|
| 1 | +.. _interoperability: |
| 2 | + |
| 3 | +Interoperability of Xarray |
| 4 | +========================== |
| 5 | + |
| 6 | +Xarray is designed to be extremely interoperable, in many orthogonal ways. |
| 7 | +Making xarray as flexible as possible is the common theme of most of the goals on our :ref:`roadmap`. |
| 8 | + |
| 9 | +This interoperability comes via a set of flexible abstractions into which the user can plug in. The current full list is: |
| 10 | + |
| 11 | +- :ref:`Custom file backends <add_a_backend>` via the :py:class:`~xarray.backends.BackendEntrypoint` system, |
| 12 | +- Numpy-like :ref:`"duck" array wrapping <internals.duckarrays>`, which supports the `Python Array API Standard <https://data-apis.org/array-api/latest/>`_, |
| 13 | +- :ref:`Chunked distributed array computation <internals.chunkedarrays>` via the :py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint` system, |
| 14 | +- Custom :py:class:`~xarray.Index` objects for :ref:`flexible label-based lookups <internals.custom indexes>`, |
| 15 | +- Extending xarray objects with domain-specific methods via :ref:`custom accessors <internals.accessors>`. |
| 16 | + |
| 17 | +.. warning:: |
| 18 | + |
| 19 | + One obvious way in which xarray could be more flexible is that whilst subclassing xarray objects is possible, we |
| 20 | + currently don't support it in most transformations, instead recommending composition over inheritance. See the |
| 21 | + :ref:`internal design page <internal design.subclassing>` for the rationale and look at the corresponding `GH issue <https://github.com/pydata/xarray/issues/3980>`_ |
| 22 | + if you're interested in improving support for subclassing! |
| 23 | + |
| 24 | +.. note:: |
| 25 | + |
| 26 | + If you think there is another way in which xarray could become more generically flexible then please |
| 27 | + tell us your ideas by `raising an issue to request the feature <https://github.com/pydata/xarray/issues/new/choose>`_! |
| 28 | + |
| 29 | + |
| 30 | +Whilst xarray was originally designed specifically to open ``netCDF4`` files as :py:class:`numpy.ndarray` objects labelled by :py:class:`pandas.Index` objects, |
| 31 | +it is entirely possible today to: |
| 32 | + |
| 33 | +- lazily open an xarray object directly from a custom binary file format (e.g. using ``xarray.open_dataset(path, engine='my_custom_format')``, |
| 34 | +- handle the data as any API-compliant numpy-like array type (e.g. sparse or GPU-backed), |
| 35 | +- distribute out-of-core computation across that array type in parallel (e.g. via :ref:`dask`), |
| 36 | +- track the physical units of the data through computations (e.g via `pint-xarray <https://pint-xarray.readthedocs.io/en/stable/>`_), |
| 37 | +- query the data via custom index logic optimized for specific applications (e.g. an :py:class:`~xarray.Index` object backed by a KDTree structure), |
| 38 | +- attach domain-specific logic via accessor methods (e.g. to understand geographic Coordinate Reference System metadata), |
| 39 | +- organize hierarchical groups of xarray data in a :py:class:`~datatree.DataTree` (e.g. to treat heterogenous simulation and observational data together during analysis). |
| 40 | + |
| 41 | +All of these features can be provided simultaneously, using libaries compatible with the rest of the scientific python ecosystem. |
| 42 | +In this situation xarray would be essentially a thin wrapper acting as pure-python framework, providing a common interface and |
| 43 | +separation of concerns via various domain-agnostic abstractions. |
| 44 | + |
| 45 | +Most of the remaining pages in the documentation of xarray's internals describe these various types of interoperability in more detail. |
0 commit comments