Skip to content

Commit 503c90e

Browse files
committed
Eigen<->numpy referencing support
This commit largely rewrites the Eigen dense matrix support to avoid copying in many cases: Eigen arguments can now reference numpy data, and numpy objects can now reference Eigen data (given compatible types). Eigen::Ref<...> arguments now also make use of the new `convert` argument use (added in PR pybind#634) to avoid conversion, allowing `py::arg().noconvert()` to be used when binding a function to prohibit copying when invoking the function. Respecting `convert` also means Eigen overloads that avoid copying will be preferred during overload resolution to ones that require copying. This commit also rewrites the Eigen documentation and test suite to explain and test the new capabilities.
1 parent 9102cca commit 503c90e

File tree

5 files changed

+1431
-271
lines changed

5 files changed

+1431
-271
lines changed

docs/advanced/cast/eigen.rst

Lines changed: 288 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,308 @@
11
Eigen
2-
=====
2+
#####
33

44
`Eigen <http://eigen.tuxfamily.org>`_ is C++ header-based library for dense and
55
sparse linear algebra. Due to its popularity and widespread adoption, pybind11
6-
provides transparent conversion support between Eigen and Scientific Python linear
7-
algebra data types.
6+
provides transparent conversion and limited mapping support between Eigen and
7+
Scientific Python linear algebra data types.
88

9-
Specifically, when including the optional header file :file:`pybind11/eigen.h`,
10-
pybind11 will automatically and transparently convert
9+
To enable the built-in Eigen support you must include the optional header file
10+
:file:`pybind11/eigen.h`.
1111

12-
1. Static and dynamic Eigen dense vectors and matrices to instances of
13-
``numpy.ndarray`` (and vice versa).
12+
Pass-by-value
13+
=============
1414

15-
2. Returned matrix expressions such as blocks (including columns or rows) and
16-
diagonals will be converted to ``numpy.ndarray`` of the expression
17-
values.
15+
When binding a function with ordinary Eigen dense object arguments (for
16+
example, ``Eigen::MatrixXd``), pybind11 will accept any input value that is
17+
already (or convertible to) a ``numpy.ndarray`` with dimensions compatible with
18+
the Eigen type, copy its values into a temporary Eigen variable of the
19+
appropriate type, then call the function with this temporary variable.
1820

19-
3. Returned matrix-like objects such as Eigen::DiagonalMatrix or
20-
Eigen::SelfAdjointView will be converted to ``numpy.ndarray`` containing the
21-
expressed value.
21+
Sparse matrices are similarly copied to or from
22+
``scipy.sparse.csr_matrix``/``scipy.sparse.csc_matrix`` objects.
2223

23-
4. Eigen sparse vectors and matrices to instances of
24-
``scipy.sparse.csr_matrix``/``scipy.sparse.csc_matrix`` (and vice versa).
24+
Pass-by-reference
25+
=================
2526

26-
This makes it possible to bind most kinds of functions that rely on these types.
27-
One major caveat are functions that take Eigen matrices *by reference* and modify
28-
them somehow, in which case the information won't be propagated to the caller.
27+
One major limitation of the above is that every data conversion implicitly
28+
involves a copy, which can be both expensive (for large matrices) and disallows
29+
binding functions that change their (Matrix) arguments. Pybind11 allows you to
30+
work around this by using Eigen's ``Eigen::Ref<MatrixType>`` class much as you
31+
would when writing a function taking a generic type in Eigen itself (subject to
32+
some limitations discussed below).
33+
34+
When calling a bound function accepting a ``Eigen::Ref<const MatrixType>``
35+
type, pybind11 will attempt to avoid copying by using an ``Eigen::Map`` object
36+
that maps into the source ``numpy.ndarray`` data: this requires both that the
37+
data types are the same (e.g. ``dtype='float64'`` and ``MatrixType::Scalar`` is
38+
``double``); and that the storage is layout compatible. The latter limitation
39+
is discussed in detail in the section below, and requires careful
40+
consideration: by default, numpy matrices and eigen matrices are *not* storage
41+
compatible.
42+
43+
If the numpy matrix cannot be used as is (either because its types differ, e.g.
44+
passing an array of integers to an Eigen paramater requiring doubles, or
45+
because the storage is incompatible), pybind11 makes a temporary copy and
46+
passes the copy instead.
47+
48+
When a bound function parameter is instead ``Eigen::Ref<MatrixType>`` (note the
49+
lack of ``const``), pybind11 will only allow the function to be called if it
50+
can be mapped *and* if the numpy array is writeable (that is
51+
``a.flags.writeable`` is true). Any access (including modification) made to
52+
the passed variable will be transparently carried out directly on the
53+
``numpy.ndarray``.
54+
55+
This means you can can write code such as the following and have it work as
56+
expected:
2957

3058
.. code-block:: cpp
3159
32-
/* The Python bindings of these functions won't replicate
33-
the intended effect of modifying the function arguments */
34-
void scale_by_2(Eigen::Vector3f &v) {
60+
void scale_by_2(Eigen::Ref<Eigen::VectorXd> m) {
3561
v *= 2;
3662
}
37-
void scale_by_2(Eigen::Ref<Eigen::MatrixXd> &v) {
38-
v *= 2;
63+
64+
Note, however, that you will likely run into limitations due to numpy and
65+
Eigen's difference default storage order for data; see the below section on
66+
:ref:`storage_orders` for details on how to bind code that won't run into such
67+
limitations.
68+
69+
.. note::
70+
71+
Passing by reference is not supported for sparse types.
72+
73+
Returning values to Python
74+
==========================
75+
76+
When returning an ordinary dense Eigen matrix type to numpy (e.g.
77+
``Eigen::MatrixXd`` or ``Eigen::RowVectorXf``) pybind11 keeps the matrix and
78+
returns a numpy array that directly references the Eigen matrix: no copy of the
79+
data is performed. The numpy array will have ``array.flags.owndata`` set to
80+
``False`` to indicate that it does not own the data, and the lifetime of the
81+
stored Eigen matrix will be tied to the returned ``array``.
82+
83+
If you bind a function with a non-reference, ``const`` return type (e.g.
84+
``const Eigen::MatrixXd``), the same thing happens except that pybind11 also
85+
sets the numpy array's ``writeable`` flag to false.
86+
87+
If you return an lvalue reference or pointer, the usual pybind11 rules apply,
88+
as dictated by the binding function's return value policy (see the
89+
documentation on :ref:`return_value_policies` for full details). That means,
90+
without an explicit return value policy, lvalue references will be copied and
91+
pointers will be managed by pybind11. In order to avoid copying, you should
92+
explictly specify an appropriate return value policy, as in the following
93+
example:
94+
95+
.. code-block:: cpp
96+
97+
class MyClass {
98+
Eigen::MatrixXd big_mat = Eigen::MatrixXd::Zero(10000, 10000);
99+
public:
100+
Eigen::MatrixXd &getMatrix() { return big_mat; }
101+
const Eigen::MatrixXd &viewMatrix() { return big_mat; }
102+
};
103+
104+
// Later, in binding code:
105+
py::class_<MyClass>(m, "MyClass")
106+
.def(py::init<>())
107+
.def("copy_matrix", &MyClass::getMatrix) // Makes a copy!
108+
.def("get_matrix", &MyClass::getMatrix, py::return_value_policy::reference_internal)
109+
.def("view_matrix", &MyClass::viewMatrix, py::return_value_policy::reference_internal)
110+
;
111+
112+
.. code-block:: python
113+
114+
a = MyClass()
115+
m = a.get_matrix() # flags.writeable = True, flags.owndata = False
116+
v = a.view_matrix() # flags.writeable = False, flags.owndata = False
117+
c = a.copy_matrix() # flags.writeable = True, flags.owndata = True
118+
# m[5,6] and v[5,6] refer to the same element, c[5,6] does not.
119+
120+
Note in this example that ``py::return_value_policy::reference_internal`` is
121+
used to tie the life of the MyClass object to the life of the returned arrays.
122+
123+
You may also return an ``Eigen::Ref``, ``Eigen::Map`` or other map-like Eigen
124+
object (for example, the return value of ``matrix.block()`` and related
125+
methods) that map into a dense Eigen type. When doing so, the default
126+
behaviour of pybind11 is to simply reference the returned data: you must take
127+
care to ensure that this data remains valid! You may ask pybind11 to
128+
explicitly *copy* such a return value by using the
129+
``py::return_value_policy::copy`` policy when binding the function. You may
130+
also use ``py::return_value_policy::reference_internal`` or a
131+
``py::keep_alive`` to ensure the data stays valid as long as the returned numpy
132+
array does.
133+
134+
When returning such a reference of map, pybind11 additionally respects the
135+
readonly-status of the returned value, marking the numpy array as non-writeable
136+
if the reference or map was itself read-only.
137+
138+
.. note::
139+
140+
Sparse types are always copied when returned.
141+
142+
.. _storage_orders:
143+
144+
Storage orders
145+
==============
146+
147+
Passing arguments via ``Eigen::Ref`` has some limitations that you must be
148+
aware of in order to effectively pass matrices by reference. First and
149+
foremost is that the default ``Eigen::Ref<MatrixType>`` class requires
150+
contiguous storage along columns (for column-major types, the default in Eigen)
151+
or rows if ``MatrixType`` is specifically an ``Eigen::RowMajor`` storage type.
152+
The former, Eigen's default, is incompatible with ``numpy``'s default row-major
153+
storage, and so you will not be able to pass numpy arrays to Eigen by reference
154+
without making one of two changes.
155+
156+
(Note that this does not apply to vectors (or column or row matrices): for such
157+
types the "row-major" and "column-major" distinction is meaningless).
158+
159+
The first approach is to change the use of ``Eigen::Ref<MatrixType>`` to the
160+
more general ``Eigen::Ref<MatrixType, 0, Eigen::Stride<Eigen::Dynamic,
161+
Eigen::Dynamic>>`` (or similar type with a fully dynamic stride type in the
162+
third template argument). Since this is a rather cumbersome type, pybind11
163+
provides a ``py::EigenDRef<MatrixType>`` type alias for your convenience (along
164+
with EigenDMap for the equivalent Map, and EigenDStride for just the stride
165+
type).
166+
167+
This type allows Eigen to map into any arbitrary storage order. This is not
168+
the default in Eigen for performance reasons: contiguous storage allows
169+
vectorization that cannot be done when storage is not known to be contiguous at
170+
compile time. The default ``Eigen::Ref`` stride type allows non-contiguous
171+
storage along the outer dimension (that is, the rows of a column-major matrix
172+
or columns of a row-major matrix), but not along the inner dimension.
173+
174+
This type, however, has the added benefit of also being able to map numpy array
175+
slices. For example, the following (contrived) example uses Eigen with a numpy
176+
slice to multiply by 2 all coefficients that are both on even rows (0, 2, 4,
177+
...) and in columns 2, 5, or 8:
178+
179+
.. code-block:: cpp
180+
181+
m.def("scale", [](py::EigenDRef<Eigen::MatrixXd> m, double c) { m *= c; });
182+
183+
.. code-block:: python
184+
185+
# a = np.array(...)
186+
scale_by_2(myarray[0::2, 2:9:3])
187+
188+
The second approach to avoid copying is more intrusive: rearranging the
189+
underlying data types to not run into the non-contiguous storage problem in the
190+
first place. In particular, that means using matrices with ``Eigen::RowMajor``
191+
storage, where appropriate, such as:
192+
193+
.. code-block:: cpp
194+
195+
using RowMatrixXd = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;
196+
// Use RowMatrixXd instead of MatrixXd
197+
198+
Now bound functions accepting ``Eigen::Ref<RowMatrixXd>`` arguments will be
199+
callable with numpy's (default) arrays without involving a copying.
200+
201+
You can, alternatively, change the storage order that numpy arrays use by
202+
adding the ``order='F'`` option when creating an array:
203+
204+
.. code-block:: python
205+
206+
myarray = np.array(source, order='F')
207+
208+
Such an object will be passable to a bound function accepting an
209+
``Eigen::Ref<MatrixXd>`` (or similar column-major Eigen type).
210+
211+
One major caveat with this approach, however, is that it is not entirely as
212+
easy as simply flipping all Eigen or numpy usage from one to the other: some
213+
operations may alter the storage order of a numpy array. For example, ``a2 =
214+
array.transpose()`` results in ``a2`` being a view of ``array`` that references
215+
the same data, but in the opposite storage order!
216+
217+
While this approach allows fully optimized vectorized calculations in Eigen, it
218+
cannot be used with array slices, unlike the first approach.
219+
220+
When *returning* a matrix to Python (either a regular matrix, a reference via
221+
``Eigen::Ref<>``, or a map/block into a matrix), no special storage
222+
consideration is required: the created numpy array will have the required
223+
stride that allows numpy to properly interpret the array, whatever its storage
224+
order.
225+
226+
Failing rather than copying
227+
===========================
228+
229+
The default behaviour when binding ``Eigen::Ref<const MatrixType>`` eigen
230+
references is to copy matrix values when passed a numpy array that does not
231+
conform to the element type of ``MatrixType`` or does not have a compatible
232+
stride layout. If you want to explicitly avoid copying in such a case, you
233+
should bind arguments using the ``py::arg().noconvert()`` annotation (as
234+
described in the :ref:`nonconverting_arguments` documentation).
235+
236+
The following example shows an example of arguments that don't allow data
237+
copying to take place:
238+
239+
.. code-block:: cpp
240+
241+
// The method and function to be bound:
242+
class MyClass {
243+
// ...
244+
double some_method(const Eigen::Ref<const MatrixXd> &matrix) { /* ... */ }
245+
};
246+
float some_function(const Eigen::Ref<const MatrixXf> &big,
247+
const Eigen::Ref<const MatrixXf> &small) {
248+
// ...
39249
}
40250
41-
To see why this is, refer to the section on :ref:`opaque` (although that
42-
section specifically covers STL data types, the underlying issue is the same).
43-
The :ref:`numpy` sections discuss an efficient alternative for exposing the
44-
underlying native Eigen types as opaque objects in a way that still integrates
45-
with NumPy and SciPy.
251+
// The associated binding code:
252+
using namespace pybind11::literals; // for "arg"_a
253+
py::class_<MyClass>(m, "MyClass")
254+
// ... other class definitions
255+
.def("some_method", &MyClass::some_method, py::arg().nocopy());
256+
257+
m.def("some_function", &some_function,
258+
"big"_a.nocopy(), // <- Don't allow copying for this arg
259+
"small"_a // <- This one can be copied if needed
260+
);
261+
262+
With the above binding code, attempting to call the the ``some_method(m)``
263+
method on a ``MyClass`` object, or attempting to call ``some_function(m, m2)``
264+
will raise a ``RuntimeError`` rather than making a temporary copy of the array.
265+
It will, however, allow the ``m2`` argument to be copied into a temporary if
266+
necessary.
267+
268+
Note that explicitly specifying ``.noconvert()`` is not required for *mutable*
269+
Eigen references (e.g. ``Eigen::Ref<MatrixXd>`` without ``const`` on the
270+
``MatrixXd``): mutable references will never be called with a temporary copy.
271+
272+
Vectors versus column/row matrices
273+
==================================
274+
275+
Eigen and numpy have fundamentally different notions of a vector. In Eigen, a
276+
vector is simply a matrix with the number of columns or rows set to 1 at
277+
compile time (for a column vector or row vector, respectively). Numpy, in
278+
contast, has comparable 2-dimensional 1xN and Nx1 arrays, but *also* has
279+
1-dimensional arrays of size N.
280+
281+
When passing a 2-dimensional 1xN or Nx1 array to Eigen, the Eigen type must
282+
have matching dimensions: That is, you cannot pass a 2-dimensional Nx1 numpy
283+
array to an Eigen value expecting a row vector, or a 1xN numpy array as a
284+
column vector argument.
285+
286+
On the other hand, pybind11 allows you to pass 1-dimensional arrays of length N
287+
as Eigen parameters. If the Eigen type can hold a column vector of length N it
288+
will be passed as such a column vector. If not, but the Eigen type constraints
289+
will accept a row vector, it will be passed as a row vector. (The column
290+
vector takes precendence when both are supported, for example, when passing a
291+
1D numpy array to a MatrixXd argument). Note that the type need not be
292+
expicitly a vector: it is permitted to pass a 1D numpy array of size 5 to an
293+
Eigen ``Matrix<double, Dynamic, 5>``: you would end up with a 1x5 Eigen matrix.
294+
Passing the same to an ``Eigen::MatrixXd`` would result in a 5x1 Eigen matrix.
295+
296+
When returning an eigen vector to numpy, the conversion is ambiguous: a row
297+
vector of length 4 could be returned as either a 1D array of length 4, or as a
298+
2D array of size 1x4. When encoutering such a situation, pybind11 compromises
299+
by considering the returned Eigen type: if it is a compile-time vector--that
300+
is, the type has either the number of rows or columns set to 1 at compile
301+
time--pybind11 converts to a 1D numpy array when returning the value. For
302+
instances that are a vector only at run-time (e.g. ``MatrixXd``,
303+
``Matrix<float, Dynamic, 4>``), pybind11 returns the vector as a 2D array to
304+
numpy. If this isn't want you want, you can use ``array.reshape(...)`` to get
305+
a view of the same data in the desired dimensions.
46306

47307
.. seealso::
48308

docs/advanced/functions.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ with the basics of binding functions and classes, as explained in :doc:`/basics`
66
and :doc:`/classes`. The following guide is applicable to both free and member
77
functions, i.e. *methods* in Python.
88

9+
.. _return_value_policies:
10+
911
Return value policies
1012
=====================
1113

@@ -319,6 +321,8 @@ like so:
319321
py::class_<MyClass>("MyClass")
320322
.def("myFunction", py::arg("arg") = (SomeType *) nullptr);
321323
324+
.. _nonconverting_arguments:
325+
322326
Non-converting arguments
323327
========================
324328

0 commit comments

Comments
 (0)