Skip to content

[BUG]: def_property doesn't seems to keep parent alive when the return object is a py::array #4236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
zhqrbitee opened this issue Oct 12, 2022 · 5 comments
Open
3 tasks done
Labels
triage New bug, unverified

Comments

@zhqrbitee
Copy link

Required prerequisites

Problem description

I posted this as a discussion before: #4200, but got no answer for a while, so repost it here.

I have a use case that have some internal C++ vectors and want to return to python side as numpy array. To avoid copy, we only return a view on a C++ vector. It works OK in most of times, but I'm now hitting below corner case:

#include <vector>
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

struct Wrapper1 {
  Wrapper1(const std::vector<double> &values): values{values} {};
  std::vector<double> values;
};

struct Wrapper2 {
  Wrapper2(const std::vector<double> &values): values{values} {};

  std::shared_ptr<Wrapper1> wrapper1() const { return std::make_shared<Wrapper1>(values); };

  std::vector<double> values;
};

namespace py = pybind11;
using namespace pybind11::literals;

PYBIND11_MODULE(test_return_numpy_array_no_copy_helper, m) {
  py::class_<Wrapper1, std::shared_ptr<Wrapper1>>(m, "Wrapper1")
      .def(py::init<const std::vector<double> &>(), "values"_a)
      .def_property_readonly("values", [](const Wrapper1& self) {
        // https://github.com/pybind/pybind11/issues/1042#issuecomment-325941022
        auto array = pybind11::array(self.values.size(), self.values.data(),
                                     // the base of the numpy array, as long as it is set to a python object, it won't do copy
                                     pybind11::handle{Py_None});
        // https://github.com/pybind/pybind11/issues/481, make numpy array read-only
        reinterpret_cast<pybind11::detail::PyArray_Proxy*>(array.ptr())->flags &=
            ~pybind11::detail::npy_api::NPY_ARRAY_WRITEABLE_;
        return array;
      });

  py::class_<Wrapper2, std::shared_ptr<Wrapper2>>(m, "Wrapper2")
      .def(py::init<const std::vector<double> &>(), "values"_a)
      .def_property_readonly("wrapper1", &Wrapper2::wrapper1);
}

Wrapper2 create a new instance of std::shared_ptr<Wrapper1> which contains a C++ std::vector values. The values is then return to python as a numpy array view.
From the docs: https://pybind11.readthedocs.io/en/stable/advanced/functions.html, def_property_readonly will use return_value_policy::reference_internal and thus keep the parent object alive. So I expect the values returned will keep its parent, i.e. the new instance of std::shared_ptr<Wrapper1> alive and the underlying memory will be valid as long as values is alive in Python.

However, below test case shows it is not the case

def test_return_np_array_with_no_copy():
    vec = [0.0, 0.5, 1.3, 2.2]
    wrapper2 = Wrapper2(vec)
    values = wrapper2.wrapper1.values
    # in some cases, you will see values be something like
    # array([1.08885983e-309, 5.00000000e-001, 1.30000000e+000, 2.20000000e+000])).
    # It means the underlying memory is destroyed
    assert np.all(values == vec), f"{values}"

I also try to manually to use keep_alive<0, 1> in the values binding of Wrapper1, but that doesn't seem to help.
So is the keep_alive system not supposed to work with an existing python object, e.g. py::array? If so, what should I do to make above test case work?

Any thoughts would be appreciated. Thanks.

Reproducible example code

No response

@zhqrbitee zhqrbitee added the triage New bug, unverified label Oct 12, 2022
@Mi-La
Copy link

Mi-La commented Jan 31, 2023

Hi,
I have similar problem with properties and keep_alive. It seems that documentation to return value policies is slightly misleading.
For return_value_policy::reference_internal it says:

This is the default policy for property getters created via def_property, def_readwrite, etc.

But it obviously is NOT true, at least not in all cases.

See the most simplified example:

struct B
{
    B() { std::cout << "B" << std::endl; }
    ~B() { std::cout << "~B" << std::endl; }
    B(B&& other) { std::cout << "B move" << std::endl; }
};

struct A
{
    A() { std::cout << "A" << std::endl; }
    ~A() { std::cout << "~A" << std::endl; }
};

PYBIND11_MODULE(my_module, m)
{
    py::class_<A>(m, "A")
        .def(py::init())
        .def_property_readonly("b_wrong", [](A& a) { return B(); })
        .def_property_readonly("b_ok", py::cpp_function([](A& a) { return B(); }, py::keep_alive<0, 1>()));

    py::class_<B>(m, "B")
        .def(py::init()) ;
}

Note that the keep_alive must be explicitly specified, which must be done using cpp_funciton, which is almost undocumented :-(.

See that property b_wrong does not keep A alive:

>>> from my_module import A, B
>>> a = A()
A
>>> b = a.b_wrong
B
B move
~B
>>> del a
~A

While it works with b_ok:

>>> from my_module import A, B
>>> a = A()
A
>>> b = a.b_ok
B
B move
~B
>>> del a
>>> del b
~B
~A

@Mi-La
Copy link

Mi-La commented Jan 31, 2023

I've also checked test_call_policies.cpp which could have been used as good reference, but it unfortunately also doesn't contain any def_property / def_property_readonly test cases.

@xtactis
Copy link

xtactis commented Jan 22, 2024

We recently ran into this problem when writing bindings for std::span. It took some digging, but I believe I found the root cause.

This happens because .def_property and co. are broken in two interesting ways.
NOTE: One important distinction, this presupposes that we are in some way relying on the type_caster_generic class. This is what type_caster_base and by extension pybind::class_ uses, so if you're running into this issue, you are probably using it directly or indirectly.

First things first, every templated .def_property method first puts the getter (and setter) into a cpp_function, which upon initialization produces a lambda which can discard the policy given by the user. As per this function call, if the return type of the invocable is not an lvalue reference nor a pointer, the policy gets overridden to a move. While I don't personally agree with the design decision of doing this behind the user's back, in most cases this seems to be a safe operation. One important thing to note here is that if the user is returning an object by value, and has set the policy to reference_internal it will get overridden to move, this discards the keep_alive<0, 1> aspect of reference_internal entirely.

But ok, I can just use keep_alive<0, 1> directly, right? That should be fine regardless of policy, it's just telling pybind that the returned object in some way depends on the existence of the implicit self argument. And as we see from @Mi-La's example, if we pass keep_alive<0, 1> to a cpp_function constructor before passing that to def_property, it works! So what's the problem?

The problem is that if I don't construct a cpp_function object, but just pass keep_alive<0, 1>, the self object will not in fact be kept alive. It would if it was a def or def_static call, but for def_property it doesn't! This is because when a def_property call receives a lambda or method pointer, it first constructs a cpp_function, but importantly, it does not pass extras... in that construction. Instead, the extras along with the getter (and setter), and the implicit reference_internal get passed along all the way to def_static_property. There we see that the extras get applied to the getter (and setter) via a detail::process_attributes<Extra...>::init(extra..., rec_fget); call. Great! What, you say, is the issue with this? The issue with this is that the specialization of process_attributes for keep_alive doesn't have an init method! Well.. it does, but it's just the inherited default which does nothing.

I believe this fully covers this bug. As for how this should be resolved, I'm not sure. I'm not familiar enough with the internals of pybind11 to propose a solution that won't break something else. But I would love to see this fixed in the near future. Hope this information helps!

@gentlegiantJGC
Copy link
Contributor

I don't see how process_attributes::init could patch in keep_alive. It only has the ability to mutate the function_record object and keep_alive isn't stored as part of that. It is passed into the cpp_function constructor and used directly in a lambda which is then assigned to the function record.

@gentlegiantJGC
Copy link
Contributor

I don't know what the correct solution here is (beyond the cpp_function workaround) but I think a compiler warning might help some people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage New bug, unverified
Projects
None yet
Development

No branches or pull requests

4 participants