Skip to content

Prefer non-converting argument overloads #643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 5, 2017

Conversation

jagerman
Copy link
Member

@jagerman jagerman commented Feb 4, 2017

This changes function dispatching to work in two passes for overloaded functions: the first tries calling a function with all arguments loaded with convert=false. If this fails for all overloads, a second pass is tried which allows conversion (except for those arguments where it is explicitly disabled with py::arg().noconvert()).

The main point here is to prefer calls to overloads that don't require conversion over calls that do require conversion, regardless of the order in which the functions were registered. This should allow issues like #631 to be solved relatively easily by not loading when convert=false except when the type already matches.

For non-overloaded functions, this code doesn't bother with two passes at all (there's no point in first trying a non-converting call because if it fails we'd just immediately make the converting call anyway).

For overloads where there are no converting argument anyway (either no arguments, or all arguments are py::arg().noconvert()) we don't bother with the second pass attempt (since it would have convert_args flags exactly identical to the first pass).


Without the included tests, this increases the test .so size by 4KB under both GCC and macOS Sierra Clang, but that's a constant increase (i.e. the same increase for binding just one function or the entire test suite).

Note that the first two commits here are just PR #634, which this depends on, but I'm submitting this as a separate PR because it's really a completely different feature and deserves separate consideration/discussion. (Also note that the no-variable-.so increase I mentioned above is relative to #634, i.e. there's a small variable increase relative to master, but that's coming from #634).

Arithmetic and complex casters now only do a converting cast when
`convert=true`; previously they would convert always (e.g. when passing
an int to a float-accepting function, or a float to complex-accepting
function).
This adds support for controlling the `convert` flag of arguments
through the py::arg annotation.  This then allows arguments to be
flagged as non-converting, which the type_caster is able to use to
request different behaviour.

Currently, AFAICS `convert` is only used for type converters of regular
pybind11-registered types; all of the other core type_casters ignore it.
We can, however, repurpose it to control internal conversion of
converters like Eigen and `array`: most usefully to give callers a way
to disable the conversion that would otherwise occur when a
`Eigen::Ref<const Eigen::Matrix>` argument is passed a numpy array that
requires conversion (either because it has an incompatible stride or the
wrong dtype).

Specifying a noconvert looks like one of these:

    m.def("f1", &f, "a"_a.noconvert() = "default"); // Named, default, noconvert
    m.def("f2", &f, "a"_a.noconvert()); // Named, no default, no converting
    m.def("f3", &f, py::arg().noconvert()); // Unnamed, no default, no converting

(The last part--being able to declare a py::arg without a name--is new:
previous py::arg() only accepted named keyword arguments).

Such an non-convert argument is then passed `convert = false` by the
type caster when loading the argument.  Whether this has an effect is up
to the type caster itself, but as mentioned above, this would be
extremely helpful for the Eigen support to give a nicer way to specify
a "no-copy" mode than the custom wrapper in the current PR, and
moreover isn't an Eigen-specific hack.
@@ -28,9 +28,10 @@ def test_pointers(msg):
print_opaque_list, return_null_str, get_null_str_value,
return_unique_ptr, ConstructorStats)

living_before = ConstructorStats.get(ExampleMandA).alive()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't entirely obvious why this change, but I was getting false positives of this test triggered by another test failure (the other failure resulting in an ExampleMandA instance not being destroyed). This just prevents the false positive when a failure is happening elsewhere.

This changes the function dispatching code for overloaded functions into
a two-pass procedure where we first try all overloads with
`convert=false` for all arguments.  If no function calls succeeds in the
first pass, we then try a second pass where we allow arguments to have
`convert=true` (unless, of course, the argument was explicitly specified
with `py::arg().noconvert()`).

For non-overloaded methods, the two-pass procedure is skipped (we just
make the overload-allowed call).  The second pass is also skipped if it
would result in the same thing (i.e. where all arguments are
`.noconvert()` arguments).
@wjakob
Copy link
Member

wjakob commented Feb 5, 2017

I think this is great and will solve a number of recurring difficulties with method resolution. It's nice that all of this also integrates with the existing implicit conversions for custom types and extends them to builtin types. My vote would be to merge this one and close the other PR.

I find the ExampleMandA change a bit troubling -- to me, this seems to indicate that an ExampleMandA instance is leaked somwhere else. But this does not make much sense: it only occurs in two testcases, and both of them test that the reference count is zero at the end. I think it would be good to track down what is causing this rather than committing a workaround.

@jagerman
Copy link
Member Author

jagerman commented Feb 5, 2017

There isn't a leak in the current test code when everything passes, but if some other test fails and leaks, it causes both the earlier test and this one to fail, when really it should be just the other one failing.

@jagerman
Copy link
Member Author

jagerman commented Feb 5, 2017

As for merging this instead of the other, it really doesn't make any difference: this includes the other one anyway.

@wjakob wjakob merged commit 6fa316d into pybind:master Feb 5, 2017
@wjakob
Copy link
Member

wjakob commented Feb 5, 2017

Merged!

@dean0x7d
Copy link
Member

dean0x7d commented Feb 5, 2017

This is awesome! Actual overload resolution 👍

@wjakob
Copy link
Member

wjakob commented Feb 5, 2017

I agree, this is really nice work -- thank you for the beautiful implementation.

@wjakob
Copy link
Member

wjakob commented Feb 6, 2017

Thinking a bit more about this patch, I was wondering about the following: does it make sense for the type_casters to even try non-converting conversions when convert=false? Due to the two-phase method resolution, we know that this failed already in a prior pass, so it might be more efficient to restructure the type casters to have an if statement with two fully disjoint branches rather than a sequence of two steps where the second one is potentially turned off. What do you think?

@jagerman
Copy link
Member Author

jagerman commented Feb 6, 2017

I'm not sure I follow: convert = false will be the first pass, where we're asking the type caster "Can you load this without conversion." If the answer is yes (i.e. true) for all arguments, we finish immediately. It's only if nothing can load without conversion that we consider a second pass with convert = true.

(And if the method isn't overloaded, there is no second pass at all: we essentially just skip the first noconvert attempt entirely).

I think what you might be getting at is that we can, however, get in a situation where we call an argument's type_caster::load() with convert = false twice. This can only happen if the method is overloaded and there are other arguments that switched to convert = true for the second pass (if nothing switches, i.e. every argument has py::arg().noconvert(), the overload doesn't make it to the second pass).

We could possibly avoid that by letting the ->impl lambda store argument casters in the function_call; if it gets invoked a second time, it would then only need to load the casters that returned false the first time rather than all of them. This would change type caster logic a bit: currently we only call load() once on any instance; we'd now be potentially calling it twice (but only if it returned false the first time).

Now that I've written it up, it doesn't seem all that complicated, I'll whip something up.

@wjakob
Copy link
Member

wjakob commented Feb 6, 2017

Right! Thinking a bit more about this, I'm starting to get worried about object code size, since this is the sort of thing that will touch every bound function ;). If data has to be moved around in every function call to get this to work, it might be better to tolerate the unnecessary extra load() call.

@jagerman
Copy link
Member Author

jagerman commented Feb 7, 2017

Yeah, I think you're right: the biggest problem is that every function ->impl has its own distinct argument loader specialization, and thus its own argument loader destruction. I can't see any cheap way to move that destructor out: we'd either have to store a custom lambda per function signature, or add a base class with a virtual destructor to argument_loader<...>, both of which are going to have non-negligible object size increases.

Given that this only has an effect on overloaded functions, but incurs the cost for all functions, it doesn't seem worthwhile. Better just to sometimes invoke load() twice (with the first one, if there are two, always being a convert = false which can presumably avoid expensive operations).

@pschella
Copy link
Contributor

Just FYI. We are using custom converters to implement a form of lazy loading of big objects (say pulling from a database). By registering a in implicit conversion from ReadProxy to T for any T that may require lazy loading. ReadProxy has a py::object __subject__ member that it then attempts to cast to T. Accessing __subject__ triggers the load.
I assume this isn't broken by this commit, but given that it does rely on the internals of implicit conversions (we can't just use py::implicitly_convertible so we had to roll our own), it is a bit fragile.

@jagerman
Copy link
Member Author

If the type isn't an argument to an overloaded function/method, this PR won't change anything. If it is, you could get multiple type_caster constructions and load() calls (which could happen before, as well, but potentially more now if mixed with other arguments whose type casters decline loading without conversion). If your expensive load() is always considered a conversion (and so you always want to defer to other overloads that can be called without a conversion), you'll probably want to immediately return false from the heavy type caster's load(handle src, bool convert) method whenever convert is false so that your conversion only participates if no other method claims to be able to load arguments without conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants