Skip to content

PEP 590: allow passing a dict to PyObject_Vectorcall #1038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jdemeyer
Copy link
Contributor

@jdemeyer jdemeyer commented May 9, 2019

Allow passing a dict as keywords argument to PyObject_Vectorcall. Since checking for a tuple or dict is very fast, this costs almost nothing in terms of performance.

The fact that _PyObject_FastCallDict (which does vectorcall with a dict) exists and is used in various places in CPython shows that such functionality is useful.

CC @encukou @markshannon

@encukou
Copy link
Member

encukou commented May 9, 2019

The argument against is that we want to position PyObject_Vectorcall as the "fast path" for calling.
You're right that it would not be that hard to allow passing dict to it. But, that wouldn't give much improvement over using tp_call directly.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented May 9, 2019

The argument against is that we want to position PyObject_Vectorcall as the "fast path" for calling.

That's just a matter of documentation: we could say that a tuple is recommended.

You're right that it would not be that hard to allow passing dict to it. But, that wouldn't give much improvement over using tp_call directly.

There is an obvious advantage over tp_call that it doesn't require a temporary tuple for the positional arguments.

An important note is also that it's not just a matter of performance but also of convenience: by allowing various kinds of calls, we make it as easy as possible for the users of the API. There are various existing use cases for using a vector of args but a dict of kwargs (that's the reason why _PyObject_FastCallDict exists).

@encukou
Copy link
Member

encukou commented May 9, 2019

_PyObject_FastCallDict is nice as internal API in CPython, where there's lots of existing code that can use it. But that doesn't automatically mean it should be exposed externally. Not including it would provide an incentive for new code to use vectorcall's fast path.

Also, this would not be hard to add afterwards, if we find compelling use cases.

@scoder
Copy link
Contributor

scoder commented May 10, 2019

I agree that optionally allowing a dict would be nice. Some callers might simply already have a dict lying around (if only for legacy reasons to support older Python versions), in which case forcing them to either create a pos-args tuple and use tp_call(), or unpack the dict somehow to use tp_vectorcall(), would be annoying. If the C-API call function can handle that case internally, why not let it do it.

@jdemeyer
Copy link
Contributor Author

_PyObject_FastCallDict is nice as internal API in CPython, where there's lots of existing code that can use it. But that doesn't automatically mean it should be exposed externally. Not including it would provide an incentive for new code to use vectorcall's fast path.

I don't get this argument. In any case, it seems more like a documentation issue: we could document that a tuple is recommended.

Also, this would not be hard to add afterwards, if we find compelling use cases.

We don't need to find use cases, there are already use cases in the CPython source code. To name two random use cases, it's used for implementing functools.partial and __init_subclass__. I don't see a reason that these use cases are specific to CPython. I'd like to play "practicality beats purity" here.

@encukou
Copy link
Member

encukou commented May 10, 2019

That convinces me that PyObject_Vectorcall is the right place to convert kwarg dicts to the vectorcall protocol (with a warning in the docs that this loses performance).

@encukou encukou requested a review from markshannon May 10, 2019 14:34
@markshannon
Copy link
Member

I think this is a bad idea, as it suggests to the user that passing a dictionary to PyObject_Vectorcall is an efficient way to call the target callable. However, in most cases it will not be.

The first reason is that tp_call and vectorcall are two parallel calling conventions. Converting from one to the other just adds overhead.
Most callables just consume their arguments. In which case it is best to let the callable decode its arguments, it has the most context and should be able to do so most efficiently.
For example, when calling Python functions any additional conversion is pure overhead.

The second reason is that it is not clear what PyObject_Vectorcall should do. Should it convert the vector arguments to a tuple and call tp_call or convert the dictionary, adding the values to the existing vector and create a tuple of names and then call vectorcall?
The former is a cheaper conversion, but the latter might be more efficient for the callee.
If the underlying callable is a Python function, then the tp_call route would be more efficient.
If the underlying callable is a METH_VECTORCALL method descriptor, the vectorcall route would be more efficient.

_PyObject_FastCallDict exists mainly as a shim to help callables like partial that are called with the tp_call convention, and then call objects that are likely to be Python or builtin functions which support this "vector plus dict" form of arguments.
Note that _PyFunction_FastCallDict does not call _PyFunction_FastCallKeywords, nor does _PyCFunction_FastCallDict call _PyCFunction_FastCallKeywords as the conversion of arguments would be unnecessarily expensive.
I don't see any pressing need to remove _PyObject_FastCallDict for internal use.

To answer @scoder's comment about caller having a dictionary "lying around". If they do so for legacy call reasons, they should also have tuple "lying around" and can just use the tp_call calling convention.

Would an API function to help convert from the vector/dict form proposed by this PR to the vectorcall vector/tuple form be valuable to tools like Cython?
If so, then making _PyStack_UnpackDict might be an option.

@jdemeyer
Copy link
Contributor Author

I think this is a bad idea, as it suggests to the user that passing a dictionary to PyObject_Vectorcall is an efficient way to call the target callable.

As I already said, that's just a documentation issue...

@jdemeyer
Copy link
Contributor Author

Also read what I posted earlier: it's not only a matter of performance, but also of convenience. With this in mind, it shouldn't be a problem to accept a dict also.

@jdemeyer
Copy link
Contributor Author

Note that _PyFunction_FastCallDict does not call _PyFunction_FastCallKeywords, nor does _PyCFunction_FastCallDict call _PyCFunction_FastCallKeywords as the conversion of arguments would be unnecessarily expensive.

But that's an argument in favor of this PR. When you don't allow passing a dict in PyObject_Vectorcall, you force the caller to do a conversion in the "dict lying around" scenario. If the callee happens to use tp_call, then there will be a double conversion dict -> tuple (by the caller) and tuple -> dict (by the callee) which is obviously suboptimal.

@encukou
Copy link
Member

encukou commented May 17, 2019

The second reason is that it is not clear what PyObject_Vectorcall should do. Should it convert the vector arguments to a tuple and call tp_call or convert the dictionary, adding the values to the existing vector and create a tuple of names and then call vectorcall?
The former is a cheaper conversion, but the latter might be more efficient for the callee.
If the underlying callable is a Python function, then the tp_call route would be more efficient.
If the underlying callable is a METH_VECTORCALL method descriptor, the vectorcall route would be more efficient.

If the caller knows the callee, it should use the right protocol. So, let's assume it doesn't know the callee (and doesn't want to introspect it).

Now, PyObject_Vectorcall already checks if the underlying callable implements vectorcall. It can make the decision for tp_call vs. vectorcall relatively easily.

tp_call and vectorcall are two parallel calling conventions, but they should usually be selected based on what you have "lying around" .
What should you do if you have vector + dict lying around? Let's make PyObject_Vectorcall the best handler for that situation.

@jdemeyer
Copy link
Contributor Author

I should add that I've been using this PyObject_Vectorcall in the implementation python/cpython#13185 and the new convention feels very natural and easy to use.

@markshannon
Copy link
Member

I'm unconvinced that this "have a vector + dict lying around" is a real scenario, and it is definitely not a common one.
Can you give an example of where allowing PyObject_Vectorcall to accept a dict would be beneficial in terms of performance?
I have given a common case, calling a Python function, of where it would be harmful.

The purpose of PEP 590 is to add one new calling convention. Allowing PyObject_Vectorcall to accept a dict is effectively another calling convention, vector+dict.

PEP 590 is not about convenience or ease of use. It is about performance. Vagueness in the API is likely to introduce confusion and inefficiency.
If you want convenience and ease of use, don't write C, write Python 🙂

I think that accepting two fundamentally different kinds of things for a single function is poor API design. The change in the second argument implicitly changes the type of first argument. If the second argument is a tuple, then the first argument is a vector of all the argument values. However if the second argument were to be a dict, then the first argument becomes a vector of just the positional arguments. Which seems needlessly confusing and error prone.

If the ability to pass vector+dict is that valuable, which I doubt, then use _PyObject_FastCallDict.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented May 18, 2019

If the ability to pass vector+dict is that valuable, which I doubt, then use _PyObject_FastCallDict.

I'm fine creating a second function for the vector + dict calling convention, if that's the compromise. But I would put those two functions on the same level (i.e. make both public, calling them PyObject_Vectorcall and PyObject_VectorcallDict for example)

@jdemeyer
Copy link
Contributor Author

jdemeyer commented May 18, 2019

I have given a common case, calling a Python function, of where it would be harmful.

Could you please explain

  1. why using PyObject_Vectorcall with a dict to call Python functions is harmful;
  2. how this performance problem is caused by the API of PyObject_Vectorcall rather than the implementation of that function.

@jdemeyer
Copy link
Contributor Author

I think that accepting two fundamentally different kinds of things for a single function is poor API design.

Counter-proposal at #1057 with the same functionality as this PR but split over two different API functions.

@jdemeyer jdemeyer closed this May 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants