-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
random.choice fails on numpy arrays #100805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This issue was also reported by @llimeht here: #30008 (comment) |
Thanks for the report. I'll look at it again shortly. I suspect we would have to make a speed trade-off which would be a bummer in the case of a fine-grained function like this one. Is there a reason for not using numpy.random.choice which is specifically designed to support numpy arrays and do so more efficiently than CPython can? |
Mark, do you have thoughts on whether the random sequence functions should support NumPy arrays? And what about other functions outside the random module? The |
No strong opinions either way on the "should". It's convenient when they do, but NumPy has its own rich selection of RNG functionality, and NumPy arrays differ enough from the Sequence interface that it's not reasonable to expect them to be usable everywhere that a Python sequence is. That said, this particular case seems like a clear regression. If we want to enforce that the argument to |
My 2 cents: in general, I don't think there's any obligation for functions in the |
Hmm. That wouldn't actually help, anyway, since as far as I can tell being an instance of |
Thank you all for reacting to this so quickly. I actually did not consider that this check is performance relevant. But this is my first issue on CPython and it makes sense that here every percent of performance counts.
In my case it was mainly consistency, since I already used |
I'll go ahead and make a PR and backport it. It is not really a bug fix; it is more of an accommodation for reliance on an implementation detail. The function contract requires that Numpy has intentionally (and legitimately) decided not to follow this path. Knowingly it chose (because the benefits outweighed the costs) to not work with a huge swath of pure Python functions that expect either a "sequence" lowercase or "Sequence" uppercase. Interestingly, MyPy has no ability to detect this issue. The choice() stub requires lowercase "sequence" semantics. It has no ability to detect that Perhaps a shout-out to Hyrum's law is warranted:
|
…100830) (cherry picked from commit 9a68ff1) Co-authored-by: Raymond Hettinger <[email protected]>
(cherry picked from commit 9a68ff1) Co-authored-by: Raymond Hettinger <[email protected]>
Thanks @NicoNeureiter and @llimeht for both reporting this issue! Support for |
random.choice never supported numpy arrays: python/cpython#100805
Bug report
In a project of mine I use random.choice on a numpy array (at multiple point in the code) which worked fine until recently. After upgrading to Python 3.11, my code crashes with:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
The value error is caused by a
if not seq
check in random.choice (line 369) that was introduced in commit 3fee777. The check is clearly intended to test for emptiness of the sequence and would raise an error if the sequence is empty. With numpy arrays (even non-empty ones) it now causes the ValueError when trying to cast the array to a boolean.The issue is easily reproducible with the following code:
The code worked on earlier versions, but fails in Python 3.11.
The problem could be avoided by checking for emptiness using
if len(seq) == 0
. I understand thatif not seq
checks are relatively common and many people prefer the conciseness, but numpy compatibility and backwards compatibility seem important to me in this context.Tested on:
Linked PRs
The text was updated successfully, but these errors were encountered: