Skip to content

MAINT: avoid nested asarray calls #22947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2025
Merged

Conversation

ev-br
Copy link
Member

@ev-br ev-br commented May 8, 2025

fixes #22945

cc @crusaderky

@ev-br ev-br requested a review from steppi as a code owner May 8, 2025 12:05
@ev-br ev-br added the maintenance Items related to regular maintenance tasks label May 8, 2025
@ev-br ev-br force-pushed the fix_nested_asarrays branch from 653c22d to 130873c Compare May 8, 2025 12:24
@lucascolley lucascolley removed the request for review from steppi May 8, 2025 14:55
@lucascolley lucascolley added this to the 1.16.0 milestone May 8, 2025
@@ -8237,7 +8237,7 @@ def test_monotonicity(self, variant, method, xp):
pvaluess = xp.sort(xp.asarray(rng.uniform(0, 1, size=(m, n))), axis=0)

combined_pvalues = xp.asarray([
stats.combine_pvalues(pvaluess[i, :], method=method)[1]
float(stats.combine_pvalues(pvaluess[i, :], method=method)[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Pdb) i=0
(Pdb) p stats.combine_pvalues(pvaluess[i, :], method=method)[1]
Array(0.39087839, dtype=array_api_strict.float64)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that looks like a bug 🐛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, from where I stand it seems to make sense to merge this PR to fix gh-22945 and improve at leisure.
Feel free to take over or close this and fix the combine_pvalues bug instead (no idea how disruptive it will be, probably tolerable).

To me, the main takeaway is that the current main of array-api-strict:

Which means there's no need to revert data-apis/array-api-strict#147

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns a np.float64 on numpy, which is a subclass of float.
However, res[()] is a no-op on all other backends.
When it comes to lazy backends, they cannot return a float as that would materialize the graph.

There are two options here:

  1. change the function to return float(res) if res.shape == () and not is_lazy_array(res) else res. Update the documentation to clarify this specific quirk of lazy backends.
  2. Just return res[()]. Update the documentation to clarify that it's a float when working in numpy, and a 0d array otherwise.

I personally prefer 2 by far.

Copy link
Contributor

@mdhaber mdhaber May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but I'm assuming that we're looking for technicalities, since we're ignoring the glaring fact that concrete generic subclasses don't provide any behavior for >0 dimensions, so they are not usable as arrays in any practical sense.

The much more practical limitation, and the reason that NumPy scalars are not acceptable in place of real NumPy arrays, is that instances do not support __setitem__, but I guess that is also ignored if we consider JAX arrays to be arrays. (I don't think __setitem__ actually excuses immutable types, though. If immutable arrays are excused elsewhere in the standard, it probably deserves mention in __setitem__, too.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concrete generic subclasses don't provide any behavior for >0 dimensions.

Should they though? They implement all required behaviours when they interact with an ndarray with >0 dimensions, by returning a new ndarray.

Read-only arrays are explicitly catered for (with maturity issues, e.g. being currently unable to identify them).
JAX arrays are arrays.
NumPy arrays with flags.writeable == False are arrays.

Fun fact: read-only np.ndarrays currently fall foul the standard because their in-place op methods (__iadd__ etc.) raise, whereas they're expected to fall back to the default behaviour to cause the Python interpreter to swap out the object with the return value of the matching op (__add__ etc.). Again something that can be easily fixed upstream.

Copy link
Contributor

@mdhaber mdhaber May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that another implied rule of this game is that we ignore array type in = array type out. If this is not explicit somewhere, it is implied by the signatures if "array" in the input type description equals "array" in the output description. (Otherwise, is it ok for xp functions to accept yp type as input and return zp type output?)

They implement all required behaviours when they interact with an ndarray with >0 dimensions,

I think so, because:

Furthermore, a conforming implementation of the array API standard must support, at minimum, array objects of rank (i.e., number of dimensions) 0, 1, 2, 3, and 4 and must explicitly document their maximum supported rank N.

generic and its subclasses do not support this. ndarray does, but the two are unrelated except that both inherit from object IIUC.

Fun fact: read-only np.ndarrays currently fall foul the standard because their in-place op

I thought so, but they still fail for that reason, so I guess that is another answer to your question.

Copy link
Contributor

@crusaderky crusaderky May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no requirements for the Array classes of a namespace to be related though. Array is a Protocol.

Furthermore, a conforming implementation of the array API standard must support, at minimum, array objects of rank (i.e., number of dimensions) 0, 1, 2, 3, and 4 and must explicitly document their maximum supported rank N.

This quote doesn't say that all implementations of the array Protocol in the namespace must support arrays of rank 4, only that some must, and (I'm personally adding) you must not need to know if the specific object you have in your hands does. np.int64(1)[None, None] transparently returns a 2-dimensionl ndarray, as does np.reshape or np.broadcast_to applied to a generic.

Copy link
Contributor

@mdhaber mdhaber May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, so let's improve the standard to make all these things more clear, and let's make NumPy scalars (and other immutable arrays) actually compliant, if these fixes are indeed trivial.

Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow-up tracked in #22954. Thanks all!

@lucascolley lucascolley merged commit 694791a into scipy:main May 8, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.integrate scipy.ndimage scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST: Nested arrays failing in array-api-strict git tip
4 participants