🏷️ ufunc annotations for `cbrt`, `deg2rad`, `degrees`, `fabs`, `rad2deg`, `radians` #373

guan404ming · 2025-03-22T12:55:47Z

Toward #230

jorenham

These ufuncs seem to handle object_ dtypes differently from the others.

If you look at this, you might think that object dtypes are not supported:

>>> np.cbrt(8, dtype=np.object_)
AttributeError: 'int' object has no attribute 'cbrt'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<python-input-23>", line 1, in <module>
    np.cbrt(8, dtype=np.object_)
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^
TypeError: loop of ufunc does not support argument 0 of type int which has no callable cbrt method

Directly passing an object array, also doesn't work:

>>> np.cbrt(np.array(8, dtype=np.object_))
AttributeError: 'int' object has no attribute 'cbrt'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<python-input-25>", line 1, in <module>
    np.cbrt(np.array(8, dtype=np.object_))
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: loop of ufunc does not support argument 0 of type int which has no callable cbrt method

This seems to contradict the cbrt.types, which explicitly contains the O->O signature.

>>> np.cbrt.types
['e->e', 'f->f', 'd->d', 'e->e', 'f->f', 'd->d', 'g->g', 'O->O']

What's going on here?

But, if we take a closer look at the message, we that it mentions a "callable cbrt method". This made me suspect that for object_ input, the ufuncs simply tries to call x.cbrt().

A simple experiment confirms that this is indeed what's going on:

>>> class Cube:
...     def __init__(self, volume):
...         self._size = volume**(1/3)
...     def cbrt(self):
...         return self._size
... 
>>> np.cbrt(Cube(8))
2.0

How about the other ufuncs?

It's prety much the same story: passing object_ input to fabs calls x.fabs(), passing it to degrees will call x.degrees(), etc.

But, can we type that?

We can. But it will be a lot of work, and it will be messy.

Taking cbrt as example again, it would require a dedicated _Call11Cbrt protocol, that instead of x: _ArrayLikeObject accepts x: _CanCbrt and returns NDArray[object_]. The _CanCbrt type is a new protocol definition:

@type_check_only
class _CanCbrt(Protocol):
    def cbrt(self) -> object: ...

We have 6 ufuncs, and this solution requires us to write 2 protocols for each ufunc; that's 12 protocols in total.
That would become a big mess, with lots of code duplication.

Can we do better?

Unfortunately, the _Can{Cbrt,Fabs,Degrees,...} protocols are unavoidable. But luckily, it only requires 3 lines of code (+ 1 black line) to define one. The price we have to pay for them will be a total of 4 * 6 = 24 lines of code. That seems reasonable.

But it's the remaining 6 callable protocols are the main issue. Most of their overloads would be identical to one another, so we'd get a lot of code duplication.
But if we're smart about it, we could avoid all of that, and instead define a single protocol to rule them all.
It would look something like this

_T_contra = TypeVar("_T_contra", contravariant=True)

class _Call11FloatObject(Protocol[_T_contra]):
    # <insert floating scalar overloads>
    @overload
    def __call__(
        self,
        x: _T_contra,
        /,
        out: None = None,
        *,
        dtype: _DTypeLike[np.object_] | None = None,
    ) -> Any: ...
    # <insert floating array overloads>
    @overload
    def __call__(
        self,
        x: _ArrayLikeObject_co | _NestedSequence[_T_contra],
        /,
        out: None = None,
        *,
        dtype: _DTypeLike[np.object_] | None = None,
    ) -> NDArray[np.object_]: ...
    # <insert fallback overload>

For e.g. cbrt you use it as _Call11FloatObject[_CanCbrt].

What did those 6 overloads get us?

We managed to precise describe the allowed object-like input, in the case of scalars. But because we use a generic protocol, and Python does not have support for higher-kinded types (HKT), we cannot annotate the return type, and are forced to use Any.

For object_ array-like input, this did not help at all. That is because np.object_ is not a generic, so we don't know what the type of the underlying object is. So we can't do x: _ArrayLike[np.object_[_CanCbrt]], so _ArrayLikeObject_co is the best we can do.
But we did manage to additionally accept nested sequences of object-like "things", because we know that these "things" are of type _T_contra, which for e.g. cbrt we set to _CanCbrt.
However, the problem with the return types for scalar-like input also applies here. Therefore the best we can do is to return -> NDArray[object_].

Wait... will people actually use this?

Honestly, I think that these a very unlikely use-case. I wouldn't be surprised if only one in a million would use it.

However, NumPy has many users. The exact amount is unknown, but I think it's safe to say that it lies somewhere between 10**7 and 10**8.

If we combine these estimates, then we reach the conclusion that we'd be helping around 10 and 100 people with this (given that our assumptions are actually true).

So yes, it's likely that people will actually use this. But it'll only be a few. There are probably other problems that can be solved in numtype that will help many more, and won't require as much of our time.

So what do we do?

Personally, I think I would just ignore the object_ overloads. Only a very small subset of object-like input is allowed, and there are many alternative to using the ufuncs this way. I would also leave a # NOTE in the code that mentions that there exists some legal object-like input types but that it's not supported. I'd also reference this PR.

By cheating this way, we effectively pretend that the signatures isn't {efdgO}->$1, but {efdg}->$1. That's exactly equivalent to spacing!

So with this (imho very reasonable) cheat, we can avoid having to write 7 new protocols (or 8 if you also consider the .at method).

But that being said, I'll leave it up to you want you want to do here.

guan404ming · 2025-03-22T16:25:09Z

Thank you for the clarification! Actually, I’ve been investigating this issue myself over the past hour. When I tried to add tests related to dtype=np.object_, I came across the same problem.

Regarding the issue, I believe that adding extra complexity to the code just to handle these edge cases (which might be used by only a small number of people), to handle it as spacing and leave some note is an more reasonable tradeoff.

I would make a change to keep things consistent with spacing and adding a note referencing the comments you provided.

…eg`, `radians`

jorenham · 2025-03-22T16:31:05Z

Regarding the issue, I believe that adding extra complexity to the code just to handle these edge cases (which might be used by only a small number of people), to handle it as spacing and leave some note is an more reasonable tradeoff.

And of course, we can always reconsider if someone opens an issue for it.

guan404ming · 2025-03-22T16:52:11Z

@jorenham I found that [arc]{cos,sin,tan}[h] also have this kind of issue
chould we handle them like we discuss above?

>>> np.cos(10, dtype=np.object_)
AttributeError: 'int' object has no attribute 'cos'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<python-input-7>", line 1, in <module>
    np.cos(10, dtype=np.object_)
    ~~~~~~^^^^^^^^^^^^^^^^^^^^^^
TypeError: loop of ufunc does not support argument 0 of type int which has no callable cos method

jorenham · 2025-03-22T17:22:04Z

I found that [arc]{cos,sin,tan}[h] also have this kind of issue

Good catch!

should we handle them like we discuss above?

That's probably for the best then. Types with a .cos methods are also pretty uncommon, at least in numpy and the python stdlib.

jorenham · 2025-03-22T17:24:02Z

Thanks, Guan-Ming

guan404ming changed the title ~~🏷️ add protocol for {[f]O} -> $1~~ 🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, rad2deg and radians Mar 22, 2025

guan404ming changed the title ~~🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, rad2deg and radians~~ 🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, rad2deg, radians Mar 22, 2025

jorenham mentioned this pull request Mar 22, 2025

Annotating the ufuncs #230

Open

jorenham added stubs: Enhancement numpy.ufunc labels Mar 22, 2025

jorenham added this to the v2.2.x.0 milestone Mar 22, 2025

guan404ming force-pushed the 1-to-1-float-obj branch from edaa930 to 0c261fd Compare March 22, 2025 14:54

jorenham reviewed Mar 22, 2025

View reviewed changes

🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, `rad2d…

059c896

…eg`, `radians`

guan404ming force-pushed the 1-to-1-float-obj branch from 0c261fd to 059c896 Compare March 22, 2025 16:30

jorenham self-requested a review March 22, 2025 16:31

jorenham approved these changes Mar 22, 2025

View reviewed changes

jorenham merged commit 3e87a93 into numpy:main Mar 22, 2025
21 checks passed

guan404ming deleted the 1-to-1-float-obj branch March 22, 2025 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

🏷️ ufunc annotations for `cbrt`, `deg2rad`, `degrees`, `fabs`, `rad2deg`, `radians` #373

🏷️ ufunc annotations for `cbrt`, `deg2rad`, `degrees`, `fabs`, `rad2deg`, `radians` #373

Uh oh!

guan404ming commented Mar 22, 2025 •

edited

Loading

Uh oh!

jorenham left a comment •

edited

Loading

Uh oh!

guan404ming commented Mar 22, 2025

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

guan404ming commented Mar 22, 2025

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

Uh oh!

Uh oh!

🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, rad2deg, radians #373

🏷️ ufunc annotations for cbrt, deg2rad, degrees, fabs, rad2deg, radians #373

Uh oh!

Conversation

guan404ming commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorenham left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

What's going on here?

How about the other ufuncs?

But, can we type that?

Can we do better?

What did those 6 overloads get us?

Wait... will people actually use this?

So what do we do?

Uh oh!

guan404ming commented Mar 22, 2025

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

guan404ming commented Mar 22, 2025

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

Uh oh!

jorenham commented Mar 22, 2025

Uh oh!

Uh oh!

🏷️ ufunc annotations for `cbrt`, `deg2rad`, `degrees`, `fabs`, `rad2deg`, `radians` #373

🏷️ ufunc annotations for `cbrt`, `deg2rad`, `degrees`, `fabs`, `rad2deg`, `radians` #373

guan404ming commented Mar 22, 2025 •

edited

Loading

jorenham left a comment •

edited

Loading