Skip to content

Add clip #482

Closed
Closed
@jakirkham

Description

@jakirkham

The clip function is quite handy for constraining data to a certain range. Would be nice to include this in the spec

Activity

rgommers

rgommers commented on Sep 23, 2022

@rgommers
Member

gh-187 says clip is universally implemented by libraries. This looks to me like a useful function to add. IIRC we had some issues with clip behavior in numpy, we should dig those up to check for possible semantics issues.

rgommers

rgommers commented on Oct 5, 2022

@rgommers
Member

IIRC we had some issues with clip behavior in numpy, we should dig those up to check for possible semantics issues.

This wasn't too bad actually, a lot of it was segfaults and whether it was a ufunc or not. The relevant ones:

And of course we need to choose keyword names, always a hard problem. Existing signatures don't match:

  • np.clip(a, a_min, a_max, out=None, **kwargs)
  • torch.clip(input, min=None, max=None, *, out=None)

No name is ideal - min/max are nicer names, but not everyone may be happy that it conflicts with builtin names. a_min and a_max are bad though, given that we consistently name our input arrays x and not a. So how about:

def clip(x, min, max, /):  # all positional-only args to avoid incompatibilities
shoyer

shoyer commented on Oct 5, 2022

@shoyer
Contributor

No name is ideal - min/max are nicer names, but not everyone may be happy that it conflicts with builtin names. a_min and a_max are bad though, given that we consistently name our input arrays x and not a. So how about:

def clip(x, min, max, /):  # all positional-only args to avoid incompatibilities

I would lean towardsclip(x, /, min=None, max=None). The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Side note: this is yet another example (like unstack and moveaxis) of functionality that could perhaps have a universal default implementation terms of other array API functionality (specifically where).

rgommers

rgommers commented on Oct 5, 2022

@rgommers
Member

The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Only if they're positional, right?

I actually do like your signature better - clip(x, max=3) is nicer than having to write clip(x, None, 3). It's just another thing then that we have to deal with when we aim to converge NumPy's main namespace to the array API standard. But I guess it's not too bad, there's already **kwargs in np.clip, so using aliases is easy.

shoyer

shoyer commented on Oct 5, 2022

@shoyer
Contributor

The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Only if they're positional, right?

I only get lint errors for overriding builtins if I define variables with the same name as a builtin. Something like clip(x, max=3) is fine -- the lint error is defining a variable named max, not for using a keyword argument.

clip(x, max=max) would be problematic, but there's nothing in the interface that requires calling the variable max. Users can write something like clip(x, max=a_max) if desired.

jakirkham

jakirkham commented on Oct 5, 2022

@jakirkham
MemberAuthor

cc @seberg (in case you have thoughts here)

seberg

seberg commented on Oct 6, 2022

@seberg
Contributor

I agree that clip is clear enough that it seems fine to allow passing positional and min, max seems OK (no strong opinions on it though).
I would lean against a_min, etc. even if that is what NumPy has. The alternative I may seriously consider is lower and upper (I have not quite made up my mind whether that is the better or worse name for bounds – mathematically speaking).

rgommers

rgommers commented on Oct 7, 2022

@rgommers
Member

Questions that were brought up yesterday:

  • Should the min and max inputs participate in determining the output dtype, or should the output dtype equal the input dtype?
    • Tentative answer: == input_dtype
  • Are min and max allowed to be arrays, or only scalars?
    • Tentative answer: yes, because otherwise there's no way to clip along an axis.
added this to the v2023 milestone on Jun 29, 2023
oleksandr-pavlyk

oleksandr-pavlyk commented on Oct 25, 2023

@oleksandr-pavlyk
Contributor

if min and max do not participate in the type promotion, this makes clip not equivalent to xp.maximum(xp.minimum(x, min), max) in functional behavior.

In NumPy both clip's min and max arguments participate in the type promotion:

In [3]: np.clip(np.arange(10, dtype=np.int8), 0, np.inf)
Out[3]: array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
seberg

seberg commented on Oct 25, 2023

@seberg
Contributor

I don't want to volunteer it, but it shouldn't be hard to change that in NumPy or fix in a compat namespace. And yes, it will mean that it isn't the same as a chained minimum/maximum call.

3 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    API extensionAdds new functions or objects to the API.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @seberg@rgommers@shoyer@kgryte@jakirkham

      Issue actions

        Add `clip` · Issue #482 · data-apis/array-api