-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
WIP: Remove 0d/1d special-casing from _numdiff #11158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3668c60
to
c77eca9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependent modules now fail with this new interface, which I expected - will come back to this later
if m == 1: | ||
J_transposed = np.ravel(J_transposed) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now handled by ret.reshape(f0.shape + x0.shape)
instead
def fun_scalar_vector(self, x): | ||
return np.array([x[0]**2, np.tan(x[0]), np.exp(x[0])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this changed because the caller passed x0
as a scalar, but then expected it to be an array here. Since the function name is scalar
, it seems to me that x
should be a scalar here anyway, so this is an improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't make changes to the existing tests at the same time as changing the code, if the aim of the changes is to keep existing code working. Here it's a bit less important than elsewhere because IIRC _numdiff.py is not public API, but still it could be better to separate it at least in different commits.
I think there are two things mixed in here that are separate: (i) cleaning up the dimension handing in numdiff, and (ii) dealing with possible(?) future numpy changes where it will start emitting warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only looking at (i) here, (ii) was just the catalyst for spotting that the dimension handling was a little clumsy.
Here it's a bit less important than elsewhere because IIRC _numdiff.py is not public API
That was my hope too. I can't cleanup _numdiff.py
without changing the tests, because the tests are testing for the design flaws I'm trying to remove - and I don't think there's value to an intermediate commit that does not pass CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, however, numdiff is called on user-provided functions in the optimizers, and for that looking at the tests, it is not clear whether it behaves the same now as before. There seem to be some behavior changes in this PR, so it's not just internal cleanup.
This allows a scalar domain without identifying it with a 1D vector space, so I really like the change. (The 1D vector case is still possible, right?) |
c77eca9
to
8dd2c90
Compare
Yes. and it now is distinct from the scalar case. What is no longer possible with this patch is calling a vector-taking function with a scalar |
orig_fun = fun | ||
orig_x0 = x0 | ||
def fun(x, *args): | ||
return orig_fun(x.reshape(orig_x0.shape), *args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although nice, this is not fully backward-compatible because previously scalar x0
was cast to 1-d array and passed as such to the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once I get this passing, I was considering changing this to:
needs_deprecated_conversion = None if x0.shape == () else False
def fun(x, *args):
nonlocal needs_deprecated_conversion
if needs_deprecated_conversion is True:
return orig_fun(x.ravel(), *args)
elif needs_deprecated_conversion is False:
return orig_fun(x.reshape(orig_x0.shape), *args)
else:
try:
f = orig_fun(x.reshape(orig_x0.shape), *args)
needs_deprecated_conversion = False
return f
except Exception:
try:
f = orig_fun(x.ravel(), *args)
emit_warning_to_change_x0_to_be_scalar()
needs_deprecated_conversion = False
return f
except Exception:
pass # the function is just broken, reraise the outer error
raise
The fix in general is to convert calls to minimize(f=lambda x: x[0], x0=0)
to either minimize(f=lambda x: x[0], x0=[0])
or minimize(f=lambda x: x, x0=0)
, both of which seem like good things to encourage the caller to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, the try: ... except: ...
thing looks ugly. I'd prefer to just do x0 = np.atleast_1d(x0)
at the beginning. I think this particular point is not important for the numdiff changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably none of the numdiff changes are really needed here - my original plan was to add the reshape wrapping at that level, but it quickly became apparent that wasn't going to work.
One of the motivations of this patch was to remove / deprecate the weird atleast_1d
behavior that was causing @nschloe all the pain in the first place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collecting some thoughts:
Most of the optimizers actually do atleast_1d(x0).flatten()
, so also the dim>1 generalization behavior change looks a bit annoying to pull through. Again legacy from 2000s... Looks like the change here would then necessarily be hard backward compat break.
Generally, minimize() follows the usual literature picture, where you minimize over a vector of parameters, so x
is a vector, scalar x0
gets upcasted to a 1-element one, and the objective function returns something castable to a scalar. For minimizing over just one parameter, I'm not so sure people are consistent with def f(x): return x**2
and having x0
be scalar when calling minimize(), because it's a thing that does not matter currently, so the proper dimension generalization of x0
probably won't help with the backward compat issue.
With minimize() objective functions, I think the intent is that any scalar castable things (as in float(y)
) are ok return values. We probably should just make sure that any internal usage of minimize() is clean, old tests should either get warnings silenced or separate cleanup PR, and new compatibility tests added. For sloppy user code, if Numpy numpy/numpy#10404 starts emitting warnings, that's probably OK, and enough. If Numpy plans to remove the old behavior, because there's likely not small amount of def f(x): return x**2
style scalar minimization code written, I think it would be better if minimize() is also changed to do the scalar cast as early as possible, and to emulate the old behavior plus a warning as necessary.
Then we have to similarly check whether the proposed change numpy/numpy#10404 in Numpy would break some other of the APIs in a similar way. For most, it probably makes sense to just let the scalar cast warnings to just bubble up, so probably not many changes are needed.
Since numdiff is not public, for that I think the proper dimension generalization API changes can be done, and probably minimize needs to wrap it such that the scalar cast is done first. Maybe useful to also check if it was also used for computing Hessians...
Before too much further work is done on this I'd prefer if #10673 could be merged. There will be merge clashes and resolving those clashes would require a lot of further work and re-review of what has already been reviewed. (pinging @ev-br and @rgommers because they were involved with looking at that PR) |
@andyfaff: sure, this isn't particularly high priority - I'm fine with not merging this until that concludes. Left a review too. |
Is there an easy way to rerun CI with the just-merged numpy/numpy#15022? Is there a job that builds off master? |
Maybe the "Draft" status of the PR is blocking the Travis/Azure runs? I think I've seen that before. There should be two matrix entries in the Travis CI run that pull in recent NumPy---technically I think it is the nightly NumPy wheels that get built for Linux, but the PR you reference is from 4 days ago so it is likely that numpy-wheels has produced a binary with that included by now. I can add a "needs-work" label if you want to lift the draft status/reopen to see if you can flush through Travis, etc. |
Based on this comment, it sounds like this needs some work. It's been inactive for a while, and the author did not consider it high priority, so I'll go ahead and close it to help us focus on the PRs that need review. But if anyone would like to revive this (e.g. for gh-14946), please feel free to reopen! |
I don't have a blas setup on my machine, so trying this out in CI here. Tips for developing scipy on windows would be appreciated.WSL was reasonably painless.Related to #11147