Skip to content

[amss2] Lecture failed nightly build #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
najuzilu opened this issue Jun 1, 2020 · 16 comments · Fixed by #171
Closed

[amss2] Lecture failed nightly build #130

najuzilu opened this issue Jun 1, 2020 · 16 comments · Fixed by #171
Labels
bug Something isn't working

Comments

@najuzilu
Copy link
Contributor

najuzilu commented Jun 1, 2020

The amss2 lecture is failing our nightly build but my local coverage does not throw any errors:
amss2 -- pass -- 94.93s.

When I run the lecture in jupyter notebook no errors pop up but this warning is displayed
Screen Shot 2020-06-01 at 1 52 29 PM

The same warning appears on our website:
Screen Shot 2020-06-01 at 1 53 10 PM

The execution error from the nightly build is located on the chunk of code displayed above. This is the beginning of the report:

Exception: CellExecutionError("An error occurred while executing the 
following cell:\n------------------\nμ_grid = np.linspace(-0.09, 0.1, 100)
\n\nlog_example = CRRAutility()\n\nlog_example.transfers = True

The error message is

Exception\x1b[0m: Positive directional derivative for linesearch\n
Exception: Positive directional derivative for linesearch\n")
@mmcky mmcky added the bug Something isn't working label Jun 1, 2020
@mmcky
Copy link
Contributor

mmcky commented Jun 1, 2020

@najuzilu are you using the same qe-lectures environment to replicate the nightly build software?

@najuzilu
Copy link
Contributor Author

najuzilu commented Jun 1, 2020

@mmcky yes, it's the first thing I do before I run coverage locally.

@mmcky
Copy link
Contributor

mmcky commented Jun 1, 2020

I guess I don't understand then why one would fail and the other doesn't.

We should be able to replicate the environment to get the same error messages. Hmm ...

@najuzilu
Copy link
Contributor Author

najuzilu commented Jun 1, 2020

I'm not sure but I will check that no upgrade/update is included in any of the previous files executed. Locally, amss2 is the first file that gets executed whereas in the Nightly build, it's executed as the fourth file.

@mmcky
Copy link
Contributor

mmcky commented Jun 1, 2020

That's a nice observation - thanks @najuzilu. Sounds like a good starting point to debug

@mmcky
Copy link
Contributor

mmcky commented Jul 18, 2020

It looks like all amss which are 1 through 3 lectures are failing. @najuzilu would you have time to review?

@najuzilu
Copy link
Contributor Author

The fail seems to have been introduced during Nightly build # 51 with the scipy upgrade from scipy=1.4.1 to scipy=1.5.0. Downgrading to scipy=1.4.1. fixes the issue for all three amss lectures.

@mmcky
Copy link
Contributor

mmcky commented Jul 21, 2020

I wonder why its upgrading to scipy=1.5.0 when anaconda contains scipy=1.4.1

https://docs.anaconda.com/anaconda/packages/py3.7_linux-64/

@mmcky
Copy link
Contributor

mmcky commented Jul 21, 2020

must be coming from conda-forge channel. @najuzilu can we make the amss code work with `scipy=1.5.0?

@najuzilu
Copy link
Contributor Author

@mmcky as I suspected, the amss.rst lecture fails due to a floating point issue. The program halts under these conditions:

x0 = [0.3764893 , 0.3262772 , 0.4764893 , 0.5262772 , 2.50455943, 2.50455943, 0., 0. ]
bounds=([ 0.,  0.,  0.,  0., -3.41077573, -3.41077573,  0.,  0.], [100., 100., 100., 100., 3.70946441, 3.70946441, 100., 100.])

The statement 3.7094644090901125 > 3.709464409090112 results as True as oppose to False which is when *** ValueError: `x0` violates bound constraints. exception is thrown. This is true for scipy>=1.5.0.

@mmcky
Copy link
Contributor

mmcky commented Jul 24, 2020

thanks for diagnosing the issue @najuzilu -- so what do you think is the best approach to fix this for scipy>=1.5.0?

Do we need to update the code in the lecture, adjust the sensitivity of the bounds? Any ideas on best way forward on this.

@najuzilu
Copy link
Contributor Author

I suspect the best approach would be to implement this fix in SciPy under slsqp_optmz.f somewhere here:
https://github.com/scipy/scipy/blob/3bf0af5312f6b6b82bef5fbc8ce1b802522b2d21/scipy/optimize/slsqp/slsqp_optmz.f#L2176-L2187.

@shizejin
Copy link
Member

shizejin commented Dec 1, 2020

The bound constraint violation issue in amss2 was introduced by this commit. It looks like the scipy team wanted to transit to another design of code related with Jacobian matrix approximation in optimization, and for slsqp.py that we are using in the lecture, L305 was replaced by L328-333. Somehow the new approximation method does not yield the exactly same numerical results as before, and causes the problem we are now facing. This is a bug which we should report to the scipy team.

If we don't want to wait for it to be solved (which I expect will take quite long before the fix is released), there is one way to circumvent this which is to pass our own Jacobian approximation function to the optimization routine. It only takes a few lines and I tested that it is working on my machine. So this solves the ValueError: `x0` violates bound constraints issue.

I will investigate the remaining issue

Exception\x1b[0m: Positive directional derivative for linesearch\n
Exception: Positive directional derivative for linesearch\n")

It would be great if you have any idea about which version of scipy worked with this lecture nicely. That's going to be a very helpful starting point for me. @mmcky @najuzilu

@shizejin
Copy link
Member

Hi team, I am just wondering if there is any info available about the versions of the packages involved in this notebook when it was still working, or any printed output of the final results so that I could have something to refer to when I am debugging? :)

@mmcky
Copy link
Contributor

mmcky commented Dec 15, 2020

hey @shizejin -- @najuzilu put together an environment that ran this lecture and it is available here

That file suggests the main requirement is scipy=1.4.1

Hope that helps.

@shizejin
Copy link
Member

Thank you so much @mmcky, this is a huge help! I will come back to you when I have some progress in fixing this.

shizejin added a commit that referenced this issue Dec 30, 2020
close #130

1. add jacobian approximatyion function `objf_prime` and pass it to the slsqp minimization routine
2. adjust the tolerance level of slsqp from 1e-12 to 1e-10
mmcky added a commit that referenced this issue Dec 30, 2020
* fix amss2 lecture failure

close #130

1. add jacobian approximatyion function `objf_prime` and pass it to the slsqp minimization routine
2. adjust the tolerance level of slsqp from 1e-12 to 1e-10

* update environment file

Co-authored-by: mmcky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants