-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Non-deterministically corrupted results from np.dot (conditional on large imports) #12394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, OpenBLAS 0.3.1 is known buggy, that is why we don't use it. Where did you get NumPy? |
Thanks a lot! I'm using nix package manager. The latest stable release of packages (18.09) happened to have OpenBLAS 0.3.1 as a default version (upstream is already at 0.3.3). I created an issue there where I am arguing that by default nix should package numpy/scipy using exactly the same versions of dependencies as you do in your binary builds. Do I understand correctly that you are building numpy |
Yes, we built the wheels with 0.3.0. See comments MacPython/numpy-wheels@cd53070 . Note that we have discovered that 0.3.0 is not thread safe, so will be looking for 0.3.4. |
See OpenMathLib/OpenBLAS#1851 for 0.3.0 threading problem. |
See also OpenMathLib/OpenBLAS#1844. |
Thanks. Apparently it's indeed multithreading-related. I get no errors if I force OpenBLAS to use only one thread. I don't think seaborn creates any threads so it's not clear why import matters. |
You could try to apply diff from PR referenced in linked issue that is supposed to fix it. It is quite old code being modified, thus should apply cleanly on many even not so recent versions. It is due for 0.3.4 if you can wait a bit and upgrade binary. |
@brada4 What sort of schedule are you looking at for 0.3.4. I'm looking to delay NumPy 1.16 until it comes out. |
@martin-frbg has plan exactly because of this issue: |
I put a 1.16 milestone on this for tracking purposes. |
I think this has been fixed in OpenBLAS 3.4, so removing the milestone. @ilya-kolpakov you can test this now by downloading the latest numpy wheel builds from https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/, look for files beginning |
Considering that this is an OpenBLAS issue from more then half a year ago, which apparently was fixed upstream. Closing this issue. |
I am getting randomly corrupted results from
np.dot
if the first argument is an array of doubles with FORTRAN layout.The behavior is non-deterministic and the does not occur unless I import a large module beforehand (e.g.
seaborn
ortheano
does the trick).Reproducing code example:
prints
The error reliably occurs for matrices of larger sizes.
Examples of corrupted results (random data)
If the failure occurs and I evaluate
np.dot(Ac, b)
a few timesthe output is non-deterministic and the errors look like this:

Examples of corrupted results (structured data)
If I replace the normal random generator with
linspace
:I get errors plots looking like this:

Numpy/Python version information:
Since I am using numpy packaged for nix package manager it is easy to get exactly the same Python environment package on a different machine. I did not do this yet.
The text was updated successfully, but these errors were encountered: