Skip to content

Non-deterministically corrupted results from np.dot (conditional on large imports) #12394

Closed
@1pakch

Description

@1pakch

I am getting randomly corrupted results from np.dot if the first argument is an array of doubles with FORTRAN layout.

The behavior is non-deterministic and the does not occur unless I import a large module beforehand (e.g. seaborn or theano does the trick).

Reproducing code example:

import numpy as np

random = dict(
    int = lambda shape: np.random.randint(low=0, high=10, size=shape),
    float = lambda shape: np.random.randn(*shape)
)

def check_dot(A, b, attempts=10, eps=1e-1):
    expected = (A * b).sum(axis=1)
    for i in range(attempts):
        actual = np.dot(A, b)
        if not np.allclose(expected, actual, eps):
            return False
    return True

def test(n_cols=2):
    for n_rows in 2**np.arange(1, 16, 1):
        for type, f in random.items():
            A = f((n_rows, n_cols))
            b = f((n_cols,))
            for order in 'CF':
                Ac = np.copy(A, order=order)
                if not check_dot(Ac, b):
                    print(type, n_rows, order)
                    return Ac, b
    return None


assert test() is None
import seaborn # or import theano
Ac, b = test()

prints

float 8192 F

The error reliably occurs for matrices of larger sizes.

Examples of corrupted results (random data)

If the failure occurs and I evaluate np.dot(Ac, b) a few times

import matplotlib.pyplot as plt

n_eval = 4
f, axes = plt.subplots(n_eval, 1, True, True, figsize=(8, 1.5*n_eval))

expected = (Ac*b).sum(axis=1)
for ax in axes:
    actual  = np.dot(Ac, b)
    ax.plot(actual - expected)

the output is non-deterministic and the errors look like this:
errors-randn

Examples of corrupted results (structured data)

If I replace the normal random generator with linspace:

random = dict(
    int = lambda shape: np.random.randint(low=0, high=10, size=shape),
    float = lambda shape: np.linspace(0, 1, np.product(shape)).reshape(shape)
)

I get errors plots looking like this:
errors-linspace

Numpy/Python version information:

Since I am using numpy packaged for nix package manager it is easy to get exactly the same Python environment package on a different machine. I did not do this yet.

>> sys.version
3.6.6 (default, Jun 27 2018, 05:47:41) 
[GCC 7.3.0]

>> np.__version__
1.15.1

>> np.show_config()
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/nix/store/blk28p4cr6r2nc7fi1c4gggiqpd7pkqy-openblas-0.3.1/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/nix/store/blk28p4cr6r2nc7fi1c4gggiqpd7pkqy-openblas-0.3.1/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/nix/store/blk28p4cr6r2nc7fi1c4gggiqpd7pkqy-openblas-0.3.1/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/nix/store/blk28p4cr6r2nc7fi1c4gggiqpd7pkqy-openblas-0.3.1/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions