Skip to content

Broken test: TestPreprocessingDistributed::test_normalize_per_cell[dask] #2526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
flying-sheep opened this issue Jun 23, 2023 · 0 comments
Open
2 of 3 tasks
Labels
Area - Out of core 💾 Working with on disk data

Comments

@flying-sheep
Copy link
Member

flying-sheep commented Jun 23, 2023

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the master branch of scanpy.

What happened?

In #2235, more gradually enabled some so far disabled tests. Before, all tests in TestPreprocessingDistributed were disabled with the available optional dependencies we run our tests with:

required = ['dask', 'zappy', 'zarr']
installed = {mod: bool(find_spec(mod)) for mod in required}
@pytest.mark.skipif(
not all(installed.values()), reason=f'{required} all required: {installed}'
)
class TestPreprocessingDistributed:

now, the dask tests are enabled and only the zappy tests are disabled:

@needs("zarr")
class TestPreprocessingDistributed:
@pytest.fixture()
def adata(self):
a = ad.read_zarr(input_file) # regular anndata
a.X = a.X[:] # convert to numpy array
return a
@pytest.fixture(
params=[
pytest.param("direct", marks=[needs("zappy")]),
pytest.param("dask", marks=[needs("dask")]),
]
)

Minimal code sample

pytest -v --disable-warnings -k test_normalize_per_cell[dask] --runxfail

Error output

===================================================================================================== test session starts ======================================================================================================
platform linux -- Python 3.8.17, pytest-7.3.1, pluggy-1.0.0 -- /home/phil/Dev/Python/venvs/single-cell/bin/python
cachedir: .pytest_cache
rootdir: /home/phil/Dev/Python/Single Cell/scanpy
configfile: pyproject.toml
testpaths: scanpy
plugins: cov-4.1.0, nunit-1.0.3, memray-1.4.0, xdist-3.3.1
collected 986 items / 985 deselected / 1 selected                                                                                                                                                                              

scanpy/tests/test_preprocessing_distributed.py::TestPreprocessingDistributed::test_normalize_per_cell[dask] FAILED                                                                                                       [100%]

=========================================================================================================== FAILURES ===========================================================================================================
__________________________________________________________________________________ TestPreprocessingDistributed.test_normalize_per_cell[dask] __________________________________________________________________________________

self = <scanpy.tests.test_preprocessing_distributed.TestPreprocessingDistributed object at 0x7fdd21f2e9d0>, adata = AnnData object with n_obs × n_vars = 9999 × 1000
    obs: 'n_counts'
    var: 'gene_ids'
adata_dist = AnnData object with n_obs × n_vars = 9999 × 1000
    obs: 'n_counts'
    var: 'gene_ids'
    uns: 'dist-mode'

    def test_normalize_per_cell(self, adata, adata_dist):
        if adata_dist.uns["dist-mode"] == "dask":
            pytest.xfail("TODO: Test broken for dask")
        normalize_per_cell(adata_dist)
        result = materialize_as_ndarray(adata_dist.X)
        normalize_per_cell(adata)
        assert result.shape == adata.shape
        assert result.shape == (adata.n_obs, adata.n_vars)
>       npt.assert_allclose(result, adata.X)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       Mismatched elements: 688287 / 9999000 (6.88%)
E       Max absolute difference: 573.4154
E       Max relative difference: 11.335767
E        x: array([[0., 0., 0., ..., 0., 0., 0.],
E              [0., 0., 0., ..., 0., 0., 0.],
E              [0., 0., 0., ..., 0., 0., 0.],...
E        y: array([[0., 0., 0., ..., 0., 0., 0.],
E              [0., 0., 0., ..., 0., 0., 0.],
E              [0., 0., 0., ..., 0., 0., 0.],...

scanpy/tests/test_preprocessing_distributed.py:64: AssertionError
----------------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------------
normalizing by total count per cell
filtered out dask.array<sum-aggregate, shape=(), dtype=int64, chunksize=(), chunktype=numpy.ndarray> cells that have less than 1 counts
    finished (0:00:00): normalized adata.X and added    'n_counts', counts per cell before normalization (adata.obs)
normalizing by total count per cell
filtered out 1 cells that have less than 1 counts
    finished (0:00:00): normalized adata.X and added    'n_counts', counts per cell before normalization (adata.obs)
=================================================================================================== short test summary info ====================================================================================================
FAILED scanpy/tests/test_preprocessing_distributed.py::TestPreprocessingDistributed::test_normalize_per_cell[dask] - AssertionError: 
======================================================================================== 1 failed, 985 deselected, 11 warnings in 3.74s ========================================================================================

Versions

-----
anndata     0.9.0rc2.dev43+g21a76088
scanpy      1.10.0.dev117+g6b9e734f
-----
PIL                 9.1.1
asciitree           NA
awkward             2.2.1
awkward_cpp         NA
beta_ufunc          NA
binom_ufunc         NA
cffi                1.15.0
cloudpickle         2.2.1
cycler              0.10.0
cython_runtime      NA
dask                2023.5.0
dateutil            2.8.2
defusedxml          0.7.1
entrypoints         0.4
fasteners           0.17.3
h5py                3.7.0
hypergeom_ufunc     NA
igraph              0.10.4
importlib_resources NA
jinja2              3.1.2
joblib              1.1.0
kiwisolver          1.4.3
leidenalg           0.9.1
llvmlite            0.38.1
markupsafe          2.1.1
matplotlib          3.7.1
mpl_toolkits        NA
natsort             8.1.0
nbinom_ufunc        NA
numba               0.55.2
numcodecs           0.10.2
numpy               1.22.4
packaging           21.3
pandas              2.0.2
pkg_resources       NA
psutil              5.9.1
pyparsing           3.0.9
pytz                2022.1
scipy               1.8.1
session_info        1.0.0
setuptools          67.8.0
setuptools_scm      NA
six                 1.16.0
sklearn             1.1.1
sphinxcontrib       NA
texttable           1.6.7
threadpoolctl       3.1.0
tlz                 0.12.0
toolz               0.12.0
typing_extensions   NA
wcwidth             0.2.5
yaml                6.0
zarr                2.12.0
zipp                NA
-----
Python 3.8.17 (default, Jun 17 2023, 20:09:37) [GCC 13.1.1 20230429]
Linux-6.3.8-zen1-1-zen-x86_64-with-glibc2.34
-----
Session information updated at 2023-06-23 14:29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Out of core 💾 Working with on disk data
Projects
None yet
Development

No branches or pull requests

1 participant