Skip to content

Commit ddeb820

Browse files
Minimum dependency test job (#2816)
* Start mindeps * Fix check_is_fitted import * Temporarilly bump anndata dep to access test utilities * Support for numpy 1.23 where np.equal didn't work on strings * Fix palette color mapping for pandas < 2.1 * Bump networkx * Exit on error for test script * Bump numba for numpy compat * update ci * Fix array comparison in both envs * Bump statsmodels version * Test returns different plot type with older dependencies * Skip test that relies on pd.value_counts * Try to use better naming test results in CI * Temporarily bump pandas * Add dependency on pynndescent, bump packaging version * skip doctest for dendrogram * install pre-commit in env * Bump networkx * Get tests to collect with an old anndata version * Fix most preprocessing tests (account for old anndata constructor) * Bump anndata min version to 0.7.8 * Fix pytest_itemcollected * Bump min anndata version to 0.8 * Fix test_get.py cases * Fix neighbor test * Fix dendrogram plotting cases * fix stacked violin ordering * Bump tolerance for older versions of numba * Fix ordering for matrixplot * Fix preprocessing tests * xfail masking test for anndata 0.8 * Fix order * Fix min-deps.py * Discard changes to scanpy/plotting/_utils.py * removed TODOs from min-deps.py * Remove dev script * Rename test jobs to be more identifiable * Use marker for xfail * Add warning for PCA order * Fix usage of pytest.mark.xfail * Remove commented out code from CI job * Obey signature test * Don't error on warning for dask.dataframe * update dask version * fix dask version better * Fix view issue with anndata==0.8 * Typo * Release note * coverage for min deps * fix coverage for minimum-version install --------- Co-authored-by: Philipp A <[email protected]>
1 parent 102b4ef commit ddeb820

22 files changed

+314
-73
lines changed

.azure-pipelines.yml

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,9 @@ variables:
66
python.version: '3.11'
77
PIP_CACHE_DIR: $(Pipeline.Workspace)/.pip
88
PYTEST_ADDOPTS: '-v --color=yes --durations=0 --nunit-xml=test-data/test-results.xml'
9-
ANNDATA_DEV: no
10-
RUN_COVERAGE: no
119
TEST_EXTRA: 'test-full'
12-
PRERELEASE_DEPENDENCIES: no
10+
DEPENDENCIES_VERSION: "latest" # |"pre-release" | "minimum-version"
11+
TEST_TYPE: "standard" # | "coverage"
1312

1413
jobs:
1514
- job: PyTest
@@ -20,12 +19,16 @@ jobs:
2019
Python3.9:
2120
python.version: '3.9'
2221
Python3.11: {}
23-
minimal_tests:
22+
minimal_dependencies:
2423
TEST_EXTRA: 'test-min'
2524
anndata_dev:
26-
ANNDATA_DEV: yes
27-
RUN_COVERAGE: yes
28-
PRERELEASE_DEPENDENCIES: yes
25+
DEPENDENCIES_VERSION: "pre-release"
26+
TEST_TYPE: "coverage"
27+
minimum_versions:
28+
python.version: '3.9'
29+
DEPENDENCIES_VERSION: "minimum-version"
30+
TEST_TYPE: "coverage"
31+
2932

3033
steps:
3134
- task: UsePythonVersion@0
@@ -52,51 +55,54 @@ jobs:
5255
pip install wheel coverage
5356
pip install .[dev,$(TEST_EXTRA)]
5457
displayName: 'Install dependencies'
55-
condition: eq(variables['PRERELEASE_DEPENDENCIES'], 'no')
58+
condition: eq(variables['DEPENDENCIES_VERSION'], 'latest')
5659
5760
- script: |
5861
python -m pip install --pre --upgrade pip
5962
pip install --pre wheel coverage
6063
pip install --pre .[dev,$(TEST_EXTRA)]
64+
pip install -v "anndata[dev,test] @ git+https://github.com/scverse/anndata"
6165
displayName: 'Install dependencies release candidates'
62-
condition: eq(variables['PRERELEASE_DEPENDENCIES'], 'yes')
66+
condition: eq(variables['DEPENDENCIES_VERSION'], 'pre-release')
6367
6468
- script: |
65-
pip install -v "anndata[dev,test] @ git+https://github.com/scverse/anndata"
66-
displayName: 'Install development anndata'
67-
condition: eq(variables['ANNDATA_DEV'], 'yes')
69+
python -m pip install pip wheel tomli packaging pytest-cov
70+
pip install `python3 ci/scripts/min-deps.py pyproject.toml --extra dev test`
71+
pip install --no-deps .
72+
displayName: 'Install dependencies minimum version'
73+
condition: eq(variables['DEPENDENCIES_VERSION'], 'minimum-version')
6874
6975
- script: |
7076
pip list
7177
displayName: 'Display installed versions'
7278
7379
- script: pytest
7480
displayName: 'PyTest'
75-
condition: eq(variables['RUN_COVERAGE'], 'no')
81+
condition: eq(variables['TEST_TYPE'], 'standard')
7682

7783
- script: |
7884
coverage run -m pytest
7985
coverage xml
8086
displayName: 'PyTest (coverage)'
81-
condition: eq(variables['RUN_COVERAGE'], 'yes')
87+
condition: eq(variables['TEST_TYPE'], 'coverage')
8288
8389
- task: PublishCodeCoverageResults@1
8490
inputs:
8591
codeCoverageTool: Cobertura
8692
summaryFileLocation: 'test-data/coverage.xml'
8793
failIfCoverageEmpty: true
88-
condition: eq(variables['RUN_COVERAGE'], 'yes')
94+
condition: eq(variables['TEST_TYPE'], 'coverage')
8995

9096
- task: PublishTestResults@2
9197
condition: succeededOrFailed()
9298
inputs:
9399
testResultsFiles: 'test-data/test-results.xml'
94100
testResultsFormat: NUnit
95-
testRunTitle: 'Publish test results for Python $(python.version)'
101+
testRunTitle: 'Publish test results for $(Agent.JobName)'
96102

97103
- script: bash <(curl -s https://codecov.io/bash)
98104
displayName: 'Upload to codecov.io'
99-
condition: eq(variables['RUN_COVERAGE'], 'yes')
105+
condition: eq(variables['TEST_TYPE'], 'coverage')
100106

101107
- job: CheckBuild
102108
pool:

ci/scripts/min-deps.py

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
#!python3
2+
from __future__ import annotations
3+
4+
import argparse
5+
import sys
6+
from collections import deque
7+
from pathlib import Path
8+
from typing import TYPE_CHECKING
9+
10+
if sys.version_info >= (3, 11):
11+
import tomllib
12+
else:
13+
import tomli as tomllib
14+
15+
from packaging.requirements import Requirement
16+
from packaging.version import Version
17+
18+
if TYPE_CHECKING:
19+
from collections.abc import Generator, Iterable
20+
21+
22+
def min_dep(req: Requirement) -> Requirement:
23+
"""
24+
Given a requirement, return the minimum version specifier.
25+
26+
Example
27+
-------
28+
29+
>>> min_dep(Requirement("numpy>=1.0"))
30+
"numpy==1.0"
31+
"""
32+
req_name = req.name
33+
if req.extras:
34+
req_name = f"{req_name}[{','.join(req.extras)}]"
35+
36+
if not req.specifier:
37+
return Requirement(req_name)
38+
39+
min_version = Version("0.0.0.a1")
40+
for spec in req.specifier:
41+
if spec.operator in [">", ">=", "~="]:
42+
min_version = max(min_version, Version(spec.version))
43+
elif spec.operator == "==":
44+
min_version = Version(spec.version)
45+
46+
return Requirement(f"{req_name}=={min_version}.*")
47+
48+
49+
def extract_min_deps(
50+
dependencies: Iterable[Requirement], *, pyproject
51+
) -> Generator[Requirement, None, None]:
52+
dependencies = deque(dependencies) # We'll be mutating this
53+
project_name = pyproject["project"]["name"]
54+
55+
while len(dependencies) > 0:
56+
req = dependencies.pop()
57+
58+
# If we are referring to other optional dependency lists, resolve them
59+
if req.name == project_name:
60+
assert req.extras, f"Project included itself as dependency, without specifying extras: {req}"
61+
for extra in req.extras:
62+
extra_deps = pyproject["project"]["optional-dependencies"][extra]
63+
dependencies += map(Requirement, extra_deps)
64+
else:
65+
yield min_dep(req)
66+
67+
68+
def main():
69+
parser = argparse.ArgumentParser(
70+
prog="min-deps",
71+
description="""Parse a pyproject.toml file and output a list of minimum dependencies.
72+
73+
Output is directly passable to `pip install`.""",
74+
usage="pip install `python min-deps.py pyproject.toml`",
75+
)
76+
parser.add_argument(
77+
"path", type=Path, help="pyproject.toml to parse minimum dependencies from"
78+
)
79+
parser.add_argument(
80+
"--extras", type=str, nargs="*", default=(), help="extras to install"
81+
)
82+
83+
args = parser.parse_args()
84+
85+
pyproject = tomllib.loads(args.path.read_text())
86+
87+
project_name = pyproject["project"]["name"]
88+
deps = [
89+
*map(Requirement, pyproject["project"]["dependencies"]),
90+
*(Requirement(f"{project_name}[{extra}]") for extra in args.extras),
91+
]
92+
93+
min_deps = extract_min_deps(deps, pyproject=pyproject)
94+
95+
print(" ".join(map(str, min_deps)))
96+
97+
98+
if __name__ == "__main__":
99+
main()

docs/release-notes/1.10.0.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
* Fix setting `sc.settings.verbosity` in some cases {pr}`2605` {smaller}`P Angerer`
3636
* Fix all remaining pandas warnings {pr}`2789` {smaller}`P Angerer`
3737
* Fix some annoying plotting warnings around violin plots {pr}`2844` {smaller}`P Angerer`
38+
* Scanpy now has a test job which tests against the minumum versions of the dependencies. In the process of implementing this, many bugs associated with using older versions of `pandas`, `anndata`, `numpy`, and `matplotlib` were fixed. {pr}`2816` {smaller}`I Virshup`
3839

3940
```{rubric} Ecosystem
4041
```

pyproject.toml

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -46,24 +46,25 @@ classifiers = [
4646
"Topic :: Scientific/Engineering :: Visualization",
4747
]
4848
dependencies = [
49-
"anndata>=0.7.4",
49+
"anndata>=0.8",
5050
# numpy needs a version due to #1320
51-
"numpy>=1.17.0",
51+
"numpy>=1.23",
5252
"matplotlib>=3.6",
53-
"pandas >=2.1.3",
54-
"scipy>=1.4",
55-
"seaborn>=0.13.0",
56-
"h5py>=3",
53+
"pandas >=1.5",
54+
"scipy>=1.8",
55+
"seaborn>=0.13",
56+
"h5py>=3.1",
5757
"tqdm",
5858
"scikit-learn>=0.24",
59-
"statsmodels>=0.10.0rc2",
59+
"statsmodels>=0.13",
6060
"patsy",
61-
"networkx>=2.3",
61+
"networkx>=2.7",
6262
"natsort",
6363
"joblib",
64-
"numba>=0.41.0",
64+
"numba>=0.56",
6565
"umap-learn>=0.3.10",
66-
"packaging",
66+
"pynndescent>=0.5",
67+
"packaging>=21.3",
6768
"session-info",
6869
"legacy-api-wrap>=1.4", # for positional API deprecations
6970
"get-annotations; python_version < '3.10'",
@@ -132,8 +133,8 @@ dev = [
132133
]
133134
# Algorithms
134135
paga = ["igraph"]
135-
louvain = ["igraph", "louvain>=0.6,!=0.6.2"] # Louvain community detection
136-
leiden = ["igraph>=0.10", "leidenalg>=0.9"] # Leiden community detection
136+
louvain = ["igraph", "louvain>=0.6.0,!=0.6.2"] # Louvain community detection
137+
leiden = ["igraph>=0.10", "leidenalg>=0.9.0"] # Leiden community detection
137138
bbknn = ["bbknn"] # Batch balanced KNN (batch correction)
138139
magic = ["magic-impute>=2.0"] # MAGIC imputation method
139140
skmisc = ["scikit-misc>=0.1.3"] # highly_variable_genes method 'seurat_v3'
@@ -142,7 +143,7 @@ scanorama = ["scanorama"] # Scanorama dataset integration
142143
scrublet = ["scikit-image"] # Doublet detection with automatic thresholds
143144
# Acceleration
144145
rapids = ["cudf>=0.9", "cuml>=0.9", "cugraph>=0.9"] # GPU accelerated calculation of neighbors
145-
dask = ["dask[array]!=2.17.0"] # Use the Dask parallelization engine
146+
dask = ["dask[array]>=2022.09.2"] # Use the Dask parallelization engine
146147
dask-ml = ["dask-ml", "scanpy[dask]"] # Dask-ML for sklearn-like API
147148

148149
[tool.hatch.build]
@@ -166,6 +167,7 @@ nunit_attach_on = "fail"
166167
markers = [
167168
"internet: tests which rely on internet resources (enable with `--internet-tests`)",
168169
"gpu: tests that use a GPU (currently unused, but needs to be specified here as we import anndata.tests.helpers, which uses it)",
170+
"anndata_dask_support: tests that require dask support in anndata",
169171
]
170172
filterwarnings = [
171173
# legacy-api-wrap: internal use of positional API

scanpy/get/get.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -260,7 +260,7 @@ def obs_df(
260260
... )
261261
>>> plotdf.columns
262262
Index(['CD8B', 'n_genes', 'X_umap-0', 'X_umap-1'], dtype='object')
263-
>>> plotdf.plot.scatter("X_umap-0", "X_umap-1", c="CD8B")
263+
>>> plotdf.plot.scatter("X_umap-0", "X_umap-1", c="CD8B") # doctest: +SKIP
264264
<Axes: xlabel='X_umap-0', ylabel='X_umap-1'>
265265
266266
Calculating mean expression for marker genes by cluster:

scanpy/neighbors/_backends/rapids.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@
33
from typing import TYPE_CHECKING, Any, Literal
44

55
import numpy as np
6-
from sklearn.base import BaseEstimator, TransformerMixin, check_is_fitted
6+
from sklearn.base import BaseEstimator, TransformerMixin
77
from sklearn.exceptions import NotFittedError
8+
from sklearn.utils.validation import check_is_fitted
89

910
from ..._settings import settings
1011
from ._common import TransformerChecksMixin

scanpy/plotting/_baseplot_class.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,7 @@ def add_totals(
347347
>>> adata = sc.datasets.pbmc68k_reduced()
348348
>>> markers = {'T-cell': 'CD3D', 'B-cell': 'CD79A', 'myeloid': 'CST3'}
349349
>>> plot = sc.pl._baseplot_class.BasePlot(adata, markers, groupby='bulk_labels').add_totals()
350-
>>> plot.plot_group_extra['counts_df']
350+
>>> plot.plot_group_extra['counts_df'] # doctest: +SKIP
351351
bulk_labels
352352
CD4+/CD25 T Reg 68
353353
CD4+/CD45RA+/CD25- Naive T 8

scanpy/plotting/_matrixplot.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,15 @@ def __init__(
168168

169169
if values_df is None:
170170
# compute mean value
171-
values_df = self.obs_tidy.groupby(level=0, observed=True).mean()
171+
values_df = (
172+
self.obs_tidy.groupby(level=0, observed=True)
173+
.mean()
174+
.loc[
175+
self.categories_order
176+
if self.categories_order is not None
177+
else self.categories
178+
]
179+
)
172180

173181
if standard_scale == "group":
174182
values_df = values_df.sub(values_df.min(1), axis=0)

scanpy/plotting/_stacked_violin.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -383,14 +383,17 @@ def _mainplot(self, ax):
383383
if self.var_names_idx_order is not None:
384384
_matrix = _matrix.iloc[:, self.var_names_idx_order]
385385

386-
if self.categories_order is not None:
387-
_matrix.index = _matrix.index.reorder_categories(
388-
self.categories_order, ordered=True
389-
)
390-
391386
# get mean values for color and transform to color values
392387
# using colormap
393-
_color_df = _matrix.groupby(level=0, observed=True).median()
388+
_color_df = (
389+
_matrix.groupby(level=0, observed=True)
390+
.median()
391+
.loc[
392+
self.categories_order
393+
if self.categories_order is not None
394+
else self.categories
395+
]
396+
)
394397
if self.are_axes_swapped:
395398
_color_df = _color_df.T
396399

scanpy/plotting/_tools/scatterplots.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from matplotlib.colors import Colormap, Normalize
2121
from matplotlib.figure import Figure # noqa: TCH002
2222
from numpy.typing import NDArray # noqa: TCH002
23+
from packaging.version import Version
2324

2425
from ... import logging as logg
2526
from ..._settings import settings
@@ -1247,8 +1248,10 @@ def _color_vector(
12471248
}
12481249
# If color_map does not have unique values, this can be slow as the
12491250
# result is not categorical
1250-
color_vector = pd.Categorical(values.map(color_map, na_action="ignore"))
1251-
1251+
if Version(pd.__version__) < Version("2.1.0"):
1252+
color_vector = pd.Categorical(values.map(color_map))
1253+
else:
1254+
color_vector = pd.Categorical(values.map(color_map, na_action="ignore"))
12521255
# Set color to 'missing color' for all missing values
12531256
if color_vector.isna().any():
12541257
color_vector = color_vector.add_categories([to_hex(na_color)])

scanpy/preprocessing/_highly_variable_genes.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,11 @@ def _highly_variable_genes_single_batch(
252252
`highly_variable`, `means`, `dispersions`, and `dispersions_norm`.
253253
"""
254254
X = _get_obs_rep(adata, layer=layer)
255+
256+
if hasattr(X, "_view_args"): # AnnData array view
257+
# For compatibility with anndata<0.9
258+
X = X.copy() # Doesn't actually copy memory, just removes View class wrapper
259+
255260
if flavor == "seurat":
256261
X = X.copy()
257262
if "log1p" in adata.uns_keys() and adata.uns["log1p"].get("base") is not None:

0 commit comments

Comments
 (0)