How to install

           ██████╗  ██╗   ██╗  ██████╗  ███████╗  ██████╗  ██████╗  ███╗   ██╗ ██╗   ██╗
           ██╔══██╗ ╚██╗ ██╔╝  ██╔══██╗ ██╔════╝ ██╔════╝ ██╔═══██╗ ████╗  ██║ ██║   ██║
           ██████╔╝  ╚████╔╝   ██║  ██║ █████╗   ██║      ██║   ██║ ██╔██╗ ██║ ██║   ██║
           ██╔═══╝    ╚██╔╝    ██║  ██║ ██╔══╝   ██║      ██║   ██║ ██║╚██╗██║ ╚██╗ ██╔╝
           ██║         ██║     ██████╔╝ ███████╗ ╚██████╗ ╚██████╔╝ ██║ ╚████║  ╚████╔╝
           ╚═╝         ╚═╝     ╚═════╝  ╚══════╝  ╚═════╝  ╚═════╝  ╚═╝  ╚═══╝   ╚═══╝

Python implementation of bulk RNAseq deconvolution algorithms

How to install

package

pip install "pydeconv"

dev

uv sync --all-groups

How to use: overview

from pydeconv import SignatureMatrix
from pydeconv.model import OLS, NNLS, DWLS, Tape, Scaden, MixupVI, NuSVR, RLR, WNNLS
from adata import AnnData

signature_matrix = SignatureMatrix.load("path/to/signature_matrix.csv") # index: gene names, column: cell types
solver = NNLS(signature_matrix)

adata = AnnData("path/to/adata.h5ad") # index: sample_id, columns: gene_names
adata.layers["raw_counts"] = ... # apply your preprocessing step or not

cell_prop = solver.transform(adata, layer="raw_counts", ratio=True)

How to use: detailed

1. Load an already registered signature matrix

from pydeconv.signature_matrix.registry import sig_matrix_laughney_lung_cancer
signature_matrix = sig_matrix_laughney_lung_cancer()

Note

Checkout here for more description of other registered signature matrix.

2. Load a custom signature matrix

from pydeconv import SignatureMatrix
signature_matrix = SignatureMatrix.load("path/to/signature_matrix.csv") #index: gene names, column: cell types

Note

For the moment only .csv format is supported. You can add any kwargs arguments from pd.read_csv after the path.

3. Predict

from pydeconv.model import Tape, Scaden

adata = AnnData("path/to/adata.h5ad") # index: sample_id, columns: gene_names
adata.layers["counts_sum"] = ...

solver = Scaden(weights_version="cti_2nd_level_granularity")
cell_prop = solver.transform(adata, layer="counts_sum", ratio=True)

Note

The model will check that you have the corresponding gene names in your input data.

4. Predict (signature based method)

from pydeconv.model import OLS, NNLS, DWLS

signature_matrix = ...
adata = AnnData("path/to/adata.h5ad")
adata.layers["relative_counts"] = ...

solver = DWLS(signature_matrix)
cell_prop = solver.transform(adata, layer="relative_counts", ratio=True)

Benchmark

We benchmarked the performance of several deconvolution algorithms on the CTI dataset, including our developed method MixUpVI. This repository and the proposed methods are part of the following paper: Joint probabilistic modeling of pseudobulk and single-cell transcriptomics enables accurate estimation of cell type composition, published in the Generative AI & Biology workshop of ICML, 2025.

The results are shown below.

To run the benchmark, you can use the following command:

python benchmark/run_benchmark.py

Note

The repository only provides inference capabilities. It does not provide capabilities to train MixUpVI and other deep learning methods, or create signature matrices. Therefore, we provide the weights from the trained models presented in the publication, and pre-computed signature matrices. To use these models on other datasets, one must provide their own weights and/or pre-computed signature matrices.

Results 1st granularity

Results 2nd granularity

Note

These results are computed and guaranteed using the adata.raw.X layer of the CTI dataset available on cellxgene. It will be automatically downloaded when running the benchmark.

Cite

If you found our work useful in your research, please consider citing it at:

@inproceedings{
grouard2025joint,
title={Joint Probabilistic Modeling of Pseudobulk and Single-Cell Transcriptomics Enables Accurate Estimation of Cell Composition},
author={Simon Grouard and Khalil Ouardini and Yann Rodriguez and Jean-Philippe Vert and Almudena Espin-Perez},
booktitle={ICML 2025 Generative AI and Biology (GenBio) Workshop},
year={2025},
url={https://openreview.net/forum?id=JhDJ0MGo2z}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
assets		assets
benchmark		benchmark
docs		docs
hub		hub
src/pydeconv		src/pydeconv
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
.releaserc		.releaserc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

How to install

package

dev

How to use: overview

How to use: detailed

1. Load an already registered signature matrix

2. Load a custom signature matrix

3. Predict

4. Predict (signature based method)

Benchmark

Results 1st granularity

Results 2nd granularity

Cite

About

Uh oh!

Releases 2

Contributors 2

Uh oh!

Languages

License

owkin/PyDeconv

Folders and files

Latest commit

History

Repository files navigation

How to install

package

dev

How to use: overview

How to use: detailed

1. Load an already registered signature matrix

2. Load a custom signature matrix

3. Predict

4. Predict (signature based method)

Benchmark

Results 1st granularity

Results 2nd granularity

Cite

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors 2

Uh oh!

Languages