Skip to content

PCA implementation #262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ repos:
hooks:
- id: mypy
args: ["--strict", "--show-error-codes"]
additional_dependencies: ["numpy", "xarray", "dask[array]", "scipy", "typing-extensions", "zarr", "numba"]
additional_dependencies: ["numpy", "xarray", "dask[array]", "scipy", "typing-extensions", "zarr", "numba", "dask-ml"]
2 changes: 2 additions & 0 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pytest-cov
pytest-datadir
pytest-mock
hypothesis
scikit-allel
statsmodels
zarr
msprime
Expand All @@ -18,3 +19,4 @@ git+https://github.com/pangeo-data/rechunker.git
cbgen
cyvcf2; platform_system != "Windows"
yarl
matplotlib
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
numpy
xarray
dask[array]
dask-ml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pulls in scikit-learn, Dask distributed and some other dependencies. That's probably OK, but thinking about if there's any way to minimise transitive dependencies here.

scipy
typing-extensions
numba
Expand Down
10 changes: 7 additions & 3 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ install_requires =
numpy
xarray
dask[array]
dask-ml
scipy
zarr
numba
Expand Down Expand Up @@ -87,15 +88,14 @@ ignore =
profile = black
default_section = THIRDPARTY
known_first_party = sgkit
known_third_party = dask,fire,glow,hail,hypothesis,invoke,msprime,numba,numpy,pandas,pkg_resources,pyspark,pytest,setuptools,sgkit_plink,sklearn,sphinx,typing_extensions,xarray,yaml,zarr
known_third_party = allel,dask,fire,glow,hail,hypothesis,invoke,msprime,numba,numpy,pandas,pkg_resources,pyspark,pytest,setuptools,sgkit_plink,sklearn,sphinx,typing_extensions,xarray,yaml,zarr
multi_line_output = 3
include_trailing_comma = True
force_grid_wrap = 0
use_parentheses = True
line_length = 88

[mypy-allel.*]
ignore_missing_imports = True

[mypy-callee.*]
ignore_missing_imports = True
[mypy-cyvcf2.*]
Expand All @@ -104,6 +104,8 @@ ignore_missing_imports = True
ignore_missing_imports = True
[mypy-fsspec.*]
ignore_missing_imports = True
[mypy-dask_ml.*]
ignore_missing_imports = True
[mypy-numpy.*]
ignore_missing_imports = True
[mypy-pandas.*]
Expand Down Expand Up @@ -132,6 +134,8 @@ ignore_missing_imports = True
ignore_missing_imports = True
[mypy-yarl.*]
ignore_missing_imports = True
[mypy-allel.*]
ignore_missing_imports = True
[mypy-sgkit.*]
allow_redefinition = True
[mypy-sgkit.*.tests.*]
Expand Down
2 changes: 2 additions & 0 deletions sgkit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from .stats.association import gwas_linear_regression
from .stats.hwe import hardy_weinberg_test
from .stats.pc_relate import pc_relate
from .stats.pca import pca
from .stats.popgen import Fst, Tajimas_D, divergence, diversity
from .stats.regenie import regenie
from .testing import simulate_genotype_call_dataset
Expand All @@ -38,4 +39,5 @@
"pc_relate",
"simulate_genotype_call_dataset",
"variables",
"pca",
]
Loading