Skip to content

Top-level Python package structure proposal #349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeromekelleher opened this issue Apr 14, 2025 · 0 comments · Fixed by #375
Closed

Top-level Python package structure proposal #349

jeromekelleher opened this issue Apr 14, 2025 · 0 comments · Fixed by #375

Comments

@jeromekelleher
Copy link
Contributor

jeromekelleher commented Apr 14, 2025

We do want to start providing a Python API soon, and I think @benjeffery's excellent refactoring work in #339 has brought this pretty close. Here's a proposal for how we could provide a long-term stable API that allows the package to grow new (potentially optional) modules for formats, and also having support for writing things other than VCZ.

  • bio2zarr.vcz: All code for writing VCZ
  • bio2zarr.plink: All code for working with plink format, defining conversion methods to VCZ
  • bio2zarr.vcf: All code for working with VCF format, defining conversion methods to VCZ
  • bio2zarr.tskit: All code for working with tskit format, defining conversion methods to VCZ
  • bio2zarr.bgen...

This feels like an easy thing for users to remember, as well as giving a nice clean separation between multi-format reading and writing-to-Zarr code. It should be straighforward to have optional dependencies then, if we don't want to bundle (e.g.) plink and bgen into the default install to keep things simpler for users when dependencies misbehave.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant