Skip to content

Remove schema data #343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025
Merged

Conversation

benjeffery
Copy link
Contributor

@benjeffery benjeffery commented Apr 3, 2025

Stacked on #339 - clean diff at 9f727a1

@benjeffery benjeffery force-pushed the remove-schema-data branch from 9f727a1 to f3c040d Compare April 3, 2025 13:15
@benjeffery benjeffery mentioned this pull request Apr 3, 2025
@jeromekelleher
Copy link
Contributor

Yes, looks good. There's a few details to follow up on, but it's the right approach.

@benjeffery benjeffery force-pushed the remove-schema-data branch from f3c040d to d61af16 Compare April 5, 2025 10:29
@coveralls
Copy link
Collaborator

coveralls commented Apr 5, 2025

Coverage Status

coverage: 98.409% (-0.4%) from 98.765%
when pulling aedf90c on benjeffery:remove-schema-data
into eed60f0 on sgkit-dev:main.

@benjeffery benjeffery force-pushed the remove-schema-data branch 2 times, most recently from d30687c to 5ce3baa Compare April 8, 2025 23:44
@benjeffery
Copy link
Contributor Author

Rebased

Copy link
Contributor

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think we just need to formalise the interface we're implementing a bit here.

CHANGELOG.md Outdated
@@ -4,6 +4,8 @@

- Add support for unindexed (and uncompressed) VCFs (#337)

- Remove explicit sample, contig and filter lists from the schema (#343)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking schema format change, so will require existing ICFs to be recreated. I listed these as "Breaking changes" below

bio2zarr/vcz.py Outdated
self.encode_samples(root)
self.encode_filter_id(root)
self.encode_contig_id(root)
if hasattr(self.source, "samples"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Control flow by hasattr is pretty :vomit: for me - so easy for mistakes to get hidden as code evolves over time.

Every implementation has to have samples anyway, right? And, I would imagine likewise for contigs, so they should both be unconditional.

For filters, I think we should probably define a superclass which defines the interface we require here, and have filters return None by default. That way we can detect if filters (and other optional stuff) makes sense or not.

@benjeffery benjeffery force-pushed the remove-schema-data branch 2 times, most recently from de7b6ce to 5f1f224 Compare April 9, 2025 23:44
@benjeffery
Copy link
Contributor Author

I've added a base class for the sources and removed the mapping base class from the icf source. I've kept contigs optional as plink doesn't have them in this PR.

@jeromekelleher jeromekelleher merged commit 9a8745f into sgkit-dev:main Apr 11, 2025
15 of 16 checks passed
@benjeffery benjeffery deleted the remove-schema-data branch April 11, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants