-
Notifications
You must be signed in to change notification settings - Fork 17
Add nested data generation and tests (zarr-python and zarrita) #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
need to fix upstream zarr-python to respect key_separator on reading
Fix released as 2.7.1. Is trying with both implementations overkill? |
Indeed, it's not supported by z5py yet. I will try to implement this at some point; in the meantime skipping the test and getting the mismatch message in the report is a good solution. |
Did you mean only reading the same file with both or did you want to also generate with both store types? Would we then test with both FSStore and DirectoryStore for the flat case as well? |
Considering the bug you already found, if possible, I'd go for:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using z5py, blosc] PASSED [ 27%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using z5py, gzip] PASSED [ 28%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using z5py, raw] PASSED [ 30%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using z5py, zlib] PASSED [ 31%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using zarr, blosc] PASSED [ 32%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using zarr, gzip] PASSED [ 33%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using zarr, raw] PASSED [ 34%]
test/test_read_all.py::test_correct_read[read zarr (DirectoryStore) zarr using zarr, zlib] PASSED [ 35%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using z5py, blosc] PASSED [ 36%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using z5py, gzip] PASSED [ 37%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using z5py, raw] PASSED [ 38%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using z5py, zlib] PASSED [ 40%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using zarr, blosc] PASSED [ 41%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using zarr, gzip] PASSED [ 42%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using zarr, raw] PASSED [ 43%]
test/test_read_all.py::test_correct_read[read zarr (FSStore) zarr using zarr, zlib] PASSED [ 44%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using z5py, blosc] SKIPPED [ 45%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using z5py, gzip] SKIPPED [ 46%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using z5py, raw] SKIPPED [ 47%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using z5py, zlib] SKIPPED [ 48%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using zarr, blosc] PASSED [ 50%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using zarr, gzip] PASSED [ 51%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using zarr, raw] PASSED [ 52%]
test/test_read_all.py::test_correct_read[read zarr (FSStore, nested) zarr using zarr, zlib] PASSED [ 53%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using z5py, blosc] SKIPPED [ 54%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using z5py, gzip] SKIPPED [ 55%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using z5py, raw] SKIPPED [ 56%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using z5py, zlib] SKIPPED [ 57%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using zarr, blosc] PASSED [ 58%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using zarr, gzip] PASSED [ 60%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using zarr, raw] PASSED [ 61%]
test/test_read_all.py::test_correct_read[read zarr (NestedDirectoryStore, nested) zarr using zarr, zlib] PASSED [ 62%]
Nice additions.
f"file not found: {fpath}. Make sure you have generated the data " | ||
"using 'make data'" | ||
) | ||
test = read_fn(fpath, codec, nested) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While struggling with zarr-developers/zarr-python#717, I introduced a pseudo-checksum in ome_zarr_test_suite that just asserts that there's at least one non-zero value in the opened dataset. My guess is that without zarr-developers/zarr-python#716 that will be the case here as well.
Longer-term, I assume we could make use of some form of true checksum'ing to show that when nesting is not properly detected, that the checksums differ (since all values revert to fill_value
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention here was mainly to guard against mistakes in the test parameter generation. We should never be trying to read from a data/XXX folder that doesn't exist.
The issue I saw in the past with fill_value was when the files did exist, but due to the /
separator not being found as a key, the fill value was used instead. I think they are two separate issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that they are separate, but I think adding a checksum will cause test failures that we aren't seeing yet. But merging this for now. Thanks.
@grlee77 : do you have anything else for this PR? |
No, I think it should be good to go. Merging will likely cause conflicts in #34, but hopefully those would be fairly easy to sort out. |
This PR adds nested data generation and reads with zarr-python (v2) and zarrita (v3).
z5py does not seem to support reading the nested data from zarr-python so I marked it as "skipped" in the test suite. It will currently show up as "FAILED: mismatch" in the test report tables.
I was originally using
zarr.storage.FSStore
to read the nested data with zarr-python in the tests, but this was failing due to zarr-developers/zarr-python#717. I switched it to NestedDirectoryStore, which passes.For the report, row names now have ", flat" or ", nested" appended to indicate which format was read.