Closed
Description
This is really neat, and I'm excited to try things out with some additional HDF files!
I realize the goal is to flesh out the specification and this is not a general conversion tool yet, but it seems like working with more HDF files out in the wild might bring things to light.
Some initial questions/suggestions:
- How to write out the .zchunkstore? seems things are setup currently to just output logging info
- Add chunk info output to logger (maybe dtype and MB too?)
lggr.debug(f'_ARRAY_CHUNKS = {h5obj.chunks}')
- Could be useful to first check input file is valid H5. this is easy for a local file, not sure about remote:
if not h5py.is_hdf5(f):
raise ValueError('Not an hdf5 file')
What isn't supported?
https://github.com/intake/fsspec-reference-maker/blob/bf41138add53b0201e583aa40840cd4fa5fb907b/fsspec_reference_maker/hdf.py#L103-L106
The first file I tried to generate .zchunkstore with ran into the above, code and traceback below:
def ATL06_remote():
return hdf2zarr.run(
's3://its-live-data.jpl.nasa.gov/icesat2/alt06/rel003/ATL06_20181230162257_00340206_003_01.h5',
mode='rb', anon=False, requester_pays=True,
default_fill_cache=False, default_cache_type='none'
)
DEBUG:h5-to-zarr:translator:Group: /gt1l/land_ice_segments
DEBUG:h5-to-zarr:translator:Dataset: /gt1l/land_ice_segments/atl06_quality_summary
Traceback (most recent call last):
File "h5py/h5o.pyx", line 302, in h5py.h5o.cb_obj_simple
File "/Users/scott/miniconda3/envs/fsspec-ref/lib/python3.8/site-packages/h5py/_hl/group.py", line 591, in proxy
return func(name, self[name])
File "/Users/scott/GitHub/fsspec-reference-maker/fsspec_reference_maker/hdf.py", line 105, in translator
raise RuntimeError(
RuntimeError: /gt1l/land_ice_segments/atl06_quality_summary uses unsupported HDF5 filters
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./test.py", line 45, in <module>
ATL06_remote()
File "./test.py", line 15, in ATL06_remote
return hdf2zarr.run(
File "/Users/scott/GitHub/fsspec-reference-maker/fsspec_reference_maker/hdf.py", line 273, in run
return h5chunks.translate()
File "/Users/scott/GitHub/fsspec-reference-maker/fsspec_reference_maker/hdf.py", line 54, in translate
self._h5f.visititems(self.translator)
File "/Users/scott/miniconda3/envs/fsspec-ref/lib/python3.8/site-packages/h5py/_hl/group.py", line 592, in visititems
return h5o.visit(self.id, proxy)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
SystemError: <built-in function visit> returned a result with an error set
Metadata
Metadata
Assignees
Labels
No labels