You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've tried to come up with a minimal example, but it's tricky to illustrate without showing the context. Here is an interaction with zarr with some instrumentation in the encode/decode methods for json.
INPUT: (1,)
INPUT: (1,)
OUTPUT: (1, 1)
OUTPUT: (1, 2)
Traceback (most recent call last):
File "dev.py", line 34, in <module>
print(z[:]) # Borks
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 559, in __getitem__
return self.get_basic_selection(selection, fields=fields)
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 685, in get_basic_selection
fields=fields)
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 727, in _get_basic_selection_nd
return self._get_selection(indexer=indexer, out=out, fields=fields)
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 1015, in _get_selection
drop_axes=indexer.drop_axes, fields=fields)
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 1608, in _chunk_getitem
chunk = self._decode_chunk(cdata)
File "/home/jk/.local/lib/python3.5/site-packages/zarr/core.py", line 1751, in _decode_chunk
chunk = chunk.reshape(self._chunks, order=self._order)
ValueError: cannot reshape array of size 2 into shape (1,)
The INPUT lines are the shapes of the input arrays to encode and the OUTPUT lines are the corresponding output shapes of the arrays from decode.
Problem description
When calling numpy.array([["s1", "s2"], ["s3, "s4"]], dtype=object) numpy is quite aggressive about reshaping the array to store things more efficiently.
I've played around with this a fair bit, and I think the only options are to
Drop the numpy dependency in the encoding and decoding steps for JSON (i.e, don't include the dtype in the JSON encoding), and provide the supplied argument directly to the JSON encoder (and conversely, directly return the value of json.loads() from decode.
Also encode the input array shape in the JSON encoding.
Both of these options are ugly because they break backward compatibility. I'll make a PR for demonstrating option 2 in a minute for discussion.
The text was updated successfully, but these errors were encountered:
Thanks a lot for this. FWIW if there is no way to fix this without changing the encoded format then we can work through that, there's some information in the developer guide which is intended to cover this type of situation. Basically any change to the encoding format should be implemented via a new codec class with a new codec ID, so compatibility is preserved for any existing data using the old codec. But probably best to figure out the technical solution to the problem first, then deal with compatibility.
This is carrying on from zarr-developers/zarr-python#258
I've tried to come up with a minimal example, but it's tricky to illustrate without showing the context. Here is an interaction with zarr with some instrumentation in the encode/decode methods for json.
output:
The INPUT lines are the shapes of the input arrays to
encode
and the OUTPUT lines are the corresponding output shapes of the arrays fromdecode
.Problem description
When calling
numpy.array([["s1", "s2"], ["s3, "s4"]], dtype=object)
numpy is quite aggressive about reshaping the array to store things more efficiently.I've played around with this a fair bit, and I think the only options are to
Drop the numpy dependency in the encoding and decoding steps for JSON (i.e, don't include the dtype in the JSON encoding), and provide the supplied argument directly to the JSON encoder (and conversely, directly return the value of json.loads() from decode.
Also encode the input array shape in the JSON encoding.
Both of these options are ugly because they break backward compatibility. I'll make a PR for demonstrating option 2 in a minute for discussion.
The text was updated successfully, but these errors were encountered: