Skip to content

Commit 6e80e97

Browse files
authored
Revert change to default write_empty_chunks. (#1001)
1 parent b8565a9 commit 6e80e97

File tree

4 files changed

+22
-14
lines changed

4 files changed

+22
-14
lines changed

docs/release.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@ Release notes
66
Unreleased
77
----------
88

9+
Bug fixes
10+
~~~~~~~~~
11+
12+
* Changes the default value of ``write_empty_chunks`` to ``True`` to prevent
13+
unanticipated data losses when the data types do not have a proper default
14+
value when empty chunks are read back in.
15+
By :user:`Vyas Ramasubramani <vyasr>`; :issue:`965`.
16+
917
.. _release_2.11.1:
1018

1119
2.11.1

docs/tutorial.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1309,7 +1309,7 @@ Empty chunks
13091309

13101310
As of version 2.11, it is possible to configure how Zarr handles the storage of
13111311
chunks that are "empty" (i.e., every element in the chunk is equal to the array's fill value).
1312-
When creating an array with ``write_empty_chunks=False`` (the default),
1312+
When creating an array with ``write_empty_chunks=False``,
13131313
Zarr will check whether a chunk is empty before compression and storage. If a chunk is empty,
13141314
then Zarr does not store it, and instead deletes the chunk from storage
13151315
if the chunk had been previously stored.
@@ -1318,7 +1318,7 @@ This optimization prevents storing redundant objects and can speed up reads, but
13181318
added computation during array writes, since the contents of
13191319
each chunk must be compared to the fill value, and these advantages are contingent on the content of the array.
13201320
If you know that your data will form chunks that are almost always non-empty, then there is no advantage to the optimization described above.
1321-
In this case, creating an array with ``write_empty_chunks=True`` will instruct Zarr to write every chunk without checking for emptiness.
1321+
In this case, creating an array with ``write_empty_chunks=True`` (the default) will instruct Zarr to write every chunk without checking for emptiness.
13221322

13231323
The following example illustrates the effect of the ``write_empty_chunks`` flag on
13241324
the time required to write an array with different values.::

zarr/core.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ def __init__(
161161
cache_metadata=True,
162162
cache_attrs=True,
163163
partial_decompress=False,
164-
write_empty_chunks=False,
164+
write_empty_chunks=True,
165165
zarr_version=None,
166166
):
167167
# N.B., expect at this point store is fully initialized with all

zarr/creation.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -74,11 +74,11 @@ def create(shape, chunks=True, dtype=None, compressor='default',
7474
.. versionadded:: 2.8
7575
7676
write_empty_chunks : bool, optional
77-
If True, all chunks will be stored regardless of their contents. If
78-
False (default), each chunk is compared to the array's fill value prior
79-
to storing. If a chunk is uniformly equal to the fill value, then that
80-
chunk is not be stored, and the store entry for that chunk's key is
81-
deleted. This setting enables sparser storage, as only chunks with
77+
If True (default), all chunks will be stored regardless of their
78+
contents. If False, each chunk is compared to the array's fill value
79+
prior to storing. If a chunk is uniformly equal to the fill value, then
80+
that chunk is not be stored, and the store entry for that chunk's key
81+
is deleted. This setting enables sparser storage, as only chunks with
8282
non-fill-value data are stored, at the expense of overhead associated
8383
with checking the data of each chunk.
8484
@@ -403,7 +403,7 @@ def open_array(
403403
chunk_store=None,
404404
storage_options=None,
405405
partial_decompress=False,
406-
write_empty_chunks=False,
406+
write_empty_chunks=True,
407407
*,
408408
zarr_version=None,
409409
dimension_separator=None,
@@ -462,11 +462,11 @@ def open_array(
462462
is Blosc, when getting data from the array chunks will be partially
463463
read and decompressed when possible.
464464
write_empty_chunks : bool, optional
465-
If True, all chunks will be stored regardless of their contents. If
466-
False (default), each chunk is compared to the array's fill value prior
467-
to storing. If a chunk is uniformly equal to the fill value, then that
468-
chunk is not be stored, and the store entry for that chunk's key is
469-
deleted. This setting enables sparser storage, as only chunks with
465+
If True (default), all chunks will be stored regardless of their
466+
contents. If False, each chunk is compared to the array's fill value
467+
prior to storing. If a chunk is uniformly equal to the fill value, then
468+
that chunk is not be stored, and the store entry for that chunk's key
469+
is deleted. This setting enables sparser storage, as only chunks with
470470
non-fill-value data are stored, at the expense of overhead associated
471471
with checking the data of each chunk.
472472

0 commit comments

Comments
 (0)