Skip to content

numcodecs.zfpy is ready #229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 84 commits into from
Mar 18, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
80a571c
added zfp compressor
halehawk Dec 21, 2018
50cfdb5
modified zfp in setup.py for new zfp054 directory
halehawk Dec 21, 2018
8b40523
changed print statement to raise,also changed zfp source path
halehawk Dec 27, 2018
dbc2b64
Merge remote-tracking branch 'upstream/master'
halehawk Jan 9, 2019
33de949
added set_stride, changed zfp source path
halehawk Jan 9, 2019
035f90d
added zfp source as a submodule
halehawk Jan 9, 2019
1adcf4b
removed zfp source as a submodule
halehawk Jan 9, 2019
a33e947
added zfp module
halehawk Jan 9, 2019
6acb24d
added zfp docs
halehawk Jan 9, 2019
812441a
used zfp submodule and define type by _type
halehawk Jan 10, 2019
5972e09
removed blanked lines
halehawk Jan 10, 2019
1a7c105
added white space after mode
halehawk Jan 10, 2019
161cbca
modified comments to meet numpy docstring
halehawk Jan 23, 2019
d44f308
updated c-blosc
halehawk Mar 5, 2020
a1674da
format setup.py
halehawk Mar 5, 2020
27676f4
added zfp.c
halehawk Mar 5, 2020
785547f
add path to bitstream.h
halehawk Mar 5, 2020
f0361cb
updated include path
halehawk Mar 5, 2020
8cfad3e
update zfp.c with include path
halehawk Mar 5, 2020
e205381
changed include path again
halehawk Mar 5, 2020
fce55da
Merge branch 'master' into master
halehawk Mar 12, 2020
fa33fa4
sync with latest numcodecs master
halehawk Apr 16, 2020
4f90933
Removed zfp submodule
halehawk Apr 16, 2020
dc37e5d
Removed zfp_extension
halehawk Apr 16, 2020
6ef6096
Remove test_zfp.py
halehawk Apr 16, 2020
fffc757
added zfp.py
halehawk Apr 18, 2020
fcf92fa
removed zfp.py
halehawk Apr 18, 2020
e896edf
added zfpy.py
halehawk Apr 18, 2020
ec63632
add pip install zfpy
halehawk Apr 20, 2020
1bd37da
add pip install modified
halehawk Apr 20, 2020
ec3b548
add pip install modified 2
halehawk Apr 20, 2020
58337ff
add pip install modified 3
halehawk Apr 20, 2020
eab0541
add zfpy.py
halehawk Apr 20, 2020
9f508f1
add zfpy in requirements_dev
halehawk Apr 20, 2020
eba81c6
add zfpy finxture
halehawk Apr 20, 2020
9d7ce65
add zfpy with index-url
halehawk Apr 21, 2020
92826fc
remove zfp.rst
halehawk Apr 21, 2020
2dd10db
get 100% coverage
halehawk Apr 21, 2020
7e1fc71
change for coverage
halehawk Apr 21, 2020
0ba8be0
change for PEP 8
halehawk Apr 21, 2020
5c4029c
add ensure_contiguous_ndarray(buf)
halehawk Apr 24, 2020
baa2821
skip tests at py38 and osx
halehawk Apr 24, 2020
548bad8
fixed format issue
halehawk Apr 24, 2020
995dab5
skip tests at py38
halehawk Apr 24, 2020
b99da10
remove variable comp_arr
halehawk Apr 24, 2020
a96ae56
skip test on darwin
halehawk Apr 24, 2020
dca2c58
remove extra blank line
halehawk Apr 24, 2020
01b0870
removed import sys
halehawk Apr 24, 2020
fbe0778
removed zfp.pyx
halehawk Apr 30, 2020
0cd7622
change zfpy repr
halehawk May 1, 2020
80249e0
change the zfpy repr test
halehawk May 1, 2020
6cd85b7
Merge remote-tracking branch 'upstream/master' into zfpy
halehawk May 23, 2020
b2bbc4e
use zfpy at PyPI
halehawk Jun 2, 2020
dea4fdf
update zfpy in release.rst
halehawk Jun 2, 2020
a8ea612
Merge branch 'master' into zfpy
halehawk Jun 8, 2020
6fcf53e
Merge branch 'master' into zfpy
halehawk Jul 20, 2020
04ddd98
Merge branch 'master' into zfpy
halehawk Sep 8, 2020
0d5e320
updating submodule to latest
halehawk Sep 9, 2020
0eab0ff
Revert "updating submodule to latest"
halehawk Sep 9, 2020
2f4fe41
Revert "Revert "updating submodule to latest""
halehawk Sep 9, 2020
032b0dc
Revert "Revert "Revert "updating submodule to latest"""
halehawk Sep 9, 2020
0321bf1
update c-blosc to 1.18.1
halehawk Sep 9, 2020
1650d42
Merge branch 'master' into zfpy
halehawk Sep 10, 2020
509b9a9
Merge branch 'master' into zfpy
halehawk Sep 11, 2020
12108a0
Update release.rst
halehawk Sep 11, 2020
2e6757f
Revert unrelated C file changes
jakirkham Sep 11, 2020
7b45cf6
Use `pytest.skip` to skip testing
jakirkham Sep 11, 2020
5c1cbcb
Run `black` on zfpy
jakirkham Sep 11, 2020
db76330
Merge branch 'master' into zfpy
halehawk Sep 11, 2020
e9c786c
Merge branch 'master' into zfpy
halehawk Sep 16, 2020
c34f99d
Merge branch 'master' into zfpy
halehawk Feb 2, 2021
a2fcb0f
Merge branch 'master' into zfpy
halehawk Mar 17, 2021
ce4be10
Update requirements_rtfd.txt
halehawk Mar 17, 2021
3e6bd38
Update requirements_dev.txt
halehawk Mar 17, 2021
97abde3
Update requirements_dev.txt
halehawk Mar 17, 2021
3669bd5
Update requirements_rtfd.txt
halehawk Mar 17, 2021
d561587
Update requirements_rtfd.txt
halehawk Mar 17, 2021
c7b699a
Update requirements_dev.txt
halehawk Mar 17, 2021
97848a0
Update conf.py
halehawk Mar 18, 2021
bd5d835
Update conf.py
halehawk Mar 18, 2021
fa34826
Update conf.py
halehawk Mar 18, 2021
7e96469
Merge branch 'master' into zfpy
halehawk Mar 18, 2021
0c44788
Update conf.py
halehawk Mar 18, 2021
44dde11
Update conf.py
halehawk Mar 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci-osx.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,4 @@ jobs:
run: |
conda activate env
pytest -v --pyargs numcodecs

2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def __getattr__(cls, name):
return Mock()


MOCK_MODULES = ['msgpack']
MOCK_MODULES = ['msgpack', 'zfpy']
sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES)


Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ Contents
registry
blosc
lz4
zfpy
zstd
zlib
gzip
Expand Down
10 changes: 10 additions & 0 deletions docs/release.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
Release notes
=============


Upcoming Release
----------------

* The :class:`numcodecs.zfpy.ZFPY` codec is now supported on Python 3.8 if
`zfpy==0.5.5 <https://pypi.org/project/zfpy/>`_ is installed


.. _unreleased:

Unreleased
Expand All @@ -15,6 +23,7 @@ Unreleased
By :user:`Jackson Maxfield Brown <JacksonMaxfield>`, :issue:`276`.
Help from :user:`Oleg Höfling <hoefling>`, :issue:`273`.


.. _release_0.7.3:

0.7.3
Expand Down Expand Up @@ -43,6 +52,7 @@ Unreleased
* Update docs regarding wheels.
By :user:`Josh Moore <joshmoore>`, :issue:`250`.


.. _release_0.7.1:

0.7.1
Expand Down
11 changes: 11 additions & 0 deletions docs/zfpy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
ZFPY
====
.. automodule:: numcodecs.zfpy

.. autoclass:: ZFPY

.. autoattribute:: codec_id
.. automethod:: encode
.. automethod:: decode
.. automethod:: get_config
.. automethod:: from_config
Binary file added fixture/zfpy/array.00.npy
Binary file not shown.
Binary file added fixture/zfpy/array.01.npy
Binary file not shown.
Binary file added fixture/zfpy/array.02.npy
Binary file not shown.
Binary file added fixture/zfpy/array.03.npy
Binary file not shown.
Binary file added fixture/zfpy/array.04.npy
Binary file not shown.
Binary file added fixture/zfpy/array.05.npy
Binary file not shown.
Binary file added fixture/zfpy/array.06.npy
Binary file not shown.
Binary file added fixture/zfpy/array.07.npy
Binary file not shown.
10 changes: 10 additions & 0 deletions fixture/zfpy/codec.00/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"compression_kwargs": {
"rate": -1
},
"id": "zfpy",
"mode": 2,
"precision": -1,
"rate": -1,
"tolerance": -1
}
Binary file added fixture/zfpy/codec.00/encoded.00.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.01.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.02.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.03.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.04.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.05.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.06.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.00/encoded.07.dat
Binary file not shown.
10 changes: 10 additions & 0 deletions fixture/zfpy/codec.01/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"compression_kwargs": {
"tolerance": -1
},
"id": "zfpy",
"mode": 4,
"precision": -1,
"rate": -1,
"tolerance": -1
}
Binary file added fixture/zfpy/codec.01/encoded.00.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.01.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.02.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.03.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.04.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.05.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.06.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.01/encoded.07.dat
Binary file not shown.
10 changes: 10 additions & 0 deletions fixture/zfpy/codec.02/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"compression_kwargs": {
"tolerance": -1
},
"id": "zfpy",
"mode": 4,
"precision": -1,
"rate": -1,
"tolerance": -1
}
Binary file added fixture/zfpy/codec.02/encoded.00.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.01.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.02.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.03.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.04.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.05.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.06.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.02/encoded.07.dat
Binary file not shown.
10 changes: 10 additions & 0 deletions fixture/zfpy/codec.03/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"compression_kwargs": {
"precision": -1
},
"id": "zfpy",
"mode": 3,
"precision": -1,
"rate": -1,
"tolerance": -1
}
Binary file added fixture/zfpy/codec.03/encoded.00.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.01.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.02.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.03.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.04.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.05.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.06.dat
Binary file not shown.
Binary file added fixture/zfpy/codec.03/encoded.07.dat
Binary file not shown.
9 changes: 7 additions & 2 deletions numcodecs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
transformation codecs for use in data storage and communication
applications. These include:

* Compression codecs, e.g., Zlib, BZ2, LZMA and Blosc.
* Compression codecs, e.g., Zlib, BZ2, LZMA, ZFPY and Blosc.
* Pre-compression filters, e.g., Delta, Quantize, FixedScaleOffset,
PackBits, Categorize.
* Integrity checks, e.g., CRC32, Adler32.
Expand All @@ -16,7 +16,6 @@
<https://github.com/alimanfoo/numcodecs/issues>`_.

"""

import multiprocessing
import atexit

Expand Down Expand Up @@ -68,6 +67,12 @@
except ImportError: # pragma: no cover
pass

try:
from numcodecs.zfpy import ZFPY
register_codec(ZFPY)
except ImportError: # pragma: no cover
pass

from numcodecs.astype import AsType
register_codec(AsType)

Expand Down
2 changes: 0 additions & 2 deletions numcodecs/tests/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,12 +268,10 @@ def check_backwards_compatibility(codec_id, arrays, codecs, precision=None, pref

# file with codec configuration information
codec_fn = os.path.join(codec_dir, 'config.json')

# one time save config
if not os.path.exists(codec_fn): # pragma: no cover
with open(codec_fn, mode='w') as cf:
_json.dump(codec.get_config(), cf, sort_keys=True, indent=4)

# load config and compare with expectation
with open(codec_fn, mode='r') as cf:
config = _json.load(cf)
Expand Down
86 changes: 86 additions & 0 deletions numcodecs/tests/test_zfpy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import pytest


import numpy as np


try:
# noinspection PyProtectedMember
from numcodecs.zfpy import ZFPY, _zfpy
except ImportError: # pragma: no cover
pytest.skip("ZFPY not available", allow_module_level=True)


from numcodecs.tests.common import (
check_encode_decode_array,
check_config,
check_repr,
check_backwards_compatibility,
check_err_decode_object_buffer,
check_err_encode_object_buffer,
)


codecs = [
ZFPY(mode=_zfpy.mode_fixed_rate, rate=-1),
ZFPY(),
ZFPY(mode=_zfpy.mode_fixed_accuracy, tolerance=-1),
ZFPY(mode=_zfpy.mode_fixed_precision, precision=-1),
]


# mix of dtypes: integer, float, bool, string
# mix of shapes: 1D, 2D, 3D
# mix of orders: C, F
arrays = [
np.linspace(1000, 1001, 1000, dtype="f4"),
np.linspace(1000, 1001, 1000, dtype="f8"),
np.random.normal(loc=1000, scale=1, size=(100, 10)),
np.random.normal(loc=1000, scale=1, size=(10, 10, 10)),
np.random.normal(loc=1000, scale=1, size=(2, 5, 10, 10)),
np.asfortranarray(np.random.normal(loc=1000, scale=1, size=(5, 10, 20))),
np.random.randint(-(2 ** 31), -(2 ** 31) + 20, size=1000, dtype="i4").reshape(
100, 10
),
np.random.randint(-(2 ** 63), -(2 ** 63) + 20, size=1000, dtype="i8").reshape(
10, 10, 10
),
]


def test_encode_decode():
for arr in arrays:
if arr.dtype == np.int32 or arr.dtype == np.int64:
codec = [codecs[-1]]
else:
codec = codecs
for code in codec:
check_encode_decode_array(arr, code)


def test_config():
for codec in codecs:
check_config(codec)


def test_repr():
check_repr("ZFPY(mode=4, tolerance=0.001, rate=-1, precision=-1)")


def test_backwards_compatibility():
for i, code in enumerate(codecs):
if code.mode == _zfpy.mode_fixed_rate:
codec = [code]
check_backwards_compatibility(ZFPY.codec_id, arrays, codec)
else:
check_backwards_compatibility(
ZFPY.codec_id, arrays[: len(arrays) - 2], codecs
)


def test_err_decode_object_buffer():
check_err_decode_object_buffer(ZFPY())


def test_err_encode_object_buffer():
check_err_encode_object_buffer(ZFPY())
89 changes: 89 additions & 0 deletions numcodecs/zfpy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
_zfpy = None
try:
import zfpy as _zfpy
except ImportError: # pragma: no cover
pass


if _zfpy:

from .abc import Codec
from .compat import ndarray_copy, ensure_contiguous_ndarray, ensure_bytes

# noinspection PyShadowingBuiltins
class ZFPY(Codec):
"""Codec providing compression using zfpy via the Python standard
library.

Parameters
----------
mode : integer
One of the zfpy mode choice, e.g., ``zfpy.mode_fixed_accuracy``.
tolerance : double, optional
A double-precision number, specifying the compression accuracy needed.
rate : double, optional
A double-precision number, specifying the compression rate needed.
precision : int, optional
A integer number, specifying the compression precision needed.

"""

codec_id = "zfpy"

def __init__(
self,
mode=_zfpy.mode_fixed_accuracy,
tolerance=-1,
rate=-1,
precision=-1,
compression_kwargs=None,
):
self.mode = mode
if mode == _zfpy.mode_fixed_accuracy:
self.compression_kwargs = {"tolerance": tolerance}
elif mode == _zfpy.mode_fixed_rate:
self.compression_kwargs = {"rate": rate}
elif mode == _zfpy.mode_fixed_precision:
self.compression_kwargs = {"precision": precision}
else:
pass

self.tolerance = tolerance
self.rate = rate
self.precision = precision

def encode(self, buf):

# normalise inputs
buf = ensure_contiguous_ndarray(buf)

# do compression
return _zfpy.compress_numpy(
buf, write_header=True, **self.compression_kwargs
)

def decode(self, buf, out=None):

# normalise inputs
buf = ensure_bytes(buf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need bytes? This will copy the data unlike ensure_contiguous_ndarray or ensure_ndarray.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

John, can you clarify this. I am confused why the input to decode will be an array? The numcodecs API docs describe buf as follows:

Encoded data. May be any object supporting the new-style buffer protocol or array.array under Python 2.

Should we open an issue in zfpy to inquire whether they could accept an array instead of bytes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like zfpy could easily be modified to accept a buffer rather than bytes:
https://github.com/LLNL/zfp/blob/079c409fcb53de5a06fa4336972c04869e33c48d/python/zfpy.pyx#L331-L343

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking Peter for a py38 build on linux and windows. It may take a while to get it. In the mean time, can I skip py38 testing and osx testing? I cannot find a way to skip py38 testing for numcodecs.zfpy only. Do you know how to do it? @rabernat @jakirkham

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabernat @halehawk, ensure_ndarray takes a view onto anything that is a new-style buffer and returns it as a NumPy array (also a new-style buffer). Though it is just the same compressed 1-D uint8 array of data. Mainly it acts as a type checker as it raises on anything that is not a new-style buffer. It also gives us something easier to manipulate. That said, we could cast it to a memoryview after if you prefer. The main point is ensure_ndarray is basically free whereas ensure_bytes always copies (unless it is actually a bytes object). Would be better not to copy if we can avoid it both for performance and memory usage reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conversation does not seem resolved. Have we decided to keep ensure_bytes? Have we discussed this choice with the zfp devs?

if out is not None:
out = ensure_contiguous_ndarray(out)

# do decompression
dec = _zfpy.decompress_numpy(buf)

# handle destination
if out is not None:
return ndarray_copy(dec, out)
else:
return dec

def __repr__(self):
r = "%s(mode=%r, tolerance=%s, rate=%s, precision=%s)" % (
type(self).__name__,
self.mode,
self.tolerance,
self.rate,
self.precision,
)
return r
2 changes: 2 additions & 0 deletions requirements_dev.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
Cython==0.29.21
msgpack==1.0.2
numpy==1.19.0
zfpy==0.5.5; python_version < '3.9'

1 change: 1 addition & 0 deletions requirements_rtfd.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ numpydoc
mock
numpy
cython
zfpy==0.5.5; python_version < '3.9'
3 changes: 3 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ def blosc_extension():
if os.path.isdir(d)]
include_dirs += [d for d in glob('c-blosc/internal-complibs/*/*')
if os.path.isdir(d)]
include_dirs += [d for d in glob('c-blosc/internal-complibs/*/*/*')
if os.path.isdir(d)]
define_macros += [('HAVE_LZ4', 1),
('HAVE_SNAPPY', 1),
('HAVE_ZLIB', 1),
Expand Down Expand Up @@ -315,6 +317,7 @@ def run_setup(with_extensions):
if with_extensions:
ext_modules = (blosc_extension() + zstd_extension() + lz4_extension() +
compat_extension() + shuffle_extension() + vlen_extension())

cmdclass = dict(build_ext=ve_build_ext)
else:
ext_modules = []
Expand Down