Developed zfp python wrapper for numcodecs and is ready to be used #160

halehawk · 2018-12-21T23:21:08Z

Closes #117

Want to know how to add tests.
Also want to know how to add zfp-0.5.4 c source codes to c-blosc/internal-complibs

TODO:

Unit tests and/or doctests in docstrings
tox -e py37 passes locally
tox -e py27 passes locally
Docstrings and API docs for any new/modified user-facing classes and functions
Changes documented in docs/release.rst
tox -e docs passes locally
AppVeyor and Travis CI passes
Test coverage to 100% (Coveralls passes)

rabernat · 2018-12-22T16:40:23Z

Hi @halehawk -- thanks so much for getting this started! We really appreciate your contribution and we look forward to working with you to get this pull request merged.

Since you are a new numcodecs contributor, I would like to explain some aspects of our process. The requirements for all PRs are outlined in the "TODO" list at the top. There are 8 items that must be satisfied (and currently 0 have been checked off). These requirements are all related to testing and documentation. Nearly every open source python package follows similar procedures. Since you asked for some help on tests, I'm going to assume you are unfamiliar with this process but interested in in learning.

Numcodecs has a comprehensive test suite which helps us maintain the reliability of the package. These tests are located in the numcodecs/tests/ subdirectory. For example, the tests for blosc are in
https://github.com/zarr-developers/numcodecs/blob/master/numcodecs/tests/test_blosc.py.
You will need to add a new test file test_zfp.py which validates that your code does what it is supposed to do. This is significantly more work that just writing the code itself; however, it is a useful and necessary part of the development process. We will be happy to advise you on the details of testing once you have made a first pass at it.

The test suite is run using tox. You need to have tox and pytest installed in your local environment (full requirements are in requirements_test.txt). You run the tests from the command line using the commands tox -e py37 and tox -e py27. You should get used to running these command from your local numcodecs development environment.

The tests are also run automatically in the cloud using two free services, travis (for unix) and appveyor (for windows). As you can see above, those tests are currently failing on this PR. If you follow the travis link, you will eventually see the following error

==================================== ERRORS ====================================
________________ ERROR collecting numcodecs/tests/test_blosc.py ________________
ImportError while importing test module '/home/travis/build/zarr-developers/numcodecs/numcodecs/tests/test_blosc.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
numcodecs/tests/test_blosc.py:12: in <module>
    from numcodecs import blosc
numcodecs/compat_ext.pxd:4: in init numcodecs.blosc
    cdef class Buffer:
E   ModuleNotFoundError: No module named 'numcodecs.compat_ext'

This indicates that the tests could not be run at all because of an import error introduced by your changes.

The fraction of the package which is covered by tests is called the "coverage". We use the coveralls utility to measure our coverage. Currently coverage is at 100%: https://coveralls.io/github/zarr-developers/numcodecs?branch=master.

We will also want documentation on the new zfp functions. Our documentation is built by sphinx. You will want to add a new documentation page modeled after https://github.com/zarr-developers/numcodecs/blob/master/docs/blosc.rst.

I would recommend starting with the tests. Your first step will be to get the test suite running locally. Once all the old tests are running correctly and passing, then you can begin adding new tests.

To update your pull request, simply make more commits to your local branch and then run git push. This page will automatically update, and all the automated tests will re-run.

Please let us know if anything is unclear. If you are new to this process, it can be a bit overwhelming. However, these are very useful skills to learn for anyone involved in open source software, so I hope you bear with us and help finish this off right.

Thanks again for your excellent contribution!

setup.py

halehawk · 2018-12-22T16:44:13Z

I cannot compile zstd with python 3 and gcc successfully. I suppose to uncomment it out when I submit my codes.

…

Sent from my iPhone

On Dec 22, 2018, at 9:41 AM, Ryan Abernathey ***@***.***> wrote: @rabernat commented on this pull request. In setup.py: > @@ -300,8 +335,8 @@ def build_extension(self, ext): def run_setup(with_extensions): if with_extensions: - ext_modules = (blosc_extension() + zstd_extension() + lz4_extension() + - compat_extension() + vlen_extension()) + ext_modules = (blosc_extension() + lz4_extension() + # zstd_extension() + + zfp_extension() + compat_extension() + vlen_extension()) Is there a reason why you commented out zstd_extension()? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

halehawk · 2018-12-22T16:52:42Z

I write a test_zfp.py, but when I try to run the test, it keeps saying cannot import zfp. I tired several ways and can’t make it work.

…

Sent from my iPhone

On Dec 22, 2018, at 9:41 AM, Ryan Abernathey ***@***.***> wrote: @rabernat commented on this pull request. In setup.py: > @@ -300,8 +335,8 @@ def build_extension(self, ext): def run_setup(with_extensions): if with_extensions: - ext_modules = (blosc_extension() + zstd_extension() + lz4_extension() + - compat_extension() + vlen_extension()) + ext_modules = (blosc_extension() + lz4_extension() + # zstd_extension() + + zfp_extension() + compat_extension() + vlen_extension()) Is there a reason why you commented out zstd_extension()? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

lindstro · 2018-12-22T18:26:32Z

setup.py

-        ext_modules = (blosc_extension() + zstd_extension() + lz4_extension() +
-                       compat_extension() + vlen_extension())
+        ext_modules = (blosc_extension() +  lz4_extension() + # zstd_extension() +
+                       zfp_extension() + compat_extension() + vlen_extension())


@halehawk Nice work! I have a couple of comments.

First, you're making the assumption that the ndarray uses C ordering, which may or may not be the case. Perhaps Zarr enforces this? It would be better to verify that this is the case, perhaps by checking ndarray.flags or ndarray.strides (I'm no Python expert--there may be a better way).

Second, the default settings of tolerance=0.01, precision=32, and rate=16 are arbitrary and inconsistent with the default expert mode parameters (minbits, maxbits, maxprec, minexp). Absent such specifications, it would be better to use the zfp defaults. For instance, tolerance=0.01 could lose all data if exponents are small. I would expect tolerance=0 to be a more reasonable default value. precision=32 is similarly not an appropriate choice for double-precision data. I would suggest using ZFP_MAX_PREC, which is analogous to setting tolerance=0 and works both for floats and doubles Finally, a default rate of 16 will also inevitably default to some level of data loss. It is difficult to suggest a good default rate, as you wouldn't want to use more bits per value than for the uncompressed representation. Perhaps the safest setting is rate=0, which I believe implies that the user has to override this value and set a rate meaningful to the application.

rabernat · 2018-12-22T18:49:55Z

I write a test_zfp.py, but when I try to run the test, it keeps saying cannot import zfp. I tired several ways and can’t make it work.

Then how do you know that any of your code works? How are you testing it now?

rabernat · 2018-12-22T18:54:08Z

setup.py

+
+    # setup sources - use ZFP bundled in blosc
+    zfp_sources = glob('c-blosc/internal-complibs/zfp*/src/*.c')
+    include_dirs = [d for d in glob('c-blosc/internal-complibs/zfp*/*') if os.path.isdir(d)]


I'm confused about this. zfp is not part of blosc. Why are you looking for it there?

halehawk · 2018-12-22T19:04:45Z

Hi, Ryan, I create a local zfp directory under c-blocs/internal-compiles. Usually where shall I place zfp c source codes?

…

Sent from my iPhone

On Dec 22, 2018, at 11:54 AM, Ryan Abernathey ***@***.***> wrote: @rabernat commented on this pull request. In setup.py: > @@ -205,6 +204,42 @@ def lz4_extension(): return extensions +def zfp_extension(): + import numpy + info('setting up ZFP extension') + + extra_compile_args = list(base_compile_args) + define_macros = [] + + # setup sources - use ZFP bundled in blosc + zfp_sources = glob('c-blosc/internal-complibs/zfp*/src/*.c') + include_dirs = [d for d in glob('c-blosc/internal-complibs/zfp*/*') if os.path.isdir(d)] I'm confused about this. zfp is not part of blosc. Why are you looking for it there? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

halehawk · 2018-12-22T19:20:51Z

I can run test at the same directory of zfp.pyx. I put one line to get path of __file__ in the __init__.py in the same directory.

…

Sent from my iPhone

On Dec 22, 2018, at 11:49 AM, Ryan Abernathey ***@***.***> wrote: I write a test_zfp.py, but when I try to run the test, it keeps saying cannot import zfp. I tired several ways and can’t make it work. Then how do you know that any of your code works? How are you testing it now? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

halehawk · 2018-12-22T19:26:46Z

I noticed your zfp requests data array dimension different from other compression algorithms, I make my wrapper working with your request now. Also I think your zfp enforce users use f-order, am I right? If so, I can change even c order array to f order array. But the memory order of data array can be easily changed by user after Zfp.decode. Your suggestion about default values are pretty good. I will modify accordingly.

…

Sent from my iPhone

On Dec 22, 2018, at 11:26 AM, Peter Lindstrom ***@***.***> wrote: @lindstro commented on this pull request. In setup.py: > @@ -300,8 +335,8 @@ def build_extension(self, ext): def run_setup(with_extensions): if with_extensions: - ext_modules = (blosc_extension() + zstd_extension() + lz4_extension() + - compat_extension() + vlen_extension()) + ext_modules = (blosc_extension() + lz4_extension() + # zstd_extension() + + zfp_extension() + compat_extension() + vlen_extension()) @halehawk Nice work! I have a couple of comments. First, you're making the assumption that the ndarray uses C ordering, which may or may not be the case. Perhaps Zarr enforces this? It would be better to verify that this is the case, perhaps by checking ndarray.flags or ndarray.strides (I'm no Python expert--there may be a better way). Second, the default settings of tolerance=0.01, precision=32, and rate=16 are arbitrary and inconsistent with the default expert mode parameters (minbits, maxbits, maxprec, minexp). Absent such specifications, it would be better to use the zfp defaults. For instance, tolerance=0.01 could lose all data if exponents are small. I would expect tolerance=0 to be a more reasonable default value. precision=32 is similarly not an appropriate choice for double-precision data. I would suggest using ZFP_MAX_PREC, which is analogous to setting tolerance=0 and works both for floats and doubles Finally, a default rate of 16 will also inevitably default to some level of data loss. It is difficult to suggest a good default rate, as you wouldn't want to use more bits per value than for the uncompressed representation. Perhaps the safest setting is rate=0, which I believe implies that the user has to override this value and set a rate meaningful to the application. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

halehawk · 2018-12-22T19:31:15Z

Peter, zfp_field has a variable type which is a statement in python, I have to change it to _type in your original package. So is exec in zfp_stream_exec.

…

Sent from my iPhone

On Dec 22, 2018, at 11:54 AM, Ryan Abernathey ***@***.***> wrote: @rabernat commented on this pull request. In setup.py: > @@ -205,6 +204,42 @@ def lz4_extension(): return extensions +def zfp_extension(): + import numpy + info('setting up ZFP extension') + + extra_compile_args = list(base_compile_args) + define_macros = [] + + # setup sources - use ZFP bundled in blosc + zfp_sources = glob('c-blosc/internal-complibs/zfp*/src/*.c') + include_dirs = [d for d in glob('c-blosc/internal-complibs/zfp*/*') if os.path.isdir(d)] I'm confused about this. zfp is not part of blosc. Why are you looking for it there? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

lindstro · 2018-12-22T20:15:55Z

zfp expects the dimensions to be specified in what's commonly referred to as Fortran order, i.e., x (the leftmost dimension) varies fastest, then y, then z, etc. The rightmost dimension varies slowest. This behavior can be modified by specifying strides (see the zfp_set_stride family of functions).

My concern was that the ndarray being compressed can be ordered in Python to follow a Fortran contiguous ordering (e.g., via numpy.asfortranarray or via order='F' in the array constructor), in which case I believe (although I could be wrong) the dimensions from ndarray.shape should be reversed. In other words, ndarray.shape will return the same whether order='C' or order='F', but the strides (i.e., memory layout) will be different, and zfp needs to know the correct memory layout.

A more general solution is to consult ndarray.strides and set the zfp_field strides to match this. Then you don't have to explicitly test if he order is C, F, or something else.

jakirkham · 2018-12-22T21:14:19Z

We have some utility functions (e.g. ensure_continguous_ndarray), which takes a ndarrary view of any continuous array-like object implementing and flattens it. Expect that’s what you will want to do here.

jakirkham · 2018-12-22T21:16:28Z

As to the C code, we will probably want to do the same thing we do with Blosc and include it as a Git submodule.

halehawk · 2018-12-22T21:25:44Z

I did the following:

elif input_array.ndim == 2:
zfp_field_set_size_2d(field,input_array.shape[1],input_array.shape[0])
elif input_array.ndim == 3:
zfp_field_set_size_3d(field,input_array.shape[2],input_array.shape[1],input_array.shape[0])
elif input_array.ndim == 4:
print(input_array.shape)
zfp_field_set_size_4d(field,input_array.shape[3],input_array.shape[2],input_array.shape[1],input_array.shape[0])
else:

This way, zfp can read the c order array in a correct way.

After I decompress, I reshape the array in (nw,nz,ny,nx). I tested in accuracy and precision modes, worked fine.

ensure_contiguous_ndarray flattens the array, which zfp does not expect, zfp works better with higher dimension data.

I will try set_stride later.

halehawk · 2018-12-22T21:38:55Z

Do you mean link to Peter’s LLNL/zfp on github directly? Then type and exec need to changed.

…

Sent from my iPhone

On Dec 22, 2018, at 2:16 PM, jakirkham ***@***.***> wrote: As to the C code, we will probably want to do the same thing we do with Blosc and include it as a Git submodule. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jakirkham · 2018-12-22T21:50:41Z

Ah ok. Didn’t realize the compressor was doing something different for higher dimensional arrays. Your proposal makes sense.

IIRC under the hood Buffer is using ensure_contiguous_ndarray, but it should still work as the data is unchanged.

Basically yes. Would go ahead and make it a new Git submodule. If you’re unfamiliar with Git submodules, this doc may help.

jakirkham · 2018-12-22T21:58:24Z

Also for errors we should raise exceptions instead of printing. For example using raise TypeError(...) if the type is unsupported.

lindstro · 2018-12-22T22:24:34Z

I did the following:

elif input_array.ndim == 2:
zfp_field_set_size_2d(field,input_array.shape[1],input_array.shape[0])
elif input_array.ndim == 3:
zfp_field_set_size_3d(field,input_array.shape[2],input_array.shape[1],input_array.shape[0])
elif input_array.ndim == 4:
print(input_array.shape)
zfp_field_set_size_4d(field,input_array.shape[3],input_array.shape[2],input_array.shape[1],input_array.shape[0])
else:

This works as long as the ndarray uses (the default) C order layout, but fails otherwise. What if the user attempts to compress an ndarray laid out in Fortran order?

My suggestion would be to keep your code the same, since C order will be the most common case for numpy arrays, but to also set strides to correctly handle other layouts. Note that ndarray.strides are measured in bytes, while zfp measures strides in number of scalars.

lindstro · 2018-12-22T22:27:34Z

Do you mean link to Peter’s LLNL/zfp on github directly? Then type and exec need to changed.
…

Is there a Cython workaround for this? If not, then we can probably address this in a January release together with some other bug fixes.

halehawk · 2018-12-23T03:37:58Z

Sure, I will put

Sent from my iPhone

On Dec 22, 2018, at 3:24 PM, Peter Lindstrom ***@***.***> wrote: I did the following: elif input_array.ndim == 2: zfp_field_set_size_2d(field,input_array.shape[1],input_array.shape[0]) elif input_array.ndim == 3: zfp_field_set_size_3d(field,input_array.shape[2],input_array.shape[1],input_array.shape[0]) elif input_array.ndim == 4: print(input_array.shape) zfp_field_set_size_4d(field,input_array.shape[3],input_array.shape[2],input_array.shape[1],input_array.shape[0]) else: This works as long as the ndarray uses (the default) C order layout, but fails otherwise. What if the user attempts to compress an ndarray laid out in Fortran order? My suggestion would be to keep your code the same, since C order will be the most common case for numpy arrays, but to also set strides to correctly handle other layouts. Note that ndarray.strides are measured in bytes, while zfp measures strides in number of scalars.

Sure, I will put set strides in.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

halehawk · 2018-12-28T22:01:18Z

I am working on writing test_zfp.py. zfp usually can compress 32bit floating-point data and 64 bit floating-point data but not string or boolean. In the numcodecs test suites, it feeds in bytes and bytearray object without any type information, do you know how I can feed in the type information. Can I skip to run these tests on zfp? Thanks, Haiying

…

On Sat, Dec 22, 2018 at 8:37 PM Haiying Xu ***@***.***> wrote: Sure, I will put Sent from my iPhone On Dec 22, 2018, at 3:24 PM, Peter Lindstrom ***@***.***> wrote: I did the following: elif input_array.ndim == 2: zfp_field_set_size_2d(field,input_array.shape[1],input_array.shape[0]) elif input_array.ndim == 3: zfp_field_set_size_3d(field,input_array.shape[2],input_array.shape[1],input_array.shape[0]) elif input_array.ndim == 4: print(input_array.shape) zfp_field_set_size_4d(field,input_array.shape[3],input_array.shape[2],input_array.shape[1],input_array.shape[0]) else: This works as long as the ndarray uses (the default) C order layout, but fails otherwise. What if the user attempts to compress an ndarray laid out in Fortran order? My suggestion would be to keep your code the same, since C order will be the most common case for numpy arrays, but to also set strides to correctly handle other layouts. Note that ndarray.strides are measured in bytes, while zfp measures strides in number of scalars. Sure, I will put set strides in. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#160 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIDyFDyOH-E2CK-INjQgL0XhqMLbWlAlks5u7rEigaJpZM4ZfNJH> .

jakirkham · 2018-12-28T22:14:14Z

My recommendation (though it would be good to hear others' thoughts as well) would be to raise a TypeError for any data types that it cannot compress. In terms of getting the type, can run ensure_ndarray and just inspect the dtype of the resulting ndarray to figure out what it is. This technique will catch bytes and bytearray as those will be uint8.

rabernat · 2019-01-09T10:37:12Z

I agree with @jakirkham's suggestion. If a type can't be compressed by a particular coded, an informative TypeError should be raised.

@halehawk - please let us know if you have any other questions or need help with getting your tests running. Feel free to push your work-in-progress brach to github so we can follow along.

rabernat · 2019-01-09T10:38:53Z

Also, given the rapid recent progress in LLNL/zfp#29, perhaps we should consider whether to try that implementation of zfp in python.

lindstro · 2020-03-05T21:27:39Z

@halehawk Is this PR based on zfp's official Python bindings (available since 0.5.5 and released in May 2019) or your own wrappers?

halehawk · 2020-03-05T21:51:25Z

Currently is based on my wrappers. I need a tool to compress MPAS data and decompress them using zarr/numcodecs. But I have problems to let them pass the CI build and test now. If I use your bindings, can I include them in numcodecs?

…

On Thu, Mar 5, 2020 at 2:27 PM Peter Lindstrom ***@***.***> wrote: @halehawk Is this PR based on zfp's official Python bindings (available since 0.5.5 and released in May 2019) or your own wrappers? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

halehawk · 2020-03-05T23:05:45Z

Why did my codes always fail to include correct path in travis CI? At least, on AppVeyor, my codes passed 3 out of 4 tests.

…

On Thu, Mar 5, 2020 at 2:51 PM Haiying Xu ***@***.***> wrote: Currently is based on my wrappers. I need a tool to compress MPAS data and decompress them using zarr/numcodecs. But I have problems to let them pass the CI build and test now. If I use your bindings, can I include them in numcodecs? On Thu, Mar 5, 2020 at 2:27 PM Peter Lindstrom ***@***.***> wrote: > > @halehawk Is this PR based on zfp's official Python bindings (available since 0.5.5 and released in May 2019) or your own wrappers? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub, or unsubscribe.

lindstro · 2020-03-05T23:38:08Z

Of course. zfp is licensed under BSD, so there are very few restrictions on what you can do with it. I figured zarr and numcodecs would be better served by integrating with the official zfPy implementation. Thanks for your help with this.

Currently is based on my wrappers. I need a tool to compress MPAS data and decompress them using zarr/numcodecs. But I have problems to let them pass the CI build and test now. If I use your bindings, can I include them in numcodecs?
…

rabernat · 2020-03-06T10:34:52Z

Now that the official zfp python wrapper has been released, it seems like we should consider use that, rather than generating new python bindings ourselves. This will mean less maintenance work going forward. It might also fix our windows problem (or at least make it not our problem). @jakirkham - do you have preference? How do we feel about introducing a new dependency?

The zfp python bindings are documented here: https://zfp.readthedocs.io/en/release0.5.5/python.html

However, @lindstro, I'm confused about the python package release status. On pip there is

pyzfp: https://pypi.org/project/pyzfp/
But on conda forge there is
zfpy: https://anaconda.org/conda-forge/zfpy (I notice @kmpaul is a maintainer)
...which also appears on pip, but as version 0.0.0
https://pypi.org/project/zfpy/

Ideally, if we were to include zfp as a dependency, we would need both a pip and conda-forge package.

If we do go this route, we should probably start fresh with a new PR.

lindstro · 2020-03-06T16:05:13Z

@rabernat We have no association with pyzfp, which was developed independently. The conda zfpy package is based on the official zfPy API. We reserved the name for a corresponding pip package, but we have not had the time to develop it yet. We would gladly accept some help with this in case anyone is interested.

jakirkham · 2020-03-06T19:55:20Z

Trying to build a codec using zfpy seems like a good first step. No objection to adding the codec here. Though maybe we should consider making zfpy an optional dependency. WDYT @rabernat?

rabernat · 2020-03-06T20:53:15Z

Though maybe we should consider making zfpy an optional dependency.

Do we have any optional dependencies currently? What happens if you try to read zarr dataset without the dependency installed? Is there a useful error message?

If so, I think we should make zfpy and optional dependency and write a very thin layer for its codec in numcodecs. We could also help with the pip packaging issue.

halehawk · 2020-03-06T20:54:42Z

Yes, maybe adding zfpy as dependency is a better way including to numcodec.

…

On Fri, Mar 6, 2020 at 12:55 PM jakirkham ***@***.***> wrote: Trying to build a codec using zfpy seems like a good first step. No objection to adding the codec here. Though maybe we should consider making zfpy an optional dependency. WDYT @rabernat? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jakirkham · 2020-03-06T21:46:42Z

Yep these all seem like good ideas.

This is sort of how Blosc worked for people pip installing it. We initially made LZMA optional as part of Python 2 support. Though yeah this is a bit less common.

There is an error raised here if the codec is not available. We could improve this a bit to include some instructions on what to do to make it work. Agree this is important to have right.

Please let me know if you have any thoughts here 🙂

halehawk · 2020-03-06T22:44:14Z

So where is lzma module? I only see it is imported in lzma.py.

…

On Fri, Mar 6, 2020 at 2:46 PM jakirkham ***@***.***> wrote: Yep these all seem like good ideas. This is sort of how Blosc worked for people pip installing it. We initially made LZMA optional as part of Python 2 support. Though yeah this is a bit less common. There is an error raised here <https://github.com/zarr-developers/numcodecs/blob/fc4196f5588bfd1eee76b9f021dcb0fe59c1e80b/numcodecs/registry.py#L35-L36> if the codec is not available. We could improve this a bit to include some instructions on what to do to make it work. Agree this is important to have right. Please let me know if you have any thoughts here 🙂 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#160?email_source=notifications&email_token=ACAPEFFTA5LZYD7BKILUYGDRGFVMHA5CNFSM4GL42JD2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOC6UZA#issuecomment-595978852>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACAPEFHRRXRHQCHPE3R3OFTRGFVMHANCNFSM4GL42JDQ> .

rabernat · 2020-03-07T04:04:35Z

So where is lzma module? I only see it is imported in lzma.py.

It is a totally separate package, like numpy. We don't include numpy code in this repo either. The required dependencies for numcodecs are listed here:

numcodecs/requirements.txt

Lines 1 to 4 in 7c07db4

    
           cython 
        
           numpy 
        
           msgpack 
        
           pytest

These are packages that will be automatically installed by pip if we try to install numcodecs.

Note that lzma is not one of them. The tests are configured to skip lzma tests if it is not installed:

numcodecs/numcodecs/tests/test_lzma.py

Lines 11 to 15 in 7c07db4

    
           try: 
        
               # noinspection PyProtectedMember 
        
               from numcodecs.lzma import LZMA, _lzma 
        
           except ImportError:  # pragma: no cover 
        
               raise unittest.SkipTest("LZMA not available")

halehawk · 2020-03-09T21:55:21Z

Does it mean we should make zfp a pip install package first? zfp or zfpy?

…

On Fri, Mar 6, 2020 at 9:04 PM Ryan Abernathey ***@***.***> wrote: So where is lzma module? I only see it is imported in lzma.py. It is a totally separate package, like numpy. We don't include numpy code in this repo either. The required dependencies for numcodecs are listed here: https://github.com/zarr-developers/numcodecs/blob/7c07db404ecdf780cc0ac37e0b2d04660ef852fc/requirements.txt#L1-L4 These are packages that will be automatically installed by pip if we try to install numcodecs. Note that lzma is not one of them. The tests are configured to skip lzma tests if it is not installed: https://github.com/zarr-developers/numcodecs/blob/7c07db404ecdf780cc0ac37e0b2d04660ef852fc/numcodecs/tests/test_lzma.py#L11-L15 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#160?email_source=notifications&email_token=ACAPEFBN3ZSS6SLMYBCT3MTRGHBVJA5CNFSM4GL42JD2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODOQCA#issuecomment-596043784>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACAPEFGN2ASWGNIOQS46HZTRGHBVJANCNFSM4GL42JDQ> .

lindstro · 2020-03-12T02:53:51Z

We have already reserved zfpy but would appreciate help with packaging.

alimanfoo · 2020-04-06T13:37:16Z

Hi folks, can I just check in on status here, are we still waiting on zfpy to be available via PyPI? I see there is a 0.0.0 release on PyPI/zfpy, but conda-forge/zfpy is at 0.5.5.

Could you just release zfpy source tarballs to PyPI and not worry about any binaries (if people need them then recommend the conda-forge route)?

lindstro · 2020-04-06T15:01:55Z

PyPI/zfpy is being worked on (see https://test.pypi.org/project/zfpy/0.5.5rc1/#files for current progress). Hopefully not too much longer.

jakirkham

Added a couple comments to firm things up here.

jakirkham · 2020-06-02T07:06:07Z

.gitmodules

+[submodule "zfp"]
+	path = zfp
+	url = https://github.com/LLNL/zfp.git


We shouldn't need the source code here since we can go the dependency route

Suggested change

[submodule "zfp"]

path = zfp

url = https://github.com/LLNL/zfp.git

Also please remove the submodule as well.

setup.py

jakirkham · 2020-06-02T07:09:37Z

docs/zfp.rst

@@ -0,0 +1,17 @@
+Zfp


Not to bikeshed things, but are we wanting this to be zfp or zfpy? Ok either way, just wanted to check that we are happy with the naming before we ship this.

If fact, the PR #160 is not needed any more, because @lindstro has their own zfp wrapper, so I have PR #229 ready for you to approve to merge. @jakirkham

Sorry, must have gotten turned around. Should we close this PR out then?

jakirkham · 2020-06-02T07:11:57Z

It looks like we have PyPI packages now. Are these good to use @lindstro?

lindstro · 2020-06-02T16:42:51Z

@jakirkham I thought these packages were good to go, but after further testing it looks like the osx wheels don't include the libzfp dylib. As I understand, the solution to this is to use delocate. Does anyone have an example of how to do this on Travis?

The Python interface to zfp is called zfpy, so I'd go with zfpy.

halehawk · 2020-06-02T17:58:01Z

@lindstro @jakirkham I changed my pip install from test PyPI to PyPI zfpy0.5.5 at PR #229, then it passed all checks including osx checks.

jakirkham · 2020-06-02T18:16:19Z

@jakirkham I thought these packages were good to go, but after further testing it looks like the osx wheels don't include the libzfp dylib. As I understand, the solution to this is to use delocate. Does anyone have an example of how to do this on Travis?

Unfortunately I do not. @matthew-brett, is an expert in this area. Maybe he can advise?

The Python interface to zfp is called zfpy, so I'd go with zfpy.

+1

matthew-brett · 2020-06-02T18:20:56Z

Sorry - I am a confused - can I help here - or is all good?

lindstro · 2020-06-02T18:53:18Z

Sorry - I am a confused - can I help here - or is all good?

@matthew-brett We're building osx wheels on Travis CI and deploying to PyPI. Our Python package is a couple of Cython wrappers around the zfp C/C++ library. Unfortunately, the osx wheels do not include the libzfp dylib. I ran across a SO post that suggested using delocate to force inclusion of the dylib, though I'm not sure how to set up our Travis scripts to make this happen.

I'm the developer of zfp and have very limited expertise when it comes to Python, PyPI, and Travis CI. I was hoping someone could point to an example of how to properly include dylibs when building wheels on Travis. Any pointers would be appreciated.

jakirkham · 2020-06-02T19:24:22Z

Sorry Matthew. The PR is out-of-date, but the packaging issue with zfpy remains, on which the newer PR will depend. If you are able to advise, that would be very helpful 🙂

halehawk added 2 commits December 21, 2018 15:50

added zfp compressor

80a571c

modified zfp in setup.py for new zfp054 directory

50cfdb5

rabernat reviewed Dec 22, 2018

View reviewed changes

setup.py Show resolved Hide resolved

lindstro reviewed Dec 22, 2018

View reviewed changes

rabernat reviewed Dec 22, 2018

View reviewed changes

lindstro mentioned this pull request Dec 23, 2018

Python Bindings: Numpy Support LLNL/zfp#29

Merged

20 tasks

changed print statement to raise,also changed zfp source path

8b40523

Merge remote-tracking branch 'upstream/master'

dbc2b64

update zfp.c with include path

8cfad3e

changed include path again

e205381

Merge branch 'master' into master

fce55da

merge confilct

9e224d6

jakirkham reviewed Jun 2, 2020

View reviewed changes

halehawk closed this Jun 2, 2020

halehawk mentioned this pull request Jun 2, 2020

numcodecs.zfpy is ready #229

Merged

7 tasks

	[submodule "zfp"]
	path = zfp
	url = https://github.com/LLNL/zfp.git

Developed zfp python wrapper for numcodecs and is ready to be used #160

Developed zfp python wrapper for numcodecs and is ready to be used #160

Uh oh!

Conversation

halehawk commented Dec 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rabernat commented Dec 22, 2018

Uh oh!

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

lindstro Dec 22, 2018

Choose a reason for hiding this comment

Uh oh!

rabernat commented Dec 22, 2018

Uh oh!

rabernat Dec 22, 2018

Choose a reason for hiding this comment

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

lindstro commented Dec 22, 2018

Uh oh!

jakirkham commented Dec 22, 2018

Uh oh!

jakirkham commented Dec 22, 2018

Uh oh!

halehawk commented Dec 22, 2018

Uh oh!

halehawk commented Dec 22, 2018 via email

Uh oh!

jakirkham commented Dec 22, 2018

Uh oh!

jakirkham commented Dec 22, 2018

Uh oh!

lindstro commented Dec 22, 2018

Uh oh!

lindstro commented Dec 22, 2018

Uh oh!

halehawk commented Dec 23, 2018 via email

Uh oh!

halehawk commented Dec 28, 2018 via email

Uh oh!

jakirkham commented Dec 28, 2018

Uh oh!

rabernat commented Jan 9, 2019

Uh oh!

rabernat commented Jan 9, 2019

Uh oh!

lindstro commented Mar 5, 2020

Uh oh!

halehawk commented Mar 5, 2020 via email

Uh oh!

halehawk commented Mar 5, 2020 via email

Uh oh!

lindstro commented Mar 5, 2020

Uh oh!

rabernat commented Mar 6, 2020

Uh oh!

lindstro commented Mar 6, 2020

Uh oh!

jakirkham commented Mar 6, 2020

Uh oh!

rabernat commented Mar 6, 2020

Uh oh!

halehawk commented Mar 6, 2020 via email

Uh oh!

jakirkham commented Mar 6, 2020

Uh oh!

halehawk commented Mar 6, 2020 via email

Uh oh!

halehawk commented Dec 21, 2018 •

edited

Loading