-
Notifications
You must be signed in to change notification settings - Fork 97
Fixing blosc encode error handling #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing blosc encode error handling #81
Conversation
db1b956
to
af83108
Compare
Looks like CI providers are OK with making a 2GB array, so that's good. We'll need to take a bit of care on 32 bit builds on Windows, but this seems manageable. The bigger issue seems that there are hard limits on the buffer size for other compressors too (LZ4 has anyway, haven't gone through the others). LZ4 fails correctly, but doesn't give any error message. This is poor from a user perspective. I guess the only good way to do this is to have a |
Thanks @jeromekelleher. Adding a Re point (2), I haven't seen that before, @FrancescAlted is there something we should be doing to clean up after a blosc internal error has occurred, when using blosc with multiple threads and global state (blosc_compress(), blosc_descompress())? Re point (3), @FrancescAlted could we discuss how blosc reports errors and if there is a better way to enable applications like numcodecs to capture and propagate appropriately? Happy to raise an issue on the c-blosc repo if you think appropriate. |
Hi. I don't have that much experience with crash recovery (Blosc crashes very few times, if any, for me), but I'd say that a Regarding the error messages, yes, currently a description (more or less accurate) is sent to stderr and a negative value is returned. Fixing this, while feasible, would require the introduction of a couple of APIs and a better catalog for the different error (and messages) that can occur internally. While not difficult, this would take quite a bit of time, so happy if anybody would be interested in doing a PR for fixing this. |
Thanks for the clarifications @FrancescAlted, this is very helpful. I agree with @alimanfoo, in that I think the best way forward here is to keep the |
ps. I'm happy to do the coding for this, but might let some other PRs get merged before taking it up again. |
Thanks @jeromekelleher, SGTM. |
Hi @jeromekelleher, would you be interested in taking this up again? We're pushing towards a release and would be great to include this. |
OK, I can take a look here @alimanfoo and see if there's something I can do cleanly fairly quickly. What's the timeline for the release? |
Thanks @jeromekelleher. No specific timeline but there's only this and a couple of other maintenance issues then I think would be at a nice point to release. |
af83108
to
f1cbde4
Compare
I've made a pass at handling this in a reasonably general way @alimanfoo. We could embed the check down in the C code or in the guts of each codec separately, which would be simpler and more efficient. However, it would be very difficult to test this. The testing hack I have here is a bit messy, but at least we get coverage on the actual mechanism. Having to create > 2GiB to provoke this will be a mess on CI, as we'll regularly have failures when the tests happen to run on a machine that's a bit more memory constrained or whatever. If you like the approach, I'd need to add calls to |
f1cbde4
to
5545096
Compare
Ah, good ole Python 2. I'm sure this is fixable anyway if we like the general approach. |
Hi @jeromekelleher, I think the approach looks great, thanks. I think it would be worth holding further work on this until we get either of #128 or #121 merged, those PRs are both attempts to simplify the handling and normalisation of different possible input types. Once one of those is merged, we could then rebase this PR, which would simplify a little and provide a consistent base to work from across other codecs. It would also enable a fix to the PY27 failures on travis about |
SGTM @alimanfoo, would you mind pinging me when the infrastructure is in place? |
@jeromekelleher will do, thanks. |
Went ahead and merged |
Thanks @jakirkham. FWIW I think it would be OK to restrict this to just Blosc for the current PR. Playing around with some other codecs now, looks like at least Zlib can handle larger inputs, but no idea how big. |
For the record... LZ4 has a max input buffer size of 0x7E000000 Zstd, Zlib, LZMA, BZ2 all work for buffers larger than 2**31, don't know what is max. |
Thanks for the update @jakirkham, this is much neater. @alimanfoo, how about we have a default |
SGTM
…On Wed, 28 Nov 2018, 08:59 Jerome Kelleher ***@***.*** wrote:
Thanks for the update @jakirkham <https://github.com/jakirkham>, this is
much neater.
@alimanfoo <https://github.com/alimanfoo>, how about we have a default
max_buffer_size of 2**63 - 1 (practically unlimited) and set the buffer
sizes for LZ4 and Blosc appropriately? I think it is good to keep this in
the ABC so that we can test the limit checking properly, otherwise it's a
real mess trying to provoke the error conditions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAq8Qv8jiCpANUSdLcsG5M3ZmSAQCIFkks5uzlBUgaJpZM4T3-bO>
.
|
Cool. I'm happy to finish this one up so. One quick question: how do I make a test that will run for all of the different encoders? |
Thanks @jeromekelleher. Currently, any tests to be run on multiple codecs are defined in numcodecs/tests/common.py, e.g., |
4bf5b7f
to
9f50722
Compare
I had a change of heart here @alimanfoo, and changed this to only check the What do you think? |
Maybe this check should live in |
Good point --- presumably these codecs are requiring contiguity in the C code anyway, so we're not making any extra requirements. @alimanfoo? |
Sounds like a good solution to me.
…On Wed, 28 Nov 2018, 15:52 Jerome Kelleher ***@***.*** wrote:
Maybe this check should live in ensure_contiguous_ndarray? That's where
other similar checks are already handled.
Good point --- presumably these codecs are requiring contiguity in the C
code anyway, so we're not making any extra requirements. @alimanfoo
<https://github.com/alimanfoo>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAq8QlL0aBwp5XwmYN5KNRPrlFHzfy25ks5uzrEjgaJpZM4T3-bO>
.
|
9f50722
to
2c5fcd0
Compare
Reverted earlier changes to the ABC layer as converting to ndarray and checking buffer size seems pointless and inefficient.
2c5fcd0
to
e4f9152
Compare
OK, I think we're ready to go @alimanfoo and @jakirkham. I can squash down to one commit if you prefer (seemed impolite to nuke @jakirkham's commit!). |
Many thanks @jeromekelleher. There is one small thing worth mentioning. Currently a call to Probably worth keeping things tight and avoiding duplicated calls. Suggest the simplest thing to do would be to remove the |
Otherwise I think it's good to go. |
OK, that's done @alimanfoo. There was a slight complication in decoding into a buffer also needed to convert the contiguous array, which was automatically being done by Buffer. I just put in explicit calls to fix it up, and it seems fine now. |
Hmm, seems like some PY2 test specialisations aren't needed either now. Nice side effect I guess. |
Actually the PY2 test failures are an indicator that the vlen codecs also need to make use of I think a decent solution to this would be to add |
4170f93
to
7333035
Compare
I see, thanks @alimanfoo. That's done now. Out of curiosity, is there some quirk in Cython that won't let you use superclasses? The |
Thanks @jeromekelleher. You're right about the code duplication, the decode methods are almost identical, except for one line where they construct an item to be placed in the output array. AFAIK super-classes are fine, and there is clearly an opportunity for refactoring here. Suggest we deal with that as a separate issue. |
WIP. Closes #80.
Some problems:
blosc.use_threads
is true. It looks like there is some library cleanup that should be performed after an error occurs?TODO:
tox -e py36
passes locallytox -e py27
passes locallytox -e docs
passes locally