Skip to content

Fixing blosc encode error handling #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 29, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ Release notes
some codecs, and also simplifies the implementation of some codecs, improving
code readability and maintainability. By :user:`John Kirkham <jakirkham>` and
:user:`Alistair Miles <alimanfoo>`; :issue:`119`, :issue:`121`, :issue:`128`.

* Improvements to handling of errors in the :class:`numcodecs.blosc.Blosc` and
:class:`numcodecs.lz4.LZ4` codecs when the maximum allowed size of an input
buffer is exceeded. By :user:`Jerome Kelleher <jeromekelleher>`, :issue:`80`,
:issue:`81`.


.. _release_0.5.5:
Expand Down
1,190 changes: 687 additions & 503 deletions numcodecs/blosc.c

Large diffs are not rendered by default.

11 changes: 8 additions & 3 deletions numcodecs/blosc.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ from cpython.bytes cimport PyBytes_FromStringAndSize, PyBytes_AS_STRING

from .compat_ext cimport Buffer
from .compat_ext import Buffer
from .compat import PY2, text_type
from .compat import PY2, text_type, ensure_contiguous_ndarray
from .abc import Codec


Expand Down Expand Up @@ -248,7 +248,8 @@ def compress(source, char* cname, int clevel, int shuffle=SHUFFLE,
char *source_ptr
char *dest_ptr
Buffer source_buffer
size_t nbytes, cbytes, itemsize
size_t nbytes, itemsize
int cbytes
bytes dest

# check valid cname early
Expand Down Expand Up @@ -365,7 +366,8 @@ def decompress(source, dest=None):
dest_ptr = PyBytes_AS_STRING(dest)
dest_nbytes = nbytes
else:
dest_buffer = Buffer(dest, PyBUF_ANY_CONTIGUOUS | PyBUF_WRITEABLE)
arr = ensure_contiguous_ndarray(dest)
dest_buffer = Buffer(arr, PyBUF_ANY_CONTIGUOUS | PyBUF_WRITEABLE)
dest_ptr = dest_buffer.ptr
dest_nbytes = dest_buffer.nbytes

Expand Down Expand Up @@ -472,6 +474,7 @@ class Blosc(Codec):
SHUFFLE = SHUFFLE
BITSHUFFLE = BITSHUFFLE
AUTOSHUFFLE = AUTOSHUFFLE
max_buffer_size = 2**31 - 1

def __init__(self, cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=AUTOBLOCKS):
self.cname = cname
Expand All @@ -484,9 +487,11 @@ class Blosc(Codec):
self.blocksize = blocksize

def encode(self, buf):
buf = ensure_contiguous_ndarray(buf, self.max_buffer_size)
return compress(buf, self._cname_bytes, self.clevel, self.shuffle, self.blocksize)

def decode(self, buf, out=None):
buf = ensure_contiguous_ndarray(buf, self.max_buffer_size)
return decompress(buf, out)

def __repr__(self):
Expand Down
13 changes: 11 additions & 2 deletions numcodecs/compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,15 +90,20 @@ def ensure_ndarray(buf):
return arr


def ensure_contiguous_ndarray(buf):
def ensure_contiguous_ndarray(buf, max_buffer_size=None):
"""Convenience function to coerce `buf` to a numpy array, if it is not already a
numpy array. Also ensures that the returned value exports fully contiguous memory,
and supports the new-style buffer interface.
and supports the new-style buffer interface. If the optional max_buffer_size is
provided, raise a ValueError if the number of bytes consumed by the returned
array exceeds this value.

Parameters
----------
buf : array-like or bytes-like
A numpy array or any object exporting a buffer interface.
max_buffer_size : int
If specified, the largest allowable value of arr.nbytes, where arr
is the retured array.

Returns
-------
Expand Down Expand Up @@ -132,6 +137,10 @@ def ensure_contiguous_ndarray(buf):
else:
raise ValueError('an array with contiguous memory is required')

if max_buffer_size is not None and arr.nbytes > max_buffer_size:
msg = "Codec does not support buffers of > {} bytes".format(max_buffer_size)
raise ValueError(msg)

return arr


Expand Down
Loading