-
Notifications
You must be signed in to change notification settings - Fork 97
WIP: Allow CuPy #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Allow CuPy #212
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ | |
import array | ||
|
||
|
||
import cupy as cp | ||
import numpy as np | ||
|
||
|
||
|
@@ -48,8 +49,8 @@ def ensure_ndarray(buf): | |
|
||
""" | ||
|
||
if isinstance(buf, np.ndarray): | ||
# already a numpy array | ||
if isinstance(buf, (np.ndarray, cp.ndarray)): | ||
# already an array | ||
arr = buf | ||
|
||
elif isinstance(buf, array.array) and buf.typecode in 'cu': | ||
|
@@ -151,6 +152,12 @@ def ensure_bytes(buf): | |
if arr.dtype == object: | ||
raise TypeError('object arrays are not supported') | ||
|
||
# Force CuPy arrays to NumPy arrays | ||
# because they don't have a `tobytes` method (yet) | ||
# xref: https://github.com/cupy/cupy/pull/2617 | ||
if isinstance(arr, cp.ndarray): | ||
arr = arr.get() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As noted, this is a workaround. The real solution here is PR ( cupy/cupy#2617 ), which should land in CuPy 7.0.0. IOW this can probably be ignored for the sake of this discussion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ooh, +1 to |
||
|
||
# create bytes | ||
buf = arr.tobytes(order='A') | ||
|
||
|
@@ -160,6 +167,12 @@ def ensure_bytes(buf): | |
def ensure_text(s, encoding='utf-8'): | ||
if not isinstance(s, text_type): | ||
s = ensure_contiguous_ndarray(s) | ||
|
||
# Force CuPy arrays to NumPy arrays | ||
# as they support the buffer protocol | ||
if isinstance(s, cp.ndarray): | ||
s = s.get() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One way to solve this might be to call On the flip side that does incur a copy. Though this may be pretty inexpensive in the grand scheme of things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think in Dask we do a check like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just for context, this is what the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's also possible we can get away with dropping this part entirely. This function is mainly used in conjunction with reading things (like JSON). So ensuring it can work with CuPy is probably unnecessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes not sure you need to worry about cupy arrays here as this is mainly for JSON reading IIRC. |
||
|
||
s = codecs.decode(s, encoding) | ||
return s | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is easy enough to do, but it would be nice if we didn't need to learn about CuPy here.
We will probably need to find another solution for Numcodecs in the near term, but wanted to mention some related things. Namely there is some discussion about generally checking array types in issue ( numpy/numpy#14891 ).
Also mentioned by Matt in the aforementioned issue, Dask has
is_arraylike
, which presents a way of solving this problem. I'm not sure it totally helps us here as we really need to check that there is a data buffer we can grab and use, but it is at least a good starting point.Another thought I had is we could registered
ndarray
types and their conversion mechanisms with Numcodecs. This could allow users to add their own array types (like CuPy), but not complicate things too much here.Other thoughts on this would generally be welcome. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps one could check for the presence of a
__array__
or__cuda_array_interface__
protocol?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Thanks! Will give that a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, if goal here is to ultimately enable codecs with GPU implementations then using
__cuda_array_interface__
sounds good. E.g., when using numba cuda functions you get back an instance of some other class (DeviceArray or something like that IIRC) but that exports__cuda_array_interface__
too.