-
Notifications
You must be signed in to change notification settings - Fork 23
Implement dpnp.linalg.lu_factor batch inputs #2565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: impl_lu_factor
Are you sure you want to change the base?
Conversation
View rendered docs @ https://intelpython.github.io/dpnp/pull/2565/index.html |
Array API standard conformance tests for dpnp=0.19.0dev3=py313h509198e_35 ran successfully. |
# Copy each 2D slice to a new array because getrf will destroy | ||
# the input matrix | ||
a_vecs[i] = dpnp.empty_like(a[i], order="F", dtype=res_type) | ||
ht_ev, copy_ev = ti._copy_usm_ndarray_into_usm_ndarray( | ||
src=a_usm_arr[i], | ||
dst=a_vecs[i].get_array(), | ||
sycl_queue=a_sycl_queue, | ||
depends=dep_evs, | ||
) | ||
_manager.add_event_pair(ht_ev, copy_ev) | ||
|
||
ipiv_vecs[i] = dpnp.empty( | ||
(k,), | ||
dtype=dpnp.int64, | ||
order="C", | ||
usm_type=a_usm_type, | ||
sycl_queue=a_sycl_queue, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any reason not to perform single copies and allocations, then just make views along the batch axis?
that is, copy all of src
into one new array, and allocate all ipiv_vecs
arrays as one allocation, then split both along batch axis in the loop
dep_evs = _manager.submitted_events | ||
|
||
# Process each batch | ||
for i in range(batch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may not make sense in this PR, but it might be worth looking at moving this entire loop into C++, a custom pybind11 function that takes lists of arrays instead and loops, calling _getrf
on each
Just wanted to tag, it could fix: #2498 |
This PR suggests extending
dpnp.linalg.lu_factor()
#2557 for batch arraysIn addition, this PR includes: