[OpenCL] Fix bugs in BatchedReduceAddInst implementation #3118
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This commit fixes two bugs in the OpenCL implementation of
BatchedReduceAddInst
and adds a few comments for clarity.The first is a segmentation fault caused by
incorporating feedback on #2958. A suggestion was made to make the loop
variable
i
in the loop that computesbatchSliceSizes
count down instead ofcount up, but this suggestion was taken without changing the type (which was
size_t
,an unsigned type), so the loop never terminates and eventually leads to a
segmentation fault.
The second bug is an incorrect computation of
destSliceSizes
. Instead ofmultiplying the slice size at a dimension with the number of elements in
that same dimension, the code was multiplying the former with the number
of elements in the adjacent dimension. This was surfaced by the unit
test added in #2958 for
axis = 2
.Test Plan
ninja check
with OpenCL enabled, DEBUG modeninja check
with OpenCL enabled, RELEASE modeninja check
with OpenCL enabled, ASAN+UBSAN mode