-
Notifications
You must be signed in to change notification settings - Fork 29
Fixes boolean indexing for strided masks #1370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1370/index.html |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_21 ran successfully. |
It was a mistake to use However I don't think removing simplification altogether is the right solution. We could use |
I've tested this out and it does also resolve the problem. Would it be worthwhile then to write an abbreviated version of |
Yes, I was thinking about this as well. It is a good idea. I'd suggest naming it |
- The cumulative sum was being calculated incorrectly -- the offset from stride simplification was unused and the result was incorrect for some cases with non-C-contiguous strides - To fix this, new functions ``compact_iteration_space`` and complementary function ``compact_iteration`` have been implemented
Compacting strides can reduce dimensionality of the array, but it can not turn an input that is not already C-contiguous into a C-contiguous one. Hence the branch checking if the input became C-contiguous after compacting is effectively dead.
7a30569
to
741dc7f
Compare
@oleksandr-pavlyk |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_23 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_30 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_31 ran successfully. |
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_26 ran successfully. |
This PR fixes some edge cases to boolean indexing that were not working as expected.
The cumulative sum used to get the count of set elements in the mask was misbehaving, especially for cases with negative strides (and most noticeably, the 1D case). This seems to have been caused by stride simplification can change the order elements are traversed in, which caused incorrect results.
Additionally, the
offset
parameter was also unused in when the contiguous kernel was called after simplification.A new stride simplification method called
compact_iteration_space
was introduced to fix the problem.