Skip to content

Conversation

naoyam
Copy link
Collaborator

@naoyam naoyam commented Mar 9, 2023

Fixes #2560

Previously, the rfactor ID sets gathered by the CA map did not include reduction rfactors. However, in the repro of #2560, traversing through an reduction rfactor ID is necessary to index patially inlined broadcast tensors. I don't see any reason not to include reduction rfactor IDs in the CA map, nor any potential side effect. Reduction rfactor has been relatively trivial compared to view rfactor IDs as IDs are reduced away, but it still needs to be processed with the ID traversal logic for indexing.

@naoyam naoyam changed the title [WIP] Fix indexing failure with non-view rfactor [DRAFT] Fix indexing failure with non-view rfactor Mar 9, 2023
@naoyam naoyam marked this pull request as draft March 9, 2023 11:53
@csarofeen
Copy link
Owner

Odd this wasn't caught before, not sure why this is happening and why this fixes it, but I'm comfortable with the change.

@csarofeen csarofeen self-requested a review March 9, 2023 14:38
@naoyam
Copy link
Collaborator Author

naoyam commented Mar 11, 2023

Bandwidth and speedup curves:

image

Speedup histogram:

image

The overall results look like mostly within random noises. There are some benchmarks showing 10% degradation, but they are most likely because they are pretty short-running kernels.

@naoyam naoyam marked this pull request as ready for review March 11, 2023 00:41
@naoyam naoyam changed the title [DRAFT] Fix indexing failure with non-view rfactor Fix indexing failure with non-view rfactor Mar 11, 2023
@naoyam naoyam merged commit 9c62d94 into devel Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indexing failure
2 participants