-
Notifications
You must be signed in to change notification settings - Fork 649
[EXIR] Update RemoveCloneOpsTransform to be dim order aware #12976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[EXIR] Update RemoveCloneOpsTransform to be dim order aware #12976
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12976
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ffd1549 with merge base d1c87e4 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "release notes: none" |
### Summary This is PR 1 of 3 implementing a dim order aware clone op. Currently, clone ops are removed during export as no-ops, causing memory layout (dim order) changes to be lost. This can cause backend failures, incorrect outputs when ops expect specific layouts, and performance degradation. This set of PRs introduces a dim order aware clone op, `_clone_dim_order`, which preserves memory layout changes by explicitly storing dim order information. This is implemented by replacing standard clone ops with this variant during export and updating the clone removal transform to preserve clones that change layout. This PR adds the portable CPU kernel for the `_clone_dim_order` op, implementing a clone variant that preserves dim order at runtime. The portable kernel validates dtype and layout compatibility, resizes the output tensor if needed, and performs an element wise clone of the tensors. Note: A future PR will add the ATen kernel for `_clone_dim_order`. Related PRs: - PR 2: [#12971](#12971) - Register `_clone_dim_order` op and map `aten.clone` - PR 3: [#12976](#12976) - Update RemoveCloneOpsTransform to be dim_order aware Fixes #12645 ### Test plan Added kernel runtime tests to verify: - Tensors of all real dtypes are cloned correctly. - Failure when input and output tensor shapes mismatch. - Failure with unsupported memory formats. - Failure when `non_blocking=true` since the portable kernel only supports blocking data transfer. - Dynamic shape outputs are cloned with correct values. - Layout conversions are cloned correctly for `contiguous` to `channels_last`, `channels_last` to `contiguous`, and `channels_last` is preserved. All runtime tests pass via: `build-ninja/kernels/test/portable_kernels_test` --------- Co-authored-by: Gasoonjia <[email protected]>
if node.op != "call_function": | ||
continue | ||
|
||
# Identify clone_dim_order ops with unchanged memory layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are supporting aten.clone
elimination through this pass then we should similarly check memory_format
arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point! I added the check for aten.clone
and updated the tests. I'll refactor/simplify the test cases if needed once we land the AOT PR since it includes its own tests.
Summary
This is PR 3 of 3 implementing a dim order aware clone op.
This PR updates the clone removal pass to retain layout changing
_clone_dim_order
ops and remove no-op clones, ensuring memory layout is preserved through export.Related PRs:
_clone_dim_order
portable kernel_clone_dim_order
op and mapaten.clone
Fixes #12645
Test plan
Added tests to verify:
All tests pass via:
python -m unittest exir.tests.test_memory_format_ops_pass