Skip to content

[EXIR] Update RemoveCloneOpsTransform to be dim order aware #12976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

keyprocedure
Copy link
Contributor

@keyprocedure keyprocedure commented Jul 29, 2025

Summary

This is PR 3 of 3 implementing a dim order aware clone op.

This PR updates the clone removal pass to retain layout changing _clone_dim_order ops and remove no-op clones, ensuring memory layout is preserved through export.

Related PRs:

  • PR 1: #12974 - Add _clone_dim_order portable kernel
  • PR 2: #12971 - Register _clone_dim_order op and map aten.clone

Fixes #12645

Test plan

Added tests to verify:

  • Clones that change layout are preserved
  • Clones with unchanged layout are removed

All tests pass via:
python -m unittest exir.tests.test_memory_format_ops_pass

Copy link

pytorch-bot bot commented Jul 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12976

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ffd1549 with merge base d1c87e4 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 29, 2025
@keyprocedure
Copy link
Contributor Author

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot bot added the release notes: none Do not include this in the release notes label Jul 29, 2025
Gasoonjia added a commit that referenced this pull request Aug 11, 2025
### Summary
This is PR 1 of 3 implementing a dim order aware clone op. 

Currently, clone ops are removed during export as no-ops, causing memory
layout (dim order) changes to be lost. This can cause backend failures,
incorrect outputs when ops expect specific layouts, and performance
degradation. This set of PRs introduces a dim order aware clone op,
`_clone_dim_order`, which preserves memory layout changes by explicitly
storing dim order information. This is implemented by replacing standard
clone ops with this variant during export and updating the clone removal
transform to preserve clones that change layout.

This PR adds the portable CPU kernel for the `_clone_dim_order` op,
implementing a clone variant that preserves dim order at runtime. The
portable kernel validates dtype and layout compatibility, resizes the
output tensor if needed, and performs an element wise clone of the
tensors.

Note: A future PR will add the ATen kernel for `_clone_dim_order`.

Related PRs:
- PR 2: [#12971](#12971) -
Register `_clone_dim_order` op and map `aten.clone`
- PR 3: [#12976](#12976) -
Update RemoveCloneOpsTransform to be dim_order aware

Fixes #12645 

### Test plan
Added kernel runtime tests to verify:
- Tensors of all real dtypes are cloned correctly.
- Failure when input and output tensor shapes mismatch.
- Failure with unsupported memory formats.
- Failure when `non_blocking=true` since the portable kernel only
supports blocking data transfer.
- Dynamic shape outputs are cloned with correct values.
- Layout conversions are cloned correctly for `contiguous` to
`channels_last`, `channels_last` to `contiguous`, and `channels_last` is
preserved.

All runtime tests pass via:
`build-ninja/kernels/test/portable_kernels_test`

---------

Co-authored-by: Gasoonjia <[email protected]>
if node.op != "call_function":
continue

# Identify clone_dim_order ops with unchanged memory layout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are supporting aten.clone elimination through this pass then we should similarly check memory_format arg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I added the check for aten.clone and updated the tests. I'll refactor/simplify the test cases if needed once we land the AOT PR since it includes its own tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: none Do not include this in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dim order variant clone operator
2 participants