Fixes to_edge_transform_and_lower when unsupported ops are asked for preservation #8776

metascroy · 2025-02-27T17:39:06Z

If a partitioner requests to_edge_transform_and_lower keep mutable / aliasing ops (e.g., transpose, view, permute, etc), lowering with ExecuTorch fails because those ops cannot be functionalized when wrapped in the EDGE_DO_NOT_DECOMP namespace as custom ops.

This PR filters out unsupported ops that backends request for preservation.

pytorch-bot · 2025-02-27T17:39:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8776

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 5 Pending

As of commit 548d770 with merge base 38384a2 ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-02-27T17:39:57Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

larryliu0820 · 2025-02-27T17:56:33Z

exir/program/_program.py

+def _remove_invalid_ops_for_not_decompose(
+    ops_to_not_decompose: List[torch._ops.OpOverload],
+) -> List[torch._ops.OpOverload]:
+    def keep(op):
+        schema = op._schema
+        native_schema = _pybind_schema_to_native_schema(schema)
+        if native_schema.is_mutable:
+            return False
+        if native_schema.aliased_return_names() != [None]:
+            return False
+        return True
+
+    return list(filter(keep, ops_to_not_decompose))


Hmm this is a bit too sneaky in my mind. If the user wants to debug why an op is decomposed vs not decomposed, it might take them a while to find this filter.

Do you think it's reasonable to raise error at the partitioner level when they try to keep the aliasing op instead of silently filtering them out?

Do you think it's reasonable to raise error at the partitioner level

The user does not specify ops that should be decomposed or not. The partitioner automatically returns the list based on what the backend supports. If we did this filtering in the partitioner, each partitioner would just need to repeat this same filtering logic because it's actually a limitation with ExecuTorch. In fact, I originally did add this logic to the CoreML partitioner, but then I thought it applies to all partitioners, so why not add it to _program.py instead.

We could add logging for this filter though?

I think the ideal solution for to_edge_transform_and_lower is:

First functionalize the graph and remove aliased ops

Ask partitioners for ops to preserve

Run decompositions

Today, step 1 happens after step 2 and that is the issue. Note that doing the functionalization and removing aliased ops before asking partitioners what ops to preserve has the same effect as the filter above.

@metascroy i remember vaguely that there was a way to functionalize the graph and remove aliased ops without calling run_decompositions. @tugsbayasgalan @angelayi is this possible?

I think the ideal solution for to_edge_transform_and_lower is:

First functionalize the graph and remove aliased ops

Ask partitioners for ops to preserve

Run decompositions

Today, step 1 happens after step 2 and that is the issue. Note that doing the functionalization and removing aliased ops before asking partitioners what ops to preserve has the same effect as the filter above.

100% agree and this should be the long term fix. Can we create an issue to track that? We can go with the current approach to unblock ourselves.

Created tracking issue here: #8781

@larryliu0820 are there any more concerns on this short term fix?

No thanks for creating the issue

facebook-github-bot · 2025-02-27T17:59:04Z

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…preservation (#8776) Summary: If a partitioner requests to_edge_transform_and_lower keep mutable / aliasing ops (e.g., transpose, view, permute, etc), lowering with ExecuTorch fails because those ops cannot be functionalized when wrapped in the EDGE_DO_NOT_DECOMP namespace as custom ops. This PR filters out unsupported ops that backends request for preservation. Differential Revision: D70333876 Pulled By: metascroy

facebook-github-bot · 2025-02-27T19:08:43Z

This pull request was exported from Phabricator. Differential Revision: D70333876

facebook-github-bot · 2025-02-27T19:21:03Z

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mcr229 · 2025-02-27T19:23:28Z

exir/program/_program.py

+        if op in [
+            # Hits infinte recursion error when op is in
+            # EDGE_DO_NOT_DECOMP namespace
+            torch.ops.aten._to_copy.default,


perhaps I didn't understand the full context, but why can't a partitioner just not specify these ops for preservation? I don't think they'll be decomposed anyways?

Because if a partitioner requests these for preservation, they are no longer aten ops (they become custom ops in EDGE_DO_NOT_DECOMP namespace), and that runs into issues in during export because PyTorch has more restrictions on custom ops than aten ops.

mcr229

seems ok for now, gonna accept, but do you mind linking the issue # to the _remove_invalid_ops_for_not_decompose function?

facebook-github-bot · 2025-02-28T19:39:56Z

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

metascroy · 2025-02-28T19:48:43Z

seems ok for now, gonna accept, but do you mind linking the issue # to the _remove_invalid_ops_for_not_decompose function?

Added

metascroy requested review from JacobSzwejbka, tarun292, larryliu0820, shoumikhin and cccclai as code owners February 27, 2025 17:39

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 27, 2025

metascroy added the partner: apple For backend delegation, kernels, demo, etc. from the 3rd-party partner, Apple label Feb 27, 2025

larryliu0820 reviewed Feb 27, 2025

View reviewed changes

facebook-github-bot force-pushed the coreml-part-up branch from 49d11b1 to 397a940 Compare February 27, 2025 19:08

facebook-github-bot added the fb-exported label Feb 27, 2025

metascroy force-pushed the coreml-part-up branch from 397a940 to 49d11b1 Compare February 27, 2025 19:19

mcr229 reviewed Feb 27, 2025

View reviewed changes

metascroy mentioned this pull request Feb 27, 2025

to_edge_transform_and_lower is broken when backend partitioners request aliased/mutable ops to be preserved #8781

Open

metascroy added 3 commits February 27, 2025 13:12

init

dfb5a60

up

f080560

up

7026ec6

metascroy force-pushed the coreml-part-up branch from b508634 to 7026ec6 Compare February 27, 2025 21:12

mcr229 approved these changes Feb 28, 2025

View reviewed changes

metascroy force-pushed the coreml-part-up branch from 6e202ee to 7026ec6 Compare February 28, 2025 19:38

metascroy added 2 commits February 28, 2025 11:39

up

6a26ccb

Merge branch 'main' into coreml-part-up

548d770

metascroy merged commit 781b082 into main Feb 28, 2025
55 of 56 checks passed

metascroy deleted the coreml-part-up branch February 28, 2025 20:39

Fixes to_edge_transform_and_lower when unsupported ops are asked for preservation #8776

Fixes to_edge_transform_and_lower when unsupported ops are asked for preservation #8776

Uh oh!

Conversation

metascroy commented Feb 27, 2025

Uh oh!

pytorch-bot bot commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8776

❌ 1 New Failure, 5 Pending

Uh oh!

github-actions bot commented Feb 27, 2025

This PR needs a release notes: label

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 27, 2025

Uh oh!

facebook-github-bot commented Feb 27, 2025

Uh oh!

facebook-github-bot commented Feb 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metascroy Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcr229 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 28, 2025

Uh oh!

metascroy commented Feb 28, 2025

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 27, 2025 •

edited

Loading

This PR needs a `release notes:` label

metascroy Feb 27, 2025 •

edited

Loading