Skip to content

Conversation

metascroy
Copy link
Contributor

If a partitioner requests to_edge_transform_and_lower keep mutable / aliasing ops (e.g., transpose, view, permute, etc), lowering with ExecuTorch fails because those ops cannot be functionalized when wrapped in the EDGE_DO_NOT_DECOMP namespace as custom ops.

This PR filters out unsupported ops that backends request for preservation.

Copy link

pytorch-bot bot commented Feb 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8776

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 5 Pending

As of commit 548d770 with merge base 38384a2 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 27, 2025
Copy link

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@metascroy metascroy added the partner: apple For backend delegation, kernels, demo, etc. from the 3rd-party partner, Apple label Feb 27, 2025
Comment on lines 972 to 1009
def _remove_invalid_ops_for_not_decompose(
ops_to_not_decompose: List[torch._ops.OpOverload],
) -> List[torch._ops.OpOverload]:
def keep(op):
schema = op._schema
native_schema = _pybind_schema_to_native_schema(schema)
if native_schema.is_mutable:
return False
if native_schema.aliased_return_names() != [None]:
return False
return True

return list(filter(keep, ops_to_not_decompose))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is a bit too sneaky in my mind. If the user wants to debug why an op is decomposed vs not decomposed, it might take them a while to find this filter.

Do you think it's reasonable to raise error at the partitioner level when they try to keep the aliasing op instead of silently filtering them out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's reasonable to raise error at the partitioner level

The user does not specify ops that should be decomposed or not. The partitioner automatically returns the list based on what the backend supports. If we did this filtering in the partitioner, each partitioner would just need to repeat this same filtering logic because it's actually a limitation with ExecuTorch. In fact, I originally did add this logic to the CoreML partitioner, but then I thought it applies to all partitioners, so why not add it to _program.py instead.

We could add logging for this filter though?

I think the ideal solution for to_edge_transform_and_lower is:

  1. First functionalize the graph and remove aliased ops
  2. Ask partitioners for ops to preserve
  3. Run decompositions

Today, step 1 happens after step 2 and that is the issue. Note that doing the functionalization and removing aliased ops before asking partitioners what ops to preserve has the same effect as the filter above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metascroy i remember vaguely that there was a way to functionalize the graph and remove aliased ops without calling run_decompositions. @tugsbayasgalan @angelayi is this possible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ideal solution for to_edge_transform_and_lower is:

  1. First functionalize the graph and remove aliased ops
  2. Ask partitioners for ops to preserve
  3. Run decompositions

Today, step 1 happens after step 2 and that is the issue. Note that doing the functionalization and removing aliased ops before asking partitioners what ops to preserve has the same effect as the filter above.

100% agree and this should be the long term fix. Can we create an issue to track that? We can go with the current approach to unblock ourselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created tracking issue here: #8781

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larryliu0820 are there any more concerns on this short term fix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No thanks for creating the issue

@facebook-github-bot
Copy link
Contributor

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Feb 27, 2025
…preservation (#8776)

Summary:
If a partitioner requests to_edge_transform_and_lower keep mutable / aliasing ops (e.g., transpose, view, permute, etc), lowering with ExecuTorch fails because those ops cannot be functionalized when wrapped in the EDGE_DO_NOT_DECOMP namespace as custom ops.

This PR filters out unsupported ops that backends request for preservation.


Differential Revision: D70333876

Pulled By: metascroy
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70333876

@facebook-github-bot
Copy link
Contributor

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

if op in [
# Hits infinte recursion error when op is in
# EDGE_DO_NOT_DECOMP namespace
torch.ops.aten._to_copy.default,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps I didn't understand the full context, but why can't a partitioner just not specify these ops for preservation? I don't think they'll be decomposed anyways?

Copy link
Contributor Author

@metascroy metascroy Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if a partitioner requests these for preservation, they are no longer aten ops (they become custom ops in EDGE_DO_NOT_DECOMP namespace), and that runs into issues in during export because PyTorch has more restrictions on custom ops than aten ops.

Copy link
Contributor

@mcr229 mcr229 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ok for now, gonna accept, but do you mind linking the issue # to the _remove_invalid_ops_for_not_decompose function?

@facebook-github-bot
Copy link
Contributor

@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@metascroy
Copy link
Contributor Author

seems ok for now, gonna accept, but do you mind linking the issue # to the _remove_invalid_ops_for_not_decompose function?

Added

@metascroy metascroy merged commit 781b082 into main Feb 28, 2025
55 of 56 checks passed
@metascroy metascroy deleted the coreml-part-up branch February 28, 2025 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported partner: apple For backend delegation, kernels, demo, etc. from the 3rd-party partner, Apple
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants