[xnn update prep] deprecate sdpa #11506

mcr229 · 2025-06-09T22:10:39Z

Differential Revision: D77265464

[ghstack-poisoned]

mcr229 · 2025-06-09T22:10:40Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-06-09T22:10:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11506

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

VolumeLimitExceeded Issue for linux.2xlarge and linux.4xlarge

❌ 4 New Failures, 7 Pending, 1 Unrelated Failure

As of commit 6e7028e with merge base d4cc258 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
Build Presets / linux (linux, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
Build Presets / linux (llm, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
Build Presets / linux (pybind, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-models-linux (add_mul, portable, linux.2xlarge) / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

digantdesai · 2025-06-10T14:52:15Z

backends/xnnpack/runtime/XNNCompiler.cpp

@@ -2097,7 +2061,6 @@ DefineNodeFunc getDefineNodeFunc(fb_xnnpack::XNodeUnion nodeType) {
    _DEFINE(Concatenate4)
    _DEFINE(Concatenate5)
    _DEFINE(StaticSlice)
-    _DEFINE(ScaledDotProductAttention)


Not updating schema for marking deprecated? XNNScaledDotProductAttention

so XNNPACK removed the operator from their codebase, so for next update we need to delete. I can mark the operator in the schema as deprecated though.

digantdesai · 2025-06-10T14:53:00Z

backends/xnnpack/operators/op_sdpa.py

@@ -1,111 +0,0 @@
-# Copyright (c) Meta Platforms, Inc. and affiliates.


Should we continue partitioning this and lower it as decomposed?
Else this will be a BC breaking change.

So ever since my new partitioner, SDPA has not been delegated. I assumed that no models in production or since h ave really used any sdpa implementation (also because i heard our sdpa is slow). so I believe this is safe to remove. I can import it internally to make sure

What I was thinking was we should handle decomp of SDPA inside XNNPACK AoT so that we don't regress for current perf. Lowering pieces is OK but can cause unexpected perf drops with slight changes.
And I guess when the constraint fails it will get decomp and we will lower it in pieces anyway, right?

facebook-github-bot · 2025-06-11T19:16:58Z