Skip to content

Qualcomm AI Engine Direct - Enable custom operator #8726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 13, 2025

Conversation

shewu-quic
Copy link
Collaborator

Summary:

  • Support to register op package in QNN Backend
  • Add example script to run torch custom op with QNN Op package
  • Allow op package override torch built-in operator
  • Add op package example
  • Modify the flag of dlopen for QNN library
  • Generate custom op based on the meta and _schema.arguments of torch.fx.Node
  • Add README for the custom op

Reproduce commands:

# Follow the README to Install qpm
# Follow the README to install hexagon-sdk and hexagon-tool
# install hexagon sdk 5.4.0 for SM8650
# qpm-cli --install hexagonsdk5.x --version 5.4.0.3 --path /path/to/Qualcomm/Hexagon_SDK/hexagon-sdk-5.4.0
# install hexagon sdk 6.0.0 for x86
# qpm-cli --install hexagonsdk6.x --version 6.0.0.2 --path /path/to/Qualcomm/Hexagon_SDK/hexagon-sdk-6.0.0
# install hexagon tool 8.8.02 for x86
# qpm-cli --extract hexagon8.8 --version 8.8.02.1 --path /path/to/Qualcomm/Hexagon_SDK/hexagon-sdk-6.0.0/tools/HEXAGON_Tools/8.8.02

export HEXAGON_SDK_ROOT=/path/to/hexagon-sdk-5.4.0
export ANDROID_NDK_ROOT=/path/to/android-ndk-r26c
# use clang-9.0.0
export X86_CXX=/path/to/clang++
# run custom op with example script
python3 examples/qualcomm/custom_op/custom_ops_1.py --build_folder build-android -s <device_serial> -H <host> -m SM8650 --op_package_dir examples/qualcomm/custom_op/example_op_package_htp/ExampleOpPackage --build_op_package
# run custom op with unit test
python3 backends/qualcomm/tests/test_qnn_delegate.py TestUtilScript.test_custom_op -b build-android -s <device_serial> -H <host> -m SM8650 --op_package_dir examples/qualcomm/custom_op/example_op_package_htp/ExampleOpPackage -r </path/to/executorch> -a </path/to/artifacts>

@shewu-quic shewu-quic requested a review from cccclai as a code owner February 26, 2025 09:11
Copy link

pytorch-bot bot commented Feb 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8726

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit de02835 with merge base c6c3616 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 26, 2025
@shewu-quic
Copy link
Collaborator Author

Hi @cccclai,

This PR is to support custom kernel in QNN Backend.
Could you please help to take a look?
If you have any problems, please let me know. Thanks :)



@impl(my_op_lib, "mul3", dispatch_key="CompositeExplicitAutograd")
def mul3_impl(a: torch.Tensor) -> torch.Tensor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not know enough details, but can you tell me how qnn backend knows this custom op can be consumed? Previously I was thinking qnn custom ops can be registered in a specific namespace, but it looks a bit different to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly. QNN uses the op package mechanism to create custom ops. Once the op package is prepared, it is registered using qnn_backend_register_op_package. In the op builder, you provide the corresponding op_package_name , qnn_op_type_name and schema to create a QNN node.
In the executorh, through compile_spec, you pass QnnExecuTorchOpPackageOptions which includes the op package info and custom_op_name. Using the custom_op_name, such as my_op_lib.mul3.default, a CustomOp builder is created to consume the custom op.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry a bit late on this, need some time to read

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach, the custom op is authored as torch custom op and then lowered through ET, annotating and also partitioning through compile_spec.

Curious, did you consider other design choices?

For example, if someone were to use a custom op, they have to write all this, an alternative would be to take set of "already partitioned" nodes, and convert them into a custom op package in the preprocess function.

I have more questions but I haven't read the PR in detail yet :(

Copy link
Collaborator Author

@shewu-quic shewu-quic May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, if someone were to use a custom op, they have to write all this, an alternative would be to take set of "already partitioned" nodes, and convert them into a custom op package in the preprocess function.

Let me clarify your point about "already partitioned" nodes. Do you mean define a namespace for QNN custom ops, identifying this namespace during partitioning to return True, and then creating the corresponding builder during the preprocess stage?

If so, my initial design choice to create the custom op builder during partitioning was because QNN provides an Op validation API. When dealing with custom ops, users can implement how to validate the op within the op package. Therefore, I chose to create the custom op builder during partitioning to validate the op.

I have more questions but I haven't read the PR in detail yet :(

Feel free to ask them whenever you're ready. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have htp x86 simulator support to make the test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have htp x86 simulator support to make the test?

Sure. Let me add it.

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readme is nice, I will try to reproduce this. Thanks!

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from 064d12f to 5f65810 Compare March 6, 2025 09:42
@digantdesai
Copy link
Contributor

Apologies this is still pending. I will also help review this next week.

@cccclai
Copy link
Contributor

cccclai commented Apr 9, 2025

Will finish review by the end of this week, thank you for being patient.

@shewu-quic
Copy link
Collaborator Author

Will finish review by the end of this week, thank you for being patient.

Thanks for your effort.
I want to add more details about internal issue mentioned in today meeting.
When we try to implement custom embedding op, we found an issue which static variable doesn't free normally after call Qnn Backend free. You will get the following error if using the macro DEF_TENSOR_PROPERTIES in op implementation file.
image
image

We will try to create a workaround PR for this issue, in the meantime, and touch internal HTP backend owner to address this issue.

@cccclai cccclai requested review from sxu and billmguo April 17, 2025 19:52
@@ -44,7 +49,11 @@ def __init__(
skip_node_id_set: set = None,
skip_node_op_set: set = None,
):
self.node_visitors = node_visitor.get_node_visitors(edge_program)
python_options = flatbuffer_to_option(compiler_specs[0].value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is python_options and why do we need this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a custom op users wrote for qnn, and they failed to tag them, are we able to error out.

Copy link
Collaborator Author

@shewu-quic shewu-quic Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this python options are used to create custom op builder. If user didn't add custom op package info into compile_spec and the graph contained custom op, error will happen here.

op_wrapper = self.node_visitors[node.target.__name__].define_node(
Custom op builder cannot be found in node_visitor.

@cccclai
Copy link
Contributor

cccclai commented Apr 21, 2025

Finally got time to finish reading, a couple of questions (can be follow up as well).

  1. Looks like users can define any custom op, and as long as they have the custom op package compile and the package name is defined as part of compile spec, they can lower it to qnn backend. It is indeed flexible and offers more options for users.
  2. Looks like the package path should be predefined AoT, however the PoC who exported the model might not be the same as the PoC who run the model on device. Maybe we can just specify the .so path on device and the package can be found automatically?

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve as I generally like the flexibility of this approach. The only I'm worried is user experience, and would like us improving it.

Originally, our thought on supporting custom ops was to force users to register the custom ops under a specific namespace, like qnn namespace. It's less flexible, but easier to debug, given that users are required to write the qnn package anyway for the qnn custom ops. It's similar to how you support AIHub model. What do you think?

@shewu-quic
Copy link
Collaborator Author

Finally got time to finish reading, a couple of questions (can be follow up as well).

  1. Looks like users can define any custom op, and as long as they have the custom op package compile and the package name is defined as part of compile spec, they can lower it to qnn backend. It is indeed flexible and offers more options for users.
  2. Looks like the package path should be predefined AoT, however the PoC who exported the model might not be the same as the PoC who run the model on device. Maybe we can just specify the .so path on device and the package can be found automatically?
  1. Yes, I expect users to be flexible. It seems to me that those who use custom operations are advanced users.
  2. Got it. Actually, I share the same thought. For the runtime operation package, we can add a follow-up to use the runtime option for setting. Regarding the AOT operation package, I have tested that the operation package can be found in LD_LIBRARY_PATH due to the dlopen call in the QNN SDK. Therefore, users only need to specify the .so file and correctly set LD_LIBRARY_PATH to access the library.

@shewu-quic
Copy link
Collaborator Author

Approve as I generally like the flexibility of this approach. The only I'm worried is user experience, and would like us improving it.

Originally, our thought on supporting custom ops was to force users to register the custom ops under a specific namespace, like qnn namespace. It's less flexible, but easier to debug, given that users are required to write the qnn package anyway for the qnn custom ops. It's similar to how you support AIHub model. What do you think?

I think it's fine for me. If I add a check for the namespace based on the current design, what are your thoughts?

@cccclai
Copy link
Contributor

cccclai commented Apr 22, 2025

Maybe let me get some thoughts from @sxu, @billmguo, as it's also one of their requests as well.

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from 5f65810 to eaf8aa6 Compare May 7, 2025 07:21
@shewu-quic
Copy link
Collaborator Author

@cccclai I have rebased this PR. Thanks!

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from 5b04add to 3bbb5c0 Compare June 2, 2025 06:20
@cccclai cccclai added the release notes: qualcomm Changes to the Qualcomm backend delegate label Jun 2, 2025
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Jun 2, 2025

The CI is currently failing because we pin to 338393f8 in flatbuffers internally, which is different than the one in executorch in open source... https://github.com/google/flatbuffers/commits/338393f8/

@cccclai
Copy link
Contributor

cccclai commented Jun 3, 2025

Thanks for enabling x86 in the latest commit, would you be able to help fixing the issue due to flatbuffers version mismatch? We may need to downgrade the flatbuffers in open source unfortunately

@shewu-quic
Copy link
Collaborator Author

shewu-quic commented Jun 4, 2025

Thanks for enabling x86 in the latest commit, would you be able to help fixing the issue due to flatbuffers version mismatch? We may need to downgrade the flatbuffers in open source unfortunately

Let me make clear for the following.

  1. May I know if this mismatch issue is occurring due to this PR?
  2. What problems did you encounter after the downgrade?

++ @haowhsu-quic aware

@cccclai
Copy link
Contributor

cccclai commented Jun 4, 2025

The mismatch issue is an existing issue, and this PR uses the newer feature in flatbuffers, causing build failing when compiling with the old flatbuffers version. If you downgrade the flatbuffers to this older version (https://github.com/google/flatbuffers/commits/338393f8/), the error message is like following

In file included from fbcode/executorch/backends/qualcomm/aot/python/PyQnnManagerAdaptor.cpp:8:
In file included from buck-out/v2/gen/fbcode/af0eca3ee0e06022/executorch/backends/qualcomm/aot/python/__PyQnnManagerAdaptor__/buck-private-headers/executorch/backends/qualcomm/aot/python/PyQnnManagerAdaptor.h:13:
In file included from buck-out/v2/gen/fbcode/af0eca3ee0e06022/executorch/backends/qualcomm/runtime/__runtime__/buck-headers/executorch/backends/qualcomm/runtime/QnnManager.h:15:
In file included from buck-out/v2/gen/fbcode/af0eca3ee0e06022/executorch/backends/qualcomm/runtime/__runtime__/buck-headers/executorch/backends/qualcomm/runtime/backends/QnnBackendFactory.h:13:
buck-out/v2/gen/fbcode/af0eca3ee0e06022/executorch/backends/qualcomm/runtime/__runtime__/buck-headers/executorch/backends/qualcomm/runtime/backends/QnnBackendCommon.h:64:26: error: too many template arguments for class template 'Vector'
   64 |       const flatbuffers::Vector<
      |                          ^
   65 |           flatbuffers::Offset<qnn_delegate::QnnExecuTorchOpPackageInfo>,
   66 |           flatbuffers::uoffset_t>* op_packages_info);
      |           ~~~~~~~~~~~~~~~~~~~~~~~
buck-out/v2/gen/fbsource/af0eca3ee0e06022/third-party/flatbuffers/__flatbuffers-api__/buck-headers/flatbuffers/flatbuffers.h:243:28: note: template is declared here
  243 | template<typename T> class Vector {
      | ~~~~~~~~~~~~~~~~~~~~       ^
1 error generated.

Action sub-errors produced by error handlers:
- [cxx_too_many_template_arguments]
\nAction failed: fbcode//executorch/backends/qualcomm/runtime:runtime (cfg:opt-linux-x86_64-fbcode-platform010-clang17-no-san-opt-by-default#af0eca3ee0e06022) (cxx_compile backends/QnnBackendCommon.cpp (pic))
Remote command returned non-zero exit code 1
Remote action, reproduce with: `frecli cas download-action 80ae0e035470c1af7018dd5d7929fcf5023e90d9d8911a5be6e46eeef50c35f3:145`
Stdout: <empty>
Stderr:
In file included from fbcode/executorch/backends/qualcomm/runtime/backends/QnnBackendCommon.cpp:8:
buck-out/v2/gen/fbcode/af0eca3ee0e06022/executorch/backends/qualcomm/runtime/__runtime__/buck-headers/executorch/backends/qualcomm/runtime/backends/QnnBackendCommon.h:64:26: error: too many template arguments for class template 'Vector'
   64 |       const flatbuffers::Vector<
      |                          ^
   65 |           flatbuffers::Offset<qnn_delegate::QnnExecuTorchOpPackageInfo>,
   66 |           flatbuffers::uoffset_t>* op_packages_info);
      |           ~~~~~~~~~~~~~~~~~~~~~~~
buck-out/v2/gen/fbsource/af0eca3ee0e06022/third-party/flatbuffers/__flatbuffers-api__/buck-headers/flatbuffers/flatbuffers.h:243:28: note: template is declared here
  243 | template<typename T> class Vector {
      | ~~~~~~~~~~~~~~~~~~~~       ^
fbcode/executorch/backends/qualcomm/runtime/backends/QnnBackendCommon.cpp:34:24: error: too many template arguments for class template 'Vector'
   34 |     const flatbuffers::Vector<
      |                        ^
   35 |         flatbuffers::Offset<qnn_delegate::QnnExecuTorchOpPackageInfo>,
   36 |         flatbuffers::uoffset_t>* op_packages_infos) {
      |         ~~~~~~~~~~~~~~~~~~~~~~~
buck-out/v2/gen/fbsource/af0eca3ee0e06022/third-party/flatbuffers/__flatbuffers-api__/buck-headers/flatbuffers/flatbuffers.h:243:28: note: template is declared here
  243 | template<typename T> class Vector {

@shewu-quic
Copy link
Collaborator Author

When I checkout to 338393f8 in flatbuffers, I get the following error with ./backends/qualcomm/scripts/build.sh.
Do I miss anything?

Error log Error while generating /local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/build-android/executorch_srcs.cmake. Exit code: 1 Output:

Error:
2025-06-05 17:18:30,928 [ExecuTorch] ERROR: Failed to query buck for sources. Failed command:
buck2 cquery inputs(deps('//runtime/executor:program')) --target-platforms shim_et//:android-arm64 This is likely due to missing git submodules or outdated CMake cache. Please run the following before retry: ./install_executorch.sh --clean
git submodule sync
git submodule update --init

Traceback (most recent call last):
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/buck_util.py", line 34, in run
cp: subprocess.CompletedProcess = subprocess.run(
File "/local/mnt/workspace/miniconda3/envs/executorch/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/buck2-bin/buck2-2025-05-06-201beb86106fecdc84e30260b0f1abb5bf576988', 'cquery', "inputs(deps('//runtime/executor:program')
)", '--target-platforms', 'shim_et//:android-arm64']' returned non-zero exit status 3.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/extract_sources.py", line 255, in
main()
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/extract_sources.py", line 240, in main
target_to_srcs[name] = sorted(target.get_sources(graph, runner, buck_args))
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/extract_sources.py", line 144, in get_sources
raise e
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/extract_sources.py", line 132, in get_sources
sources: set[str] = set(runner.run(["cquery", query] + buck_args))
File "/local3/mnt/workspace/shewu/executorch/executorch_shewu/executorch/tools/cmake/buck_util.py", line 42, in run
raise RuntimeError(ex.stderr.decode("utf-8")) from ex
RuntimeError: [2025-06-05T17:18:17.612+08:00] Starting new buck2 daemon...
[2025-06-05T17:18:30.149+08:00] Connected to new buck2 daemon.
[2025-06-05T17:18:30.198+08:00] Build ID: b928dfaf-981f-44d0-9077-9d3406669451
Command failed:
Error in configured node dependency, dependency chain follows (-> indicates depends on, ^ indicates same configuration as previous):
root//runtime/executor:program (shim_et//:android-arm64#529b86ff5de06a9c)
-> root//runtime/executor:program_no_prim_ops (^)
-> root//runtime/executor:pte_data_map (^)
-> root//schema:program (^)
-> root//schema:generate_program (^)
-> root//third-party:flatc (^)

Caused by:
0: looking up unconfigured target node root//third-party:flatc
1: Error loading targets in package root//third-party for target root//third-party:flatc
2: Error evaluating build file: root//third-party:TARGETS
3: Traceback (most recent call last):
* third-party/TARGETS:123, in
runtime.cxx_library(
* shim_et/xplat/executorch/build/runtime_wrapper.bzl:249, in _cxx_library
_cxx_library_common(*args, **kwargs)
* shim_et/xplat/executorch/build/runtime_wrapper.bzl:242, in _cxx_library_common
env.cxx_library(*args, **kwargs)

   error: Error coercing attribute `raw_headers` of `root//third-party:flatc_library`
     --> shim_et/xplat/executorch/build/runtime_wrapper.bzl:242:5
       |
   242 |     env.cxx_library(*args, **kwargs)
       |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |
4: Error coercing attribute `raw_headers` of type `attrs.list(attrs.source(), default=[])`
5: Error coercing ["flatbuffers/include/flatbuffers/allocator.h", "flatbuffers/include/flatbuffers/array.h", "flatbuffers/include/flatbuffers/base.h", "flatbuffers/include/flatbuffers/buffer.h", "flatbuffers/include/flatbuffers

/buffer_ref.h", "flatbuffers/include/flatbuffers/code_generator.h", "flatbuffers/include/flatbuffers/default_allocator.h", "flatbuffers/include/flatbuffers/detached_buffer.h", "flatbuffers/include/flatbuffers/file_manager.h", "flat
buffers/include/flatbuffers/flatbuffer_builder.h", "flatbuffers/include/flatbuffers/flatbuffers.h", "flatbuffers/include/flatbuffers/flex_flat_util.h", "flatbuffers/include/flatbuffers/flexbuffers.h", "flatbuffers/include/flatbuffe
rs/hash.h", "flatbuffers/include/flatbuffers/idl.h", "flatbuffers/include/flatbuffers/minireflect.h", "flatbuffers/include/flatbuffers/reflection.h", "flatbuffers/include/flatbuffers/reflection_generated.h", "flatbuffers/include/fl
atbuffers/registry.h", "flatbuffers/include/flatbuffers/stl_emulation.h", "flatbuffers/include/flatbuffers/string.h", "flatbuffers/include/flatbuffers/struct.h", "flatbuffers/include/flatbuffers/table.h", "flatbuffers/include/flatb
uffers/util.h", "flatbuffers/include/flatbuffers/vector.h", "flatbuffers/include/flatbuffers/vector_downward.h", "flatbuffers/include/flatbuffers/verifier.h"]
6: Error coercing "flatbuffers/include/flatbuffers/allocator.h"
7: Coercing flatbuffers/include/flatbuffers/allocator.h as a source
8: Source file flatbuffers/include/flatbuffers/allocator.h does not exist as a member of package root//third-party.

CMake Error at tools/cmake/Utils.cmake:109 (message):
executorch: source list generation failed
Call Stack (most recent call first):
CMakeLists.txt:303 (extract_sources)

-- Configuring incomplete, errors occurred!

@shewu-quic
Copy link
Collaborator Author

Woops, I got it. This commit is very different from now. It is missing some header file...

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from eff8673 to a22cdd2 Compare June 5, 2025 10:12
@shewu-quic
Copy link
Collaborator Author

It seems flatbuffers::Vector only takes one template argument. I've made the adjustments. Could you please try again?

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Jun 5, 2025

It seems like flatbuffers::Vector issue is resolved now

@cccclai
Copy link
Contributor

cccclai commented Jun 5, 2025

But there is a different issue...

Error log from executorch.backends.qualcomm.serialization.qc_schema import ( File "/data/sandcastle/boxes/eden-trunk-hg-full-fbsource/buck-out/v2/gen/fbcode/192ecd433d810457/executorch/examples/models/llama/fb/__test_llama_qnn__/test_llama_qnn#link-tree/executorch/backends/qualcomm/serialization/qc_schema.py", line 178, in [@DataClass](https://www.internalfb.com/intern/profile/dataclass) ^^^^^^^^^ File "/usr/local/fbcode/platform010/lib/python3.12/dataclasses.py", line 1275, in dataclass return wrap(cls) ^^^^^^^^^ File "/usr/local/fbcode/platform010/lib/python3.12/dataclasses.py", line 1265, in wrap return _process_class(cls, init, repr, eq, order, unsafe_hash, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/fbcode/platform010/lib/python3.12/dataclasses.py", line 994, in _process_class cls_fields.append(_get_field(cls, name, type, kw_only)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/fbcode/platform010/lib/python3.12/dataclasses.py", line 852, in _get_field raise ValueError(f'mutable default {type(f.default)} for field ' ValueError: mutable default for field op_package_options is not allowed: use default_factory

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from a22cdd2 to f28ab0e Compare June 6, 2025 06:36
@shewu-quic
Copy link
Collaborator Author

But there is a different issue...

Error log

It seems that dataclass cannot set mutable value to a class attributes.
I have changed. Could you take a shot again? Thanks.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Jun 6, 2025

Hmm seems like importing fail, can you rebase?

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from f28ab0e to 655e416 Compare June 9, 2025 01:55
@shewu-quic
Copy link
Collaborator Author

Hmm seems like importing fail, can you rebase?

Done. Thanks :)

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Jun 10, 2025

Hmm our infra was a bit flaky last week, can you rebase again? It should recover now

Summary:
- Support to register op package in QNN Backend
- Add example script to run torch custom op with QNN Op package
- Allow op package override torch built-in operator
- Add op package example
  - move test_custom_op to TestUtilScript
- Modify the flag of dlopen for QNN library
- Generate custom op based on the meta and _schema.arguments of torch.fx.Node
- Add README for the custom op
@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_custom_operator branch from 655e416 to de02835 Compare June 11, 2025 03:32
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai cccclai merged commit 67b6009 into pytorch:main Jun 13, 2025
103 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: qualcomm Changes to the Qualcomm backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants