-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Platform] import activation_quant_fusion for CUDA only #23882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
94e8e36
to
16767d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request aims to fix an AttributeError
on platforms without custom ops by adding hasattr
checks. While the intention is correct, the implementation is incomplete and introduces new runtime errors because other parts of the code unconditionally assume these ops are available. The pattern registration logic needs to be made conditional to fully resolve the issue.
16767d0
to
b96e1cf
Compare
https://github.com/vllm-project/vllm/blob/main/vllm/compilation/fusion.py#L41-L54 |
@elvischenv there is a platform check when import the file, so it's fine with custom platform vllm/vllm/compilation/pass_manager.py Lines 10 to 12 in 006477e
And I'm not sure if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this! maybe we need some refactor to avoid op register issue on multiple platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very ugly, but I understand that we can't set these globals due to other platforms. I think we've already dealt with this in other files and in a nicer way, could you look around briefly?
Yeah I see the link now, can you just make the pass a conditional import like it's done for fusion? |
Yes, this is only for cuda. Let's follow this conditional import as @ProExpertProg mentioned. Thanks. |
b96e1cf
to
280c59a
Compare
@elvischenv @ProExpertProg Done, and tested with vllm-ascend again, it works |
Signed-off-by: wangxiyuan <[email protected]>
280c59a
to
8d8c2c5
Compare
…#23882) Signed-off-by: wangxiyuan <[email protected]>
…#23882) Signed-off-by: wangxiyuan <[email protected]>
Purpose
The commit #23671 add the custom init logic like
SILU_MUL_OP = torch.ops._C.silu_and_mul.default
. But it doesn't work on the platform which doesn't support custom ops. It will raise error when setup vLLM.This PR let activation_quant_fusion be imported only for CUDA case
Test Plan
This is a change for custom platform. No need for new test. and all test should not break. vLLM Ascend e2e test will be happy after this change.
Test Result
Custom platform works well then.