-
Notifications
You must be signed in to change notification settings - Fork 7.1k
add quantized vision transformer model #6545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@WZMIAOMIAO Quantizing models used to be a more involved process as Eager Mode Quantization often requires overwriting the class to introduce additional Quant/DeQuant stubs and structuring your code in a way that elements can be easily replaced (using nn modules instead of functional etc). The PyTorch Core team is working on a series of APIs that would make the quantization of models an easier process. FX Graph mode quantization is a new API that would allow you to do this, see #5797 for details. The work on the API is still in progress and might take an extra bit to complete but it's worth keeping an eye on it as it might simplify most of the past complexities. @andrewor14 Do you have any idea how close are we to finalizing the API and reopening the above PR? |
Hi @datumbox, the FX graph mode quantization API should more or less be finalized at this point, cc'ing @jerryzh168 just to confirm. However, due to recent priority shifts I no longer have the bandwidth to continue this work. I do think it's close to being done and it would be worth it to finish the remaining work. If it is high priority I can check with the team to see if we can prioritize this. |
@andrewor14 We would definitely like to complete the work, expose the new API to users and build trust on the solution. I appreciate though that your might be limited by the bandwidth. Please do check out with your team and let me know if that's something we could tackle in Q4. |
@datumbox well, FX Graph Mode Quantization tool is so convenient. But I find a bug for quantize mobilenetv2. If set
But if I delete env:
|
@andrewor14 Any thoughts on @WZMIAOMIAO above bug report? Should he open an issue on Core to review? |
🚀 The feature
hi, thanks for your great work. I hope to be able to add quantized vit model (for ptq or qat).
Motivation, pitch
In 'torchvision/models/quantization', there are several quantized model (Eager Mode Quantization) that is very useful for me to learn quantization. In recent years, Transformer model is very popular. I want to learn how to quantized Transformer model, e.g Vision Transformer, Swin Transformer etc, using pytorch official tools like Eager Mode Quantization. I also tried to modify it myself, but failed. I don't know how to quantify 'pos_embedding' (nn.Parameter) and nn.MultiheadAttention module. look forward to your reply.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: