-
Notifications
You must be signed in to change notification settings - Fork 33
Add trt decoder #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add trt decoder #307
Conversation
bd14f16
to
c34be87
Compare
- Add trt_decoder class implementing TensorRT-accelerated inference - Support both ONNX model loading and pre-built engine loading - Include precision configuration (fp16, bf16, int8, fp8, tf32, best) - Add hardware platform detection for capability-based precision selection - Implement CUDA memory management and stream-based execution - Add Python utility script for ONNX to TensorRT engine conversion - Update CMakeLists.txt to build TensorRT decoder plugin - Add comprehensive parameter validation and error handling
c34be87
to
9e97e26
Compare
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
libs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py
Show resolved
Hide resolved
import tensorrt as trt | ||
|
||
|
||
def build_engine(onnx_file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file exposed as part of the wheel such that regular users will be able to use this file?
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
…trix) Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
…ecoder model, added to unittest Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
I, Scott Thornton <[email protected]>, hereby add my Signed-off-by to this commit: 9e97e26 Signed-off-by: Scott Thornton <[email protected]>
/ok to test fb16b36 |
@wsttiger, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/ |
/ok to test c9e563f |
Add TensorRT Decoder Plugin for Quantum Error Correction
Overview
This PR introduces a new TensorRT-based decoder plugin for quantum error correction, leveraging NVIDIA TensorRT for accelerated neural network inference in QEC applications.
Key Features
Technical Implementation
trt_decoder
implementing thedecoder
interface with TensorRT backendFiles Added/Modified
libs/qec/include/cudaq/qec/trt_decoder_internal.h
- Internal API declarationslibs/qec/lib/decoders/plugins/trt_decoder/trt_decoder.cpp
- Main decoder implementationlibs/qec/lib/decoders/plugins/trt_decoder/CMakeLists.txt
- Plugin build configurationlibs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py
- Python utilitylibs/qec/unittests/test_trt_decoder.cpp
- Comprehensive unit testsTesting
Usage Example
Dependencies
Performance Benefits
This implementation provides a production-ready TensorRT decoder plugin that can significantly accelerate quantum error correction workflows while maintaining compatibility with the existing CUDA-Q QEC framework.