Description
Bug Description
Hello, I am currently compiling my model to TensorRT on a Jetson AGX Orin dev. kit. As such I'd like to make use of the DLAs on the system. By default the local DRAM is set to 1024MiB and I'm looking to increase this due to the size of some of the layers in my network. The network is a simple U-Net but the first layer consists of feature maps of dimension [32, 64, 592, 784] so quite big and requiring more DRAM to be able to execute these layers on the DLA. In torch_tensorrt.compile
, I set the kwarg dla_local_dram_size
to be a different number e.g. 2 times the default value, however when I run the script to make the engine, the DLA local ram is still the default value of 1024MiB.
To Reproduce
Steps to reproduce the behavior:
import torch
import torch_tensorrt
from unet import UNet #custom unet model, but inherits from nn.Module so could be replaced with any torchvision model
model = UNet(3, 64, 1).eval().half()
inputs = [torch_tensorrt.Input([32, 3, 592, 784], dtype=torch.half)]
enabled_precisions = {torch.half}
device = torch_tensorrt.Device("dla:0", allow_gpu_fallback=True)
trt_ts_model = torch_tensorrt.compile(model, inputs=inputs, enabled_precisions=enabled_precisions, device=device, dla_local_dram_size=2*1024**3)
Expected behavior
Expected to build the TensorRT engine with local DRAM of 2048MiB but instead get local DRAM of 1024MiB.
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0): 1.4.0
- PyTorch Version (e.g. 1.0): 2.0.0
- CPU Architecture: aarch64
- OS (e.g., Linux): Linux
- How you installed PyTorch (
conda
,pip
,libtorch
, source): pip - Build command you used (if compiling from source):
- Are you using local sources or building from archives: yes, torch_tensorrt is built from source
- Python version: 3.8.10
- CUDA version: 11.4
- GPU models and configuration: NVIDIA Jetson AGX Orin Developer Kit
- Any other relevant information: