Skip to content

chore: Remove CUDNN dependencies #2804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions .github/workflows/docker_builder.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,18 +44,16 @@ jobs:
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

# Automatically detect TensorRT and cuDNN default versions for Torch-TRT build
# Automatically detect TensorRT default versions for Torch-TRT build
- name: Build Docker image
env:
DOCKER_TAG: ${{ env.DOCKER_REGISTRY }}/${{ steps.fix_slashes.outputs.container_name }}
run: |
python3 -m pip install pyyaml
TRT_VERSION=$(python3 -c "import versions; versions.tensorrt_version()")
echo "TRT VERSION = ${TRT_VERSION}"
CUDNN_VERSION=$(python3 -c "import versions; versions.cudnn_version()")
echo "CUDNN VERSION = ${CUDNN_VERSION}"

DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=$TRT_VERSION --build-arg CUDNN_VERSION=$CUDNN_VERSION -f docker/Dockerfile --tag $DOCKER_TAG .
DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=$TRT_VERSION -f docker/Dockerfile --tag $DOCKER_TAG .

- name: Push Docker image
env:
Expand Down
26 changes: 5 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Torch-TensorRT is distributed in the ready-to-run NVIDIA [NGC PyTorch Container]

## Building a docker container for Torch-TensorRT

We provide a `Dockerfile` in `docker/` directory. It expects a PyTorch NGC container as a base but can easily be modified to build on top of any container that provides, PyTorch, CUDA, cuDNN and TensorRT. The dependency libraries in the container can be found in the <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">release notes</a>.
We provide a `Dockerfile` in `docker/` directory. It expects a PyTorch NGC container as a base but can easily be modified to build on top of any container that provides, PyTorch, CUDA, and TensorRT. The dependency libraries in the container can be found in the <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">release notes</a>.

Please follow this instruction to build a Docker container.

Expand Down Expand Up @@ -152,14 +152,13 @@ bash ./compile.sh
You need to start by having CUDA installed on the system, LibTorch will automatically be pulled for you by bazel,
then you have two options.

#### 1. Building using cuDNN & TensorRT tarball distributions
#### 1. Building using TensorRT tarball distributions

> This is recommended so as to build Torch-TensorRT hermetically and insures any bugs are not caused by version issues

> Make sure when running Torch-TensorRT that these versions of the libraries are prioritized in your `$LD_LIBRARY_PATH`

1. You need to download the tarball distributions of TensorRT and cuDNN from the NVIDIA website.
- https://developer.nvidia.com/cudnn
1. You need to download the tarball distributions of TensorRT from the NVIDIA website.
- https://developer.nvidia.com/tensorrt
2. Place these files in a directory (the directories `third_party/dist_dir/[x86_64-linux-gnu | aarch64-linux-gnu]` exist for this purpose)
3. Compile using:
Expand All @@ -168,25 +167,16 @@ then you have two options.
bazel build //:libtorchtrt --compilation_mode opt --distdir third_party/dist_dir/[x86_64-linux-gnu | aarch64-linux-gnu]
```

#### 2. Building using locally installed cuDNN & TensorRT
#### 2. Building using locally installed TensorRT

> If you find bugs and you compiled using this method please disclose you used this method in the issue
> (an `ldd` dump would be nice too)

1. Install TensorRT, CUDA and cuDNN on the system before starting to compile.
1. Install TensorRT and CUDA on the system before starting to compile.
2. In `WORKSPACE` comment out

```py
# Downloaded distributions to use with --distdir
http_archive(
name = "cudnn",
urls = ["<URL>",],

build_file = "@//third_party/cudnn/archive:BUILD",
sha256 = "<TAR SHA256>",
strip_prefix = "cuda"
)

http_archive(
name = "tensorrt",
urls = ["<URL>",],
Expand All @@ -201,12 +191,6 @@ and uncomment

```py
# Locally installed dependencies
new_local_repository(
name = "cudnn",
path = "/usr/",
build_file = "@//third_party/cudnn/local:BUILD"
)

new_local_repository(
name = "tensorrt",
path = "/usr/",
Expand Down
243 changes: 0 additions & 243 deletions cmake/Modules/FindcuDNN.cmake

This file was deleted.

2 changes: 0 additions & 2 deletions cmake/dependencies.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,9 @@ endif()

# If the custom finders are needed at this point, there are good chances that they will be needed when consuming the library as well
install(FILES "${CMAKE_SOURCE_DIR}/cmake/Modules/FindTensorRT.cmake" DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/torchtrt/Modules")
install(FILES "${CMAKE_SOURCE_DIR}/cmake/Modules/FindcuDNN.cmake" DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/torchtrt/Modules")

# CUDA
find_package(CUDAToolkit REQUIRED)
find_package(cuDNN REQUIRED) # Headers are needed somewhere

# libtorch
find_package(Torch REQUIRED)
Expand Down
1 change: 0 additions & 1 deletion core/plugins/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ target_link_libraries(${lib_name}
TensorRT::nvinfer_plugin
torch
core_util
cuDNN::cuDNN
PRIVATE
Threads::Threads
)
Expand Down
1 change: 0 additions & 1 deletion dev_dep_versions.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
__version__: "2.4.0.dev0"
__cuda_version__: "12.1"
__cudnn_version__: "8.9"
__tensorrt_version__: "10.0.1"
6 changes: 1 addition & 5 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@ ENV BASE_IMG=nvidia/cuda:12.1.1-devel-ubuntu22.04
ARG TENSORRT_VERSION
ENV TENSORRT_VERSION=${TENSORRT_VERSION}
RUN test -n "$TENSORRT_VERSION" || (echo "No tensorrt version specified, please use --build-arg TENSORRT_VERSION=x.y to specify a version." && exit 1)
ARG CUDNN_VERSION
ENV CUDNN_VERSION=${CUDNN_VERSION}
RUN test -n "$CUDNN_VERSION" || (echo "No cudnn version specified, please use --build-arg CUDNN_VERSION=x.y to specify a version." && exit 1)

ARG PYTHON_VERSION=3.10
ENV PYTHON_VERSION=${PYTHON_VERSION}
Expand All @@ -35,13 +32,12 @@ RUN wget -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-instal
RUN pyenv install -v ${PYTHON_VERSION}
RUN pyenv global ${PYTHON_VERSION}

# Install CUDNN + TensorRT + dependencies
# Install TensorRT + dependencies
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
RUN mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/7fa2af80.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update
Comment on lines 36 to 40
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part can be removed I think. Reference:

RUN pyenv global ${PYTHON_VERSION}
# Install TensorRT + dependencies
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update
RUN apt-get install -y libnvinfer10=${TENSORRT_VERSION}.* libnvinfer-plugin10=${TENSORRT_VERSION}.* libnvinfer-dev=${TENSORRT_VERSION}.* libnvinfer-plugin-dev=${TENSORRT_VERSION}.* libnvonnxparsers10=${TENSORRT_VERSION}.* libnvonnxparsers-dev=${TENSORRT_VERSION}.*
# Setup Bazel via Bazelisk
RUN wget -q https://github.com/bazelbuild/bazelisk/releases/download/v1.17.0/bazelisk-linux-amd64 -O /usr/bin/bazel &&\

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave it like this. I've removed this as a part of #2793

RUN apt-get install -y libcudnn8=${CUDNN_VERSION}* libcudnn8-dev=${CUDNN_VERSION}*

RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
Expand Down
Loading