Skip to content

Conversation

@dfalbel
Copy link
Member

@dfalbel dfalbel commented Jul 24, 2025

Updates for LibTorch v2.7.1

On Ci when running CUDA 12.8 we see:

CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1

This is likely because the our GPU NVIDIA 1080Ti uses Pascal architecture (6.1) and that's now not supported in 12.8. See:

TORCH_CUDA_ARCH_LIST="5.0;6.0;7.0;7.5;8.0;8.6"
case ${CUDA_VERSION} in
    12.8)
        TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6;9.0;10.0;12.0+PTX" #removing sm_50-sm_70 as these architectures are deprecated in CUDA 12.8 and will be removed in future releases
        EXTRA_CAFFE2_CMAKE_FLAGS+=("-DATEN_NO_TEST=ON")
        ;;
    12.6)
        TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST};9.0"
        EXTRA_CAFFE2_CMAKE_FLAGS+=("-DATEN_NO_TEST=ON")
        ;;
    12.4)
        TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST};9.0"
        EXTRA_CAFFE2_CMAKE_FLAGS+=("-DATEN_NO_TEST=ON")
        ;;
    11.8)
        TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST};3.7;9.0"
        EXTRA_CAFFE2_CMAKE_FLAGS+=("-DATEN_NO_TEST=ON")
        ;;
    *)
        echo "unknown cuda version $CUDA_VERSION"
        exit 1
        ;;
esac

in the repo

Thus CI now uses 12.6. Looking forward we'll need to upgrade our runner GPU or move to using a different CI runner.

@dfalbel dfalbel added the lantern Use this label if your PR affects lantern so it's built in the CI label Jul 24, 2025
@dfalbel dfalbel merged commit 3f1bb59 into main Jul 25, 2025
0 of 8 checks passed
@dfalbel dfalbel deleted the libtorch-v2.7.1 branch July 25, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lantern Use this label if your PR affects lantern so it's built in the CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant