Skip to content

Commit a8c5d0b

Browse files
authored
Merge branch 'main' into chunked-prefill-spec-dec
2 parents 95ab84c + 10dbf4f commit a8c5d0b

File tree

4 files changed

+47
-19
lines changed

4 files changed

+47
-19
lines changed

docker/Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,8 @@ jenkins-aarch64_%: IMAGE_WITH_TAG = $(shell . ../jenkins/current_image_tags.prop
180180
jenkins-aarch64_%: STAGE = tritondevel
181181

182182
# For x86_64
183-
jenkins-rockylinux8_%: IMAGE_WITH_TAG = $(shell . ../jenkins/current_image_tags.properties && echo $$LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE)
183+
jenkins-rockylinux8_%: PYTHON_VERSION_TAG_ID = $(if $(findstring 3.12,${PYTHON_VERSION}),PY312,$(if $(findstring 3.10,${PYTHON_VERSION}),PY310,$(error Unknown PYTHON_VERSION specified)))
184+
jenkins-rockylinux8_%: IMAGE_WITH_TAG = $(shell . ../jenkins/current_image_tags.properties && echo $$LLM_ROCKYLINUX8_${PYTHON_VERSION_TAG_ID}_DOCKER_IMAGE)
184185
jenkins-rockylinux8_%: STAGE = tritondevel
185186
jenkins-rockylinux8_%: BASE_IMAGE = nvidia/cuda
186187
jenkins-rockylinux8_%: BASE_TAG = 12.9.0-devel-rockylinux8

docker/README.md

Lines changed: 35 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -89,13 +89,10 @@ equivalent containers as [described above](#building-docker-images-with-gnu-make
8989
### Jenkins Integration
9090

9191
[`Makefile`](Makefile) has special targets for building, pushing and running the Docker build image used on Jenkins.
92-
The full image name and tag is defined in [`L0_MergeRequest.groovy`](../jenkins/L0_MergeRequest.groovy). The `make`
93-
system will parse this name as the value of `LLM_DOCKER_IMAGE`. To build and push a new Docker image for Jenkins,
94-
define a new image name and tag in [`L0_MergeRequest.groovy`](../jenkins/L0_MergeRequest.groovy) and run
92+
The full image names and tags are defined in [`current_image_tags.properties`](../jenkins/current_image_tags.properties). The `make`
93+
system will parse the names/tags from this file.
9594

96-
```bash
97-
make -C docker jenkins_push
98-
```
95+
#### Running
9996

10097
Start a new container using the same image as Jenkins using your local user account with
10198

@@ -134,6 +131,38 @@ make -C docker trtllm_run LOCAL_USER=1 DOCKER_PULL=1
134131
The argument `DOCKER_PULL=1` instructs `make` to pull the latest version of the image before deploying it in the container.
135132
By default, the release images built in the above manner are tagged by their `git` branch name and may be frequently updated.
136133

134+
#### Building CI images
135+
136+
To build and push a new Docker image for Jenkins, define new image names and tags in [`current_image_tags.properties`](../jenkins/current_image_tags.properties) and run
137+
138+
```bash
139+
# Commands assume an amd64 host
140+
make -C docker jenkins_build
141+
#
142+
docker buildx create --name multi-builder
143+
make -C docker jenkins-aarch64_build \
144+
DOCKER_BUILD_ARGS="--platform arm64 --builder=multi-builder"
145+
#
146+
# check jenkins/BuildDockerImage.groovy for current Python versions
147+
make -C docker jenkins-rockylinux8_build PYTHON_VERSION=3.12.3
148+
make -C docker jenkins-rockylinux8_build PYTHON_VERSION=3.10.12
149+
```
150+
151+
The resulting images then need to be pushed:
152+
153+
```bash
154+
sh -c '. jenkins/current_image_tags.properties && echo $LLM_DOCKER_IMAGE $LLM_SBSA_DOCKER_IMAGE $LLM_ROCKYLINUX8_PY310_DOCKER_IMAGE $LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE' | tr ' ' '\n' | xargs -I{} docker push {}
155+
```
156+
157+
Alternatively, it is possible to trigger the image build by opening a new pull request and commenting
158+
159+
```text
160+
/bot run --stage-list "Build-Docker-Images"
161+
```
162+
163+
The resulting images can then be re-tagged using `scripts/rename_docker_images.py`
164+
and the new tags included in [`current_image_tags.properties`](../jenkins/current_image_tags.properties).
165+
137166
### Docker rootless
138167

139168
Some aspects require special treatment when using [Docker rootless mode](https://docs.docker.com/engine/security/rootless/). The `docker/Makefile` contains heuristics to detect Docker rootless mode. When assuming

jenkins/current_image_tags.properties

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@
88
# NB: Although string interpolation is supported, redundant substrings are
99
# kept in the variables below for interoperability with
1010
# scripts/rename_docker_images.py
11-
LLM_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.05-py3-x86_64-ubuntu24.04-trt10.11.0.33-skip-tritondevel-202507150652-9504
12-
LLM_SBSA_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.05-py3-aarch64-ubuntu24.04-trt10.11.0.33-skip-tritondevel-202507150652-9504
13-
LLM_ROCKYLINUX8_PY310_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-12.9.0-devel-rocky8-x86_64-rocky8-py310-trt10.11.0.33-skip-tritondevel-202507150652-9504
14-
LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-12.9.0-devel-rocky8-x86_64-rocky8-py312-trt10.11.0.33-skip-tritondevel-202507150652-9504
11+
#
12+
# NB: Typically, the suffix indicates the PR whose CI pipeline generated the images. In case that
13+
# images are adopted from PostMerge pipelines, the abbreviated commit hash is used instead.
14+
LLM_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.05-py3-x86_64-ubuntu24.04-trt10.11.0.33-skip-tritondevel-202507162011-ec3ebae
15+
LLM_SBSA_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:pytorch-25.05-py3-aarch64-ubuntu24.04-trt10.11.0.33-skip-tritondevel-202507162011-ec3ebae
16+
LLM_ROCKYLINUX8_PY310_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-12.9.0-devel-rocky8-x86_64-rocky8-py310-trt10.11.0.33-skip-tritondevel-202507162011-ec3ebae
17+
LLM_ROCKYLINUX8_PY312_DOCKER_IMAGE=urm.nvidia.com/sw-tensorrt-docker/tensorrt-llm:cuda-12.9.0-devel-rocky8-x86_64-rocky8-py312-trt10.11.0.33-skip-tritondevel-202507162011-ec3ebae

tensorrt_llm/_torch/pyexecutor/py_executor.py

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -966,19 +966,14 @@ def _executor_loop(self):
966966
self._prepare_disagg_gen_transmission_complete(
967967
scheduled_batch)
968968

969+
# Return the first token to the client
970+
self._handle_first_token_response(scheduled_batch)
971+
969972
self.resource_manager.prepare_resources(scheduled_batch)
970973
if self.drafter is not None:
971974
self.drafter.prepare_draft_tokens(
972975
scheduled_batch, self.resource_manager)
973976

974-
if self.kv_cache_transceiver:
975-
# For generation requests which have completed KV cache transfer
976-
self._prepare_disagg_gen_transmission_complete(
977-
scheduled_batch)
978-
979-
# Return the first token to the client
980-
self._handle_first_token_response(scheduled_batch)
981-
982977
batch_outputs = self._forward_step(scheduled_batch)
983978

984979
if self.guided_decoder is not None:

0 commit comments

Comments
 (0)