Skip to content

Commit fe8839f

Browse files
committed
fix an error
Signed-off-by: Shixiaowei02 <[email protected]>
1 parent 012f575 commit fe8839f

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/source/blogs/tech_blog/blog5_Disaggregated_Serving_in_TensorRT-LLM.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ By NVIDIA TensorRT-LLM Team
1616
- [Measurement Methodology](#Measurement-Methodology)
1717
- [DeepSeek R1](#DeepSeek-R1)
1818
- [ISL 4400 - OSL 1200 (Machine Translation Dataset)](#ISL-4400---OSL-1200-Machine-Translation-Dataset)
19-
- [ISL 8192 - OS L256 (Synthetic Dataset)](#ISL-8192---OS-L256-Synthetic-Dataset)
19+
- [ISL 8192 - OSL 256 (Synthetic Dataset)](#ISL-8192---OSL-256-Synthetic-Dataset)
2020
- [ISL 4096 - OSL 1024 (Machine Translation Dataset)](#ISL-4096---OSL-1024-Machine-Translation-Dataset)
2121
- [Reproducing Steps](#Reproducing-Steps)
2222
- [Future Work](#Future-Work)
@@ -218,7 +218,7 @@ For some data points on the performance curve, the context/generation instance n
218218

219219
As shown in Figure 10, enabling MTP increases speedups of disaggregation over aggregation further, reaching 1.6x to 2.5x, averaging 20 – 30 % higher than MTP-off.
220220

221-
#### ISL 8192 - OS L256 (Synthetic Dataset)
221+
#### ISL 8192 - OSL 256 (Synthetic Dataset)
222222

223223
<div align="center">
224224
<figure>
@@ -251,7 +251,7 @@ By comparing the disaggregated serving E2E results with the “rate-matched” c
251251

252252
<div align="center">
253253
<figure>
254-
<img src="../media/tech_blog5_Picture13.png" width="640" height="auto">
254+
<img src="../media/tech_blog5_Picture14.png" width="640" height="auto">
255255
</figure>
256256
</div>
257257
<p align="center"><sub><em>Figure 14. DeepSeek R1 E2E Pareto curves without MTP.</em></sub></p>

0 commit comments

Comments
 (0)