[DRAFT] refactor: PyExecutor uses a list-type for response handling #5406

jaedeok-nvidia · 2025-06-23T17:07:23Z

PyExecutor._enqueue_responses now accepts a list of LlmResponse instead of a dictionary mapping req_id to response. Since each response already contains its request_id, using req_id as a dictionary key is redundant and causes issues when num_return_sequences > 1, where multiple responses share the same request_id.

This change aligns PyExecutor with C++ Executor's behavior, which returns std::vector. The previous dictionary-based approach would overwrite responses with the same request_id, losing all but the last response. With this fix, all responses for multi-sequence generation are properly preserved.

Changes:

_enqueue_responses now accepts List[LlmResponse] instead of Dict[int, LlmResponse]
All callers updated: _handle_errors, _handle_cancelled_requests, _handle_first_token_response, _handle_responses

PyExecutor._enqueue_responses now accepts a list of LlmResponse instead of a dictionary mapping req_id to response. Since each response already contains its request_id, using req_id as a dictionary key is redundant and causes issues when num_return_sequences > 1, where multiple responses share the same request_id. This change aligns PyExecutor with C++ Executor's behavior, which returns std::vector<Response>. The previous dictionary-based approach would overwrite responses with the same request_id, losing all but the last response. With this fix, all responses for multi-sequence generation are properly preserved. Changes: - _enqueue_responses now accepts List[LlmResponse] instead of Dict[int, LlmResponse] - All callers updated: _handle_errors, _handle_cancelled_requests, _handle_first_token_response, _handle_responses Signed-off-by: Jaedeok Kim <[email protected]>

jaedeok-nvidia · 2025-06-23T17:13:52Z

This is a draft PR for a discussion purpose. Please DO NOT MERGE yet.

jaedeok-nvidia · 2025-06-23T17:33:10Z

/bot run

tensorrt-cicd · 2025-06-23T17:41:03Z

PR_Github #9611 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-23T19:23:14Z

PR_Github #9611 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7064 completed with status: 'FAILURE'

jaedeok-nvidia · 2025-06-25T16:17:37Z

Closing the PR because it's no longer needed for #5415. If needed, we will revisit later.

jaedeok-nvidia requested a review from a team as a code owner June 23, 2025 17:07

jaedeok-nvidia requested a review from schetlur-nv June 23, 2025 17:07

jaedeok-nvidia self-assigned this Jun 23, 2025

jaedeok-nvidia changed the title ~~refactor: PyExecutor uses a list-type for response handling~~ [DRAFT] refactor: PyExecutor uses a list-type for response handling Jun 23, 2025

jaedeok-nvidia mentioned this pull request Jun 24, 2025

fix: Enable num_return_sequences (n) support in PyTorch backend #5415

Closed

jaedeok-nvidia removed the request for review from schetlur-nv June 25, 2025 16:16

jaedeok-nvidia closed this Jun 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DRAFT] refactor: PyExecutor uses a list-type for response handling #5406

[DRAFT] refactor: PyExecutor uses a list-type for response handling #5406

Uh oh!

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

[DRAFT] refactor: PyExecutor uses a list-type for response handling #5406

[DRAFT] refactor: PyExecutor uses a list-type for response handling #5406

Uh oh!

Conversation

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

jaedeok-nvidia commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jaedeok-nvidia commented Jun 25, 2025 •

edited

Loading