Skip to content

Conversation

netanel-haber
Copy link
Collaborator

No description provided.

@netanel-haber netanel-haber force-pushed the user/nhaber/feature/align_sample_state_with_trtllm_sampler_sample_state branch from af06fcd to 27fd1d1 Compare May 19, 2025 13:06
@netanel-haber
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6632 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6632 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #4848 completed with status: 'FAILURE'

@netanel-haber netanel-haber force-pushed the user/nhaber/feature/align_sample_state_with_trtllm_sampler_sample_state branch 12 times, most recently from 8345da8 to b07fa8e Compare June 4, 2025 11:54
@netanel-haber
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7516 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7516 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #5451 completed with status: 'FAILURE'

@netanel-haber
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7698 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7698 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #5588 completed with status: 'FAILURE'

@netanel-haber netanel-haber force-pushed the user/nhaber/feature/align_sample_state_with_trtllm_sampler_sample_state branch 2 times, most recently from f48672a to 71783a4 Compare June 5, 2025 16:11
@netanel-haber netanel-haber marked this pull request as ready for review June 5, 2025 16:16
@netanel-haber netanel-haber requested review from a team as code owners June 5, 2025 16:16
@netanel-haber
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9539 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9539 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7000 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@Funatiq Funatiq requested a review from Copilot June 20, 2025 07:05
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors how speculative samplers handle new token formatting by unifying on a single TorchSampler.Args structure, streamlining decoder factory logic, and replacing legacy sampler implementations.

  • Refactored get_spec_decoder to accept TorchSampler.Args and updated MTP/Eagle3OneModel sampler constructors.
  • Consolidated request iteration via ScheduledRequests.all_requests(), replacing itertools.chain across the codebase.
  • Removed outdated Eagle3Sampler/Eagle3Decoder classes and integrated SeqSlotManager for draft slot management.

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
speculative/utils.py Updated decoder factory signature and imports for spec decoders
speculative/mtp.py Refactored MTPSampler constructor, updated stop criteria logic
speculative/eagle3.py Removed legacy sampler/decoder classes, added new sampler class
pyexecutor/seq_slot_manager.py Simplified resource prep using all_requests()
pyexecutor/scheduler.py Changed all_requests to a method returning a list
pyexecutor/py_executor.py Integrated SeqSlotManager, updated logits field assignments
pyexecutor/model_engine.py Introduced BEAM_WIDTH, centralized batch indexing logic
pyexecutor/llm_request.py Added py_is_draft flag to LlmRequest
pyexecutor/guided_decoder.py Replaced itertools.chain with all_requests()
pyexecutor/_util.py Centralized sampler instantiation with create_torch_sampler_args
auto_deploy/shim/ad_executor.py Updated AD executor to use TorchSampler.Args and slot manager
Comments suppressed due to low confidence (3)

tensorrt_llm/_torch/speculative/mtp.py:314

  • The returned SampleStateMTP no longer includes a logits field, which may be accessed downstream in the executor (e.g., in _executor_loop_pp). Consider preserving or setting device.logits and host.logits in SampleStateMTP to avoid missing attribute errors.
        )

tensorrt_llm/_torch/speculative/utils.py:83

  • [nitpick] The parameter name sampler_args is more verbose than other code that uses args for TorchSampler parameters. Consider renaming it to args for consistency and brevity.
def get_spec_decoder(sampler_args: TorchSampler.Args, spec_config: SpecConfig):

tensorrt_llm/_torch/pyexecutor/model_engine.py:1160

  • [nitpick] The nonlocal mtp_batch_idx declaration appears after a conditional return in the nested py_batch_idx function. For clarity, move the nonlocal statement to the top of the function body before any logic.
            nonlocal mtp_batch_idx

@netanel-haber netanel-haber requested a review from dcampora June 20, 2025 09:31
Copy link
Collaborator

@suyoggupta suyoggupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AD changes LGTM

@netanel-haber netanel-haber merged commit 58a8a8f into NVIDIA:main Jun 23, 2025
3 checks passed
netanel-haber added a commit to netanel-haber/TensorRT-LLM that referenced this pull request Jun 25, 2025
…er new_tokens format (NVIDIA#4401)"

This reverts commit 58a8a8f.

Signed-off-by: Netanel Haber <[email protected]>
litaotju pushed a commit that referenced this pull request Jun 26, 2025
@netanel-haber netanel-haber deleted the user/nhaber/feature/align_sample_state_with_trtllm_sampler_sample_state branch July 1, 2025 09:56
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants