Skip to content

Conversation

tomeras91
Copy link
Collaborator

Nemotron-H correctness tests doesn't just check that the completions are as expected, but rather also validates that logprobs aren't too far from reference logprobs. Initially, the test was implemented using a manual generation for-loop, since the LLM API didn't support returning context and generation logprobs.

After PRs #4538 and #4819 added context and generation logits support for the LLM API, it can now be used to compare logprobs as well, removing the need for a manual generation loop. It is also better to use the LLM API in the unittest since that's the standard way to use a model in TRTLLM.

Therefore, this PR merges the old Nemotron-H correctness and LLM API unittests, such that now the LLM API is used in the Nemotron-H correctness test.

Also, the reference generation logprobs were updated, since turns out there was a bug in the position_ids in the old manual generation loop. This further emphasizes why, when possible, standard APIs should be used instead of custom code.

tomeras91 added 19 commits June 5, 2025 11:13
…failed or succeeded (2) don't add BOS token to match expected outputs

Signed-off-by: Tomer Asida <[email protected]>
…y state_indices during forward pass. Now LLM API test passes

Signed-off-by: Tomer Asida <[email protected]>
…-LLM into fix-nemotron-h-warmup

Signed-off-by: Tomer Asida <[email protected]>
…eference logprobs from commit 5ce1102 after fix

Signed-off-by: Tomer Asida <[email protected]>
Signed-off-by: Tomer Asida <[email protected]>
@tomeras91 tomeras91 requested a review from Copilot June 10, 2025 16:57
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

@tomeras91 tomeras91 requested review from amitz-nv and omera-nv June 10, 2025 16:57
@tomeras91
Copy link
Collaborator Author

FYI @vegaluisjose @suyoggupta

@tomeras91 tomeras91 changed the title [feat] Use LLM API for Nemotron-h correctness test [test] Use LLM API for Nemotron-h correctness test Jun 10, 2025
@tomeras91 tomeras91 changed the title [test] Use LLM API for Nemotron-h correctness test [test] Use LLM API for Nemotron-H correctness test Jun 10, 2025
@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8337 [ run ] triggered by Bot

Copy link
Collaborator

@vegaluisjose vegaluisjose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@omera-nv omera-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8337 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6039 completed with status: 'FAILURE'

…, and other tests don't necessarily clean memory after themselves

Signed-off-by: Tomer Asida <[email protected]>
@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8358 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8358 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #6052 completed with status: 'FAILURE'

@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8419 [ run ] triggered by Bot

@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8440 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8419 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8440 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #6119 completed with status: 'FAILURE'

@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8493 [ run ] triggered by Bot

Copy link
Collaborator

@amitz-nv amitz-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of get context & generation logits LGTM.
Note that logprobs support was merged this morning: #4836 - using it can probably save some more code from this test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8493 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #6155 completed with status: 'FAILURE'

@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8530 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8530 [ run ] completed with state FAILURE

@tomeras91
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8535 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8535 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6188 completed with status: 'SUCCESS'

@shaharmor98 shaharmor98 merged commit 06d9f1e into NVIDIA:main Jun 12, 2025
3 checks passed
@tomeras91 tomeras91 deleted the llm-api-for-nemotron-h-correctness-test branch June 12, 2025 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants