-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix: fix the deterministic issue in the MTP Eagle path #5285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: fix the deterministic issue in the MTP Eagle path #5285
Conversation
3883280
to
cc397aa
Compare
/bot run --disable-fail-fast |
PR_Github #9204 [ run ] triggered by Bot |
/bot run |
cc397aa
to
9fcb06c
Compare
/bot kill |
PR_Github #9264 [ run ] triggered by Bot |
PR_Github #9265 [ kill ] triggered by Bot |
PR_Github #9264 [ run ] completed with state |
PR_Github #9265 [ kill ] completed with state |
/bot run |
PR_Github #9277 [ run ] triggered by Bot |
/bot kill |
9fcb06c
to
0bf3101
Compare
PR_Github #9307 [ kill ] triggered by Bot |
PR_Github #9277 [ run ] completed with state |
0bf3101
to
df46b63
Compare
/bot run |
PR_Github #9307 [ kill ] completed with state |
PR_Github #9308 [ run ] triggered by Bot |
PR_Github #9308 [ run ] completed with state |
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
df46b63
to
d6276c0
Compare
/bot run --disable-fail-fast |
PR_Github #9334 [ run ] triggered by Bot |
PR_Github #9334 [ run ] completed with state |
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Description
This PR updates the first draft forward in the MTP Eagle path to use all accepted tokens instead of just the last one. This allows the KV cache for the draft layer to be updated.
Before this fix, each iteration, the KV cache didn't store the key/value pair for the last draft token (let's call it D) because it was the output of the last draft forward pass. For this last draft token, we call it D. In the next iteration, if all draft tokens were accepted, we'd use the newly generated tokens as inputs for the first draft forward. But since the KV cache had incorrect key/value data for token D, identical inputs could produce different draft tokens.
With this fix, the KV cache will be updated in the first draft forward each iteration, making MTP Eagle deterministic.
I tested the DS-R1-FP4 model with the same dataset and on the same node (with BS=1). The acceptance rate will increase a little bit:
Before the changes:
After:
For the model accuracy, with this fix, I enabled MTP Ealge and tested GPQA diamond twice with the same random seed. We got the same results 70.202 ± 3.2586, which is expected.