-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Use backend to replace macro to control enablement of MNNVL all reduce #4635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks, Jerry. This indeed makes the code cleaner. June |
/bot run |
PR_Github #6402 [ run ] triggered by Bot |
PR_Github #6402 [ run ] completed with state |
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-Others-1" |
1 similar comment
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-Others-1" |
PR_Github #6459 [ run ] triggered by Bot |
PR_Github #6459 [ run ] completed with state |
/bot run |
PR_Github #7074 [ run ] triggered by Bot |
PR_Github #7074 [ run ] completed with state |
/bot run |
PR_Github #7105 [ run ] triggered by Bot |
PR_Github #7105 [ run ] completed with state |
/bot run --disable-fail-fast |
PR_Github #8389 [ run ] triggered by Bot |
PR_Github #8389 [ run ] completed with state |
Failure on fp8_block_scaling_gemm with error which is not related to this change: RuntimeError: Assertion failed: cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size) == cudaSuccess It's a known issue. Need to rerun. |
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-DeepSeek-1" |
PR_Github #8487 [ run ] triggered by Bot |
Signed-off-by: Hui Gao <[email protected]> Signed-off-by: Hui Gaoâ� <[email protected]>
Signed-off-by: Hui Gao <[email protected]> Signed-off-by: Hui Gaoâ� <[email protected]>
Signed-off-by: Hui Gao <[email protected]> Signed-off-by: Hui Gaoâ� <[email protected]>
Signed-off-by: Hui Gao <[email protected]>
Signed-off-by: Hui Gao <[email protected]>
Signed-off-by: Hui Gao <[email protected]> Signed-off-by: Hui Gaoâ� <[email protected]>
Signed-off-by: Hui Gao <[email protected]>
mappingg Signed-off-by: Hui Gao <[email protected]>
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-DeepSeek-1" --comment="rerun after rebase with that known issue is waived " |
PR_Github #8499 Bot args parsing error: usage: /bot [-h] |
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-DeepSeek-1" |
PR_Github #8506 [ run ] triggered by Bot |
PR_Github #8506 [ run ] completed with state |
Signed-off-by: Hui Gao <[email protected]>
/bot run --stage-list="DGX_H100-4_GPUs-PyTorch-DeepSeek-1" |
PR_Github #8555 [ run ] triggered by Bot |
PR_Github #8555 [ run ] completed with state |
/bot skip --comment="After rerun, all multi-gpu stages passed." |
PR_Github #8587 [ skip ] triggered by Bot |
PR_Github #8587 [ skip ] completed with state |
No description provided.