Open
Description
Hi developers,
Thanks for such a great project that introduces FP8 training.
I am trying to use the FP8 computation in Colossal.ai, but when I check with the code ,I found from the code and blog the FP8 implementation might be current scaling.
Also, I have another question, does ColossalAI FP8 support group_gemm
ops in MoE models?
Metadata
Metadata
Assignees
Labels
No labels