Open
Description
🚀 The feature, motivation and pitch
Add support for the text decoder backbone of the new Gemma 3 (1B / 4B for edge) model.
Use --local_global_attention
for the sliding window attention.
Checkpoints are on HuggingFace:
cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel
Metadata
Metadata
Assignees
Type
Projects
Status
No status
Status
Ready