-
Notifications
You must be signed in to change notification settings - Fork 509
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Honor COMPACT data_format for FP8 blockwise scales in MoE up-projection path to remove 5× redundant rowwise_scale_inv.T.contiguous() passes
#2199
opened Sep 24, 2025 by
xiaoxi-wangfj
Loading…
2 of 13 tasks
[JAX] Update JAX version requirement in pyproject.toml
2.8.0
#2197
opened Sep 23, 2025 by
phu0ngng
Loading…
5 of 13 tasks
[PyTorch] fix int32 overflow in permute kernels
#2196
opened Sep 23, 2025 by
hxbai
Loading…
1 of 13 tasks
[PyTorch] Add max_score support for MuonClip
2.9.0
#2195
opened Sep 22, 2025 by
cyanguwa
Loading…
8 of 13 tasks
[Feature] Enable rope application with offsets for training
2.8.0
#2188
opened Sep 19, 2025 by
sudhakarsingh27
Loading…
1 of 13 tasks
Context Parallel integration tests with a transformer layer: BSHD and THD + CP
2.9.0
#2176
opened Sep 16, 2025 by
jomitchellnv
Loading…
7 of 13 tasks
blockwise fp8 weight memory optimization: on-demand columnwise fp8 weight creation
#2168
opened Sep 10, 2025 by
skydoorkai
Loading…
7 of 13 tasks
[Pytorch] Support for Swiglu Activation used in GPT OSS
#2161
opened Sep 8, 2025 by
vthumbe1503
Loading…
8 of 13 tasks
Add support for the FP8 Block Scaling (ie. Deepseek) recipe on Blackwell
#2157
opened Sep 5, 2025 by
janekb04
Loading…
5 of 13 tasks
[Common][PyTorch][Rework] PDL for Quantization
#2150
opened Sep 4, 2025 by
yaox12
Loading…
1 of 13 tasks
[main][feature][under updating]adapt for offload activation
#2145
opened Sep 2, 2025 by
GeYuhong
Loading…
1 of 13 tasks
[PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor
#2144
opened Sep 2, 2025 by
xiaoxi-wangfj
Loading…
1 of 13 tasks
[PyTorch Debug] Support precision debug tools for fp8 model parameters.
#2141
opened Sep 1, 2025 by
pggPL
Loading…
8 of 13 tasks
ci: Build and attach bdist wheels to release page
#2138
opened Aug 29, 2025 by
ko3n1g
Loading…
13 tasks
[PyTorch Debug] Add max_blockwise_dynamic_range stats
#2137
opened Aug 29, 2025 by
pggPL
Loading…
8 of 13 tasks
Adds dst.dtype information in copy_ method of quantized tensors.
#2120
opened Aug 26, 2025 by
zobeideThePlayer
Loading…
3 of 13 tasks
[PyTorch Debug] Fix issue with microbatching + debug value caching
#2108
opened Aug 25, 2025 by
pggPL
Loading…
8 of 13 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-09-21.