NVIDIA / TransformerEngine Public

Notifications You must be signed in to change notification settings
Fork 509
Star 2.7k

Code
Issues 217
Pull requests 85
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: NVIDIA/TransformerEngine

Labels 48 Milestones 0

New pull request New

85 Open 1,567 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Honor COMPACT data_format for FP8 blockwise scales in MoE up-projection path to remove 5× redundant rowwise_scale_inv.T.contiguous() passes

#2199 opened Sep 24, 2025 by xiaoxi-wangfj

Loading…

2 of 13 tasks

[JAX] Update JAX version requirement in pyproject.toml 2.8.0

#2197 opened Sep 23, 2025 by phu0ngng

Loading…

5 of 13 tasks

[PyTorch] fix int32 overflow in permute kernels

#2196 opened Sep 23, 2025 by hxbai

Loading…

1 of 13 tasks

[PyTorch] Add max_score support for MuonClip 2.9.0

#2195 opened Sep 22, 2025 by cyanguwa

Loading…

8 of 13 tasks

[JAX] Clamped Swiglu Integration

#2194 opened Sep 22, 2025 by vthumbe1503 • Draft

13 tasks

FSDP grad fusion support

#2191 opened Sep 21, 2025 by sanandaraj5597

Loading…

[Feature] Enable rope application with offsets for training 2.8.0

#2188 opened Sep 19, 2025 by sudhakarsingh27

Loading…

1 of 13 tasks

[Feat] Draft: support offloading activation

#2187 opened Sep 18, 2025 by lhb8125

Loading…

13 tasks

[Core][PyTorch] NVFP4 recipe 2.8.0

#2177 opened Sep 16, 2025 by ksivaman

Loading…

3 of 13 tasks

Context Parallel integration tests with a transformer layer: BSHD and THD + CP 2.9.0

#2176 opened Sep 16, 2025 by jomitchellnv

Loading…

7 of 13 tasks

[PyTorch Debug] Add nvdlfw-inspect to dependencies

#2173 opened Sep 15, 2025 by pggPL • Draft

7 tasks done

blockwise fp8 weight memory optimization: on-demand columnwise fp8 weight creation

#2168 opened Sep 10, 2025 by skydoorkai

Loading…

7 of 13 tasks

[JAX] CollectiveGemm 2.8.0

#2166 opened Sep 9, 2025 by phu0ngng

Loading…

8 of 13 tasks

Fix issue with RNG state shape

#2164 opened Sep 8, 2025 by epwalsh

Loading…

5 of 13 tasks

[Pytorch] Support for Swiglu Activation used in GPT OSS

#2161 opened Sep 8, 2025 by vthumbe1503

Loading…

8 of 13 tasks

Add support for the FP8 Block Scaling (ie. Deepseek) recipe on Blackwell

#2157 opened Sep 5, 2025 by janekb04

Loading…

5 of 13 tasks

[Common][PyTorch][Rework] PDL for Quantization

#2150 opened Sep 4, 2025 by yaox12

Loading…

1 of 13 tasks

[PyTorch] CPU Overhead Micro-optimizations

#2146 opened Sep 2, 2025 by zhongbozhu

Loading…

13 tasks

[main][feature][under updating]adapt for offload activation

#2145 opened Sep 2, 2025 by GeYuhong

Loading…

1 of 13 tasks

[PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor

#2144 opened Sep 2, 2025 by xiaoxi-wangfj

Loading…

1 of 13 tasks

[PyTorch Debug] Support precision debug tools for fp8 model parameters.

#2141 opened Sep 1, 2025 by pggPL

Loading…

8 of 13 tasks

ci: Build and attach bdist wheels to release page

#2138 opened Aug 29, 2025 by ko3n1g

Loading…

13 tasks

[PyTorch Debug] Add max_blockwise_dynamic_range stats

#2137 opened Aug 29, 2025 by pggPL

Loading…

8 of 13 tasks

Adds dst.dtype information in copy_ method of quantized tensors.

#2120 opened Aug 26, 2025 by zobeideThePlayer

Loading…

3 of 13 tasks

[PyTorch Debug] Fix issue with microbatching + debug value caching

#2108 opened Aug 25, 2025 by pggPL

Loading…

8 of 13 tasks

Previous 1 2 3 4 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-09-21.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!