Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Deprecate TextEnvironment and tools
#3389 opened Apr 29, 2025 by lewtun Loading…
5 tasks
[SFT] Handle formatting_func + completion_only_loss
#3385 opened Apr 29, 2025 by LeonEricsson Loading…
3 of 5 tasks
Introduce vLLM colocation
#3383 opened Apr 28, 2025 by toslali-ibm Draft
[IterativeSFT] Small refresher
#3378 opened Apr 28, 2025 by LeonEricsson Loading…
3 of 5 tasks
DPO fixes for evaluations
#3377 opened Apr 28, 2025 by winglian Loading…
5 tasks
📝 vLLM-integration documentation
#3376 opened Apr 28, 2025 by shirinyamani Loading…
5 tasks
Reintroduce generate method for PPOTrainer
#3374 opened Apr 27, 2025 by CloseChoice Loading…
4 tasks done
An Unified Example Format Checker
#3373 opened Apr 27, 2025 by innerNULL Loading…
1 of 5 tasks
add support for reward func using nn.Module in GRPOTrainer
#3372 opened Apr 27, 2025 by Tavish9 Loading…
1 of 5 tasks
[Feat] Suppport SGLang as rollout engine of GRPO trainer
#3370 opened Apr 27, 2025 by ryang-max Loading…
2 of 8 tasks
Environments
#3367 opened Apr 26, 2025 by August-murr Loading…
PEFT support for Liger GRPO
#3355 opened Apr 24, 2025 by SalmanMohammadi Loading…
5 tasks
Support FSDP in GRPO trainer
#3354 opened Apr 24, 2025 by jglaser Loading…
1 of 4 tasks
Add support for FSDP2
#3317 opened Apr 17, 2025 by lewtun Loading…
1 of 5 tasks
[DPO] Model forward pass padding side fix
#3307 opened Apr 16, 2025 by LeonEricsson Loading…
2 of 5 tasks
add vllm support for token ids as input
#3280 opened Apr 11, 2025 by wybryan Loading…
Reward takes completion ids
#3272 opened Apr 9, 2025 by qgallouedec Draft
5 tasks
🦙 Llama 4
#3267 opened Apr 9, 2025 by qgallouedec Draft
5 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.