-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Misc] Support MMMU accuracy benchmark
performance
Performance-related issues
#23034
opened Aug 16, 2025 by
tanruixiang
•
Draft
4 tasks
[Bugfix] fix some minor issues of marlin kernel
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
#23032
opened Aug 16, 2025 by
jinzhen-lin
Loading…
[Bugfix] fix qwen3 moe fp8 accuracy issue
qwen
Related to Qwen models
#23031
opened Aug 16, 2025 by
jinzhen-lin
Loading…
[Refactor] Defer tensor data construction in MultiModalKwargs
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#23030
opened Aug 16, 2025 by
DarkLight1337
Loading…
4 tasks
[Bugfix] fix IntermediateTensors equal method
#23027
opened Aug 16, 2025 by
andyxning
Loading…
4 tasks
[Bugfix] Fix Dense module loading for sentence-transformers embedding models (simplified version)
ci/build
#23019
opened Aug 16, 2025 by
FFFfff1FFFfff
Loading…
[Core] Use key-only cache for Improvements or additions to documentation
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
tpu
Related to Google TPUs
v1
BaseMultiModalProcessor
documentation
#23018
opened Aug 16, 2025 by
DarkLight1337
•
Draft
3 of 4 tasks
Allows initialize TorchAOConfig object through quantization_config_file
#23014
opened Aug 15, 2025 by
jerryzh168
•
Draft
[FlashInfer] Truncate block tables for sliding window attention
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#23010
opened Aug 15, 2025 by
WoosukKwon
Loading…
Use Blackwell FlashInfer MXFP4 MoE by default if available
#23008
opened Aug 15, 2025 by
mgoin
Loading…
4 tasks
[UX] Separate marlin moe config logic from triton moe
ready
ONLY add when PR is ready to merge/full CI is needed
#23006
opened Aug 15, 2025 by
mgoin
Loading…
4 tasks
[CI/Build] Replace lm-eval gsm8k tests with faster implementation
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#23002
opened Aug 15, 2025 by
mgoin
Loading…
4 tasks
[Performance] Reapply Performance improvements in non-blockwise fp8 CUTLASS MoE
needs-rebase
performance
Performance-related issues
[Benchmarks] add benchmark for embedding models
performance
Performance-related issues
#23000
opened Aug 15, 2025 by
ZJY0516
Loading…
4 tasks
Optimize MoE Token Dispatch for Tensor Parallel Configurations
#22993
opened Aug 15, 2025 by
skyloevil
Loading…
[ROCm][Bugfix] Add missing max_qlen argument
rocm
Related to AMD ROCm
#22984
opened Aug 15, 2025 by
tuukkjs
Loading…
3 of 4 tasks
[XPU] Delay BF16 check to worker init for spawn compatibility
v1
#22979
opened Aug 15, 2025 by
chaojun-zhang
Loading…
[Frontend] Complete Redesign of Tool Calling
frontend
tool-calling
#22977
opened Aug 15, 2025 by
chaunceyjiang
•
Draft
4 tasks
[BugFix] pp cannot run successfully under NixlConnector
#22976
opened Aug 15, 2025 by
R2-Y
Loading…
4 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.