-
-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[v1] Introduce KVCacheBlocks as interface between Scheduler and KVCacheManager
v1
#17479
opened Apr 30, 2025 by
heheda12345
•
Review required
Bump Compressed Tensors version to 0.9.4
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#17478
opened Apr 30, 2025 by
rahul-tuli
•
Approved
[v1] Move block management logic from KVCacheManager to SpecializedManager
needs-rebase
v1
#17474
opened Apr 30, 2025 by
heheda12345
•
Review required
[Misc] refactor example - cpu_offload_lmcache
documentation
Improvements or additions to documentation
#17460
opened Apr 30, 2025 by
reidliu41
•
Review required
[CI/Build] Reorganize models tests
ci/build
multi-modality
Related to multi-modality (#4194)
#17459
opened Apr 30, 2025 by
DarkLight1337
•
Approved
Improve configs - ONLY add when PR is ready to merge/full CI is needed
ObservabilityConfig
needs-rebase
ready
#17453
opened Apr 30, 2025 by
hmellor
•
Review required
[Feature][Frontend]: Deprecate --enable-reasoning
documentation
Improvements or additions to documentation
frontend
structured-output
tool-calling
#17452
opened Apr 30, 2025 by
chaunceyjiang
•
Changes requested
Fix more broken speculative decode tests
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
[Bugfix] Fix TritonPlaceholder conflicts with torch.compile
v1
#17446
opened Apr 30, 2025 by
MengqingCao
•
Draft
[benchmark][structured output] Add offline benchmark script for structured output
structured-output
#17440
opened Apr 30, 2025 by
lk-chen
•
Review required
fix missing
_num_cached_tokens
in subtract_num_batched_tokens
#17436
opened Apr 30, 2025 by
initzhang
•
Review required
[Misc][AMD] Add query_platform method to interface.py
#17424
opened Apr 29, 2025 by
rasmith
•
Review required
[Feature][CLI] Unify configuration for structured outputs via Improvements or additions to documentation
needs-rebase
structured-output
tool-calling
v1
--structured-output-config
documentation
#17420
opened Apr 29, 2025 by
aarnphm
•
Changes requested
[DO NOT MERGE] Manual Fusion PR for Comparison
#17417
opened Apr 29, 2025 by
rasmith
•
Review required
Fix noisy warning for uncalibrated q_scale/p_scale
#17414
opened Apr 29, 2025 by
mgoin
•
Review required
[Bugfix] Temporarily disable gptq_bitblas on ROCm
documentation
Improvements or additions to documentation
#17411
opened Apr 29, 2025 by
nlzy
•
Review required
[Frontend] Fix tool_call handling in llama3.1 and llama3.2 chat template to allow zero tool_calls
documentation
Improvements or additions to documentation
tool-calling
#17409
opened Apr 29, 2025 by
CatherineSue
•
Review required
Previous Next
ProTip!
Follow long discussions with comments:>50.