-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[grpc] Support gRPC server entrypoint
ci/build
frontend
#30190
opened Dec 6, 2025 by
CatherineSue
Loading…
5 tasks
feat(metrics): Add prefill KV compute metric excluding cached tokens
v1
#30189
opened Dec 6, 2025 by
ziliangpeng
Loading…
adds jais 2 support
new-model
Requests to new models
#30188
opened Dec 6, 2025 by
sarathc-cerebras
Loading…
5 tasks
Fix #15483 : Add error handling for model-dependent endpoints during …
frontend
v1
#30186
opened Dec 6, 2025 by
erdaltoprak
Loading…
[Feature] Add offline FastAPI documentation support for air-gapped environments
frontend
#30184
opened Dec 6, 2025 by
rickychen-infinirc
Loading…
[WIP][Feat][Sched] Add Buffered_Response
v1
#30183
opened Dec 6, 2025 by
Pr0Wh1teGivee
Loading…
5 tasks
[MISC]: change NIXL compatibility hash logging level to debug
kv-connector
#30182
opened Dec 6, 2025 by
AuruTus
Loading…
[ROCm][MXFP4] Enable FP4 MLA BMM support
rocm
Related to AMD ROCm
v1
#30177
opened Dec 6, 2025 by
dllehr-amd
Loading…
5 tasks
[Misc][Core] Remove unused ONLY add when PR is ready to merge/full CI is needed
v1
req_index increment in scheduler
ready
#30176
opened Dec 6, 2025 by
ivanium
Loading…
5 tasks
[Frontend] Add --uvicorn-access-log-exclude-paths option
frontend
#30175
opened Dec 6, 2025 by
GeoffreyWang1117
Loading…
4 tasks done
[Bugfix] Improve DCP error message with backend hint
v1
#30174
opened Dec 6, 2025 by
GeoffreyWang1117
Loading…
2 tasks done
[BugFix] Fix ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
assert batch_descriptor.num_tokens == num_tokens_padded
nvidia
ready
#30173
opened Dec 6, 2025 by
LucasWilkinson
Loading…
[Frontend] Remove confusing -O.xx flag error
ready
ONLY add when PR is ready to merge/full CI is needed
#30169
opened Dec 6, 2025 by
gmagogsfm
Loading…
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector
kv-connector
tpu
Related to Google TPUs
v1
#30166
opened Dec 6, 2025 by
ivanium
Loading…
5 tasks
Nvidia ModelOpt workaround for issue 28072
nvidia
quantization
#30164
opened Dec 6, 2025 by
shengliangxu
Loading…
[Deepseek] Fix OOM during DeepSeek R1 startup
deepseek
Related to DeepSeek models
v1
#30162
opened Dec 5, 2025 by
MatthewBonanni
Loading…
3 of 5 tasks
[CI] Update Test Dependencies
ci/build
ready-run-all-tests
Trigger CI with all tests for wide-ranging PRs
#30160
opened Dec 5, 2025 by
junpuf
Loading…
5 tasks
[Perf] Optimize ONLY add when PR is ready to merge/full CI is needed
group_topk kernel, 1.9% Throughput improvement, 2.1% TPOT improvemnt
ready
#30159
opened Dec 5, 2025 by
yewentao256
Loading…
feat: add TxtSlicesDataset to allow sampling slices from txt file for benchmarking
performance
Performance-related issues
#30156
opened Dec 5, 2025 by
hypdeb
Loading…
update torchao safetensors impl
ready
ONLY add when PR is ready to merge/full CI is needed
#30155
opened Dec 5, 2025 by
liangel-02
Loading…
Integration for Ray LLM with load_format=runai_streamer
#30154
opened Dec 5, 2025 by
jiangwu300
Loading…
5 tasks
Bump nvshmem to 3.3.24 and fix CUDA 13 installation
nvidia
#30149
opened Dec 5, 2025 by
dmitry-tokarev-nv
Loading…
5 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.