Jiangyun Zhu
|
29c9cb8007
|
[CI] Add tests for cudagraph (#27391)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-10-25 02:37:33 +00:00 |
|
kourosh hakhamaneshi
|
7e1d697b56
|
[Bugfix] Fix MultiConnector stats reconstruction across process boundaries (#27366)
Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2025-10-24 17:08:05 +00:00 |
|
Jonathan Chen
|
ca76486a16
|
[Chore] Separate out vllm.utils.platform_utils.py (#27374)
Signed-off-by: Jonathan <chenleejonathan@gmail.com>
|
2025-10-23 19:08:06 +00:00 |
|
Tova Movshovitz
|
88afa11010
|
[Metrics] [KVConnector] Add connector prefix cache hit rate stats (#26245)
Signed-off-by: tovam <tovam@pliops.com>
|
2025-10-23 12:21:08 +02:00 |
|
Zhewen Li
|
50b788a17a
|
[CI/Build] Fix AMD CI: test_cpu_gpu.py (#27388)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-10-23 07:55:00 +00:00 |
|
Giancarlo Delfin
|
6644796bf4
|
[V1][spec decode] return logprobs for spec decoding (#26060)
Signed-off-by: Giancarlo Delfin <gdelfin@meta.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-10-22 22:59:59 -07:00 |
|
Andrew Sansom
|
ff93cc8c84
|
[CORE] Support Prefix Caching with Prompt Embeds (#27219)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-10-22 22:18:07 -07:00 |
|
dongbo910220
|
a0003b56b0
|
[Chore] Separate out system utilities from vllm.utils (#27201)
Signed-off-by: dongbo910220 <1275604947@qq.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-10-22 20:25:25 +00:00 |
|
Sage
|
1651003c35
|
[Prefix Cache] Use LoRA name for consistent KV-cache block hashing (#27211)
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
|
2025-10-22 18:13:03 +00:00 |
|
Nicolò Lucchesi
|
4dfdb821c8
|
[P/D] Dynamic kv_output_aggregator collect size (#26734)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-22 18:07:58 +02:00 |
|
Russell Bryant
|
58fab50d82
|
[Frontend] Require flag for loading text and image embeds (#27204)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-22 15:52:02 +00:00 |
|
Mark McLoughlin
|
4ca13a8667
|
[NIXL] Terminate handshake listener thread in shutdown (#26404)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-22 16:59:53 +02:00 |
|
Nicolò Lucchesi
|
bfa59be8f1
|
[CI] Nixl integration tests DP-EP (#27199)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-22 11:17:48 +08:00 |
|
Tyler Michael Smith
|
6c2eef5a5d
|
[P/D] KVConnector for decode benchmarking (#25986)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-10-21 16:30:47 -07:00 |
|
ExtReMLapin
|
4a8a567e16
|
Updated xgrammar backend to not deny supported string formats (#27253)
Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>
Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com>
Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-21 22:25:23 +00:00 |
|
Huy Do
|
becb7de40b
|
Update PyTorch to 2.9.0+cu129 (#24994)
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-21 17:20:18 -04:00 |
|
Nick Hill
|
647214f3d5
|
[V0 Deprecation] Remove V0 executors (#27142)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-21 11:09:37 -07:00 |
|
Nicolò Lucchesi
|
72f431e709
|
[Nixl] Minor refactor to handshake related metadata (#26410)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-21 09:07:47 +02:00 |
|
Nicolò Lucchesi
|
f9e7ad5400
|
[Bugfix][CI] Fix Distributed Tests (4 GPUs) async_sched+ray test (#27195)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-20 16:34:54 +00:00 |
|
dongbo910220
|
8a297115e2
|
[Chore] Separate out hashing utilities from vllm.utils (#27151)
Signed-off-by: dongbo910220 <1275604947@qq.com>
|
2025-10-19 11:09:38 +08:00 |
|
Tova Movshovitz
|
83e760c57d
|
[V1][Metrics][Plugin] Add plugin support for custom StatLoggerBase implementations (#22456)
Signed-off-by: tovam <tovam@pliops.com>
|
2025-10-18 15:12:46 -07:00 |
|
Isotr0py
|
6ac5e06f7c
|
[Chore] Clean up pytorch helper functions in vllm.utils (#26908)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com>
|
2025-10-18 09:48:22 -07:00 |
|
Wentao Ye
|
245e4f2c01
|
[Feature] Batch Invariant: Support DeepGEMM and Blackwell (#27127)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-18 09:28:05 -04:00 |
|
iAmir97
|
1d165d6d85
|
[Chore] Separate out vllm.utils.mem_utils (#27143)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-18 10:06:59 +00:00 |
|
Hanchenli
|
7c572544e4
|
[GPT-OSS] Structure_Tag support for gpt-oss tool-call in cot (#25515)
Signed-off-by: Hanchenli <lihanc2002@gmail.com>
Signed-off-by: Hanchenli <61769611+Hanchenli@users.noreply.github.com>
Signed-off-by: Wei Wei <wwei6@meta.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Wei Wei <wwei6@meta.com>
Co-authored-by: Wei Wei <weiweinpu@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-10-17 21:55:54 -07:00 |
|
Huamin Li
|
c312320764
|
[CI/Build] tests(v1): feed Triton attention the (num_blocks, 2, …) KV cache layout in backend-correctness tests (#26663)
Signed-off-by: Huamin Li <3ericli@gmail.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-10-17 21:11:26 -07:00 |
|
Wentao Ye
|
f50cc221ea
|
[Test] Make test_failure more stable for batch invariance (#27054)
|
2025-10-17 16:59:08 -04:00 |
|
Nicolò Lucchesi
|
2ba60ec7fe
|
[CI] Nixl integration tests (#27010)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-10-17 07:13:31 -07:00 |
|
Jee Jee Li
|
fec2b341ad
|
[Kernel] Lazy import FlashInfer (#26977)
|
2025-10-17 04:48:18 +00:00 |
|
Cyrus Leung
|
4d4d6bad19
|
[Chore] Separate out vllm.utils.importlib (#27022)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-17 00:48:59 +00:00 |
|
Bram Wasti
|
b2f78cbad4
|
[small][batch invariance] Rename the env and internal flags to simplify usage (#26855)
Signed-off-by: Bram Wasti <bwasti@meta.com>
|
2025-10-16 21:40:25 +00:00 |
|
Wentao Ye
|
2ed8b6b3d0
|
[Bug] Fix batch invariant test has to is (#27032)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-16 19:45:14 +00:00 |
|
Tahsin Tunan
|
43721bc67f
|
[CI] Replace large models with tiny alternatives in tests (#24057)
Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-16 15:51:27 +01:00 |
|
Bram Wasti
|
7d8975de84
|
Deepseek-v3 Batch Invariant on 8xH100 (#26609)
Signed-off-by: Bram Wasti <bwasti@meta.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-10-15 22:06:02 -07:00 |
|
Cyrus Leung
|
f93e348010
|
[Misc] Remove isort and yapf ignores (#26888)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-15 12:09:03 +00:00 |
|
Morrison Turnansky
|
96b9aa5aa0
|
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): name change compilation level to compilation mode, deprecation compilation level (#26355)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-15 02:51:16 +00:00 |
|
Nick Hill
|
4aed506b65
|
[Core] Streamline some structured output related code (#26737)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-14 23:27:44 +00:00 |
|
Matthew Bonanni
|
82af928c41
|
[Attention][Spec Decode] FlashMLA spec decode support (#26541)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-10-14 19:38:20 +00:00 |
|
Qier Li
|
720394de43
|
[KVConnector][Metrics] Aggregate scheduler-side KVConnectorStats (#26046)
Signed-off-by: Qier Li <kevin44036@gmail.com>
|
2025-10-14 14:38:07 +00:00 |
|
Chendi.Xue
|
7e6edb1469
|
[NIXL][HeteroTP] Enable KV transfer from HND prefill to NHD decode (#26556)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-10-14 09:46:05 +00:00 |
|
Lucia Fang
|
8317f72354
|
[Misc][DP] support customized aggregated logger for dp (#24354)
Signed-off-by: Lu Fang <fanglu@fb.com>
|
2025-10-13 17:45:59 -07:00 |
|
Maximilien de Bayser
|
d8bebb008a
|
Add tests for chunked prefill and prefix cache with causal pooling models (#26526)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Ayush Singh <ayush1009208@gmail.com>
|
2025-10-14 07:45:04 +08:00 |
|
Will Eaton
|
53c9a7cee2
|
[P/D] [NixlConnector] kv load recovery integration (#26171)
Signed-off-by: Will Eaton <weaton@redhat.com>
|
2025-10-13 08:48:04 -07:00 |
|
Bram Wasti
|
3263799056
|
[unrevert] Add batch invariant kernel override for FlashInfer backend [2/n] (#26373)
Signed-off-by: Bram Wasti <bwasti@meta.com>
Signed-off-by: Bram Wasti <bwasti@fb.com>
|
2025-10-13 10:24:53 -04:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Chendi.Xue
|
9bb38130cb
|
[Bugfix] Fix GPU_ID issue in test script (#26442)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-10-12 11:39:05 +00:00 |
|
Nick Hill
|
5bc26c438d
|
[BugFix] Make penalties and bad_words work with async scheduling (#26467)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-10 23:27:04 +00:00 |
|
Nick Hill
|
949cb0170d
|
[BugFix] Fix async scheduling + request preemption (#26385)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-10 20:29:57 +00:00 |
|
Harry Mellor
|
7c12763b24
|
Fix some typing issues found by mypy==1.18.2 (#26596)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-10 18:21:25 +00:00 |
|
Mark McLoughlin
|
784c231151
|
[NIXL] Ignore abort on already-finished request (#25067)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-10 12:21:56 +02:00 |
|