Xiong Wang
|
19a9b169bf
|
Add Qwen3-Omni moe thinker (#25550)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Xiong Wang <feizi.wx@alibaba-inc.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-10 17:00:56 +00:00 |
|
Roberto L. Castro
|
96ad65b7fe
|
[Transform] [Quantization] Add QuTLASS support to vLLM (#24440)
Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Signed-off-by: Andrei Panferov <andrei@panferov.org>
Co-authored-by: Andrei Panferov <andrei@panferov.org>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-10-10 09:43:40 -07:00 |
|
Shane A
|
8d2b8c0ff2
|
[Model] Add FlexOlmo model implementation (#24923)
Signed-off-by: Shane A <shanea@allenai.org>
|
2025-10-10 09:43:15 -07:00 |
|
Chauncey
|
910abdbd08
|
[Bugfix] fixed top_logprobs: -1 does not appear to work as intended (#26470)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 00:41:17 +08:00 |
|
baonudesifeizhai
|
cddce79fda
|
[torch.compile] Make inductor partition rules respect splitting_ops #25691 (#25845)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-10 16:35:28 +00:00 |
|
Mark McLoughlin
|
e519281920
|
[Metrics] Add test for multi-modal cache stats logging (#26588)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-10 16:00:50 +00:00 |
|
Elvir Crnčević
|
7b03584de8
|
Silu v2 (#25074)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: elvircrn <elvircrn@gmail.com>
Signed-off-by: Elvir Crnčević <elvircrn@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
|
2025-10-10 15:19:53 +00:00 |
|
Daniel Cámpora
|
0e67102d93
|
Added test_top_k_per_row to test-pipeline.yaml. (#26569)
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
|
2025-10-10 10:48:33 -04:00 |
|
Chauncey
|
1e6848a65d
|
[CI] fix test_run_batch.py::test_completions - AssertionError (#26578)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 22:16:28 +08:00 |
|
Andy Lo
|
67661375fa
|
[BugFix] Fix noop elimination edge case (#26394)
Signed-off-by: Andy Lo <andy@mistral.ai>
|
2025-10-10 13:33:04 +00:00 |
|
Mark McLoughlin
|
784c231151
|
[NIXL] Ignore abort on already-finished request (#25067)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-10 12:21:56 +02:00 |
|
Chen Zhang
|
606b00e80f
|
[bugfix][DCP] fix block_size of hash in DCP prefix caching (#26296)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-10-10 03:02:49 -07:00 |
|
Chauncey
|
720d3cd0f0
|
[CI] fix ruff format (#26579)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 03:02:12 -07:00 |
|
Ashwin Phadke
|
ab196edefb
|
Remove LoRA bias support (#25807)
Signed-off-by: Ashwin Phadke <ashwinphadke12@rediffmail.com>
Signed-off-by: Ashwin Phadke <23502062+ashwin-phadke@users.noreply.github.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-10 09:50:33 +00:00 |
|
Luis Tomas Bolivar
|
3ee202ea1e
|
[GPT-OSS] Add support for arrays at tool message content (#25593)
Signed-off-by: Luis Tomas Bolivar <ltomasbo@redhat.com>
|
2025-10-10 09:00:45 +00:00 |
|
Cyrus Leung
|
ad430a67ca
|
[Metrics] Log multi-modal cache stats and fix reset (#26285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-10 01:45:55 -07:00 |
|
Boyuan Feng
|
b545a0b207
|
fix test_simple_inductor_graph_partition (#26522)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
|
2025-10-10 06:39:19 +00:00 |
|
Ben Browning
|
da4455609d
|
[Chore]: One pythonic tool parser test uses the wrong parser (#26515)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-10-10 04:03:55 +00:00 |
|
Julien Denize
|
c6187f55f7
|
Refactor MistralTokenizer (#26358)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-10-09 22:48:58 +00:00 |
|
elvischenv
|
44f633dba1
|
[Flashinfer][gpt-oss] Support FP8-qkv Flashinfer TRTLLM Sinks Attention (#25674)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-10-09 16:13:39 -04:00 |
|
Jiangyun Zhu
|
5728da11ea
|
Revert #26113 "[Frontend] CompilationConfig overhaul (#20283): deprecate use_inductor in favor of backend, simplify custom_ops" (#26472)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-10-09 05:43:55 -07:00 |
|
Wenzheng Bi
|
ec10fd0abc
|
[Bugfix] Move current_platform import to avoid python import cache. (#16601)
Signed-off-by: iwzbi <wzbi@zju.edu.cn>
|
2025-10-09 10:46:19 +00:00 |
|
Cyrus Leung
|
4bdf7ac593
|
[Bugfix] Fix SHM cache initialization (#26427)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 02:48:04 -07:00 |
|
Cyrus Leung
|
dc7976dd9f
|
[Misc] Upgrade more code to Python 3.10 (#26463)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 10:43:53 +01:00 |
|
Jerry Zhang
|
a83ff278d6
|
[torchao] Add support for ModuleFqnToConfig using regex (#26001)
Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>
|
2025-10-09 08:32:32 +00:00 |
|
Rahul Tuli
|
cf4cd6c24f
|
Add: Support for multiple hidden layers in Eagle3 (#26164)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
|
2025-10-09 07:30:50 +00:00 |
|
elvischenv
|
5e49c3e777
|
Bump Flashinfer to v0.4.0 (#26326)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-10-08 23:58:44 -07:00 |
|
Cyrus Leung
|
0f29dca988
|
[CI/Build] Fix model nightly tests (#26466)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-08 23:44:16 -07:00 |
|
Zhiyuan Li
|
d24cf322e1
|
[Hybrid]: Decouple Kernel Block Size from KV Page Size (#24486)
Signed-off-by: lizhiyuan <uniartisan2017@gmail.com>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
|
2025-10-08 23:43:39 -07:00 |
|
Qier Li
|
d17f0fbf30
|
[Core][KVConnector] Propagate all tokens on resumed preemptions (#24926)
Signed-off-by: Qier Li <kevin44036@gmail.com>
Co-authored-by: Qier Li <qier@fb.com>
|
2025-10-09 14:43:31 +08:00 |
|
bnellnm
|
da364615fc
|
[Kernels] Modular kernel refactor (#24812)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2025-10-08 17:51:52 -04:00 |
|
Elaine Zhao
|
f08919b7d1
|
[Bugfix] Respect min_tokens in scheduler stop check (#26317)
Signed-off-by: Elaine Zhao <elaineyz@amazon.com>
|
2025-10-08 14:08:24 -07:00 |
|
Matthew Bonanni
|
76879cc160
|
[Attention] Implement universal BACKEND_MAP (#25900)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-10-08 12:00:25 -07:00 |
|
Wentao Ye
|
4ba8875749
|
[Bug] Fix Test in Batch Invariant (#26128)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-08 10:13:47 -07:00 |
|
Wentao Ye
|
9fb3ae4e6f
|
[Bug] Fix DeepGEMM Attention Test (#26423)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-08 12:23:41 -04:00 |
|
Harry Mellor
|
2f99f2f506
|
Tidy vllm/config/__init__.py to only add classes and functions (#26405)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-08 07:10:00 -07:00 |
|
wang.yuqi
|
e39dc46f8f
|
[CI] Pooling models mteb test disable enforce_eager (#26408)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-08 12:15:36 +00:00 |
|
liangel-02
|
b32260ab85
|
[torchao] safetensors integration (#25969)
Signed-off-by: Angel Li <liangel@meta.com>
|
2025-10-07 20:12:35 -06:00 |
|
Lucas Wilkinson
|
f80e7866c0
|
[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-10-08 10:09:34 +08:00 |
|
Thomas Parnell
|
31a4b3e6c4
|
Revert #24446 and #26168 (#26332)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-10-07 16:38:19 -06:00 |
|
Sergei Skvortsov
|
6ebaf43ee4
|
[V1] Logit processors for rejection sampler (#19482)
Signed-off-by: southfreebird <yvorott@gmail.com>
Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Signed-off-by: Sergei Skvortsov <yvorott@gmail.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-10-07 13:02:49 -07:00 |
|
Morrison Turnansky
|
0c824fc46f
|
[Frontend] CompilationConfig overhaul (#20283): deprecate use_inductor in favor of backend, simplify custom_ops (#26113)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com>
|
2025-10-07 12:53:43 -07:00 |
|
Michael Goin
|
30a3e5af69
|
[CI] Add Qwen3 MoE NVFP4 to Blackwell lm-eval (#26316)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-07 10:36:15 -07:00 |
|
fxmarty-amd
|
a38c1bfe09
|
[ci] Rename test_mxfp4_moe.py to test_ocp_mx_moe.py (#26364)
Signed-off-by: Felix Marty <Felix.Marty@amd.com>
|
2025-10-07 09:52:24 -07:00 |
|
Paul Pak
|
320feae6f5
|
[Model] Lfm2Moe (#26344)
Signed-off-by: Paul Pak <paulpak58@gmail.com>
|
2025-10-07 16:03:05 +00:00 |
|
Cyrus Leung
|
1e4ecca1d0
|
[V0 Deprecation] Remove VLLM_USE_V1 from tests (#26341)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 15:42:31 +00:00 |
|
Cyrus Leung
|
c0a7b89d8e
|
[Misc] Move LRUCache into its own file (#26342)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 15:08:40 +00:00 |
|
antrec
|
6f59beaf0b
|
[Model] Add support for ModernBertForTokenClassification (#26340)
Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Signed-off-by: antrec <antoine.recanati@gmail.com>
Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-07 14:29:19 +00:00 |
|
fxmarty-amd
|
41f1cf38f2
|
[Feature][OCP MX] Support mxfp6 and mixed mxfp6-mxfp4 (#21166)
|
2025-10-07 09:35:26 -04:00 |
|
Daniel Cámpora
|
e1098ced95
|
Add topk logits torch op for DS3.2. (#25945)
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
Signed-off-by: Daniel Cámpora <961215+dcampora@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-10-07 10:07:32 +00:00 |
|