Harry Mellor
|
4f207c7174
|
Ignore large reformatting PRs in git blame (#26690)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-13 01:20:47 -07:00 |
|
CSWYF3634076
|
782505ed8e
|
[Model] Add reasoning_parser and tool_parser for Ernie45 thinking (#25027)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-10-13 15:55:20 +08:00 |
|
Jee Jee Li
|
98f30b8cba
|
[Model] Fix Skywork R1V mlp (#26673)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-12 22:42:17 -07:00 |
|
yihong
|
3cd36660f7
|
docs: wrong command in structured_outputs README (#26677)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-10-12 20:59:01 -07:00 |
|
yyzxw
|
46ad73955a
|
[FIX] Throwing an exception when the model does not support pool tasks (#25840) (#25855)
Signed-off-by: zxw <1020938856@qq.com>
Co-authored-by: wang.yuqi <noooop@126.com>
|
2025-10-12 20:56:21 -07:00 |
|
quanliu
|
41f3884438
|
[Bugfix][Core]Fix block table out-of-range issue in priority scheduling (#26661)
Signed-off-by: quanliu <18646313696@163.com>
|
2025-10-13 01:25:42 +00:00 |
|
bnellnm
|
60e419c1ee
|
[Misc] cache result of disable_inplace (#26666)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2025-10-13 00:17:50 +00:00 |
|
Michael Goin
|
7ef6052804
|
[CI/Build] Add tool to build vllm-tpu wheel (#19165)
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-12 16:25:40 -06:00 |
|
Huamin Li
|
4fca1a1bd2
|
[easy] fix pre commit error on trunk (#26665)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-10-12 21:25:34 +00:00 |
|
Lukas Geiger
|
a6049be73c
|
[Models][Qwen3VL] Speedup fast_pos_embed_interpolate (#26647)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-13 01:20:07 +08:00 |
|
gjgjos
|
18ed7746ea
|
[Feature] Add support for naver/splade-v3 (BERT-based sparse embedding model) (#26339)
Signed-off-by: gjgjos <gjgjos@naver.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-10-12 17:00:52 +00:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Chendi.Xue
|
9bb38130cb
|
[Bugfix] Fix GPU_ID issue in test script (#26442)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-10-12 11:39:05 +00:00 |
|
Jaya Yuan
|
b91d8db873
|
[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (#26574)
Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
|
2025-10-12 09:58:38 +00:00 |
|
Isotr0py
|
045b396d09
|
[Bugfix][CI/Build] Fix failing Mteb CI (#26638)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-12 02:42:42 -07:00 |
|
wang.yuqi
|
76852017ea
|
[MISC] Rename the torch profiler filename as instance_id+rank_id for merging the Profiler results of each Rank (#25867)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-12 09:29:08 +00:00 |
|
Vadim Gimpelson
|
82e64c7a20
|
[PERF] [Qwen3-next] Speed up gated RMSNorm (#26207)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Signed-off-by: Vadim Gimpelson <156319763+vadiklyutiy@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-12 08:27:50 +00:00 |
|
wang.yuqi
|
4ca204055e
|
Add @noooop to codeowner for pooling models (#26652)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-12 14:04:44 +08:00 |
|
Haisheng Chen
|
c5c8f5ea59
|
[EPLB] Support ernie4.5-moe (#22100)
Signed-off-by: Haisheng Chen <langzs335@outlook.com>
Signed-off-by: Haisheng Chen <60504847+HsChen-sys@users.noreply.github.com>
Signed-off-by: Haisheng Chen <hac048@ucsd.edu>
Co-authored-by: Haisheng Chen <langzs335@outlook.com>
|
2025-10-12 10:40:47 +08:00 |
|
Angela Yi
|
01653a917b
|
[compile] Fix inductor partition config (#26645)
Signed-off-by: angelayi <yiangela7@gmail.com>
|
2025-10-11 21:03:14 +00:00 |
|
Huamin Li
|
0cd103e7cb
|
CP: make correct_attn_out robust to 4‑D views and fix Triton arg binding (#26509)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-10-11 20:50:57 +00:00 |
|
Cyrus Leung
|
5be7ca1b99
|
[Benchmark] Support Infinity API (#26641)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-12 01:45:32 +08:00 |
|
Jee Jee Li
|
f0a30a067b
|
[Bugfix] Fix qwen-moe packed_modules_mapping (#26634)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-11 15:21:33 +00:00 |
|
JJJYmmm
|
9d6cff3ede
|
[Bugfix][Qwen3VL] fix deepstack in qwen3vl (#26626)
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
|
2025-10-11 05:58:33 -07:00 |
|
Angela Yi
|
a25f2adee9
|
[compile] Add patched_fused_scaled_matmul_reduce_scatter (#26604)
Signed-off-by: angelayi <yiangela7@gmail.com>
|
2025-10-11 05:44:43 -07:00 |
|
Chauncey
|
d0bed837ac
|
[Refactor]Reduce duplicate code in serving_chat (#26627)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 12:04:49 +00:00 |
|
muzian666
|
f7ee69868a
|
[CPU] fix the issue when the node is '-' cause json decode error. (#26562)
Signed-off-by: muzian666 <andylee_2001@163.com>
Co-authored-by: qingan.li <qingan.li@wizpresso.com>
|
2025-10-11 12:04:04 +00:00 |
|
Rahul Tuli
|
d2a71530c1
|
Add EAGLE-3 Speculative Decoding Support for Qwen3 MoE (#26485)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
|
2025-10-11 10:14:41 +00:00 |
|
ihb2032
|
086609de64
|
fix(nix): Allow local oneDNN path to fix vLLM CPU build failure (#26401)
Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>
Signed-off-by: ihb2032 <1355790728@qq.com>
|
2025-10-11 09:12:16 +00:00 |
|
dsinghvi
|
727144bed1
|
[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py (#24172)
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-10-11 07:21:04 +00:00 |
|
sangho.lee
|
55392bc879
|
[Bugfix][Multi Modal] Fix incorrect Molmo image processing (#26563)
Signed-off-by: sanghol <sanghol@allenai.org>
|
2025-10-10 22:28:23 -07:00 |
|
Roger Wang
|
ddaff2938e
|
[MM] Move Qwen3Omni MRoPE impl to model file (#26608)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-10 22:17:24 -07:00 |
|
liuzhenwei
|
27ed39a347
|
[XPU] Upgrade NIXL to remove CUDA dependency (#26570)
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
|
2025-10-11 05:15:23 +00:00 |
|
Nishidha Panpaliya
|
8f8474fbe3
|
[CI/Build] Fix ppc64le CPU build and tests (#22443)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
|
2025-10-11 13:04:42 +08:00 |
|
Chauncey
|
be067861c6
|
[Frontend] Improve the performance of is_reasoning_end (#25735)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 10:43:39 +08:00 |
|
Nick Hill
|
5bc26c438d
|
[BugFix] Make penalties and bad_words work with async scheduling (#26467)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-10 23:27:04 +00:00 |
|
Zhengxu Chen
|
eef921f45e
|
AOT Compilation for torch.compile (Bundled) (#24274)
Signed-off-by: zhxchen17 <zhxchen17@fb.com>
|
2025-10-10 19:02:11 -04:00 |
|
Bram Wasti
|
e317414ce1
|
Cache the environment variable check for batch invariance (#26510)
Signed-off-by: Bram Wasti <bwasti@meta.com>
|
2025-10-10 22:47:34 +00:00 |
|
Nick Hill
|
949cb0170d
|
[BugFix] Fix async scheduling + request preemption (#26385)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-10 20:29:57 +00:00 |
|
Vadim Gimpelson
|
e94cfd51da
|
[BUG] Qwen3-next MTP. Fix attn metadata build bug (#26564)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2025-10-10 14:59:03 -04:00 |
|
Harry Mellor
|
7c12763b24
|
Fix some typing issues found by mypy==1.18.2 (#26596)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-10 18:21:25 +00:00 |
|
Will Eaton
|
3b780a4bbb
|
Update CUDA architecture list in build pipeline for 12.9.1 wheels (#26592)
Signed-off-by: Will Eaton <wseaton@users.noreply.github.com>
|
2025-10-10 11:15:27 -07:00 |
|
Harry Mellor
|
30f78af147
|
Update pre-commit hook versions (#26591)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-10 17:03:44 +00:00 |
|
Xiong Wang
|
19a9b169bf
|
Add Qwen3-Omni moe thinker (#25550)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Xiong Wang <feizi.wx@alibaba-inc.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-10 17:00:56 +00:00 |
|
Roberto L. Castro
|
96ad65b7fe
|
[Transform] [Quantization] Add QuTLASS support to vLLM (#24440)
Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Signed-off-by: Andrei Panferov <andrei@panferov.org>
Co-authored-by: Andrei Panferov <andrei@panferov.org>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-10-10 09:43:40 -07:00 |
|
Shane A
|
8d2b8c0ff2
|
[Model] Add FlexOlmo model implementation (#24923)
Signed-off-by: Shane A <shanea@allenai.org>
|
2025-10-10 09:43:15 -07:00 |
|
Lukas Geiger
|
b2155ed317
|
[Model][Qwen3VL] Compute cu_seqlens on CPU to remove (#26496)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-10 09:42:17 -07:00 |
|
Chauncey
|
910abdbd08
|
[Bugfix] fixed top_logprobs: -1 does not appear to work as intended (#26470)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 00:41:17 +08:00 |
|
baonudesifeizhai
|
cddce79fda
|
[torch.compile] Make inductor partition rules respect splitting_ops #25691 (#25845)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-10 16:35:28 +00:00 |
|
Mark McLoughlin
|
e519281920
|
[Metrics] Add test for multi-modal cache stats logging (#26588)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-10 16:00:50 +00:00 |
|