zhang-prog
|
40b69e33e7
|
[Model] Add PaddleOCR-VL Model Support (#27758)
Signed-off-by: zhangyue <zhangyue66@baidu.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-03 19:04:22 +08:00 |
|
Asaf Joseph Gardin
|
00b31a36a2
|
[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-11-02 04:16:23 -08:00 |
|
Cyrus Leung
|
853a8eb53b
|
[Bugfix] Fix Qwen Omni audio inference (#27920)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-02 05:06:05 +00:00 |
|
TJian
|
e2347dbf58
|
[Bugfix] [Model] Missing MRoPE function definition from KeyeForConditionalGeneration (#27895)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-11-01 13:45:23 +08:00 |
|
Cyrus Leung
|
879a06579e
|
[CI/Build] Bump transformers version (#27528)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-31 22:11:07 -07:00 |
|
Yan Ma
|
7e2729b57e
|
[Multimodal][XPU]Enable vision attn backend for xpu platform (#27525)
Signed-off-by: Yan Ma <yan.ma@intel.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yejing Lai <yejing.lai@intel.com>
Co-authored-by: Guancheng Fu <110874468+gc-fu@users.noreply.github.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-11-01 04:45:02 +00:00 |
|
ZiTian Zhao
|
bc306fe5e9
|
fix incorrect type annotation in KimiMLP (#27885)
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
|
2025-10-31 17:38:02 +00:00 |
|
Isotr0py
|
7e06c40e63
|
[Bugfix] Fix broken MRoPE for GLM-4.1V/GLM-4.5V (#27860)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-31 17:04:51 +00:00 |
|
toncao
|
e5ef4dfc11
|
[Kimi-Linear] Correct prefixes and add compatibility to AWQ quants (#27834)
Signed-off-by: toncao <cpatonn@gmail.com>
Co-authored-by: toncao <cpatonn@gmail.com>
|
2025-10-31 17:36:37 +08:00 |
|
Tyler Michael Smith
|
ab98f6556f
|
[Bugfix] Fix 2 precommit issues - (mamba_block_size, kv_cache_config) (#27811)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-10-30 11:52:18 -07:00 |
|
Mengqing Cao
|
1004205795
|
[MTP] Refactor mtp predictor to avoid d2h operation (#27643)
Signed-off-by: MengqingCao <cmq0113@163.com>
|
2025-10-30 17:27:39 +00:00 |
|
Fan Yin
|
9956aae4ea
|
[Model][Ouro] Support Ouro Model (#27794)
Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-30 22:34:41 +08:00 |
|
Zhiyuan Li
|
4e68cc9b6a
|
[Model] Introduce Kimi Linear to vLLM (#27809)
Signed-off-by: lizhiyuan <lizhiyuan@moonshot.cn>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
|
2025-10-30 21:02:27 +08:00 |
|
wang.yuqi
|
4464723f22
|
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-10-30 12:13:05 +00:00 |
|
Zhewen Li
|
e806178d2a
|
[BugFix][VL] Fix FA selection on Qwen2.5-VL (#27790)
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-30 07:54:44 +00:00 |
|
Chenheli Hua
|
48eb8eba58
|
[Temp fix] Disable torch.compile for Qwen2.5 VL's VisionBlock temporarily. (#27760)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-29 23:17:48 +00:00 |
|
JartX
|
7568a282b9
|
[FIXBUG] Qwen3VL hallucinations without Contiguous on Torch.SDPA (#27744)
Signed-off-by: JartX <sagformas@epdcenter.es>
Co-authored-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-29 16:55:35 +00:00 |
|
Roger Young
|
d6704dd099
|
Fix MiniMax-M2 rmsnorm precision and remove useless code (#27627)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-10-29 21:01:05 +08:00 |
|
Jiangyun Zhu
|
8df98c2161
|
[perf] Enable concurrent execution of "shared_experts" and "selected_experts" in qwen3-next (#27578)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-10-29 08:12:54 +00:00 |
|
Lukas Geiger
|
0d8161b075
|
[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes (#27705)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-29 05:28:20 +00:00 |
|
Lucas Kabela
|
94666612a9
|
[Misc][qwen2_5_vl][torch.compile] Enable supports_torch_compile on generic nn.Module and demonstrate speedup on Qwen Vision model (#23207)
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: Lucas Kabela <lucasakabela@gmail.com>
|
2025-10-28 22:36:43 +00:00 |
|
Asaf Joseph Gardin
|
05181cc57f
|
[Hybrid] Add mamba_block_size to Engine Args (#27289)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-10-28 12:54:24 +00:00 |
|
tingtinggithub
|
23ad820553
|
fixing mm placeholder replacement issue with gemma3 (#27538)
Signed-off-by: tingtingtang1992 <streamttt@gmail.com>
|
2025-10-27 14:34:01 +00:00 |
|
Yu Jiaqi
|
4f882be4a0
|
[Model] Siglip2 Model Support (#27566)
Signed-off-by: piood <2477084691@qq.com>
|
2025-10-27 06:57:37 -07:00 |
|
Asaf Joseph Gardin
|
9273754222
|
[Hybrid] Added supports_mamba_prefix_caching Protocol (#27339)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-10-27 13:05:20 +00:00 |
|
Jee Jee Li
|
2d631d28c6
|
[Doc] Slight improvement to M2 and beyond (#27554)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-27 09:02:10 +00:00 |
|
Cyrus Leung
|
cbd5e07a51
|
[Model] Use merge_by_field_config for MM models (Qwen series) (#27546)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 05:38:05 +00:00 |
|
CSWYF3634076
|
63b22e0dbb
|
[Model][Bugfix] fix ernie45 moe 300B SharedFusedMoE output tuple (#27316)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-10-26 20:53:31 -07:00 |
|
Roger Young
|
5980604c44
|
Fix MiniMax-M2 copyright (#27537)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-10-27 03:29:51 +00:00 |
|
Roger Young
|
720af6ab79
|
[Model][MiniMax-M2] Support MiniMax-M2 Model (#27535)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-10-27 00:59:11 +08:00 |
|
Yeshwanth N
|
71b1c8b667
|
[Chore]:Extract math and argparse utilities to separate modules (#27188)
Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com>
Signed-off-by: Yeshwanth N <yeshsurya@gmail.com>
Signed-off-by: yeshsurya <yeshsurya@gmail.com>
|
2025-10-26 04:03:32 -07:00 |
|
JartX
|
65d2cf9511
|
[BUGFIX][ROCM] ViT FlashAttention on ROCm (no GFX9) and contiguous on qwen3vl ROCm TORCH_SDPA (#27190)
Signed-off-by: JartX <sagformas@epdcenter.es>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-10-26 15:08:52 +08:00 |
|
Cyrus Leung
|
66a168a197
|
[CI/Build] Refactor processing tests (#27470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-25 16:14:30 +00:00 |
|
Isotr0py
|
acc78aeb88
|
[Bugfix] Fix interns1-vit qk norm code path (#27480)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-24 17:43:45 +00:00 |
|
fhl2000
|
284cc92275
|
[MISC] cudagraph_capture_sizes related improvements (#26016)
Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-24 05:11:05 -07:00 |
|
Isotr0py
|
42efe609ba
|
[MM][Bugfix] Replace PatchEmbed's conv3d to linear layer (#27418)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-24 07:32:47 +00:00 |
|
Harry Mellor
|
1f9460c4c1
|
Fix pooling adapters for Transformers backend (#27338)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-23 20:23:55 -07:00 |
|
xiao-llm
|
70022ffc00
|
Granite 4.0 quark quantization support (#26944)
Signed-off-by: Xiao YU <Xiao.YU@xilinx.com>
Signed-off-by: Xiao Yu <xiao.yu.dc@outlook.com>
Co-authored-by: Xiao YU <Xiao.YU@xilinx.com>
|
2025-10-24 02:14:03 +00:00 |
|
Yu Jiaqi
|
0552cfb195
|
[Model] Siglip Embedding Support (#27324)
Signed-off-by: piood <2477084691@qq.com>
|
2025-10-23 20:19:48 +00:00 |
|
Jonathan Chen
|
ca76486a16
|
[Chore] Separate out vllm.utils.platform_utils.py (#27374)
Signed-off-by: Jonathan <chenleejonathan@gmail.com>
|
2025-10-23 19:08:06 +00:00 |
|
wang.yuqi
|
3fa2c12185
|
[Frontend][4/N] Improve all pooling task | Add plugin pooling task (#26973)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Christian Pinto <christian.pinto@ibm.com>
|
2025-10-23 14:46:18 +00:00 |
|
Cyrus Leung
|
fe2016de2d
|
[CI/Build] Remove unnecessary flags from test registry (#27353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-23 14:42:40 +00:00 |
|
Bradley D
|
570c3e1cd4
|
[Bugfix] Honor --mm_encoder_attn_backend when used (#27124)
Co-authored-by: Bradley D <4551889+bradleyhd@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-23 20:09:52 +08:00 |
|
tomeras91
|
61089465a6
|
[Model] Add MoE support for NemotronH (#25863)
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
|
2025-10-23 10:27:23 +00:00 |
|
Isotr0py
|
2566dca2a9
|
[Bugfix] Fix deepseek-ocr multi-image inference and add merge_by_field_config=True with tensor schema support (#27361)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-22 17:15:38 -07:00 |
|
Luciano Martins
|
e05a6754a8
|
[Model] Revert PR #26715: Restore custom PaliGemma and Gemma3-MM impl… (#27309)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
|
2025-10-22 10:05:34 -07:00 |
|
Isotr0py
|
db6f28d898
|
[Bugfix] Fix HF format InternVL large variants video processing (#27330)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-22 08:39:23 -07:00 |
|
Cyrus Leung
|
14e2f1231e
|
[Bugfix] Make get_mrope_input_positions instance methods (#27342)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-22 08:38:34 -07:00 |
|
Isotr0py
|
675aa2ec64
|
[Model] Upstream Deepseek-OCR model (#27247)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-22 07:59:15 -07:00 |
|
Lain
|
09a7e6f617
|
[Deepseek v3.2] Remove extra logics in indexer (#26465)
Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
Signed-off-by: Lain <siyuanf@nvidia.com>
Co-authored-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
|
2025-10-21 23:34:03 +00:00 |
|