ZiTian Zhao
|
ae88aada38
|
[Feature]Add EVS (Efficient Video Sampling) Support for Qwen3-VL (#29752)
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Co-authored-by: deitxfge <huhaibo1990@126.com>
|
2025-12-14 05:24:56 -08:00 |
|
zifeitong
|
48b8456ff9
|
[Bugfix] Revert Qwen2-VL part of change in #28271 (#30542)
Signed-off-by: Zifei Tong <zifeitong@gmail.com>
|
2025-12-14 05:20:08 -08:00 |
|
Ilya Markov
|
3224ea9915
|
[torch.compile] Add encoder tag for compilation (#30489)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
|
2025-12-14 18:15:11 +08:00 |
|
Lasha Koroshinadze
|
3a20450d31
|
Add AudioFlamingo3 model support (#30539)
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-14 02:14:55 -08:00 |
|
Chen Zhang
|
ace34e3783
|
[Bugfix] Qwen3-next with --hf-overrides \{\"num_hidden_layers\":8\} (#30433)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-12-13 22:12:45 +08:00 |
|
Cyrus Leung
|
64251f48df
|
[Chore] Adjust tokenizer import to avoid circular imports (#30601)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-13 04:42:39 -08:00 |
|
Roberto L. Castro
|
4fa7ce46f3
|
[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484)
Signed-off-by: LopezCastroRoberto <robertol.c510@gmail.com>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-12-12 19:34:23 -08:00 |
|
Lucas Wilkinson
|
3e41992fec
|
[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-12 05:57:47 -08:00 |
|
Jaehwang Jung
|
f90319d5d1
|
[Bugfix] Schedule failure due to wrong get_image_size_with_most_features (#29692)
|
2025-12-12 02:27:20 -08:00 |
|
Michael Goin
|
9f2fc16a69
|
[Bugfix][Model] Fix Afmoe rope_parameters issue (#30505)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-12 02:53:57 +00:00 |
|
Nicolò Lucchesi
|
0efd9f867c
|
[Core] Whisper Enable Encoder Batching (#29421)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-11 21:06:51 +00:00 |
|
Harry Mellor
|
cf3eacfe58
|
Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim (#30389)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-11 20:45:23 +00:00 |
|
Harry Mellor
|
8781cd6b88
|
Add Eagle and Eagle3 support to Transformers modeling backend (#30340)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-11 17:02:10 +00:00 |
|
Harry Mellor
|
93db3256a4
|
Give pooling examples better names (#30488)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-11 16:22:58 +00:00 |
|
Cyrus Leung
|
3a3b06ee70
|
[Misc] Improve error message for is_multimodal (#30483)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-11 06:39:51 -08:00 |
|
Cyrus Leung
|
13d63b65e0
|
[Deprecation] Remove missed fallback for embed_input_ids (#30469)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-11 10:06:36 +00:00 |
|
Cyrus Leung
|
979f50efd0
|
[Deprecation] Remove fallbacks for embed_input_ids and embed_multimodal (#30458)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-11 06:58:23 +00:00 |
|
gh-wf
|
36c9ce2554
|
Ensure minimum frames for GLM 4.6V compatibility (#30285)
Signed-off-by: Wayne Ferguson <wayneferguson@gmail.com>
|
2025-12-11 05:26:49 +00:00 |
|
Anker
|
e8e8cd73e5
|
[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing (#30344)
Signed-off-by: Lennart Brog <lennart.borg@list-ag.de>
Signed-off-by: Anker <20343812+anker-c2@users.noreply.github.com>
|
2025-12-10 18:09:31 +00:00 |
|
Roger Young
|
d017bceb08
|
[BugFix] Fix minimax m2 model rotary_dim (#30384)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-12-10 04:58:50 -08:00 |
|
haoyangli-amd
|
06462392e4
|
[bugfix][quantization] fix quark qwen3 kv_cache quantization (#30308)
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
|
2025-12-10 03:24:12 +00:00 |
|
Tsukasa OI
|
73a484caa1
|
[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models (#30307)
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
|
2025-12-09 19:13:10 +00:00 |
|
wang.yuqi
|
9c32df6101
|
[Bugfix] Qwen 3 VL Embedding loading (#30303)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-09 08:04:02 +00:00 |
|
shaharmor98
|
fcd5306f65
|
Add latent MoE support (#30203)
Signed-off-by: Shahar Mor <smor@nvidia.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-12-08 17:35:01 +00:00 |
|
Daniel Cámpora
|
184076c3fe
|
[DeepSeek v3.2] Make top-k work for any logit values. (#27568)
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-08 06:55:58 -08:00 |
|
wang.yuqi
|
9e77ffca3f
|
[Model][7/N] Improve all pooling task | Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API (#26686)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2025-12-08 08:10:09 +00:00 |
|
Dazhi Jiang
|
bcb6f5947f
|
[Perf] Remove sync point in vit torch sdpa attn backend (#30232)
Signed-off-by: Dazhi Jiang <dazhi_jiang@163.com>
|
2025-12-08 07:12:42 +00:00 |
|
Cyrus Leung
|
e83b7e379c
|
Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145)" (#30199)
|
2025-12-07 00:00:22 -08:00 |
|
Cyrus Leung
|
27f4c2fd46
|
[Renderer] Separate out RendererConfig from ModelConfig (#30145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 23:15:42 -08:00 |
|
Cyrus Leung
|
671427efbf
|
[Model] Move multimodal_cpu_fields definition to field config (#30181)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 13:40:02 +00:00 |
|
Cyrus Leung
|
c46b932df2
|
[Chore] Deprecate SupportsMultiModal.merge_by_field_config (#30170)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 07:57:28 +00:00 |
|
Peter Salas
|
e858bc4d14
|
[Model] Add support for transformer-based Ultravox v0.7 projector (#30089)
Signed-off-by: Peter Salas <peter@fixie.ai>
|
2025-12-05 20:55:43 -08:00 |
|
Divakar Verma
|
962d703818
|
[Bugfix][llama4_eagle] Fix missing 'lm_head' attribute (#29926)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-05 19:57:26 +00:00 |
|
Matthew Bonanni
|
66e674cdd5
|
[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2025-12-05 09:48:43 -08:00 |
|
amitz-nv
|
6038b1b04b
|
[Frontend][Model] Add 'float16' to possible mamba cache dtype values, override mamba SSM cache dtype value for NemotronH (#29978)
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
|
2025-12-05 00:34:33 -08:00 |
|
Harry Mellor
|
e10c84e06a
|
Access partial_rotary_factor from rope_parameters (#29966)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-04 18:42:49 +00:00 |
|
Tao Yun
|
6dcb07f676
|
support qwen3-vl handle requests with embeddings (#30037)
Signed-off-by: taoyun <1069423820@qq.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-12-04 17:34:06 +00:00 |
|
Cyrus Leung
|
b286a311c2
|
[Chore] Deprecate merge_by_field_config arg (#30035)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 17:21:24 +00:00 |
|
Harry Mellor
|
9998ea5b57
|
Delete HF version of Phi 4 MM (#30049)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-04 13:44:50 +00:00 |
|
wang.yuqi
|
74c4d80c6c
|
[Model][6/N] Improve all pooling task | Support chunked prefill with ALL pooling (#27145)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-04 13:44:15 +00:00 |
|
Cyrus Leung
|
68eb5c8d97
|
[Misc] Move functions into PoolingMetadata (#30027)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 08:21:19 +00:00 |
|
TJian
|
3f1b03739a
|
[ROCm] [Bugfix] compute_attn_mask_seqlen for qwen3 omni (#29974)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-12-04 08:20:24 +00:00 |
|
Cyrus Leung
|
9ae2f60374
|
[Misc] Various cleanups for MM input processing (#29970)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 06:22:20 +00:00 |
|
Isotr0py
|
a21cd9ed23
|
[Bugfix] Fix incorrect image_grid_thw rank for HunyuanOCR from missing merge_by_field_config=True (#29950)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 10:05:10 +00:00 |
|
Julien Denize
|
5e5646e206
|
[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention (#29908)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:20 -08:00 |
|
Harry Mellor
|
6fc5841db1
|
Fix some more Transformers nightly tests (#29872)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 21:49:44 +00:00 |
|
Navanit Dubey
|
a2b053dc85
|
feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896)
Signed-off-by: navanit-git <navanitdubey@gmail.com>
|
2025-12-02 19:28:35 +00:00 |
|
Isotr0py
|
0ec8422171
|
[Bugfix] Fix incorrect channel order for idefics3 in edge case (#29881)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-02 16:03:52 +00:00 |
|
Matthew Bonanni
|
51c57b51dd
|
[Bugfix] Fix DeepSeek R1 MTP weight loading (#29545)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-12-02 15:52:18 +00:00 |
|
Cyrus Leung
|
68ffbca7e4
|
[Chore] Use tokenizer.encode and tokenizer.decode directly (#29851)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 12:30:40 +00:00 |
|