Harry Mellor
14385c80fc
Fix weight mapping test for Transfomers v5 ( #33162 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-01-27 12:30:14 +00:00
Roger Wang
b539f988e1
[Models] Kimi-K2.5 ( #33131 )
...
Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn >
Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Signed-off-by: youkaichao <youkaichao@gmail.com >
Signed-off-by: Roger Wang <hey@rogerw.io >
Co-authored-by: wanglinian <wanglinian@stu.pku.edu.cn >
Co-authored-by: wangln19 <96399074+wangln19@users.noreply.github.com >
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-27 14:50:31 +08:00
Cyrus Leung
c25dbee40d
[Model] Bump transformers version for test registry ( #33100 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-26 18:53:22 +00:00
Nicolò Lucchesi
19ab0f7ce5
[Bugfix] Fix Voxtral streaming slot_mapping ( #33073 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-01-26 10:40:40 -08:00
Andy Lo
d56afd45fd
Remove unused logic in models/mistral.py ( #33095 )
...
Signed-off-by: Andy Lo <andy@mistral.ai >
2026-01-26 09:01:52 -08:00
Pleaplusone
be6931ee27
[ROCm][Bugfix] Fix ptpc scale load issue for fused shared expert path in deepseek mtp ( #33018 )
...
Signed-off-by: ganyi <ygan@amd.com >
2026-01-26 23:19:04 +08:00
Yuxuan Zhang
bb17e8f11c
[GLM-OCR] GLM-OCR with MTP Support ( #33005 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-26 06:24:43 -08:00
Cyrus Leung
dcd80206b7
[Chore] Update type annotation of input_ids in model forward ( #33063 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-26 06:02:10 -08:00
VihaanThat
208c56256f
[Feature] Add LoRA support for Gemma3 vision components ( #32764 )
2026-01-26 13:56:40 +00:00
Itay Etelis
6ca2c91b96
[Model] Use mm_position to compute mrope positions for Qwen3-Omni ( #33010 )
...
Signed-off-by: Itay Etelis <itay.etelis@ibm.com >
Co-authored-by: Itay Etelis <itay.etelis@ibm.com >
2026-01-26 13:48:07 +00:00
Cyrus Leung
11b556878b
[Refactor] Use data parser for matching data items to multi-modal UUIDs ( #32955 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-26 15:00:28 +08:00
ltd0924
105d104576
[StepVL] support close img patch ( #32923 )
...
Signed-off-by: luotingdan <luotingdan@stepfun.com >
Signed-off-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
Co-authored-by: luotingdan <luotingdan@stepfun.com >
2026-01-25 20:56:39 -08:00
Lucas Wilkinson
566cdb6cfb
[CI] Fix MHA attention test failure (AttributeError when model_config is None in ViT attention backend) ( #33033 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com >
2026-01-25 19:49:53 -08:00
Andreas Karatzas
22aeb43007
[Bugfix][VLM] Fix transformers backend embed_multimodal for Qwen2.5-VL profiling ( #32969 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-26 08:34:05 +08:00
Itay Etelis
a698e8e7ad
[Model] Use mm_position to compute mrope positions for Qwen2.5-Omni ( #32772 )
...
Signed-off-by: Itay Etelis <itay.etelis@ibm.com >
Co-authored-by: Itay Etelis <itay.etelis@ibm.com >
2026-01-25 20:15:53 +08:00
JJJYmmm
7e67df5570
[Bugfix] fix encoder cache hang in Qwen3VL ( #32684 )
...
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com >
Signed-off-by: Roger Wang <hey@rogerw.io >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: Roger Wang <hey@rogerw.io >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-25 05:17:31 +00:00
david guan
bc0d291bfe
feat: Complete LoRA support for MiniMaxM2 Fixes #32736 ( #32763 )
...
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com >
2026-01-24 20:48:46 +08:00
Isotr0py
9ad7f89f55
[Models]: Make Multimodal config implicit in ViT implementation ( #31972 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-24 20:34:26 +08:00
Isotr0py
8edaf38570
[Models] Add SharedFusedMoE support to Qwen3MoE ( #32082 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-23 23:36:31 -08:00
Wentao Ye
37c9859fab
[Refactor] Clean up unused variables & func ( #32692 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com >
2026-01-23 17:04:25 -05:00
Harry Huang
5206e5e28c
[V1][Hybrid] Mamba Prefix Caching with align mode ( #30877 )
...
Signed-off-by: huanghaoyan.hhy <huanghaoyan.hhy@alibaba-inc.com >
Signed-off-by: Chen Zhang <zhangch99@outlook.com >
Co-authored-by: Chen Zhang <zhangch99@outlook.com >
2026-01-23 09:56:48 -08:00
Matteo Fari
fec9da0af4
[Model] Enable LoRA support for internvl2 ( #32397 )
...
Signed-off-by: Matteo Fari <matteofari06@gmail.com >
2026-01-24 01:39:01 +08:00
baonudesifeizhai
1fb648bf10
[Bugfix] Fix FP8 MoE EP Weight Loading for ModelOpt Llama4 ( #32886 )
...
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com >
2026-01-23 10:31:48 -05:00
Raushan Turganbay
d95d650762
[Bugfix] Fix getting vision features in Transformer Multimodal backend ( #32933 )
...
Signed-off-by: raushan <raushan@huggingface.co >
2026-01-23 13:34:48 +00:00
tianshu-Michael-yu
13d8746c54
[Feature]: Remove DtoH Copy for lfm2_vl On Default Stream ( #32815 )
...
Signed-off-by: Tianshu Yu <tianshuyu.formal@gmail.com >
2026-01-23 13:20:30 +00:00
Patrick von Platen
3f3f89529d
[Voxtral] Add new streaming arch ( #32861 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-23 12:41:52 +01:00
Andreas Karatzas
a8eb1182f1
[CI][Models] Add VLM Support for Sequence Classification Conversion ( #32885 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-23 16:22:51 +08:00
Eldar Kurtić
44f08af3a7
Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head) ( #30141 )
...
Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com >
Signed-off-by: eldarkurtic <8884008+eldarkurtic@users.noreply.github.com >
2026-01-22 13:29:57 -07:00
Maximilien de Bayser
ff365eea94
Support bge-m3 sparse embeddings and colbert embeddings ( #14526 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com >
Signed-off-by: Max de Bayser <maxdebayser@gmail.com >
2026-01-22 23:52:57 +08:00
Cyrus Leung
2b8a38b6d6
[Model] Extend collect_children and no_init_weights contexts ( #32757 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-22 08:20:27 +00:00
Patrick von Platen
1579c9b5fd
[Llama.py -> mistral.py] Extract mistral-only relevant code into separate file ( #32780 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
2026-01-22 05:14:57 +00:00
Xin Yang
63227accf5
[Kernel] Add topk_sigmoid kernel ( #31246 )
...
Signed-off-by: Xin Yang <xyangx@amazon.com >
2026-01-21 22:49:51 +00:00
Pleaplusone
6c20e89c02
[ROCm][Deepseekv3.2] Refactor Sparse Indexer as CustomOp ( #29287 )
...
Signed-off-by: ganyi <ygan@amd.com >
2026-01-21 23:16:30 +08:00
Robert Shaw
cea3c754c4
[Quantization][Deprecation] Remove DeepSpeedFp8 ( #32679 )
...
Signed-off-by: Robert Shaw <robshaw@redhat.com >
Co-authored-by: Robert Shaw <robshaw@redhat.com >
2026-01-21 09:32:12 -05:00
Robert Shaw
42135d6898
[MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority ( #32414 )
2026-01-21 08:22:33 -05:00
Divakar Verma
e14467be43
[bugfix] Aria model ( #32727 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com >
2026-01-21 05:11:31 -08:00
Kim Hee Su
7727ce35c2
[Model] Add Eagle2.5-8B Vision-Language Model support ( #32456 )
...
Signed-off-by: kimheesu <wlskaka4@gmail.com >
2026-01-21 09:39:53 +00:00
RickyChen / 陳昭儒
f23fb5a7c1
[Bugfix] Support HF sharded weights for Mistral3/Pixtral models ( #32673 )
...
Signed-off-by: ricky-chaoju <ricky.chen@infinirc.com >
Signed-off-by: vllm-dev <ricky.chen@infinirc.com >
2026-01-20 23:27:30 -08:00
Netanel Haber
27ca95b3c9
[Bugfix] Fix Nemotron-Nano-v2-vlm static resolution ( #32682 )
...
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com >
2026-01-21 06:28:21 +00:00
shanjiaz
7ab80a8e37
Added qwen3 vision language moe support for speculative decoding ( #32048 )
...
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com >
Signed-off-by: shanjiaz <43143795+shanjiaz@users.noreply.github.com >
2026-01-21 03:24:05 +00:00
gopalsarda
0900cedb3f
Enable Eagle3 speculative decoding for Pixtral (LlavaForConditionalGeneration) ( #32542 )
...
Signed-off-by: gopalsarda <gopal.sarda@servicenow.com >
2026-01-21 11:18:05 +08:00
Alex Brooks
27b81e010d
[Bugfix] Fix Granite Vision / Don't use Siglip Pooling Head Nested Models by Default ( #32299 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
2026-01-21 11:11:52 +08:00
Cyrus Leung
193069d129
[5/N] Initialize MM components in context managers (Q-Z) ( #32695 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 19:10:23 +00:00
JJJYmmm
04a9e064db
[Bugfix] fix the ima issue of qwen-vit ( #32687 )
...
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com >
2026-01-20 17:21:25 +00:00
Cyrus Leung
fda3f03eb2
[4/N] Initialize MM components in context managers (M-P) ( #32663 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 14:06:32 +00:00
Chauncey
c4e5bdf61b
[Bugfix] Fix the fp8_mqa_logits dim mismatch ( #32652 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-20 18:48:07 +08:00
Cyrus Leung
7f1bcd18ff
[3/N] Initialize MM components in context managers (I-L) ( #32650 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 10:21:56 +00:00
Cyrus Leung
e1a34c3a5d
[2/N] Initialize MM components in context managers (E-H) ( #32641 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 08:12:56 +00:00
Cyrus Leung
b75e85dede
[1/N] Initialize MM components in context managers (A-D) ( #32632 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 14:12:42 +08:00
Cyrus Leung
4753f3bf69
[Model] Use context managers for encoder- and LM-only mode ( #32605 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 11:43:38 +08:00