biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Harry Mellor	14385c80fc	Fix weight mapping test for Transfomers v5 (#33162 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-01-27 12:30:14 +00:00
Roger Wang	b539f988e1	[Models] Kimi-K2.5 (#33131 ) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: wanglinian <wanglinian@stu.pku.edu.cn> Co-authored-by: wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-27 14:50:31 +08:00
Cyrus Leung	c25dbee40d	[Model] Bump transformers version for test registry (#33100 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-26 18:53:22 +00:00
Nicolò Lucchesi	19ab0f7ce5	[Bugfix] Fix Voxtral streaming slot_mapping (#33073 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2026-01-26 10:40:40 -08:00
Andy Lo	d56afd45fd	Remove unused logic in `models/mistral.py` (#33095 ) Signed-off-by: Andy Lo <andy@mistral.ai>	2026-01-26 09:01:52 -08:00
Pleaplusone	be6931ee27	[ROCm][Bugfix] Fix ptpc scale load issue for fused shared expert path in deepseek mtp (#33018 ) Signed-off-by: ganyi <ygan@amd.com>	2026-01-26 23:19:04 +08:00
Yuxuan Zhang	bb17e8f11c	[GLM-OCR] GLM-OCR with MTP Support (#33005 ) Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-01-26 06:24:43 -08:00
Cyrus Leung	dcd80206b7	[Chore] Update type annotation of `input_ids` in model forward (#33063 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-26 06:02:10 -08:00
VihaanThat	208c56256f	[Feature] Add LoRA support for Gemma3 vision components (#32764 )	2026-01-26 13:56:40 +00:00
Itay Etelis	6ca2c91b96	[Model] Use mm_position to compute mrope positions for Qwen3-Omni (#33010 ) Signed-off-by: Itay Etelis <itay.etelis@ibm.com> Co-authored-by: Itay Etelis <itay.etelis@ibm.com>	2026-01-26 13:48:07 +00:00
Cyrus Leung	11b556878b	[Refactor] Use data parser for matching data items to multi-modal UUIDs (#32955 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-26 15:00:28 +08:00
ltd0924	105d104576	[StepVL] support close img patch (#32923 ) Signed-off-by: luotingdan <luotingdan@stepfun.com> Signed-off-by: ltd0924 <32387785+ltd0924@users.noreply.github.com> Co-authored-by: luotingdan <luotingdan@stepfun.com>	2026-01-25 20:56:39 -08:00
Lucas Wilkinson	566cdb6cfb	[CI] Fix MHA attention test failure (AttributeError when model_config is None in ViT attention backend) (#33033 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2026-01-25 19:49:53 -08:00
Andreas Karatzas	22aeb43007	[Bugfix][VLM] Fix transformers backend embed_multimodal for Qwen2.5-VL profiling (#32969 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2026-01-26 08:34:05 +08:00
Itay Etelis	a698e8e7ad	[Model] Use mm_position to compute mrope positions for Qwen2.5-Omni (#32772 ) Signed-off-by: Itay Etelis <itay.etelis@ibm.com> Co-authored-by: Itay Etelis <itay.etelis@ibm.com>	2026-01-25 20:15:53 +08:00
JJJYmmm	7e67df5570	[Bugfix] fix encoder cache hang in Qwen3VL (#32684 ) Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-01-25 05:17:31 +00:00
david guan	bc0d291bfe	feat: Complete LoRA support for MiniMaxM2 Fixes #32736 (#32763 ) Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-24 20:48:46 +08:00
Isotr0py	9ad7f89f55	[Models]: Make Multimodal config implicit in ViT implementation (#31972 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-01-24 20:34:26 +08:00
Isotr0py	8edaf38570	[Models] Add `SharedFusedMoE` support to Qwen3MoE (#32082 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-01-23 23:36:31 -08:00
Wentao Ye	37c9859fab	[Refactor] Clean up unused variables & func (#32692 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2026-01-23 17:04:25 -05:00
Harry Huang	5206e5e28c	[V1][Hybrid] Mamba Prefix Caching with align mode (#30877 ) Signed-off-by: huanghaoyan.hhy <huanghaoyan.hhy@alibaba-inc.com> Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com>	2026-01-23 09:56:48 -08:00
Matteo Fari	fec9da0af4	[Model] Enable LoRA support for internvl2 (#32397 ) Signed-off-by: Matteo Fari <matteofari06@gmail.com>	2026-01-24 01:39:01 +08:00
baonudesifeizhai	1fb648bf10	[Bugfix] Fix FP8 MoE EP Weight Loading for ModelOpt Llama4 (#32886 ) Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>	2026-01-23 10:31:48 -05:00
Raushan Turganbay	d95d650762	[Bugfix] Fix getting vision features in Transformer Multimodal backend (#32933 ) Signed-off-by: raushan <raushan@huggingface.co>	2026-01-23 13:34:48 +00:00
tianshu-Michael-yu	13d8746c54	[Feature]: Remove DtoH Copy for lfm2_vl On Default Stream (#32815 ) Signed-off-by: Tianshu Yu <tianshuyu.formal@gmail.com>	2026-01-23 13:20:30 +00:00
Patrick von Platen	3f3f89529d	[Voxtral] Add new streaming arch (#32861 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-23 12:41:52 +01:00
Andreas Karatzas	a8eb1182f1	[CI][Models] Add VLM Support for Sequence Classification Conversion (#32885 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2026-01-23 16:22:51 +08:00
Eldar Kurtić	44f08af3a7	Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head) (#30141 ) Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com> Signed-off-by: eldarkurtic <8884008+eldarkurtic@users.noreply.github.com>	2026-01-22 13:29:57 -07:00
Maximilien de Bayser	ff365eea94	Support bge-m3 sparse embeddings and colbert embeddings (#14526 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Signed-off-by: Max de Bayser <maxdebayser@gmail.com>	2026-01-22 23:52:57 +08:00
Cyrus Leung	2b8a38b6d6	[Model] Extend `collect_children` and `no_init_weights` contexts (#32757 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-22 08:20:27 +00:00
Patrick von Platen	1579c9b5fd	[Llama.py -> mistral.py] Extract mistral-only relevant code into separate file (#32780 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>	2026-01-22 05:14:57 +00:00
Xin Yang	63227accf5	[Kernel] Add topk_sigmoid kernel (#31246 ) Signed-off-by: Xin Yang <xyangx@amazon.com>	2026-01-21 22:49:51 +00:00
Pleaplusone	6c20e89c02	[ROCm][Deepseekv3.2] Refactor Sparse Indexer as CustomOp (#29287 ) Signed-off-by: ganyi <ygan@amd.com>	2026-01-21 23:16:30 +08:00
Robert Shaw	cea3c754c4	[Quantization][Deprecation] Remove `DeepSpeedFp8` (#32679 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com>	2026-01-21 09:32:12 -05:00
Robert Shaw	42135d6898	[MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414 )	2026-01-21 08:22:33 -05:00
Divakar Verma	e14467be43	[bugfix] Aria model (#32727 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2026-01-21 05:11:31 -08:00
Kim Hee Su	7727ce35c2	[Model] Add Eagle2.5-8B Vision-Language Model support (#32456 ) Signed-off-by: kimheesu <wlskaka4@gmail.com>	2026-01-21 09:39:53 +00:00
RickyChen / 陳昭儒	f23fb5a7c1	[Bugfix] Support HF sharded weights for Mistral3/Pixtral models (#32673 ) Signed-off-by: ricky-chaoju <ricky.chen@infinirc.com> Signed-off-by: vllm-dev <ricky.chen@infinirc.com>	2026-01-20 23:27:30 -08:00
Netanel Haber	27ca95b3c9	[Bugfix] Fix Nemotron-Nano-v2-vlm static resolution (#32682 ) Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>	2026-01-21 06:28:21 +00:00
shanjiaz	7ab80a8e37	Added qwen3 vision language moe support for speculative decoding (#32048 ) Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> Signed-off-by: shanjiaz <43143795+shanjiaz@users.noreply.github.com>	2026-01-21 03:24:05 +00:00
gopalsarda	0900cedb3f	Enable Eagle3 speculative decoding for Pixtral (LlavaForConditionalGeneration) (#32542 ) Signed-off-by: gopalsarda <gopal.sarda@servicenow.com>	2026-01-21 11:18:05 +08:00
Alex Brooks	27b81e010d	[Bugfix] Fix Granite Vision / Don't use Siglip Pooling Head Nested Models by Default (#32299 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2026-01-21 11:11:52 +08:00
Cyrus Leung	193069d129	[5/N] Initialize MM components in context managers (Q-Z) (#32695 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 19:10:23 +00:00
JJJYmmm	04a9e064db	[Bugfix] fix the ima issue of qwen-vit (#32687 ) Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>	2026-01-20 17:21:25 +00:00
Cyrus Leung	fda3f03eb2	[4/N] Initialize MM components in context managers (M-P) (#32663 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 14:06:32 +00:00
Chauncey	c4e5bdf61b	[Bugfix] Fix the fp8_mqa_logits dim mismatch (#32652 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-20 18:48:07 +08:00
Cyrus Leung	7f1bcd18ff	[3/N] Initialize MM components in context managers (I-L) (#32650 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 10:21:56 +00:00
Cyrus Leung	e1a34c3a5d	[2/N] Initialize MM components in context managers (E-H) (#32641 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 08:12:56 +00:00
Cyrus Leung	b75e85dede	[1/N] Initialize MM components in context managers (A-D) (#32632 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 14:12:42 +08:00
Cyrus Leung	4753f3bf69	[Model] Use context managers for encoder- and LM-only mode (#32605 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-20 11:43:38 +08:00

1 2 3 4 5 ...

2155 Commits