biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Jee Jee Li	ce1eafd1a5	[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674 ) Signed-off-by: bk-201 <joy25810@foxmail.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: bk-201 <joy25810@foxmail.com> Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: Anexdeus <5142168@mail.ru>	2025-12-26 04:48:20 -08:00
dengyunyang	8f8f469b1b	[BugFix] skip language model in Encoder (#30242 ) Signed-off-by: dengyunyang <584797741@qq.com>	2025-12-22 05:25:59 -08:00
Roger Wang	f5f51e5931	[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475 ) Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Sun Kim <sunytokki@gmail.com>	2025-12-16 14:18:17 -08:00
Shanshan Shen	87b4d1557d	[CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. (#30125 ) Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-12-15 11:13:32 +08:00
ZiTian Zhao	ae88aada38	[Feature]Add EVS (Efficient Video Sampling) Support for Qwen3-VL (#29752 ) Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com> Co-authored-by: deitxfge <huhaibo1990@126.com>	2025-12-14 05:24:56 -08:00
Harry Mellor	cf3eacfe58	Standardise `get_rope` to use `rope_parameters["partial_rotary_factor"]`, not `rotary_dim` (#30389 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 20:45:23 +00:00
Cyrus Leung	3a3b06ee70	[Misc] Improve error message for `is_multimodal` (#30483 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-11 06:39:51 -08:00
Cyrus Leung	979f50efd0	[Deprecation] Remove fallbacks for `embed_input_ids` and `embed_multimodal` (#30458 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-11 06:58:23 +00:00
Cyrus Leung	671427efbf	[Model] Move `multimodal_cpu_fields` definition to field config (#30181 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 13:40:02 +00:00
Cyrus Leung	c46b932df2	[Chore] Deprecate `SupportsMultiModal.merge_by_field_config` (#30170 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 07:57:28 +00:00
Tao Yun	6dcb07f676	support qwen3-vl handle requests with embeddings (#30037 ) Signed-off-by: taoyun <1069423820@qq.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-12-04 17:34:06 +00:00
Cyrus Leung	fe3398fab2	[Chore] Enable passing `tokenizer=None` into MM processor (#29724 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-29 06:25:10 -08:00
Mingyuan Ma	460d8bbf2d	Remove upstream fa checks (#29471 ) Signed-off-by: mingyuanm <mingyuanm@nvidia.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-28 05:52:42 -08:00
Cyrus Leung	b34e8775a3	Revert "[CPU]Update CPU PyTorch to 2.9.0 (#29589 )" (#29647 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-27 22:43:18 -08:00
EanWang211123	37b15e97e8	[Multimodal][Speculative Decoding]Eagle3 mm support, enablement on qwen3vl (#29594 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com> Signed-off-by: EanWang211123 <wangyiheng@sangfor.com.cn> Co-authored-by: Louie Tsai <louie.tsai@intel.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-11-27 22:05:45 -08:00
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
Lukas Geiger	a9705a290a	[Model][QwenVL] Replace `torch.repeat_interleave` with faster `np.repeat` (#28964 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-19 22:04:23 -08:00
Harry Mellor	a8b70304d6	Update `rope_scaling` to `rope_parameters` in preparation for Transformers v5 (#28542 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-19 09:06:36 -08:00
Lukas Geiger	3d4e7d34be	[Model][QwenVL] Simplify cos/sin rotary embedding indexing (#28962 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-19 05:43:01 +00:00
Canlin Guo	b9489f51e1	[Model][Perf] Use cos and sin cache in QwenVL (#28798 ) Signed-off-by: gcanlin <canlinguosdu@gmail.com>	2025-11-18 11:51:54 +00:00
Lukas Geiger	07cadab27a	[Model][Qwen3VL] Cache positional embedding indices (#28475 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-15 19:03:09 +00:00
Lukas Geiger	f05d474c8a	[Model][Qwen3VL] Use `mm_position` to compute mrope positions (#28730 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-14 19:45:11 -08:00
GuanH	cec275efce	[Bugfix] resolve Qwen3-VL GPTQModel quantized model loading failure (#28663 ) Signed-off-by: GuanH <guansdrailib@gmail.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-14 18:44:27 +00:00
Shanshan Shen	41b92f7d38	[Model][MM] Extract conv layer as CustomOp (#28455 ) Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-14 19:16:13 +08:00
Harry Mellor	97d1c99302	Rename clashing method names for vLLM model protocol (#27583 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 19:14:33 -08:00
Lukas Geiger	cbb799e314	[Model][Qwen3VL] Simplify `get_mrope_input_positions` using numpy (#28302 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-12 02:55:10 +00:00
Jee Jee Li	9d1c474704	[LoRA][1/N]Remove LoRA extra vocab (#28382 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-11-11 11:06:21 -08:00
Cyrus Leung	afffd3cc8a	[Model] Pass `mm_features` directly into `get_mrope_input_positions` (#28399 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 21:14:48 +08:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Lukas Geiger	9973e6e04a	[Model][Qwen3VL] Slighly speedup `fast_pos_embed_interpolate` (#28434 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-11 10:35:10 +00:00
Cyrus Leung	d0e186c16f	[V0 Deprecation] Remove unused `context_len` and `seq_len` from M-RoPE (#28395 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 00:30:06 +08:00
Lukas Geiger	e0919f331d	[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (#28168 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-07 12:14:29 +00:00
Lukas Geiger	0d8161b075	[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes (#27705 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-29 05:28:20 +00:00
Cyrus Leung	cbd5e07a51	[Model] Use merge_by_field_config for MM models (Qwen series) (#27546 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-27 05:38:05 +00:00
Cyrus Leung	66a168a197	[CI/Build] Refactor processing tests (#27470 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-25 16:14:30 +00:00
Isotr0py	42efe609ba	[MM][Bugfix] Replace `PatchEmbed`'s conv3d to linear layer (#27418 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-24 07:32:47 +00:00
Cyrus Leung	14e2f1231e	[Bugfix] Make `get_mrope_input_positions` instance methods (#27342 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-22 08:38:34 -07:00
Roger Wang	c3a2c6ac5f	[MM][Core] Decouple ViT backend from LM backend (#27061 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-10-21 00:30:10 -07:00
Cyrus Leung	d31f7844f8	[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-19 05:20:55 -07:00
燃	4c91a28e30	[bugfix] Qwen3-VL fix video incorrect timestamp calculations while do_sample_frames=True (#27104 ) Co-authored-by: 松灵 <wpf272043@alibaba-inc.com>	2025-10-17 16:26:33 +00:00
Jee Jee Li	9f4e30904b	[Model] Fix Qwen3VL mm mapping (#27027 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-10-16 09:45:59 -07:00
Cyrus Leung	d2740fafbf	[Chore] Separate out `vllm.utils.collections` (#26990 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 08:35:35 +00:00
Cyrus Leung	d2f816d6ff	[Bugfix] Standardize merging multimodal embeddings (#26771 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-14 09:36:21 +00:00
Lukas Geiger	a6049be73c	[Models][Qwen3VL] Speedup `fast_pos_embed_interpolate` (#26647 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-13 01:20:07 +08:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
JJJYmmm	9d6cff3ede	[Bugfix][Qwen3VL] fix deepstack in qwen3vl (#26626 ) Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com> Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>	2025-10-11 05:58:33 -07:00
dsinghvi	727144bed1	[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py (#24172 ) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>	2025-10-11 07:21:04 +00:00
Lukas Geiger	b2155ed317	[Model][Qwen3VL] Compute `cu_seqlens` on CPU to remove (#26496 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-10-10 09:42:17 -07:00
Lukas Geiger	2c1c7dfb35	[Models][Qwen] Replace `pad` with `cat` for better performance (#26486 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-09 14:51:26 +00:00
Lukas Geiger	0426e3c5e1	[Models][Qwen3VL] Optimise `_validate_and_reshape_mm_tensor` (#26426 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-09 10:25:48 +00:00

1 2

74 Commits