biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Matthew Bonanni	2612ba9285	[1/N][Attention] Restructure attention: move files (#31916 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2026-01-09 13:10:24 -08:00
Isotr0py	2972a05473	[MM Encoder]: Make MMEncoderAttention's `scale` takes effect properly (#31950 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-01-08 02:33:48 -08:00
Jee Jee Li	ce1eafd1a5	[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674 ) Signed-off-by: bk-201 <joy25810@foxmail.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: bk-201 <joy25810@foxmail.com> Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: Anexdeus <5142168@mail.ru>	2025-12-26 04:48:20 -08:00
dengyunyang	8f8f469b1b	[BugFix] skip language model in Encoder (#30242 ) Signed-off-by: dengyunyang <584797741@qq.com>	2025-12-22 05:25:59 -08:00
Shanshan Shen	3bd9c49158	[CustomOp] Extract ApplyRotaryEmb as CustomOp and unify the dispatch logic (#29873 ) Signed-off-by: shen-shanshan <467638484@qq.com> Co-authored-by: gcanlin <canlinguosdu@gmail.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>	2025-12-15 19:08:16 -08:00
Shanshan Shen	87b4d1557d	[CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. (#30125 ) Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-12-15 11:13:32 +08:00
Ilya Markov	3224ea9915	[torch.compile] Add encoder tag for compilation (#30489 ) Signed-off-by: ilmarkov <markovilya197@gmail.com>	2025-12-14 18:15:11 +08:00
Harry Mellor	cf3eacfe58	Standardise `get_rope` to use `rope_parameters["partial_rotary_factor"]`, not `rotary_dim` (#30389 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 20:45:23 +00:00
Cyrus Leung	671427efbf	[Model] Move `multimodal_cpu_fields` definition to field config (#30181 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 13:40:02 +00:00
Cyrus Leung	c46b932df2	[Chore] Deprecate `SupportsMultiModal.merge_by_field_config` (#30170 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 07:57:28 +00:00
Cyrus Leung	b286a311c2	[Chore] Deprecate `merge_by_field_config` arg (#30035 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-04 17:21:24 +00:00
Mingyuan Ma	460d8bbf2d	Remove upstream fa checks (#29471 ) Signed-off-by: mingyuanm <mingyuanm@nvidia.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-28 05:52:42 -08:00
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
ZiTian Zhao	d84d8f4429	Fix EVS crash when using `video_embeds` inputs in Qwen2.5-VL (#29232 ) Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-22 06:48:59 -08:00
Harry Mellor	a8b70304d6	Update `rope_scaling` to `rope_parameters` in preparation for Transformers v5 (#28542 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-19 09:06:36 -08:00
Lukas Geiger	3d4e7d34be	[Model][QwenVL] Simplify cos/sin rotary embedding indexing (#28962 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-19 05:43:01 +00:00
Canlin Guo	b9489f51e1	[Model][Perf] Use cos and sin cache in QwenVL (#28798 ) Signed-off-by: gcanlin <canlinguosdu@gmail.com>	2025-11-18 11:51:54 +00:00
Lukas Geiger	5a87076d6e	[Model][QwenVL] Optimize `Qwen2_5_VisionAttention` q,k preparation (#28769 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-16 17:37:15 +00:00
Shanshan Shen	41b92f7d38	[Model][MM] Extract conv layer as CustomOp (#28455 ) Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-14 19:16:13 +08:00
Harry Mellor	97d1c99302	Rename clashing method names for vLLM model protocol (#27583 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 19:14:33 -08:00
Canlin Guo	bc5bd45c7d	[Refactor] Remove redundant TP gather/split in split_qkv in QwenVL (#28271 ) Signed-off-by: gcanlin <canlinguosdu@gmail.com>	2025-11-12 15:56:47 +00:00
Cyrus Leung	afffd3cc8a	[Model] Pass `mm_features` directly into `get_mrope_input_positions` (#28399 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 21:14:48 +08:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Cyrus Leung	d0e186c16f	[V0 Deprecation] Remove unused `context_len` and `seq_len` from M-RoPE (#28395 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 00:30:06 +08:00
Lukas Geiger	e0919f331d	[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (#28168 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-07 12:14:29 +00:00
Harry Mellor	c0a4b95d64	Fix issues from #28242 (#28257 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-07 04:23:17 +00:00
Lucas Kabela	4bf56c79cc	[Multimodal][torch.compile] Add compilation config field for turning off ViT/MM compile (#28242 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com>	2025-11-07 00:16:03 +00:00
Lucas Kabela	55011aef24	[Bugfix][Qwen][Multimodal] Move Qwen2_5_vl sdpa to custom op and reenable compile (#27764 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com>	2025-11-03 11:12:15 -08:00
Yan Ma	7e2729b57e	[Multimodal][XPU]Enable vision attn backend for xpu platform (#27525 ) Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Yejing Lai <yejing.lai@intel.com> Co-authored-by: Guancheng Fu <110874468+gc-fu@users.noreply.github.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-11-01 04:45:02 +00:00
Zhewen Li	e806178d2a	[BugFix][VL] Fix FA selection on Qwen2.5-VL (#27790 ) Signed-off-by: zhewenli <zhewenli@meta.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-30 07:54:44 +00:00
Chenheli Hua	48eb8eba58	[Temp fix] Disable torch.compile for Qwen2.5 VL's VisionBlock temporarily. (#27760 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-29 23:17:48 +00:00
JartX	7568a282b9	[FIXBUG] Qwen3VL hallucinations without Contiguous on Torch.SDPA (#27744 ) Signed-off-by: JartX <sagformas@epdcenter.es> Co-authored-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-29 16:55:35 +00:00
Lukas Geiger	0d8161b075	[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes (#27705 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-29 05:28:20 +00:00
Lucas Kabela	94666612a9	[Misc][qwen2_5_vl][torch.compile] Enable `supports_torch_compile` on generic nn.Module and demonstrate speedup on Qwen Vision model (#23207 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com> Signed-off-by: Lucas Kabela <lucasakabela@gmail.com>	2025-10-28 22:36:43 +00:00
Cyrus Leung	cbd5e07a51	[Model] Use merge_by_field_config for MM models (Qwen series) (#27546 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-27 05:38:05 +00:00
JartX	65d2cf9511	[BUGFIX][ROCM] ViT FlashAttention on ROCm (no GFX9) and contiguous on qwen3vl ROCm TORCH_SDPA (#27190 ) Signed-off-by: JartX <sagformas@epdcenter.es> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-10-26 15:08:52 +08:00
Isotr0py	42efe609ba	[MM][Bugfix] Replace `PatchEmbed`'s conv3d to linear layer (#27418 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-24 07:32:47 +00:00
Jonathan Chen	ca76486a16	[Chore] Separate out `vllm.utils.platform_utils.py` (#27374 ) Signed-off-by: Jonathan <chenleejonathan@gmail.com>	2025-10-23 19:08:06 +00:00
Cyrus Leung	14e2f1231e	[Bugfix] Make `get_mrope_input_positions` instance methods (#27342 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-22 08:38:34 -07:00
Roger Wang	c3a2c6ac5f	[MM][Core] Decouple ViT backend from LM backend (#27061 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-10-21 00:30:10 -07:00
Lukas Geiger	5c2acb270a	[Models][QwenVL] Remove unnecessary `.contiguous()` calls (#27106 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-18 07:05:05 -07:00
ZiTian Zhao	7bb736d00e	Fix Qwen2.5 VL image grid docstring (#27033 ) Signed-off-by: zitian zhao <zitian.zhao@tencentmusic.com>	2025-10-16 09:57:36 -07:00
Cyrus Leung	d2f816d6ff	[Bugfix] Standardize merging multimodal embeddings (#26771 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-14 09:36:21 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
dsinghvi	727144bed1	[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py (#24172 ) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>	2025-10-11 07:21:04 +00:00
tomeras91	b8f603cebe	[Model] EVS support for nano_nemotron_vl (#26269 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com> Signed-off-by: tomeras91 <57313761+tomeras91@users.noreply.github.com> Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com>	2025-10-07 00:23:37 +08:00
Harry Mellor	557b2e961d	Remove all cases of `fmt: on/off` (#26253 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 09:18:14 -07:00
Harry Mellor	4e256cadc2	Remove all references to `yapf` as it's no longer used (#26251 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 09:18:11 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
TJian	9c5ee91b2a	[ROCm] [VL] [Bugfix] Fix vit flash attn dispatcher logic for ROCm (#26104 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-10-02 22:34:53 -07:00

1 2 3

117 Commits