Cyrus Leung
|
d0e186c16f
|
[V0 Deprecation] Remove unused context_len and seq_len from M-RoPE (#28395)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-11 00:30:06 +08:00 |
|
Lukas Geiger
|
e0919f331d
|
[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (#28168)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-11-07 12:14:29 +00:00 |
|
Lukas Geiger
|
0d8161b075
|
[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes (#27705)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-29 05:28:20 +00:00 |
|
Cyrus Leung
|
cbd5e07a51
|
[Model] Use merge_by_field_config for MM models (Qwen series) (#27546)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 05:38:05 +00:00 |
|
Cyrus Leung
|
66a168a197
|
[CI/Build] Refactor processing tests (#27470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-25 16:14:30 +00:00 |
|
Isotr0py
|
42efe609ba
|
[MM][Bugfix] Replace PatchEmbed's conv3d to linear layer (#27418)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-24 07:32:47 +00:00 |
|
Cyrus Leung
|
14e2f1231e
|
[Bugfix] Make get_mrope_input_positions instance methods (#27342)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-22 08:38:34 -07:00 |
|
Roger Wang
|
c3a2c6ac5f
|
[MM][Core] Decouple ViT backend from LM backend (#27061)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-10-21 00:30:10 -07:00 |
|
Cyrus Leung
|
d31f7844f8
|
[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-19 05:20:55 -07:00 |
|
燃
|
4c91a28e30
|
[bugfix] Qwen3-VL fix video incorrect timestamp calculations while do_sample_frames=True (#27104)
Co-authored-by: 松灵 <wpf272043@alibaba-inc.com>
|
2025-10-17 16:26:33 +00:00 |
|
Jee Jee Li
|
9f4e30904b
|
[Model] Fix Qwen3VL mm mapping (#27027)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-16 09:45:59 -07:00 |
|
Cyrus Leung
|
d2740fafbf
|
[Chore] Separate out vllm.utils.collections (#26990)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-16 08:35:35 +00:00 |
|
Cyrus Leung
|
d2f816d6ff
|
[Bugfix] Standardize merging multimodal embeddings (#26771)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-14 09:36:21 +00:00 |
|
Lukas Geiger
|
a6049be73c
|
[Models][Qwen3VL] Speedup fast_pos_embed_interpolate (#26647)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-13 01:20:07 +08:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
JJJYmmm
|
9d6cff3ede
|
[Bugfix][Qwen3VL] fix deepstack in qwen3vl (#26626)
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
|
2025-10-11 05:58:33 -07:00 |
|
dsinghvi
|
727144bed1
|
[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py (#24172)
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-10-11 07:21:04 +00:00 |
|
Lukas Geiger
|
b2155ed317
|
[Model][Qwen3VL] Compute cu_seqlens on CPU to remove (#26496)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-10 09:42:17 -07:00 |
|
Lukas Geiger
|
2c1c7dfb35
|
[Models][Qwen] Replace pad with cat for better performance (#26486)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-09 14:51:26 +00:00 |
|
Lukas Geiger
|
0426e3c5e1
|
[Models][Qwen3VL] Optimise _validate_and_reshape_mm_tensor (#26426)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-09 10:25:48 +00:00 |
|
Lukas Geiger
|
93f2c0aa08
|
[Models] Improve iteration over layers (#26425)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-08 20:48:33 +00:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Roger Wang
|
67bc0c003e
|
[Bugfix] Fix qwen3 vl dummy data generation with overrides (#26193)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-10-04 01:40:20 +00:00 |
|
Wenlong Wang
|
79aa244678
|
[Multi Modal] Configurable MM Profiling (#25631)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-03 03:59:10 -07:00 |
|
TJian
|
9c5ee91b2a
|
[ROCm] [VL] [Bugfix] Fix vit flash attn dispatcher logic for ROCm (#26104)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-10-02 22:34:53 -07:00 |
|
Matthew Bonanni
|
2aaa423842
|
[Attention] Move Backend enum into registry (#25893)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-10-02 20:32:24 -07:00 |
|
Roger Wang
|
66bca9b8bd
|
[MM] Add text-only mode for Qwen3-VL (#26000)
|
2025-09-30 21:13:42 -07:00 |
|
Isotr0py
|
bd51f78e39
|
[V0 Deprecation][Models] Remove all V0 condition for mm embeddings merge (#25331)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com>
|
2025-09-29 14:09:18 +08:00 |
|
Roger Wang
|
65ecb4f134
|
[Bugfix] Fallback ViT attn backend to SDPA for blackwell (#25851)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-09-29 06:03:51 +00:00 |
|
Isotr0py
|
0efd540dbc
|
[VLM] Update Qwen3-VL max_num_video_tokens calculation for configurable video profiling (#25557)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-28 04:21:01 +00:00 |
|
Cyrus Leung
|
27d7638b94
|
[Bugfix] Merge MM embeddings by index instead of token IDs (#16229)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-27 08:15:12 +00:00 |
|
Wentao Ye
|
c242c98031
|
[Bugfix] Allow Only SDPA Backend for ViT on B200 for Qwen3-VL (#25788)
|
2025-09-26 20:44:52 -07:00 |
|
Isotr0py
|
d4d9899860
|
[Quantization] Add field to skip unquantized modules for GPTQ config (#25455)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-26 15:47:41 +00:00 |
|
Isotr0py
|
17b4c6685c
|
[Bugfix] Fix Qwen3-VL max_num_video_tokens calculation for video profiling (#25648)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-25 18:36:01 +08:00 |
|
Roger Wang
|
7be9ffcd9f
|
[Misc] Fix Qwen3-VL video_grid_thw typing (#25646)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-09-25 10:16:45 +00:00 |
|
Harry Mellor
|
8c853050e7
|
[Docs] Enable fail_on_warning for the docs build in CI (#25580)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-24 19:30:33 +00:00 |
|
Cyrus Leung
|
babad6e5dd
|
[Misc] Move DP for ViT code inside model executor dir (#25459)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-23 09:20:52 +00:00 |
|
Isotr0py
|
af7dfb0d1a
|
[Perf] Further optimization for Qwen3-VL fast_pos_embed_interpolate (#25347)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-21 20:12:45 +00:00 |
|
Woosuk Kwon
|
1c3ffdbecc
|
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-21 10:37:11 -07:00 |
|
Roger Wang
|
30d08911f7
|
[MM][Perf] Minor Optimization on Qwen3-VL fast_pos_embed_interpolate (#25337)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-09-21 11:05:20 +00:00 |
|
LJH-LBJ
|
d90e212a3a
|
Remove Redundant Assignment in Qwen3_VisionPatchMerger (#25224)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-19 12:15:13 -06:00 |
|
Roger Wang
|
1dfea5f4a9
|
[Bugfix][Perf] Misc fixes for Qwen3 VL (#25238)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-09-19 10:46:16 +00:00 |
|
Roger Wang
|
3127274d02
|
[MM Encoder] Apply DP ViT for Qwen3-VL model series (#24955)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-17 21:04:21 -07:00 |
|
Roger Wang
|
0f7acdd73c
|
[Model] Support Qwen3-VL Model Series (#24727)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-17 05:01:04 +00:00 |
|