Commit Graph

142 Commits

Author SHA1 Message Date
Russell Bryant
6f3b2047ab [Core] Fix SSRF bypass via backslash-@ URL parsing inconsistency (#34743)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: isotr0py <2037008807@qq.com>
2026-02-18 03:53:35 +00:00
Reagan Lee
c4df59ad43 Add embedding input functionality for disabled modalities [remake] (#32493)
Signed-off-by: Reagan Lee <“reaganjlee@gmail.com”>
Signed-off-by: Reagan Lee <reaganjlee@gmail.com>
Signed-off-by: Reagan Lee <96998476+reaganjlee@users.noreply.github.com>
Co-authored-by: Reagan Lee <“reaganjlee@gmail.com”>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-08 04:57:16 -08:00
Cyrus Leung
48312e579a [Misc] Make PlaceholderRange.get_num_embeds a method (#34035)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-07 05:30:17 +00:00
Cyrus Leung
116880a5a0 [Bugfix] Make MM batching more robust (#33817)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-05 20:40:58 +00:00
Cyrus Leung
88c3e114d8 [Refactor] Move MM data parsing outside processor (#33408)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-31 16:46:14 +00:00
Cyrus Leung
c87eac18f7 [Refactor] Move MM item count validation outside of processor (#33396)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-30 09:27:31 +00:00
Cyrus Leung
831453fcef [Chore] Move MediaConnector to vllm.multimodal.media (#33324)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-29 16:54:31 +00:00
Cyrus Leung
c6e7404cc5 [Multimodal] Simplify MM input definitions (#33331)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-29 13:32:04 +00:00
Cyrus Leung
28459785ff [3/N] Group together media-related code (#32406)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-15 11:52:12 +00:00
Cyrus Leung
cbbae38f93 [2/N] Move cache factories to MM registry (#32382)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-15 01:02:30 -08:00
Cyrus Leung
9ea07b41da [1/N] Reorganize multimodal processing code (#32327)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-14 15:25:31 +00:00
Cyrus Leung
90db5b31e4 [Refactor] Move top-level dummy data generation to registry (#32310)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-14 02:17:46 -08:00
Cyrus Leung
252c011012 [Refactor] Remove MultiModalProfiler (#32254)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-13 15:10:20 +00:00
Jeremy Teboul
657e9c0e18 [Fix] Introduce audio channels spec (#31595)
Signed-off-by: Jeremy Teboul <jeremyte@meta.com>
2026-01-09 19:34:51 +00:00
prashanth058
d3235cb503 [Fix] Enable mm_processor_cache with vision LoRA (#31927)
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
2026-01-08 15:31:51 +08:00
vSeamar
6f351548b2 [Frontend] Implement robust video frame recovery for corrupted videos (#29197)
Signed-off-by: cmartinez <cmartinez@roblox.com>
Signed-off-by: vSeamar <cmartinez@roblox.com>
2026-01-07 01:13:24 +00:00
Nikhil G
ada6f91d56 Fix RecursionError in MediaWithBytes unpickling (#31191)
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
2026-01-06 20:11:26 +00:00
jeremyteboul
97a01308e9 Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2026-01-03 04:31:09 +00:00
Wenqi Glantz
4924ac582c Add hidden dimension validation for multimodal embedding inputs (#30968)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
2025-12-19 07:59:36 +00:00
Roger Wang
f5f51e5931 [Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Sun Kim <sunytokki@gmail.com>
2025-12-16 14:18:17 -08:00
Wenqi Glantz
84e23d103d additional protection for CVE-2025-62164 (#30649)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
2025-12-15 03:07:10 +00:00
jeremyteboul
4515eb1a0b [Fix] Update lazing loading of video loader backend (#30444)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2025-12-11 10:14:57 +00:00
Andreas Karatzas
ed7af3178a [ROCm][CI] Attempt to fix the failures under a subgroup of the e2e the test group (#29358)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
2025-12-10 05:33:13 +00:00
Cyrus Leung
e83b7e379c Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145)" (#30199) 2025-12-07 00:00:22 -08:00
Cyrus Leung
27f4c2fd46 [Renderer] Separate out RendererConfig from ModelConfig (#30145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-06 23:15:42 -08:00
Cyrus Leung
671427efbf [Model] Move multimodal_cpu_fields definition to field config (#30181)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-06 13:40:02 +00:00
Cyrus Leung
b286a311c2 [Chore] Deprecate merge_by_field_config arg (#30035)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-04 17:21:24 +00:00
knlnguyen1802
fc6acc88ca [Bugfix] Missing cached item in the MultiModalReceiverCache (#28525)
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: Chenguang Zheng <645327136@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-01 10:18:07 -08:00
Cyrus Leung
fe3398fab2 [Chore] Enable passing tokenizer=None into MM processor (#29724)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 06:25:10 -08:00
Cyrus Leung
34a984274e [Misc] Refactor tokenizer interface (#29693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 04:02:21 -08:00
Mert Unsal
c625d7b1c6 [Bugfix] Fix O(n²) multimodal string prompt processing (#29667)
Signed-off-by: mertunsall <mertunsal1905@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-28 16:10:39 -08:00
Canlin Guo
fe25772aa9 [Bugfix] Handle broken frames in video loading (#29001)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: 凌葭 <lvjiang.lj@alibaba-inc.com>
Co-authored-by: 凌葭 <lvjiang.lj@alibaba-inc.com>
2025-11-20 04:38:12 +00:00
Kevin H. Luu
c64c0b78de [chore] Move the rest of wikimedia url to S3 (#28921)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-18 09:44:18 -08:00
Harry Mellor
8fcaaf6a16 Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Harry Mellor
b893d661b1 Fix per file ruff ignores related to simplification (#26259)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 20:31:53 +00:00
Harry Mellor
4e256cadc2 Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 09:18:11 -07:00
Harry Mellor
d6953beb91 Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Reza Barazesh
bc546f76a1 [CI] Move applicable tests to CPU (#24080)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 14:45:20 +01:00
Russell Bryant
3958b96bf5 Add option to restrict media domains (#25783)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
2025-09-27 01:23:52 +00:00
Cyrus Leung
6b0fcbbf43 [Misc] Simplify test_argsort_mm_positions (#25690)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-25 18:23:01 +00:00
Cyrus Leung
6488f3481b [Misc]] Move processing context to multimodal directory (#25548)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-24 08:15:00 +00:00
Cyrus Leung
babad6e5dd [Misc] Move DP for ViT code inside model executor dir (#25459)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-23 09:20:52 +00:00
Deboleina
5aeb925452 Multimodal - audio tests (#25285)
Signed-off-by: Debolina Roy <debroy@redhat.com>
2025-09-22 07:07:11 +08:00
Harry Mellor
c4afdb69cc Move MultiModalConfig from config/__init__.py to config/multimodal.py (#24659)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-15 17:43:16 +00:00
dongluw
a5b84f1cbf [Core] Shared memory based object store for Multimodal data caching and IPC (#20452)
Signed-off-by: donglu <donglu@cohere.com>
2025-09-12 07:54:17 -07:00
Isotr0py
bcbe2a4d9e [VLM] Optimize GLM4.5-V-style video processing to only decode necessary frames (#24161)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-11 09:44:34 -07:00
dsinghvi
70549c1245 [CI/Build] Serve images used by multimodal tests through local HTTP Server (#23907)
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-09-03 16:13:11 +08:00
Didier Durand
d7e1e59972 [Doc]: fix typos in Python comments (#24093)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-09-02 21:05:45 -07:00
WeiQing Chen
a0e0efd6bd [Model] Support DP for ViT on Kimi-VL-A3B-Thinking-2506 (#23817)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-09-01 16:56:56 +00:00
Cyrus Leung
69244e67e6 [Core] Use key-only cache for BaseMultiModalProcessor (#23018)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-27 14:19:13 +08:00