Commit Graph

53 Commits

Author SHA1 Message Date
Alex Brooks
bd2659a566 Increase Flexibility for OOV Multimodal Token Handling (#34858)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
2026-03-08 20:30:49 -07:00
Avery Miao
e998fa76b9 [BUGFIX]Fix Qwen-Omni models audio max_token_per_item estimation error leading to encoder_cache_size is 0 (#35994)
Signed-off-by: Miao, Avery <avery.miao@intel.com>
2026-03-05 09:16:29 -08:00
AllenDou
3ee68590c7 refactor funasr model. (#36108)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-05 08:07:37 -08:00
Ye (Charlotte) Qi
fa6a6be519 [Bugfix] Fix missing sequence_lengths in qwen3_omni_moe_thinker (#35741)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2026-03-02 21:11:56 +00:00
Yueqian Lin
c0615a296d [Bugfix] Fix Qwen2.5-Omni and Qwen3-Omni mixed-modality embed regression (#35368)
Signed-off-by: linyueqian <linyueqian@outlook.com>
2026-02-26 11:58:23 +00:00
Isotr0py
71cd89264f [MM Encoder] Add Triton ViT attention backend (#32183)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-02-15 06:32:47 -08:00
Isotr0py
0ab06100f4 [Multimodal] Expose mm_processor_kwargs for DummyInputsBuilder (#34330)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-02-11 09:37:40 -08:00
Muhammad Hashmi
535de06cb1 [Model] Add transcription support for Qwen3-Omni (#29828)
Signed-off-by: Muhammad Hashmi <mhashmi@berkeley.edu>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
2026-02-04 21:17:47 +00:00
Yueqian Lin
f8516a1ab9 [Bugfix][Model] Fix audio-in-video support for Qwen2.5-Omni and Qwen3-Omni (#33605)
Signed-off-by: linyueqian <linyueqian@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2026-02-04 12:15:29 +00:00
Shanshan Shen
9fb27dd3b3 [MM] Align the prefix of MMEncoderAttention with Attention (#33750)
Signed-off-by: shen-shanshan <467638484@qq.com>
2026-02-04 04:07:30 +00:00
Shanshan Shen
5c4f2dd6ef [MM] Pass prefix parameter to MMEncoderAttention (#33674)
Signed-off-by: shen-shanshan <467638484@qq.com>
2026-02-03 06:47:41 -08:00
JartX
cd86fff38f [BUGFIX] Fix hipErrorIllegalState in Qwen3-Omni during startup profiling allow inference Omni on ROCM (#33077)
Signed-off-by: JartX <sagformas@epdcenter.es>
2026-02-01 13:36:25 +00:00
Cyrus Leung
dcd80206b7 [Chore] Update type annotation of input_ids in model forward (#33063)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-26 06:02:10 -08:00
Itay Etelis
6ca2c91b96 [Model] Use mm_position to compute mrope positions for Qwen3-Omni (#33010)
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
Co-authored-by: Itay Etelis <itay.etelis@ibm.com>
2026-01-26 13:48:07 +00:00
Isotr0py
9ad7f89f55 [Models]: Make Multimodal config implicit in ViT implementation (#31972)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-24 20:34:26 +08:00
Cyrus Leung
193069d129 [5/N] Initialize MM components in context managers (Q-Z) (#32695)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-20 19:10:23 +00:00
Cyrus Leung
9ea07b41da [1/N] Reorganize multimodal processing code (#32327)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-14 15:25:31 +00:00
Roger Wang
b8199f6049 [Model] Re-implement Qwen3Omni Audio Encoder (#32167)
Signed-off-by: Roger Wang <hey@rogerw.io>
2026-01-14 15:40:30 +08:00
Matthew Bonanni
2612ba9285 [1/N][Attention] Restructure attention: move files (#31916)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-09 13:10:24 -08:00
Jzz1943
2c1a4f2488 [Bugfix]: avoid overriding audio/text kwargs (Qwen3-Omni) (#31790)
Signed-off-by: Zhongze Jiang <jiangzhongze.jzz@ant-intl.com>
2026-01-06 12:59:17 +00:00
Cyrus Leung
da71d44410 [Doc] Show that use_audio_in_video is supported in docs (#30837)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-05 23:27:19 -08:00
jeremyteboul
97a01308e9 Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2026-01-03 04:31:09 +00:00
Xiong Wang
bb24592d13 [Qwen3-Omni] fixed _get_feat_extract_output_lengths function (#31007)
Signed-off-by: Xiong Wang <wangxiongts@163.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-12-23 21:33:54 -08:00
Kevin McKay
14c3e6ade3 [Misc] Fix spelling typos in model comments (#31117)
Signed-off-by: c0de128 <kevin.mckay@outlook.com>
2025-12-21 21:14:14 -08:00
Shanshan Shen
87b4d1557d [CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. (#30125)
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-12-15 11:13:32 +08:00
Harry Mellor
cf3eacfe58 Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim (#30389)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 20:45:23 +00:00
Cyrus Leung
c46b932df2 [Chore] Deprecate SupportsMultiModal.merge_by_field_config (#30170)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-06 07:57:28 +00:00
TJian
3f1b03739a [ROCm] [Bugfix] compute_attn_mask_seqlen for qwen3 omni (#29974)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-12-04 08:20:24 +00:00
Navanit Dubey
a2b053dc85 feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896)
Signed-off-by: navanit-git <navanitdubey@gmail.com>
2025-12-02 19:28:35 +00:00
Mingyuan Ma
460d8bbf2d Remove upstream fa checks (#29471)
Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-28 05:52:42 -08:00
Chenheli Hua
839c6b7b72 [Multimodal][Qwen3 Omni] Make Qwen3 Omni work with audio-in-video inputs in V1 engine. (#27721)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-24 19:24:37 +00:00
Roger Wang
0ff70821c9 [Core] Deprecate xformers (#29262)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Harry Mellor
a8b70304d6 Update rope_scaling to rope_parameters in preparation for Transformers v5 (#28542)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-19 09:06:36 -08:00
Lukas Geiger
3d4e7d34be [Model][QwenVL] Simplify cos/sin rotary embedding indexing (#28962)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-11-19 05:43:01 +00:00
Canlin Guo
b9489f51e1 [Model][Perf] Use cos and sin cache in QwenVL (#28798)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
2025-11-18 11:51:54 +00:00
Shanshan Shen
41b92f7d38 [Model][MM] Extract conv layer as CustomOp (#28455)
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-14 19:16:13 +08:00
Harry Mellor
97d1c99302 Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-12 19:14:33 -08:00
Cyrus Leung
afffd3cc8a [Model] Pass mm_features directly into get_mrope_input_positions (#28399)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-11 21:14:48 +08:00
Matthew Bonanni
b30dfa03c5 [Attention] Refactor CUDA attention backend selection logic (#24794)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-11-11 07:40:44 -05:00
Cyrus Leung
d0e186c16f [V0 Deprecation] Remove unused context_len and seq_len from M-RoPE (#28395)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-11 00:30:06 +08:00
Cyrus Leung
853a8eb53b [Bugfix] Fix Qwen Omni audio inference (#27920)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-02 05:06:05 +00:00
Lukas Geiger
0d8161b075 [Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes (#27705)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-29 05:28:20 +00:00
Cyrus Leung
cbd5e07a51 [Model] Use merge_by_field_config for MM models (Qwen series) (#27546)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-27 05:38:05 +00:00
Cyrus Leung
66a168a197 [CI/Build] Refactor processing tests (#27470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-25 16:14:30 +00:00
Isotr0py
42efe609ba [MM][Bugfix] Replace PatchEmbed's conv3d to linear layer (#27418)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-24 07:32:47 +00:00
Cyrus Leung
14e2f1231e [Bugfix] Make get_mrope_input_positions instance methods (#27342)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-22 08:38:34 -07:00
Roger Wang
c3a2c6ac5f [MM][Core] Decouple ViT backend from LM backend (#27061)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-10-21 00:30:10 -07:00
Cyrus Leung
f93e348010 [Misc] Remove isort and yapf ignores (#26888)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-15 12:09:03 +00:00
Isotr0py
8c851f6d04 [Bugfix] Fix qwen3-omni audio truncation issue (#26815)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-15 05:38:36 +00:00
Cyrus Leung
d2f816d6ff [Bugfix] Standardize merging multimodal embeddings (#26771)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-14 09:36:21 +00:00