Isotr0py
|
912fbe9555
|
[Bugfix] Fix Qwen2.5-Omni/Qwen3-Omni use_audio_in_video with multi-video inputs (#37147)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-16 08:56:06 +00:00 |
|
Isotr0py
|
abf61aaa8e
|
[Bugfix] Fix Qwen2.5-omni/Qwen3-omni mm_processor cache for audio_in_video request (#36800)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-13 18:16:05 +00:00 |
|
István Ketykó
|
00726c74c9
|
[Bugfix][Model] Fix DeepSeek-OCR TensorSchema crash on empty images_crop (#36670)
Signed-off-by: István Ketykó <istvan.ketyko@gmail.com>
|
2026-03-12 15:35:54 +08:00 |
|
Cyrus Leung
|
196802dfa6
|
[Misc] Clean up renderers (#36770)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-11 16:39:29 +00:00 |
|
Weiguang Li
|
724759684c
|
[Bugfix] Fix Qwen3-VL timestamp mismatch when using num_frames without fps (#36136)
Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-11 03:13:06 -07:00 |
|
tunglinwood
|
42fadebecb
|
[Model] Add support for moonshotai/Kimi-Audio-7B-Instruct (#36127)
Signed-off-by: tunglinwood <tunglinwood@gmail.com>
Signed-off-by: tunglinwood <tomwu.tunglin@gmail.com>
Signed-off-by: tunglinwood <113751333+tunglinwood@users.noreply.github.com>
|
2026-03-10 21:24:48 -07:00 |
|
Matthew Bonanni
|
77a73458e3
|
Reapply [Attention] Refactor check_and_update_config (#35122)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-09 07:17:14 -07:00 |
|
Cyrus Leung
|
6dd302653f
|
[Misc] Rename group_mm_kwargs_by_modality -> group_and_batch_mm_kwargs (#36158)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-06 12:32:48 +08:00 |
|
Isotr0py
|
7d8bbe6f42
|
[CI/Build] Automatically patch video metadata for multimodal processor test (#35822)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-03 04:27:45 +00:00 |
|
Roger Wang
|
1b82b433fc
|
[Bugfix] Fix MM processor test for Qwen3.5 (#35797)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-03-02 23:05:08 +00:00 |
|
Yueqian Lin
|
e8249378e4
|
[Bugfix] Fix check_interleaved_audio_video false positive for batched non-interleaved requests (#35487)
Signed-off-by: linyueqian <linyueqian@outlook.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2026-02-27 06:48:25 -08:00 |
|
Cyrus Leung
|
845ee348ef
|
[Misc] Standardize handling of mm_processor_kwargs.size (#35284)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-26 13:05:46 +00:00 |
|
Yueqian Lin
|
c0615a296d
|
[Bugfix] Fix Qwen2.5-Omni and Qwen3-Omni mixed-modality embed regression (#35368)
Signed-off-by: linyueqian <linyueqian@outlook.com>
|
2026-02-26 11:58:23 +00:00 |
|
Isotr0py
|
d12d201409
|
[Bugfix] Fix failing FunASR processor test (#35111)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-24 04:13:45 -08:00 |
|
eustlb
|
b3ad37c5db
|
[glm-asr] change defaults dummy audio size (#35108)
Signed-off-by: Eustache Le Bihan <eulebihan@gmail.com>
|
2026-02-24 04:13:33 -08:00 |
|
Cyrus Leung
|
392645454b
|
[Refactor] Decouple TimingContext from InputProcessingContext (#35083)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-23 14:15:50 +00:00 |
|
Neil Schemenauer
|
54e2f83d0a
|
[Feature] Lazy import for the "mistral" tokenizer module. (#34651)
Signed-off-by: Neil Schemenauer <nas@arctrix.com>
|
2026-02-23 00:43:01 -08:00 |
|
Cyrus Leung
|
987506bca6
|
[Refactor] Simplify dummy data generation (#35025)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-22 20:55:27 -08:00 |
|
Lucas Wilkinson
|
aaefc58ee0
|
[CI] Revert PRs 34818 and 33600 (#34979)
|
2026-02-20 13:25:50 -08:00 |
|
Matthew Bonanni
|
662205d34e
|
[Bugfix] Fix Basic Models Test (#34818)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-19 14:49:07 -08:00 |
|
Cyrus Leung
|
574fe75245
|
[Renderer] Move InputPreprocessor into Renderer (2/2) (#34560)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-17 05:29:01 -08:00 |
|
Isotr0py
|
91ac5d9bfd
|
[CI/Build] Enable tests for recent day-0 new models (#34585)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-15 18:17:04 -08:00 |
|
Andreas Karatzas
|
de42abb366
|
[CI] Heavy refactoring of Voxtral multimodal audio model tests (#34294)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-13 20:04:29 -08:00 |
|
Cyrus Leung
|
372b2e762a
|
[Bugfix] Standardize getting number of image patches/tokens (#34358)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-12 20:47:01 -08:00 |
|
Patrick von Platen
|
1100a97621
|
[Voxstral Realtime] Enable tests (#33803)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2026-02-12 09:43:24 -08:00 |
|
Raushan Turganbay
|
527ca32197
|
[Bugfix] Fix more multimodal tests for transformers V5 (#34334)
Signed-off-by: raushan <raushan@huggingface.co>
|
2026-02-11 22:02:05 +01:00 |
|
Cyrus Leung
|
48312e579a
|
[Misc] Make PlaceholderRange.get_num_embeds a method (#34035)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-07 05:30:17 +00:00 |
|
Isotr0py
|
192ad4648b
|
[Bugfix] Fix interns1-pro initialization and PP (#33793)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-04 17:54:45 +00:00 |
|
Isotr0py
|
4061dcf4c5
|
[Bugfix] Enable Kimi k25 processor test (#33562)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 14:25:25 +00:00 |
|
Cyrus Leung
|
88c3e114d8
|
[Refactor] Move MM data parsing outside processor (#33408)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-31 16:46:14 +00:00 |
|
Harry Mellor
|
67239c4c42
|
Fix encoder-decoder model disabling mm processor cache (#33236)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-30 16:30:10 +00:00 |
|
Cyrus Leung
|
c6e7404cc5
|
[Multimodal] Simplify MM input definitions (#33331)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-29 13:32:04 +00:00 |
|
Isotr0py
|
3a92c6f3b5
|
[Misc] Cleanup Kimi-K2.5's vision chunk modality entrypoints (#33157)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-29 09:46:02 +00:00 |
|
Yuxuan Zhang
|
bb17e8f11c
|
[GLM-OCR] GLM-OCR with MTP Support (#33005)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-26 06:24:43 -08:00 |
|
JJJYmmm
|
7e67df5570
|
[Bugfix] fix encoder cache hang in Qwen3VL (#32684)
Signed-off-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-25 05:17:31 +00:00 |
|
Cyrus Leung
|
90db5b31e4
|
[Refactor] Move top-level dummy data generation to registry (#32310)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-14 02:17:46 -08:00 |
|
sangho.lee
|
7e6f123810
|
Add Molmo2 multimodal model support (#30997)
Signed-off-by: sanghol <sanghol@allenai.org>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-14 15:33:09 +08:00 |
|
Cyrus Leung
|
252c011012
|
[Refactor] Remove MultiModalProfiler (#32254)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-13 15:10:20 +00:00 |
|
Jeremy Teboul
|
07286ec5a6
|
[Bugfix] Fix integer overflow in Gemma3n audio processing (#31657)
Signed-off-by: Jeremy Teboul <jeremyte@meta.com>
|
2026-01-10 17:52:53 +08:00 |
|
Jeremy Teboul
|
657e9c0e18
|
[Fix] Introduce audio channels spec (#31595)
Signed-off-by: Jeremy Teboul <jeremyte@meta.com>
|
2026-01-09 19:34:51 +00:00 |
|
Lucas Wilkinson
|
6cdf015c3c
|
[Misc] Fix Current vLLM config is not set. warnings, assert to avoid issues in the future (#31747)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-01-08 15:20:49 -08:00 |
|
Andreas Karatzas
|
8dd2419fa9
|
[CI] Skip Qwen-VL in multimodal processing tests due to flaky external dependency (#31932)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-01-08 02:58:01 +00:00 |
|
amitz-nv
|
ee21291825
|
[Model] Nemotron Parse 1.1 Support (#30864)
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-05 13:00:14 -08:00 |
|
jeremyteboul
|
97a01308e9
|
Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
|
2026-01-03 04:31:09 +00:00 |
|
baonudesifeizhai
|
d722e9e614
|
Add GLM-ASR multimodal support (#31436)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-31 23:12:24 +08:00 |
|
Isotr0py
|
3d024985ab
|
[CI/Build] Ignore max transformers version for more common tests (#31401)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-27 13:06:26 +00:00 |
|
SongHe
|
2d6001f491
|
[Model][Ernie4.5-VL] Support video metadata for timestamp rendering (#31274)
Signed-off-by: dengsonghe <dengsonghe@baidu.com>
Co-authored-by: dengsonghe <dengsonghe@baidu.com>
|
2025-12-25 14:07:15 +00:00 |
|
Kevin McKay
|
8c084de59d
|
[Misc] Fix spelling typos in comments (#31114)
Signed-off-by: c0de128 <kevin.mckay@outlook.com>
|
2025-12-21 21:13:14 -08:00 |
|
Roger Wang
|
f5f51e5931
|
[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Sun Kim <sunytokki@gmail.com>
|
2025-12-16 14:18:17 -08:00 |
|
Lasha Koroshinadze
|
3a20450d31
|
Add AudioFlamingo3 model support (#30539)
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-14 02:14:55 -08:00 |
|