Patrick von Platen
3f3f89529d
[Voxtral] Add new streaming arch ( #32861 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-23 12:41:52 +01:00
Nicolò Lucchesi
ea6102b85d
[Bugfix] Fix Whisper/encoder-decoder GPU memory leak ( #32789 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-01-22 10:50:37 +00:00
Alex Brooks
27b81e010d
[Bugfix] Fix Granite Vision / Don't use Siglip Pooling Head Nested Models by Default ( #32299 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
2026-01-21 11:11:52 +08:00
Andreas Karatzas
9d0d7f48d5
[ROCm][CI] Handle missing vision_config in Isaac model attention patch ( #32281 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-14 07:21:26 +00:00
Andreas Karatzas
5e714f7ff4
[ROCm][CI] Fix HuggingFace flash_attention_2 accuracy issue in Isaac vision encoder ( #32233 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-12 22:33:59 -08:00
Akshat Shrivastava
e45946bd91
feature/issac 0.2 ( #31550 )
...
Signed-off-by: Roger Wang <hey@rogerw.io >
Co-authored-by: Roger Wang <hey@rogerw.io >
2026-01-10 03:18:05 +00:00
Matthew Bonanni
2612ba9285
[1/N][Attention] Restructure attention: move files ( #31916 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com >
2026-01-09 13:10:24 -08:00
amitz-nv
ee21291825
[Model] Nemotron Parse 1.1 Support ( #30864 )
...
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com >
Signed-off-by: Michael Goin <mgoin64@gmail.com >
Co-authored-by: Michael Goin <mgoin64@gmail.com >
2026-01-05 13:00:14 -08:00
Isotr0py
51e38a8e30
[Misc] Enable Paligemma's PrefixLM attention mask computation ( #31725 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-06 03:31:49 +08:00
Isotr0py
6aa5b18e1d
[v1] Add encoder-only/cross attention support to Triton Attention backend ( #31406 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-06 00:00:23 +08:00
oscardev256
b7165d53c6
Feature/isaac 0.1 ( #28367 )
...
Signed-off-by: oscardev256 <42308241+oscardev256@users.noreply.github.com >
Signed-off-by: Oscar Gonzalez <ogonzal6@alumni.jh.edu >
Signed-off-by: Yang <lymailforjob@gmail.com >
Co-authored-by: Yang <lymailforjob@gmail.com >
2025-12-25 18:49:11 -08:00
Cyrus Leung
aa3868ecfe
[Chore] Remove unused noqas ( #31263 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-24 05:38:46 -08:00
Andreas Karatzas
e42894f5b5
[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance ( #31235 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-24 02:56:58 +00:00
Cyrus Leung
bb62dda2c3
[Misc] Introduce encode_*_url utility function ( #31208 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-23 13:45:21 +00:00
Patrick von Platen
3faa8bee57
adapt voxtral ( #31095 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
2025-12-23 05:31:55 -08:00
Andreas Karatzas
7b43db210c
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements ( #30270 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-19 02:17:27 +00:00
Isotr0py
74a1ac38b0
[v1] Add PrefixLM support to TritonAttention backend ( #30386 )
2025-12-17 16:05:24 -08:00
Matthew Bonanni
7eb6cb6c18
[Attention] Update tests to remove deprecated env vars ( #30563 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com >
2025-12-17 09:49:59 -08:00
Isotr0py
4de08ad698
[CI/Build] Skip broken ViT backend functionality test tempoarily ( #30782 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2025-12-16 06:45:25 -08:00
Shanshan Shen
87b4d1557d
[CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. ( #30125 )
...
Signed-off-by: shen-shanshan <467638484@qq.com >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-12-15 11:13:32 +08:00
Lasha Koroshinadze
3a20450d31
Add AudioFlamingo3 model support ( #30539 )
...
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com >
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com >
Co-authored-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2025-12-14 02:14:55 -08:00
Cyrus Leung
64251f48df
[Chore] Adjust tokenizer import to avoid circular imports ( #30601 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-13 04:42:39 -08:00
Nicolò Lucchesi
57e9bf1864
[CI] Whisper logprobs tests ( #30504 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2025-12-13 10:49:11 +08:00
Nicolò Lucchesi
c756fb6781
[Core] Whisper enable FULL_DECODE_ONLY CudaGraph ( #30072 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2025-12-10 06:14:24 -08:00
Aditya Tewari
cebda2a4af
[CPU] Support for Whisper ( #30062 )
...
Signed-off-by: Aditya Tewari <aditya.tewari@arm.com >
2025-12-10 04:58:42 -08:00
Isotr0py
b952f4d3c3
[v1] Add PrefixLM support to FlexAttention backend ( #27938 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2025-12-07 15:51:36 +00:00
Russell Bryant
3633035a3f
[Misc] Rename CohereForAI references to CohereLabs ( #30147 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2025-12-05 19:41:40 +00:00
Harry Mellor
9998ea5b57
Delete HF version of Phi 4 MM ( #30049 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-12-04 13:44:50 +00:00
Andreas Karatzas
e96a6a6dca
[ROCm][CI][Bugfix] Fixing the Multi-Modal Models Test (Extended) 1 group ( #30013 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-04 11:00:16 +00:00
Isotr0py
cc4e296ea6
[CI/Build] Avoid duplicate empty inputs test for common multimodal generation tests ( #29907 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2025-12-03 10:27:36 +00:00
Andreas Karatzas
506ed87e87
[ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues ( #29909 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-03 10:36:49 +08:00
Huamin Li
83805a6078
[CI] Skip paddleocr_vl for transformer 4.57.3 ( #29758 )
...
Signed-off-by: Huamin Li <3ericli@gmail.com >
2025-12-01 04:38:06 +00:00
Cyrus Leung
34a984274e
[Misc] Refactor tokenizer interface ( #29693 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-11-29 04:02:21 -08:00
Lukas Geiger
a9705a290a
[Model][QwenVL] Replace torch.repeat_interleave with faster np.repeat ( #28964 )
...
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com >
2025-11-19 22:04:23 -08:00
Luciano Martins
c2612371ad
[Model] Add Gemma3 GGUF multimodal support ( #27772 )
...
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2025-11-18 08:56:29 -08:00
Laith Sakka
2e0ad629b0
Avoid bytecode hook and simplify TorchCompileWrapperWithCustomDipatch ( #25110 )
...
Signed-off-by: Laith Sakka <lsakka@meta.com >
2025-11-14 14:11:10 -08:00
dongbo910220
c934caee88
[Fix] improve aspect ratio in dummy image generation and add common VLM tests for PaddleOCR-VL ( #28711 )
...
Signed-off-by: dongbo910220 <1275604947@qq.com >
2025-11-14 16:07:20 +00:00
Roger Wang
d3387750f1
[Misc] Turn off encoder torch compile by default ( #28634 )
...
Signed-off-by: Roger Wang <hey@rogerw.io >
2025-11-13 08:38:08 -08:00
Harry Mellor
a39dd7bb06
[CI] Skip "Multi-Modal Models Test (Extended) 3" test that's broken in current Transformers ( #28559 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-11-12 19:38:13 +00:00
Harry Mellor
a742134cc5
Remove deprecated fields from CompilationConfig ( #27593 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-11-12 16:10:28 +00:00
Andreas Karatzas
9f0247cfa4
VLLM_USE_TRITON_FLASH_ATTN V0 variable deprecation (#27611 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
Signed-off-by: Andreas Karatzas <Andreas.Karatzas@amd.com >
2025-11-11 18:34:36 -08:00
TJian
e2347dbf58
[Bugfix] [Model] Missing MRoPE function definition from KeyeForConditionalGeneration ( #27895 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-11-01 13:45:23 +08:00
Cyrus Leung
879a06579e
[CI/Build] Bump transformers version ( #27528 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-10-31 22:11:07 -07:00
Isotr0py
ad3ec89532
[VLM] Add Qwen3-VL generation test ( #25185 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
Signed-off-by: Roger Wang <hey@rogerw.io >
Co-authored-by: Roger Wang <hey@rogerw.io >
2025-10-29 12:19:37 +00:00
Cyrus Leung
fe2016de2d
[CI/Build] Remove unnecessary flags from test registry ( #27353 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-23 14:42:40 +00:00
Luciano Martins
e05a6754a8
[Model] Revert PR #26715 : Restore custom PaliGemma and Gemma3-MM impl… ( #27309 )
...
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com >
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com >
2025-10-22 10:05:34 -07:00
Russell Bryant
58fab50d82
[Frontend] Require flag for loading text and image embeds ( #27204 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-22 15:52:02 +00:00
Cyrus Leung
d31f7844f8
[Misc] Move utils to avoid conflicts with stdlib, and move tests ( #27169 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-19 05:20:55 -07:00
Cyrus Leung
8c017b3490
[Model] Always use Transformers backend for PaliGemma and Gemma3-MM ( #26715 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-17 05:03:35 +00:00
Cyrus Leung
d2740fafbf
[Chore] Separate out vllm.utils.collections ( #26990 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-16 08:35:35 +00:00