Michael Goin
|
db5d0719e1
|
[Kernel] Add MXFP8 to Marlin GEMM/MoE and refactor Mxfp8LinearOp (#34664)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-04-01 09:41:42 -07:00 |
|
Lukas Geiger
|
4f6eed3bd4
|
[Core] Simplify multimodal masking (#34246)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2026-04-01 01:18:22 -07:00 |
|
wliao2
|
4dfad17ed1
|
replace cuda_device_count_stateless() to current_platform.device_count() (#37841)
Signed-off-by: Liao, Wei <wei.liao@intel.com>
Signed-off-by: wliao2 <wei.liao@intel.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-31 22:32:54 +08:00 |
|
wang.yuqi
|
719735d6c5
|
[CI Failure] pin colmodernvbert revision (#38612)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-03-31 10:54:54 +00:00 |
|
wang.yuqi
|
d9d21eb8e3
|
[Frontend][3/n] Improve pooling entrypoints | scoring. (#28631)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-31 07:52:00 +00:00 |
|
Micah Williamson
|
d9c7db18da
|
[ROCm][CI] Pin test_hybrid test to TRITON_ATTN on ROCm (#38381)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-03-30 20:26:46 +00:00 |
|
Benjamin Chislett
|
494636b29d
|
[Feat][Spec Decode] DFlash (#36847)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-03-30 15:03:15 -04:00 |
|
haosdent
|
a08b7733fd
|
[CI] Fix SPLADE pooler test broken by #38139 (#38495)
Signed-off-by: haosdent <haosdent@gmail.com>
|
2026-03-30 07:48:33 +00:00 |
|
haosdent
|
d39b8daf5f
|
[Feature] Add Qwen3-ForcedAligner support via token classification pooling (#35367)
Signed-off-by: haosdent <haosdent@gmail.com>
|
2026-03-29 00:27:52 +00:00 |
|
haosdent
|
b2bc736b12
|
[CI] Fix Ernie4.5-VL initialization test (#38429)
Signed-off-by: haosdent <haosdent@gmail.com>
|
2026-03-28 22:43:24 +08:00 |
|
Nicolò Lucchesi
|
44a6528028
|
[CI] Skip failing test (#38369)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-03-27 13:25:19 -07:00 |
|
Divakar Verma
|
b9dbc5c4ab
|
[Mamba][APC] Add test case to compare apc outputs (#34977)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2026-03-26 16:40:35 +00:00 |
|
wang.yuqi
|
dcdc145893
|
[CI] Reorganize scoring tests (#38207)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-26 12:07:01 +00:00 |
|
Ekagra Ranjan
|
7b54f60db0
|
[Cohere] Enable Cohere-Transcribe (#38120)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2026-03-25 16:13:51 -07:00 |
|
Cyrus Leung
|
ba2f0acc2d
|
[Misc] Reorganize inputs (#35182)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-25 10:22:54 -07:00 |
|
Harry Mellor
|
d215d1efca
|
[Mypy] Better fixes for the mypy issues in vllm/config (#37902)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-25 06:14:43 -07:00 |
|
Nick Cao
|
935c46dd9b
|
[Model] Add Granite 4.0 1B speech to supported models (#38019)
Signed-off-by: Nick Cao <ncao@redhat.com>
|
2026-03-24 18:23:41 +00:00 |
|
Lasha Koroshinadze
|
e7767eccae
|
Fix AudioFlamingo3/MusicFlamingo HF parity and RoTE handling (#37643)
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
|
2026-03-23 10:29:07 +08:00 |
|
Andreas Karatzas
|
c862481c02
|
[CI] Skip ISAAC multimodal tests due to broken upstream HF model weights (#37781)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-22 13:23:32 +08:00 |
|
Andreas Karatzas
|
0d50fa1db6
|
[ROCm][CI] Mark gemma3 as large GPU test to avoid OOM on MI250 (#37610)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-21 12:57:25 +08:00 |
|
Isotr0py
|
c7f98b4d0a
|
[Frontend] Remove librosa from audio dependency (#37058)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-21 11:36:15 +08:00 |
|
Andreas Karatzas
|
fb4e8bf442
|
[ROCm][CI] Fix accuracy for llama-nemotron-vl pooling tests (#37613)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 10:16:59 -07:00 |
|
Ilya Boytsov
|
8b6c6b9505
|
[Model] Add LFM2-ColBERT-350M support (#37528)
Signed-off-by: Ilya Boytsov <ilyaboytsov1805@gmail.com>
|
2026-03-20 14:57:57 +00:00 |
|
Harry Mellor
|
9f6d9dd371
|
Fix attribute error in isaac_patch_hf_runner (#37685)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-20 14:49:40 +00:00 |
|
Andreas Karatzas
|
5a4a179591
|
[ROCm][CI] Fix granite_speech test for gfx90a by selecting compatible attention backend (#37611)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 17:07:26 +08:00 |
|
Cyrus Leung
|
765e461065
|
[Bugfix] Fix Nemotron Parse loading (#37407)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-19 09:55:29 +00:00 |
|
Cyrus Leung
|
99267c23ca
|
[2/3] Refactor InternVL-based processors (#37324)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-18 22:22:19 +08:00 |
|
Andreas Karatzas
|
eaf7c9b976
|
[CI] Fix PaddleOCR-VL HF test failure due to create_causal_mask API rename (#37328)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-18 09:44:12 +00:00 |
|
Athrael Soju
|
c0745a851a
|
[Model] Add ColQwen3.5 4.5B support (#36887)
Signed-off-by: Athrael Soju <athrael.soju@gmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-17 21:17:02 +00:00 |
|
Ekagra Ranjan
|
b5ca9c3557
|
[Models] Cohere ASR (#35809)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2026-03-17 21:04:17 +00:00 |
|
Cyrus Leung
|
51f0acda79
|
[Model] Remove unused handle_oov_mm_token (#37321)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-17 19:44:52 +00:00 |
|
Isotr0py
|
a836524d20
|
[Chore] Replace all base64 usages with faster pybase64 package (#37290)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-17 14:44:19 +00:00 |
|
Harry Mellor
|
ecfcdd2ce4
|
Fix Phi3 test that fails with Transformers v5 (#37298)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-17 14:29:24 +00:00 |
|
Cyrus Leung
|
f340324335
|
[1/2] Move InternVL-based processors (#37260)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-17 21:50:56 +08:00 |
|
EdalatiAli
|
e5b807607c
|
[Quant][Feature] Support online MXFP8 quantization for MoE and dense models (#35448)
Signed-off-by: EdalatiAli <aliedalati@cohere.com>
|
2026-03-16 18:07:39 -04:00 |
|
Raushan Turganbay
|
55e6d3d5c0
|
[Bugfix] Make siglip/clip compatible with transformers v5 (#37200)
Signed-off-by: raushan <raushan@huggingface.co>
|
2026-03-16 16:48:18 +00:00 |
|
Isotr0py
|
912fbe9555
|
[Bugfix] Fix Qwen2.5-Omni/Qwen3-Omni use_audio_in_video with multi-video inputs (#37147)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-16 08:56:06 +00:00 |
|
bigshanedogg
|
2390d44209
|
[Model] Add HyperCLOVAX-SEED-Think-14B language model support (#37107)
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
|
2026-03-16 06:40:05 +00:00 |
|
Andreas Karatzas
|
d4c57863f7
|
[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test (#37138)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-16 04:49:31 +00:00 |
|
Harry Mellor
|
0005d2a3c9
|
Use Transformers v5 WeightRenaming for Transformers modeling backend (#31545)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-13 20:49:08 +00:00 |
|
Isotr0py
|
abf61aaa8e
|
[Bugfix] Fix Qwen2.5-omni/Qwen3-omni mm_processor cache for audio_in_video request (#36800)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-13 18:16:05 +00:00 |
|
whyiug
|
1ce13cf992
|
[Model] Add support for BERT-like Chinese ERNIE pooling models (#36385)
Signed-off-by: whyiug <whyiug@hotmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-13 03:23:53 +00:00 |
|
Nikita
|
10f08dedfa
|
[Model] Add ColPali late interaction model for multi-modal retrieval (#36818)
Signed-off-by: Nikita Sukharev <kaonael@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-03-13 02:18:57 +00:00 |
|
Marc Sun
|
c973ecdead
|
[bnb] Skip moe + bnb test (#36896)
Signed-off-by: Marc Sun <marc@huggingface.co>
|
2026-03-12 18:03:25 +00:00 |
|
Kunshang Ji
|
53ec16a705
|
[Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145)
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-12 07:57:47 -07:00 |
|
István Ketykó
|
00726c74c9
|
[Bugfix][Model] Fix DeepSeek-OCR TensorSchema crash on empty images_crop (#36670)
Signed-off-by: István Ketykó <istvan.ketyko@gmail.com>
|
2026-03-12 15:35:54 +08:00 |
|
wang.yuqi
|
6ecabe4936
|
[CI Failure] Fix Language Models Test (Extended Pooling) daily CI Failure (#36761)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-12 12:22:05 +08:00 |
|
Harry Mellor
|
65986db6ba
|
Make Gemma and Gemma 2 accept inputs_embeds like Gemma 3 (#36787)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-11 18:12:43 +00:00 |
|
Harry Mellor
|
5efa206a8c
|
Fix ExaoneMoeMTP test that never ran in Transformers v4 (#36792)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-11 17:10:23 +00:00 |
|
Cyrus Leung
|
196802dfa6
|
[Misc] Clean up renderers (#36770)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-11 16:39:29 +00:00 |
|