Commit Graph

1092 Commits

Author SHA1 Message Date
EdalatiAli
e5b807607c [Quant][Feature] Support online MXFP8 quantization for MoE and dense models (#35448)
Signed-off-by: EdalatiAli <aliedalati@cohere.com>
2026-03-16 18:07:39 -04:00
Raushan Turganbay
55e6d3d5c0 [Bugfix] Make siglip/clip compatible with transformers v5 (#37200)
Signed-off-by: raushan <raushan@huggingface.co>
2026-03-16 16:48:18 +00:00
Isotr0py
912fbe9555 [Bugfix] Fix Qwen2.5-Omni/Qwen3-Omni use_audio_in_video with multi-video inputs (#37147)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-16 08:56:06 +00:00
bigshanedogg
2390d44209 [Model] Add HyperCLOVAX-SEED-Think-14B language model support (#37107)
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
2026-03-16 06:40:05 +00:00
Andreas Karatzas
d4c57863f7 [ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test (#37138)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-16 04:49:31 +00:00
Harry Mellor
0005d2a3c9 Use Transformers v5 WeightRenaming for Transformers modeling backend (#31545)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-13 20:49:08 +00:00
Isotr0py
abf61aaa8e [Bugfix] Fix Qwen2.5-omni/Qwen3-omni mm_processor cache for audio_in_video request (#36800)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-13 18:16:05 +00:00
whyiug
1ce13cf992 [Model] Add support for BERT-like Chinese ERNIE pooling models (#36385)
Signed-off-by: whyiug <whyiug@hotmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-13 03:23:53 +00:00
Nikita
10f08dedfa [Model] Add ColPali late interaction model for multi-modal retrieval (#36818)
Signed-off-by: Nikita Sukharev <kaonael@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-03-13 02:18:57 +00:00
Marc Sun
c973ecdead [bnb] Skip moe + bnb test (#36896)
Signed-off-by: Marc Sun <marc@huggingface.co>
2026-03-12 18:03:25 +00:00
Kunshang Ji
53ec16a705 [Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145)
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-12 07:57:47 -07:00
István Ketykó
00726c74c9 [Bugfix][Model] Fix DeepSeek-OCR TensorSchema crash on empty images_crop (#36670)
Signed-off-by: István Ketykó <istvan.ketyko@gmail.com>
2026-03-12 15:35:54 +08:00
wang.yuqi
6ecabe4936 [CI Failure] Fix Language Models Test (Extended Pooling) daily CI Failure (#36761)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-12 12:22:05 +08:00
Harry Mellor
65986db6ba Make Gemma and Gemma 2 accept inputs_embeds like Gemma 3 (#36787)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-11 18:12:43 +00:00
Harry Mellor
5efa206a8c Fix ExaoneMoeMTP test that never ran in Transformers v4 (#36792)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-11 17:10:23 +00:00
Cyrus Leung
196802dfa6 [Misc] Clean up renderers (#36770)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-03-11 16:39:29 +00:00
Jhao-Ting Chen
5573894737 Kimi k2.5 MLA based eagle3 (#36361)
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
Co-authored-by: Izzy Putterman <iputterman@nvidia.com>
2026-03-11 11:36:11 -04:00
Harry Mellor
d5816c8c2f Fix tied weights in weight mapping test for Transformers v5 (#36788)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-11 15:10:26 +00:00
Weiguang Li
724759684c [Bugfix] Fix Qwen3-VL timestamp mismatch when using num_frames without fps (#36136)
Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 03:13:06 -07:00
Angela Yi
13e79fc811 [ci] Update rtol for test_classification (#36556)
Signed-off-by: angelayi <yiangela7@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
2026-03-11 03:08:16 -07:00
tunglinwood
42fadebecb [Model] Add support for moonshotai/Kimi-Audio-7B-Instruct (#36127)
Signed-off-by: tunglinwood <tunglinwood@gmail.com>
Signed-off-by: tunglinwood <tomwu.tunglin@gmail.com>
Signed-off-by: tunglinwood <113751333+tunglinwood@users.noreply.github.com>
2026-03-10 21:24:48 -07:00
Nick Hill
65b2f405dc [Core] Simplify core kv-cache blocks initialization logic (#36521)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
2026-03-10 20:20:02 +00:00
wang.yuqi
a3189a08b0 [Model] Consolidate score logic by introduce score_type (#36479)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-10 13:32:25 +00:00
Harry Mellor
195c997203 Fix LFM2 MoE test for Transformers v5 (#36534)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-09 22:29:17 -07:00
Hojin Yang
0836be3b03 [Model] Add HyperCLOVAX-SEED-Think-32B vision-language model support (#31471)
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-03-10 10:59:19 +08:00
Micah Williamson
4ff9b045fe [ROCm][CI] Prep Tests For Change To ROCM_ATTN As New Default Backend On ROCm (#36025)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2026-03-09 13:27:55 -05:00
Andreas Karatzas
1e0f917b34 [ROCm][CI] Fix logprob divergence for TitanML/tiny-mixtral under AITER rms_norm (#36101)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-09 12:07:44 -05:00
Matthew Bonanni
77a73458e3 Reapply [Attention] Refactor check_and_update_config (#35122)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-03-09 07:17:14 -07:00
Micah Williamson
fc4657756f [ROCm][CI] Enable AITER for failing test_gpt_oss test case on MI355 (#36174) 2026-03-07 13:50:17 -08:00
rahul-sarvam
85f50eb41f Adding support to Sarvam's MoE models (#33942)
Signed-off-by: rahul-sarvam <140298821+rahul-sarvam@users.noreply.github.com>
2026-03-08 01:16:24 +08:00
Isotr0py
1d0c0d209c [Misc] Lazy import registered processors (#36024)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
2026-03-06 06:06:45 -08:00
Alex Brooks
10f4db4dbe [Frontend] Add Support for MM Encoder/Decoder Beam Search (Offline) (#36153)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-06 01:16:56 -08:00
Andreas Karatzas
a1ffa56a1e [CI] Fix bge-m3 similarity reference values after *Defination* typo fix (#36208)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-06 05:07:29 +00:00
Cyrus Leung
6dd302653f [Misc] Rename group_mm_kwargs_by_modality -> group_and_batch_mm_kwargs (#36158)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-03-06 12:32:48 +08:00
Yanhong Li
a911f4dd20 [Model] Add support for OLMo Hybrid (#32550) 2026-03-05 14:51:06 -05:00
Jiayi Yan
6a895197fa [Bugfix][CI] fix typos (#34934)
Signed-off-by: 1195343015 <1195343015@qq.com>
Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-05 17:05:46 +00:00
Kunshang Ji
66a2209645 [Hardware] Replace torch.cuda.synchronize() api with torch.accelerator.synchronize (#36085)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-05 10:36:39 +00:00
Isotr0py
21eb2c3372 [Chore] Correct MTP models test registry ordering (#36115)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-05 08:55:04 +00:00
AllenDou
c1d963403c [model] support FireRedASR2 (#35727)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-03 19:41:30 -08:00
Micah Williamson
e7213003cb [ROCm][CI] Fix TP size issue for test_gpt_oss (#35887)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2026-03-03 20:57:34 +00:00
wang.yuqi
fd4a90f337 [CI] And PPL test for Qwen3.5. (#35853)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-03 13:15:51 +00:00
Micah Williamson
8b9e8b7454 [ROCm][CI] Fix Assertion Logic For test_gpt_oss (#35806)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2026-03-03 05:08:04 +00:00
Isotr0py
7d8bbe6f42 [CI/Build] Automatically patch video metadata for multimodal processor test (#35822)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-03 04:27:45 +00:00
Jakub Zakrzewski
c8b678e53e [Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
2026-03-03 08:32:14 +08:00
Roger Wang
1b82b433fc [Bugfix] Fix MM processor test for Qwen3.5 (#35797)
Signed-off-by: Roger Wang <hey@rogerw.io>
2026-03-02 23:05:08 +00:00
Fynn Schmitt-Ulms
9433acb8df [Spec Decode] Add hidden states extraction system (#33736)
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
2026-03-02 14:29:09 -05:00
Isotr0py
cc0d565f40 [CI/Build] Enable Qwen3.5 tests on CI (#35763)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-03-02 17:43:53 +00:00
Andreas Karatzas
c34963f138 [ROCm][CI] Disable skinny GEMMs in language model standard tests to fix non-determinism (#35152)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-02 15:04:18 +08:00
Itay Alroy
dea268336f [1/N] Elastic EP Milestone 2 (#34861)
Signed-off-by: Yongji Wu <wuyongji317@gmail.com>
Signed-off-by: Itay Alroy <ialroy@nvidia.com>
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Signed-off-by: Ron Tourgeman <rtourgeman@nvidia.com>
Co-authored-by: Yongji Wu <wuyongji317@gmail.com>
Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Ron Tourgeman <rtourgeman@nvidia.com>
2026-02-28 04:46:42 +00:00
fort726
905d76b51d [Model] Add huggingface skt/A.X-K1 model (#32407)
Signed-off-by: Sungwan(Alex) Kim <sw0726.kim@sktelecom.com>
Signed-off-by: fort726 <38447663+fort726@users.noreply.github.com>
Co-authored-by: Sungwan(Alex) Kim <sw0726.kim@sktelecom.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
2026-02-27 09:26:02 -08:00