Harry Mellor
|
f4ae58b38b
|
Remove unused config field from Gemma2 (#36672)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-11 01:51:19 -07:00 |
|
Hongbin Guo
|
4bf533623b
|
[Doc] Fix duplicate words in comments (#36713)
Signed-off-by: Hongbin10 <jdmjdm1998@163.com>
|
2026-03-10 21:28:31 -07:00 |
|
tunglinwood
|
42fadebecb
|
[Model] Add support for moonshotai/Kimi-Audio-7B-Instruct (#36127)
Signed-off-by: tunglinwood <tunglinwood@gmail.com>
Signed-off-by: tunglinwood <tomwu.tunglin@gmail.com>
Signed-off-by: tunglinwood <113751333+tunglinwood@users.noreply.github.com>
|
2026-03-10 21:24:48 -07:00 |
|
AllenDou
|
aefc59f088
|
FunASR model bugfix (#36633)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
|
2026-03-10 08:14:21 -07:00 |
|
wang.yuqi
|
a3189a08b0
|
[Model] Consolidate score logic by introduce score_type (#36479)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-10 13:32:25 +00:00 |
|
Hojin Yang
|
0836be3b03
|
[Model] Add HyperCLOVAX-SEED-Think-32B vision-language model support (#31471)
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-03-10 10:59:19 +08:00 |
|
Ajay Anubolu
|
4e95ec111c
|
[Bugfix] Fix Qwen3-Next in_proj_ba weight sharding with TP > 1 (#36242)
Signed-off-by: AjAnubolu <anuboluajay@gmail.com>
|
2026-03-09 19:16:26 -07:00 |
|
Lucas Kabela
|
3fd03f1ec2
|
[BE] Rename should_torch_compile_mm_vit to should_torch_compile_mm_encoder (#36281)
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
|
2026-03-09 18:22:05 +00:00 |
|
SoluMilken
|
55d27cca55
|
[Misc] fix typo: dependant -> dependent (2 lines change) (#36511)
Signed-off-by: SoluMilken <ypiheyn.imm02g@g2.nctu.edu.tw>
|
2026-03-09 10:00:12 -07:00 |
|
Matthew Bonanni
|
77a73458e3
|
Reapply [Attention] Refactor check_and_update_config (#35122)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-09 07:17:14 -07:00 |
|
Tianyu Guo
|
5578f2a4d3
|
Support online use_audio_in_video (#36319)
Signed-off-by: Tianyu Guo <guoty9@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-09 07:16:44 -07:00 |
|
Xin Yang
|
dc6b578466
|
[Kernel] Add fused_sigmoid_gating_delta_rule_update kernel for Qwen3 Next (#35777)
Signed-off-by: Xin Yang <xyangx@amazon.com>
|
2026-03-08 23:41:01 -07:00 |
|
Cyrus Leung
|
d62856b928
|
[Misc] Move processors to transformers_utils (#35953)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-09 11:31:39 +08:00 |
|
Alex Brooks
|
bd2659a566
|
Increase Flexibility for OOV Multimodal Token Handling (#34858)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
|
2026-03-08 20:30:49 -07:00 |
|
nvnbagrov
|
b7332b058c
|
[Model] Nano Nemotron VL - fast media preprocessing (#35657)
Signed-off-by: Natan Bagrov <nbagrov@nvidia.com>
|
2026-03-08 03:04:05 -07:00 |
|
Wei Zhao
|
379689d533
|
[Perf] Support FP8 KV cache for Flashinfer MLA Sparse (#35891)
|
2026-03-07 13:51:54 -08:00 |
|
rahul-sarvam
|
85f50eb41f
|
Adding support to Sarvam's MoE models (#33942)
Signed-off-by: rahul-sarvam <140298821+rahul-sarvam@users.noreply.github.com>
|
2026-03-08 01:16:24 +08:00 |
|
vllmellm
|
ee8a29511f
|
[Bugfix] Fix compressed-tensors quantization failure for DeepSeek-R1 on MI300x (#36247)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
|
2026-03-07 09:26:59 +00:00 |
|
Isotr0py
|
1d0c0d209c
|
[Misc] Lazy import registered processors (#36024)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2026-03-06 06:06:45 -08:00 |
|
Andreas Karatzas
|
2a00d3241f
|
[CI][MM] Gate vision encoder attention mask to MiniCPM only, fixing Aria regression (#36206)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-06 01:17:08 -08:00 |
|
Russell Bryant
|
00bd08edee
|
[Security] Respect user trust_remote_code setting in NemotronVL and KimiK25 (#36192)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2026-03-05 22:15:19 -08:00 |
|
Yanhong Li
|
a911f4dd20
|
[Model] Add support for OLMo Hybrid (#32550)
|
2026-03-05 14:51:06 -05:00 |
|
Netanel Haber
|
b93a9e6f6d
|
ParakeetProjection.norm = RMSNorm instead of nn.LayerNorm (#36133)
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
|
2026-03-05 17:29:30 +00:00 |
|
Avery Miao
|
e998fa76b9
|
[BUGFIX]Fix Qwen-Omni models audio max_token_per_item estimation error leading to encoder_cache_size is 0 (#35994)
Signed-off-by: Miao, Avery <avery.miao@intel.com>
|
2026-03-05 09:16:29 -08:00 |
|
Jiayi Yan
|
6a895197fa
|
[Bugfix][CI] fix typos (#34934)
Signed-off-by: 1195343015 <1195343015@qq.com>
Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-05 17:05:46 +00:00 |
|
AllenDou
|
3ee68590c7
|
refactor funasr model. (#36108)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-05 08:07:37 -08:00 |
|
Cyrus Leung
|
7196348157
|
[Bugfix] Fix Qwen-VL tokenizer implementation (#36140)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-05 08:07:19 -08:00 |
|
Harry Mellor
|
ecde7af9c4
|
Fix import that was moved in Transformers 5.2.0 (#36120)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-05 13:59:44 +00:00 |
|
Hanjun Cho
|
f600d5192e
|
[Bugfix] Fix score layer quantization for sequence classification models - Qwen3 (VL) Reranker (#35849)
Signed-off-by: Hanjun Cho <gkswns0531@gmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-03-04 20:57:20 -08:00 |
|
Andrii Skliar
|
0a12cea25f
|
Order config.py in Lexicographical order (#35866)
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Co-authored-by: Andrii Skliar <askliar@nvidia.com>
|
2026-03-04 20:56:47 -08:00 |
|
daje0601
|
3b23d57c96
|
[Model] Add LoRA support for Whisper models (#29856)
Signed-off-by: daje0601 <englishmt4118@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-05 10:38:25 +08:00 |
|
tc-mb
|
bfdb512f11
|
fix minicpmo4.5: fix attn_mask in vit attn && fix resampler pos_emb i… (#34127)
Signed-off-by: tc-mb <caitianchi@modelbest.cn>
Co-authored-by: hezhihui <hezhihui@modelbest.cn>
|
2026-03-04 17:46:17 +00:00 |
|
Yan Ma
|
58cfe0dc44
|
Fix phi4-mm and remove cuda binding (#35964)
Signed-off-by: Yan Ma <yan.ma@intel.com>
|
2026-03-05 01:08:05 +08:00 |
|
Netanel Haber
|
289fc48ab7
|
Use MMEncoderAttention (=use FlashAttention) instead of torch.sdpa in radio.py (#35653)
|
2026-03-04 08:43:13 -08:00 |
|
Raghavan
|
c8c3935b70
|
[Bugfix][Model] Fix FP8 k_scale/v_scale not loaded for Qwen3-MoE (#35656)
Signed-off-by: raghavan <oneraghavan@gmail.com>
|
2026-03-04 13:15:38 +00:00 |
|
Nathan Price
|
36bf213181
|
[Bugfix] Add missing dynamic_arg_dims for Qwen3-ASR torch.compile (#35869)
Signed-off-by: Nathan Price <nathan@abridge.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-04 08:29:01 +00:00 |
|
Andrii Skliar
|
5d199ac8f2
|
Support Audio Extraction from MP4 Video for Nemotron Nano VL (#35539)
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Andrii <askliar@nvidia.com>
Co-authored-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Co-authored-by: Andrii Skliar <askliar@oci-nrt-cs-001-vscode-01.cm.cluster>
Co-authored-by: Andrii <askliar@nvidia.com>
Co-authored-by: root <root@pool0-03748.cm.cluster>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: root <root@pool0-02416.cm.cluster>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: root <root@pool0-04880.cm.cluster>
|
2026-03-03 23:20:33 -08:00 |
|
Andreas Karatzas
|
edba15045a
|
[Bugfix] Guard mm_token_type_ids kwarg in get_mrope_input_positions (#35711)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-04 04:12:51 +00:00 |
|
Isotr0py
|
6e9f21e8a2
|
[Chore] Remove debug code in model implementation (#35883)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-03 19:50:58 -08:00 |
|
AllenDou
|
c1d963403c
|
[model] support FireRedASR2 (#35727)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-03 19:41:30 -08:00 |
|
Shanshan Shen
|
77e6dcbbfa
|
[PluggableLayer][MM] Add PluggableLayer for RelPosAttention (#33753)
Signed-off-by: shen-shanshan <467638484@qq.com>
|
2026-03-03 19:41:27 -08:00 |
|
William Zhang
|
70c73df69e
|
[Bugfix] Fix EVS implementation for Qwen3 VL (#33607)
Signed-off-by: 2ez4bz <133824995+2ez4bz@users.noreply.github.com>
|
2026-03-04 02:18:11 +00:00 |
|
Isotr0py
|
8ea8ba275e
|
[V0 deprecation] Remove Swin model (#35821)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-02 20:03:41 -08:00 |
|
Jakub Zakrzewski
|
c8b678e53e
|
[Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
|
2026-03-03 08:32:14 +08:00 |
|
Robert Shaw
|
9319044ee9
|
[MoE][Perf] Wrap DSV3 QKVAProj GEMM in custom op for torch.compile (#35751)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-03-02 23:03:49 +00:00 |
|
Ye (Charlotte) Qi
|
fa6a6be519
|
[Bugfix] Fix missing sequence_lengths in qwen3_omni_moe_thinker (#35741)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2026-03-02 21:11:56 +00:00 |
|
Fynn Schmitt-Ulms
|
9433acb8df
|
[Spec Decode] Add hidden states extraction system (#33736)
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
2026-03-02 14:29:09 -05:00 |
|
CSWYF3634076
|
2a9e3347e9
|
[BugFix][Model]Fix the garbled code in Ernie4.5-VL caused by fast_moe_cold_start (#35587)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2026-03-02 18:56:33 +00:00 |
|
lin-shh
|
a9ec392c86
|
Fix typo: implictly -> implicitly in isaac.py docstring (#35646)
|
2026-02-28 23:34:37 -08:00 |
|
lailoo
|
afd089f231
|
[Bugfix][Model] Fix Qwen3.5/Qwen3Next ignoring --dtype flag on older GPUs (#35617)
|
2026-03-01 03:27:37 +00:00 |
|