Yichuan Wang
|
80f2ba6ea6
|
Fix DeepSeek-OCR tensor validation for all size variants (#34085)
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-02-11 22:50:23 -08:00 |
|
Lucas Wilkinson
|
136b0bfa59
|
[BugFix] Fix DP chunking (#34379)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Bill Nell <bnell@redhat.com>
|
2026-02-12 06:44:03 +00:00 |
|
Cyrus Leung
|
b96f7314b4
|
[Refactor] Pass Renderer to Input Processor (#34329)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-11 19:38:11 -08:00 |
|
Cyrus Leung
|
ced2a92f40
|
[Refactor] Move validation to params definitions (#34362)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-11 19:33:15 -08:00 |
|
Runkai Tao
|
e1d97c38f8
|
[Bug Fix] Fix naive_block_assignment always defaulting to False due to arg misalignment (#33848)
Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
|
2026-02-12 11:30:57 +08:00 |
|
Michael Goin
|
ec12d39d44
|
[Bugfix] Fix MTP accuracy for GLM-5 (#34385)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-02-12 11:08:19 +08:00 |
|
Michael Goin
|
ff1f83b056
|
[Refactor] Replace activation: str with MoEActivation enum (#33843)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
|
2026-02-11 17:29:32 -08:00 |
|
Kevin H. Luu
|
83b47f67b1
|
[ci] Integrate AMD tests into CI (#33626)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2026-02-12 08:54:17 +08:00 |
|
Micah Williamson
|
fb7b30c716
|
[ROCm][CI] Revert Test Groups From mi325_8 to mi325_1 Agent Pool In AMD CI (#34384)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-02-11 15:52:34 -08:00 |
|
bnellnm
|
31d992d215
|
[Bugfix] Fix some issues with MoERunner PR #32344 (#34371)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2026-02-11 14:33:14 -08:00 |
|
Wei Zhao
|
5aff2699bd
|
Fix CI failure - Flashinfer Kernel tests (#34316)
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
|
2026-02-11 14:17:16 -08:00 |
|
Raushan Turganbay
|
527ca32197
|
[Bugfix] Fix more multimodal tests for transformers V5 (#34334)
Signed-off-by: raushan <raushan@huggingface.co>
|
2026-02-11 22:02:05 +01:00 |
|
Junseo Park
|
5458eb835d
|
[Bugfix] send None sentinel on final commit so server properly sends transcription.done (#33963)
Signed-off-by: pjs102793 <pjs102793@naver.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-11 21:01:53 +00:00 |
|
Tomas Ruiz
|
144d9b7cc8
|
[Benchmarks] Reduce ready checker log verbosity (#34349)
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
|
2026-02-11 20:57:57 +00:00 |
|
elvischenv
|
83e26c834e
|
[GPT-OSS] Remove unnecessary contiguous (#34337)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2026-02-11 15:29:29 -05:00 |
|
TJian
|
5001211369
|
[ROCm] [CI] fix test_unrecognized_env (#34350)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2026-02-11 18:50:44 +00:00 |
|
Eldar Kurtić
|
11c7ace340
|
[Bugfix] Enable attn quantization of Llama-4 by correctly permuting scales for rope (int8, fp8) (#34243)
Signed-off-by: Your Name <you@example.com>
Co-authored-by: Your Name <you@example.com>
|
2026-02-11 13:24:22 -05:00 |
|
Xinyu Dong
|
be7f3d5d20
|
[Bugfix] fix default is_neox_style is True for deepseek (#34353)
Signed-off-by: dongxinyu03 <dongxinyu03@baidu.com>
|
2026-02-11 18:20:45 +00:00 |
|
Isotr0py
|
0ab06100f4
|
[Multimodal] Expose mm_processor_kwargs for DummyInputsBuilder (#34330)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-11 09:37:40 -08:00 |
|
Xinyu Chen
|
ffb3d553cc
|
[Model Runner V2] Init cuda graph pool when necessary (#33217)
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
|
2026-02-11 09:12:13 -08:00 |
|
junuxyz
|
fa7e0bfacf
|
[CI][BugFix] Fix silent failure in shellcheck hook and baseline exist… (#32458)
Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>
|
2026-02-11 17:03:48 +00:00 |
|
SorenDreano
|
48134a2c22
|
[Docs] Fix typo ("defult") and double spacing (#34348)
Signed-off-by: SorenDreano <71752785+SorenDreano@users.noreply.github.com>
Co-authored-by: Soren Dreano <soren@numind.ai>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-02-11 09:02:27 -08:00 |
|
kliuae
|
64f570ab56
|
[ROCm] [aiter] Split KV cache update for AiterFlashAttention (#33681)
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
|
2026-02-11 16:26:44 +00:00 |
|
Rohan Potdar
|
fd618871b4
|
[Bugfix]: Fix ROCm fusion attn test; use AttentionBackend utils to create kv cache (#33948)
Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
|
2026-02-11 11:12:05 -05:00 |
|
Harry Mellor
|
67a42b5a44
|
Don't try and run GLM-ASR with remote code (#34352)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-02-11 08:09:40 -08:00 |
|
Lucas Wilkinson
|
c7914d30f9
|
Reapply [Attention][FA3] Update FA3 to include new swizzle optimization (#34043)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-02-11 07:07:56 -08:00 |
|
Adam Binford
|
1b8756562e
|
Responses harmony system message structured (#34268)
Signed-off-by: Adam Binford <adamq43@gmail.com>
|
2026-02-11 05:14:28 -08:00 |
|
Linda
|
275e0d2a99
|
[NVIDIA][test] Tests for flashinfer TRTLLM BF16 MoE (#33715)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Pavani Majety <pmajety@nvidia.com>
|
2026-02-11 12:38:11 +00:00 |
|
Harry Mellor
|
0f5e55e7a8
|
Make JAIS compatible with Transformers v5 (#34264)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-02-11 12:30:37 +00:00 |
|
Harry Mellor
|
1e9204bff3
|
Make Qwen3VL compatible with Transformers v5 (#34262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2026-02-11 04:13:23 -08:00 |
|
Li, Jiang
|
05339a7b20
|
[Bugfix][CPU] Fix llama4 inference on CPU (#34321)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-02-11 19:07:23 +08:00 |
|
Harry Mellor
|
40b8f55358
|
[Docs] Reduce time spent generating API docs (#34255)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-02-11 02:56:02 -08:00 |
|
Seiji Eicher
|
5045d5c983
|
Patch protobuf for CVE-2026-0994 (#34253)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-02-11 02:25:04 -08:00 |
|
Nick Hill
|
e09546cf05
|
[Frontend] Exploit tokenizers "new stream" in FastIncrementalDetokenizer (#34217)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-11 11:03:24 +01:00 |
|
Tianqi Ren
|
786806dd44
|
[Doc] Update Marlin support matrix for Turing (#34319)
Signed-off-by: Tianqi Ren <tianqi.r@outlook.com>
|
2026-02-11 09:03:41 +00:00 |
|
Nick Hill
|
79504027ef
|
[Misc] Bump fastsafetensors version for latest fixes (#34273)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-11 00:30:09 -08:00 |
|
Luka Govedič
|
addac0e653
|
[torch.compile] Enable AR+rms fusion by default available for -O2 (#34299)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
|
2026-02-11 00:30:00 -08:00 |
|
Cyrus Leung
|
675a22ed66
|
[Chore] Move BaseRenderer to base.py (#34308)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-11 00:29:51 -08:00 |
|
Kunshang Ji
|
cb9574eb85
|
[XPU][9/N] clean up existing ipex code/doc (#34111)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-02-11 00:27:15 -08:00 |
|
AllenDou
|
21dfb842d7
|
[model] support FunASR model (#33247)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
|
2026-02-11 07:37:09 +00:00 |
|
R3hankhan
|
d1b837f0ae
|
[CPU] Enable FP16 (Half dtype) support for s390x (#34116)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
|
2026-02-11 14:41:42 +08:00 |
|
Roger Wang
|
0b20469c62
|
[Bugfix] Fix weight naming in Qwen3.5 (#34313)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-02-10 21:37:14 -08:00 |
|
Tyler Michael Smith
|
d7982daff5
|
[Bugfix] Fix fused MoE IMA (sans chunking) by using int64 for strides (#34279)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-11 05:15:52 +00:00 |
|
Robert Shaw
|
9b17c57460
|
[ModelBash][DSR1 NVFp4] Removed Bf16 Bias Cast (#34298)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-02-11 05:00:00 +00:00 |
|
Hashem Hashemi
|
1b3540e6c6
|
Threshold fix wvSplitk for occasional CI fails (#34013)
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
|
2026-02-11 03:59:14 +00:00 |
|
Matthias Gehre
|
7a048ee65f
|
[Bugfix] Fix benchmark_moe.py inplace assertion with torch >= 2.9 (#34149)
Signed-off-by: Matthias Gehre <matthias.gehre@amd.com>
|
2026-02-11 03:58:56 +00:00 |
|
Cyrus Leung
|
c9a1923bb4
|
[Plugin] Simplify IO Processor Plugin interface (#34236)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-10 19:47:39 -08:00 |
|
zofia
|
b482f71e9f
|
[XPU][7/N] enable xpu fp8 moe (#34202)
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
|
2026-02-11 03:33:59 +00:00 |
|
Дзержи́нский
|
1485396abb
|
[Kernel] Apply 256bit LDG/STG To Activation Kernels (#33022)
Signed-off-by: Dzerzhinsky <256908701+AstroVoyager7@users.noreply.github.com>
Signed-off-by: Дзержи́нский <256908701+AstroVoyager7@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-02-10 19:31:51 -08:00 |
|
Kebe
|
5ee5c86eeb
|
[Bugfix][DeepSeek-V3.2] fix fp8 kvcache type cast (#33884)
Signed-off-by: Kebe <mail@kebe7jun.com>
|
2026-02-10 19:31:36 -08:00 |
|