Yanhong Li
|
a911f4dd20
|
[Model] Add support for OLMo Hybrid (#32550)
|
2026-03-05 14:51:06 -05:00 |
|
Jiayi Yan
|
6a895197fa
|
[Bugfix][CI] fix typos (#34934)
Signed-off-by: 1195343015 <1195343015@qq.com>
Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-05 17:05:46 +00:00 |
|
Sage Moore
|
8c760b6ab6
|
[ROCm] Refactor ROCm attention backend selection logic (#35246)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2026-03-05 10:51:26 -06:00 |
|
Cyrus Leung
|
7196348157
|
[Bugfix] Fix Qwen-VL tokenizer implementation (#36140)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-05 08:07:19 -08:00 |
|
Ning Xie
|
176c799f4c
|
[openai api] log exception in exception handler (1/N) (#31164)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2026-03-05 16:00:12 +00:00 |
|
Or Ozeri
|
612e7729c2
|
[KVConnector] Scheduler: Fix num_computed_tokens after async KV load (#34616)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
|
2026-03-05 14:25:15 +00:00 |
|
Andreas Karatzas
|
b03ff6a96b
|
[CI] Stabilize test_no_args_tool_call and add ROCm-specific server args (#36107)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-05 21:52:49 +08:00 |
|
Kunshang Ji
|
66a2209645
|
[Hardware] Replace torch.cuda.synchronize() api with torch.accelerator.synchronize (#36085)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-05 10:36:39 +00:00 |
|
Isotr0py
|
21eb2c3372
|
[Chore] Correct MTP models test registry ordering (#36115)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-05 08:55:04 +00:00 |
|
Benjamin Chislett
|
57c629e9c1
|
[Bugfix] Fix block_size for hybrid model MTP (#36036)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-03-05 06:10:54 +00:00 |
|
Zhengxu Chen
|
dd6dbd93f8
|
[compile] Fix extra cache save on warm start. (#35921)
Signed-off-by: zhxchen17 <zhxchen17@fb.com>
|
2026-03-05 12:56:30 +08:00 |
|
daje0601
|
3b23d57c96
|
[Model] Add LoRA support for Whisper models (#29856)
Signed-off-by: daje0601 <englishmt4118@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-03-05 10:38:25 +08:00 |
|
Simon Mo
|
f678c3f61a
|
[RL] [Weight Sync] Guard IPC update-info pickle deserialization behind insecure serialization flag (#35928)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
|
2026-03-04 17:05:32 -05:00 |
|
Harry Mellor
|
17dc9c7fc9
|
[CI] Bump mypy version (#34950)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-04 20:55:11 +00:00 |
|
Richard Zou
|
5569f5218d
|
[torch.compile] Stop lazily compiling (#35472)
Signed-off-by: Richard Zou <zou3519@gmail.com>
|
2026-03-04 12:13:17 -08:00 |
|
Stefano Castagnetta
|
d7166e74c1
|
[CI] Add Blackwell AsyncTP correctness test (#35871)
Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
|
2026-03-04 19:41:21 +00:00 |
|
Hyunkyun Moon
|
bc6be89d16
|
[Frontend] Add vllm launch command for GPU-less preprocessing serving (#34551)
Signed-off-by: HyunKyun Moon <mhg5303@gmail.com>
|
2026-03-04 18:41:52 +00:00 |
|
Bhuminjay Soni
|
fb3e78ab09
|
[Feature][CI]: compare func & no_func outputs in test_functionalization.py (#35481)
Signed-off-by: Bhuminjay <bhuminjaysoni@gmail.com>
Signed-off-by: Bhuminjay Soni <Soni5Happy@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-03-04 18:01:16 +00:00 |
|
Christian Pinto
|
2f2212e6cc
|
Split generic IO Processor plugins tests from Terratorch specific ones (#35756)
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
2026-03-05 00:01:03 +08:00 |
|
Nicolò Lucchesi
|
18e01a0a10
|
[Misc] Add --attention-backend auto option (#35738)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-03-04 15:12:27 +00:00 |
|
sungsoo ha
|
6cb901093f
|
[Core] Add All-to-All communication backend for DCP (#34883)
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Signed-off-by: sungsoo ha <hasungsoo@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-04 10:01:57 -05:00 |
|
Qi Wang
|
6aa6ad8992
|
[BugFix] Fix implicit and incorrect assumption on ECConnector is_producer (#34783)
Signed-off-by: Qi Wang <qiwa@nvidia.com>
|
2026-03-04 15:01:30 +01:00 |
|
Raghavan
|
c8c3935b70
|
[Bugfix][Model] Fix FP8 k_scale/v_scale not loaded for Qwen3-MoE (#35656)
Signed-off-by: raghavan <oneraghavan@gmail.com>
|
2026-03-04 13:15:38 +00:00 |
|
Ronen Schaffer
|
bb6888b8b1
|
[Bugfix][CPUOffloadingManager] Prevent eviction of already-stored blocks in LRU/ARC prepare_store() (#35846)
Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
|
2026-03-04 14:25:33 +02:00 |
|
haosdent
|
d6e04f4c43
|
[Bugfix] Cap FULL decode cudagraph sizes for Mamba/hybrid models (#34094) (#34571)
Signed-off-by: haosdent <haosdent@gmail.com>
Co-authored-by: zjy0516 <riverclouds.zhu@qq.com>
|
2026-03-04 11:56:22 +01:00 |
|
Kunshang Ji
|
16d2ad1d38
|
[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-04 09:49:47 +00:00 |
|
Joe Runde
|
6f0dd93801
|
[Core] Remove busy loop from idle buffer readers (#28053)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-03-04 07:44:20 +00:00 |
|
Cyrus Leung
|
e379396167
|
[Refactor] Clean up processor kwargs extraction (#35872)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-03 19:53:53 -08:00 |
|
AllenDou
|
c1d963403c
|
[model] support FireRedASR2 (#35727)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-03 19:41:30 -08:00 |
|
William Zhang
|
70c73df69e
|
[Bugfix] Fix EVS implementation for Qwen3 VL (#33607)
Signed-off-by: 2ez4bz <133824995+2ez4bz@users.noreply.github.com>
|
2026-03-04 02:18:11 +00:00 |
|
Micah Williamson
|
e7213003cb
|
[ROCm][CI] Fix TP size issue for test_gpt_oss (#35887)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-03-03 20:57:34 +00:00 |
|
Robert Shaw
|
97995f6376
|
[MoE Refactor] Create MK for TRTLLM Kernels (#32564)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
|
2026-03-03 10:39:50 -08:00 |
|
Robert Shaw
|
881a6b011b
|
[CI] Temporarily Disable Llama4 MoE Refactor Test (#35870)
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-03-03 10:36:15 -08:00 |
|
Matthew Bonanni
|
8e1fd5baf0
|
[CI] Bump num_speculative_tokens to 3 in nightly DeepSeek tests (#35882)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-03 09:26:44 -08:00 |
|
JasonCohere
|
ae88468bcc
|
fix: Ensure invalid audio files return 400 error (#34715)
Signed-off-by: Jason Ozuzu <jasonozuzu@cohere.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2026-03-03 08:47:39 -08:00 |
|
ojhaanshika
|
e05cb3b93e
|
TRTLLM gen-full attn Test Coverage (#34986)
Signed-off-by: Anshika Ojha <anshikao@nvidia.com>
Co-authored-by: Anshika Ojha <anshikao@gb-nvl-059-compute09.nvidia.com>
|
2026-03-03 11:35:34 -05:00 |
|
Lucas Wilkinson
|
28ef9ba399
|
[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA (#34552)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-03 07:21:57 -08:00 |
|
TJian
|
fb7fdc49c4
|
[ROCm] [CI] Add new fusion test cases that are relevant to vLLM IR Ops (#34307)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: vllmellm <vllm.ellm@embeddedllm.com>
|
2026-03-03 06:24:21 -08:00 |
|
wang.yuqi
|
fd4a90f337
|
[CI] And PPL test for Qwen3.5. (#35853)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-03-03 13:15:51 +00:00 |
|
hallerite
|
b8401cde0e
|
add regression test (#35834)
Signed-off-by: hallerite <git@hallerite.com>
|
2026-03-03 07:32:15 +00:00 |
|
Micah Williamson
|
8b9e8b7454
|
[ROCm][CI] Fix Assertion Logic For test_gpt_oss (#35806)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-03-03 05:08:04 +00:00 |
|
Isotr0py
|
7d8bbe6f42
|
[CI/Build] Automatically patch video metadata for multimodal processor test (#35822)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-03 04:27:45 +00:00 |
|
aykoppol
|
25e02647c2
|
[Core] Add optional flags to check for repetitive token patterns in engine output (#35451)
Signed-off-by: aykoppol <aykoppol+git@gmail.com>
|
2026-03-03 12:23:25 +08:00 |
|
Robert Shaw
|
6521ccf286
|
[CI] Temporarily Disable Nightly Failures (#35770)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-03-03 01:49:13 +00:00 |
|
Jakub Zakrzewski
|
c8b678e53e
|
[Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
|
2026-03-03 08:32:14 +08:00 |
|
Roger Wang
|
1b82b433fc
|
[Bugfix] Fix MM processor test for Qwen3.5 (#35797)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-03-02 23:05:08 +00:00 |
|
Fynn Schmitt-Ulms
|
9433acb8df
|
[Spec Decode] Add hidden states extraction system (#33736)
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
2026-03-02 14:29:09 -05:00 |
|
Richard Zou
|
d1a6e96d9e
|
[torch.compile] Improve cold and warm start compile tests (#35709)
Signed-off-by: Richard Zou <zou3519@gmail.com>
|
2026-03-02 19:27:06 +00:00 |
|
Isotr0py
|
cc0d565f40
|
[CI/Build] Enable Qwen3.5 tests on CI (#35763)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-02 17:43:53 +00:00 |
|
Turner Jabbour
|
4034c3d32e
|
[Core] Move test utility to test file (#35672)
Signed-off-by: Turner Jabbour <doubleujabbour@gmail.com>
|
2026-03-02 10:56:03 -05:00 |
|