Matthew Bonanni
|
300622e609
|
[CI][Attention] Add more CI dependencies for attention tests (#32487)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-22 18:44:56 +00:00 |
|
Cyrus Leung
|
d117a4d1a9
|
[Frontend] Introduce Renderer for processing chat messages (using ModelConfig) (#30200)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-22 12:44:22 +00:00 |
|
Robert Shaw
|
42135d6898
|
[MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414)
|
2026-01-21 08:22:33 -05:00 |
|
Matthew Bonanni
|
1a1fc3bbc0
|
[Attention][MLA] Make FLASHINFER_MLA the default MLA backend on Blackwell, and TRTLLM the default prefill (#32615)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-01-19 18:41:34 -05:00 |
|
Yanan Cao
|
9d1e611f0e
|
[CI] Add Helion as an optional dependency (#32482)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2026-01-19 19:09:56 +00:00 |
|
Robert Shaw
|
afc3622602
|
[CI] Move Distributed Tests from H200 -> H100 (#32555)
|
2026-01-18 10:25:23 -08:00 |
|
Lucas Wilkinson
|
ca21288080
|
[CI] Fix OOM in Hopper Fusion E2E Tests (H100) (#32489)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-01-16 21:27:16 +00:00 |
|
Lucas Wilkinson
|
14ce524249
|
[CI] Breakup h200 tests (#30499)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-01-16 06:23:22 +00:00 |
|
Roberto L. Castro
|
8ef50d9a6b
|
[Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (#30885)
Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>
|
2026-01-13 15:22:53 -08:00 |
|
Cyrus Leung
|
a374532111
|
[CI/Build] Separate out flaky responses API tests (#32110)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-11 05:01:12 -08:00 |
|
Matthew Bonanni
|
2612ba9285
|
[1/N][Attention] Restructure attention: move files (#31916)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-09 13:10:24 -08:00 |
|
Nicolò Lucchesi
|
83e1c76dbe
|
[CI][ROCm] Fix NIXL tests on ROCm (#31728)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-01-09 01:34:43 +08:00 |
|
TJian
|
72c068b8e0
|
[CI] [Bugfix] Fix unbounded variable in run-multi-node-test.sh (#31967)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2026-01-08 05:42:01 -08:00 |
|
Robert Shaw
|
5dcd7ef1f2
|
[MoE Refactor][15/N] Apply Refactor to Fp8 (#31415)
|
2026-01-07 19:42:33 -05:00 |
|
Robert Shaw
|
d3e477c013
|
[MoE Refactor] Add Temporary Integration Tests - H100/B200 (#31759)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-06 10:34:17 -05:00 |
|
Michael Goin
|
ccb309a964
|
Revert "[CI Failure] Disable B200 tests while runner is broken" (#31750)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-05 17:26:33 -08:00 |
|
Matthew Bonanni
|
276e03b92c
|
[CI][DeepSeek] Add nightly DeepSeek R1 lm_eval tests on H200 (#30356)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-05 17:17:59 -05:00 |
|
Michael Goin
|
eefa713a66
|
[CI Failure] Disable B200 tests while runner is broken (#31732)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-01-05 08:50:51 -08:00 |
|
TJian
|
578c8f51f6
|
[CI] [Critical] [CUDA] Fix duplicated test name (#31562)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-12-30 21:01:09 -08:00 |
|
Nicolò Lucchesi
|
ab1af6aa3e
|
[CI][NIXL] Split DPEP tests (#31491)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-30 07:26:12 -05:00 |
|
Lucas Wilkinson
|
7e065eba59
|
[CI] Fix "2 Node Tests (4 GPUs in total)" (#31090)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-22 10:32:40 +08:00 |
|
Ameen Patel
|
93cabc417c
|
ci: add nvidia-smi warmup before Prime-RL integration test (#31093)
Signed-off-by: AmeenP <ameenp360@gmail.com>
|
2025-12-21 15:43:01 +00:00 |
|
Lucas Wilkinson
|
ae0770fa6b
|
[CI] Fix H200 Distributed test (#31054)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-20 16:48:49 -05:00 |
|
Nick Hill
|
45c0526ac9
|
[BugFix] Handle errors when preprocessing added requests (#30895)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-12-19 01:29:11 +00:00 |
|
Elizabeth Thomas
|
41b6f9200f
|
Remove all2all backend envvar (#30363)
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-18 19:46:28 +00:00 |
|
Andrey Talman
|
e06d0bf0aa
|
2.9.1 PyTorch release update (#28495)
|
2025-12-17 12:20:22 -08:00 |
|
Chauncey
|
9ad5b21710
|
[Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-17 02:27:30 -08:00 |
|
Michael Goin
|
10ee1c64cf
|
[CI] Generalize gsm8k test args and add Qwen3-Next MTP B200 test (#30723)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-12-16 14:28:34 -05:00 |
|
Lucas Wilkinson
|
00a8d7628c
|
[BugFix] Fix memory spike in workspace allocation (#30744)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-16 06:46:22 -08:00 |
|
Cyrus Leung
|
ed586e7724
|
[Refactor] [3/N] Move tool parser tests and run on CPU (#30693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-15 13:45:36 +00:00 |
|
Michael Goin
|
2f32a68d75
|
[CI] Update several models in registry that are available online now (#30514)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-12-12 18:28:13 -08:00 |
|
Kevin H. Luu
|
b4039c08b5
|
[ci] Mark PrimeRL integration test as soft fail (#30578)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2025-12-12 14:13:09 -08:00 |
|
shivampr
|
cd7740ac5c
|
[ROCm] Enable Triton ScaledMM fallback + kernel selection fix (#26668)
Signed-off-by: Shivam <shivampr.dev@gmail.com>
Signed-off-by: Shivam <shivamprasad91@gmail.com>
|
2025-12-12 13:28:20 -05:00 |
|
Sage Moore
|
b4054c8ab4
|
Revert "[CI] Add Async Eplb nightly CI tests (#29385)" (#30431)
|
2025-12-11 00:48:35 +00:00 |
|
Ilya Markov
|
0b6a8a304c
|
[BugFix] Fix non detected failing tests (#30277)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
|
2025-12-09 17:57:55 +00:00 |
|
Zhewen Li
|
263c38d74d
|
[CI/Build] Update batch invariant test trigger (#30080)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-12-05 00:42:37 +00:00 |
|
Zhewen Li
|
c493b9d092
|
[CI/Build] Add MM code path to Examples Test (#29986)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-12-03 19:21:45 -08:00 |
|
WeiQing Chen
|
7fe9c1a223
|
[CI] Add Async Eplb nightly CI tests (#29385)
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-03 09:51:08 +00:00 |
|
wang.yuqi
|
2eb4fe9129
|
[examples] Resettle pooling examples. (#29365)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 15:54:28 +00:00 |
|
Shengqi Chen
|
4b612664fd
|
[CI] Renovation of nightly wheel build & generation (take 2) (#29838)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
|
2025-12-01 22:17:10 -08:00 |
|
Kevin H. Luu
|
ec7035c9d4
|
[ci] Make distributed 8 gpus test optional (#29801)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2025-12-01 10:22:05 -08:00 |
|
Cyrus Leung
|
2afcec4dec
|
[Misc] Update TokenizerLike interface and move get_cached_tokenizer (#29730)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-30 14:59:47 +08:00 |
|
Cyrus Leung
|
34a984274e
|
[Misc] Refactor tokenizer interface (#29693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-29 04:02:21 -08:00 |
|
Angela Yi
|
4b17ce6815
|
Add gpu memory wait before test_async_tp (#28893)
Signed-off-by: angelayi <yiangela7@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-11-28 20:19:05 -08:00 |
|
Isotr0py
|
d40c854009
|
[CI/Build] Rework CPU multimodal processor test (#29684)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-28 17:10:29 +00:00 |
|
HDCharles
|
df01eda4dc
|
[Bugfix] Make compressed-tensors MoEs respect ignored layers (#28878)
Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>
|
2025-11-26 21:35:13 -05:00 |
|
Huamin Li
|
70d5953f82
|
Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841)" (#29483)
Signed-off-by: Huamin Li <3ericli@gmail.com>
|
2025-11-26 22:27:26 +08:00 |
|
Harry Mellor
|
bf0c75cd4f
|
Make Transformers Nightly tests soft-fail and enable all tests (#29401)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-25 12:41:15 +00:00 |
|
elvischenv
|
6330f9477d
|
[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-11-25 07:59:40 +00:00 |
|
Rémi Delacourt
|
12c007e288
|
EAGLE Support DP>1 (#26086)
Signed-off-by: Rémi Delacourt <remi@mistral.ai>
Signed-off-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com>
Signed-off-by: remi <remi@mistral.ai>
|
2025-11-25 07:32:21 +00:00 |
|