Commit Graph

528 Commits

Author SHA1 Message Date
Kevin H. Luu
b6101d384d Deprecate test-pipeline.yaml (#34864)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
2026-02-19 02:15:27 +00:00
Cyrus Leung
30ebe0dc3c [CI/Build] Remove use of skip_v1 (#34699)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-18 12:19:11 +08:00
Harry Mellor
d0bc520569 Bump mamba-ssm version in CI for Transformers v5 compatibility (#34233)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-10 14:46:01 +01:00
wang.yuqi
dab1de9f38 [Frontend][CI] Consolidate instrumentator entrypoints (#34123)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-02-10 07:30:19 +00:00
wang.yuqi
22b64948f6 [Frontend][last/5] Make pooling entrypoints request schema consensus. (#31127)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-02-09 06:42:38 +00:00
Luka Govedič
ac32e66cf9 [torch.compile] Reorganize vllm/compilation and tests/compile (0/N for vLLM IR) (#33731)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: ProExpertProg <luka.govedic@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2026-02-06 04:19:49 -08:00
Cyrus Leung
116880a5a0 [Bugfix] Make MM batching more robust (#33817)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-05 20:40:58 +00:00
Aaron Hao
c1858b7ec8 [Feat][RL][1/2] Native Weight Syncing API: NCCL (#31943)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: SumanthRH <sumanthrh99@gmail.com>
2026-02-05 12:13:23 -05:00
Cyrus Leung
038914b7c8 [Refactor] Move task outside of PoolingParams.verify (#33796)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-02-05 09:33:11 +00:00
Luka Govedič
4d9513537d [CI][torch.compile] Reduce e2e fusion test time (#33293)
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: ProExpertProg <luka.govedic@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2026-02-04 19:09:03 -05:00
shanjiaz
5eeba80c74 Adding optional speculator tests for larger models (#32943)
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
2026-01-29 16:54:02 +08:00
Matthew Bonanni
a608b4c6c2 [5/N][Attention] Finish eliminating vllm/attention folder (#32064)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-27 10:02:51 -05:00
Robert Shaw
5a93b9162b [MoE Refactor] Integrate Naive Prepare Finalize into MK (#32567)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: amirkl94 <203507526+amirkl94@users.noreply.github.com>
2026-01-27 01:28:02 +00:00
Alex Brooks
9ac818a551 [Misc] HF Hub LoRA Resolver (#20320)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2026-01-26 13:56:32 +00:00
Matthew Bonanni
300622e609 [CI][Attention] Add more CI dependencies for attention tests (#32487)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-22 18:44:56 +00:00
Cyrus Leung
d117a4d1a9 [Frontend] Introduce Renderer for processing chat messages (using ModelConfig) (#30200)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-22 12:44:22 +00:00
Robert Shaw
42135d6898 [MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414) 2026-01-21 08:22:33 -05:00
Matthew Bonanni
1a1fc3bbc0 [Attention][MLA] Make FLASHINFER_MLA the default MLA backend on Blackwell, and TRTLLM the default prefill (#32615)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2026-01-19 18:41:34 -05:00
Yanan Cao
9d1e611f0e [CI] Add Helion as an optional dependency (#32482)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
2026-01-19 19:09:56 +00:00
Robert Shaw
afc3622602 [CI] Move Distributed Tests from H200 -> H100 (#32555) 2026-01-18 10:25:23 -08:00
Lucas Wilkinson
ca21288080 [CI] Fix OOM in Hopper Fusion E2E Tests (H100) (#32489)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2026-01-16 21:27:16 +00:00
Lucas Wilkinson
14ce524249 [CI] Breakup h200 tests (#30499)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2026-01-16 06:23:22 +00:00
Roberto L. Castro
8ef50d9a6b [Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (#30885)
Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>
2026-01-13 15:22:53 -08:00
Cyrus Leung
a374532111 [CI/Build] Separate out flaky responses API tests (#32110)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-11 05:01:12 -08:00
Matthew Bonanni
2612ba9285 [1/N][Attention] Restructure attention: move files (#31916)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-09 13:10:24 -08:00
Nicolò Lucchesi
83e1c76dbe [CI][ROCm] Fix NIXL tests on ROCm (#31728)
Signed-off-by: NickLucche <nlucches@redhat.com>
2026-01-09 01:34:43 +08:00
TJian
72c068b8e0 [CI] [Bugfix] Fix unbounded variable in run-multi-node-test.sh (#31967)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-08 05:42:01 -08:00
Robert Shaw
5dcd7ef1f2 [MoE Refactor][15/N] Apply Refactor to Fp8 (#31415) 2026-01-07 19:42:33 -05:00
Robert Shaw
d3e477c013 [MoE Refactor] Add Temporary Integration Tests - H100/B200 (#31759)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-06 10:34:17 -05:00
Michael Goin
ccb309a964 Revert "[CI Failure] Disable B200 tests while runner is broken" (#31750)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
2026-01-05 17:26:33 -08:00
Matthew Bonanni
276e03b92c [CI][DeepSeek] Add nightly DeepSeek R1 lm_eval tests on H200 (#30356)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-05 17:17:59 -05:00
Michael Goin
eefa713a66 [CI Failure] Disable B200 tests while runner is broken (#31732)
Signed-off-by: mgoin <mgoin64@gmail.com>
2026-01-05 08:50:51 -08:00
TJian
578c8f51f6 [CI] [Critical] [CUDA] Fix duplicated test name (#31562)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-12-30 21:01:09 -08:00
Nicolò Lucchesi
ab1af6aa3e [CI][NIXL] Split DPEP tests (#31491)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-12-30 07:26:12 -05:00
Lucas Wilkinson
7e065eba59 [CI] Fix "2 Node Tests (4 GPUs in total)" (#31090)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-22 10:32:40 +08:00
Ameen Patel
93cabc417c ci: add nvidia-smi warmup before Prime-RL integration test (#31093)
Signed-off-by: AmeenP <ameenp360@gmail.com>
2025-12-21 15:43:01 +00:00
Lucas Wilkinson
ae0770fa6b [CI] Fix H200 Distributed test (#31054)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-20 16:48:49 -05:00
Nick Hill
45c0526ac9 [BugFix] Handle errors when preprocessing added requests (#30895)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-19 01:29:11 +00:00
Elizabeth Thomas
41b6f9200f Remove all2all backend envvar (#30363)
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-18 19:46:28 +00:00
Andrey Talman
e06d0bf0aa 2.9.1 PyTorch release update (#28495) 2025-12-17 12:20:22 -08:00
Chauncey
9ad5b21710 [Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-17 02:27:30 -08:00
Michael Goin
10ee1c64cf [CI] Generalize gsm8k test args and add Qwen3-Next MTP B200 test (#30723)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-12-16 14:28:34 -05:00
Lucas Wilkinson
00a8d7628c [BugFix] Fix memory spike in workspace allocation (#30744)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-16 06:46:22 -08:00
Cyrus Leung
ed586e7724 [Refactor] [3/N] Move tool parser tests and run on CPU (#30693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-15 13:45:36 +00:00
Michael Goin
2f32a68d75 [CI] Update several models in registry that are available online now (#30514)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-12-12 18:28:13 -08:00
Kevin H. Luu
b4039c08b5 [ci] Mark PrimeRL integration test as soft fail (#30578)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
2025-12-12 14:13:09 -08:00
shivampr
cd7740ac5c [ROCm] Enable Triton ScaledMM fallback + kernel selection fix (#26668)
Signed-off-by: Shivam <shivampr.dev@gmail.com>
Signed-off-by: Shivam <shivamprasad91@gmail.com>
2025-12-12 13:28:20 -05:00
Sage Moore
b4054c8ab4 Revert "[CI] Add Async Eplb nightly CI tests (#29385)" (#30431) 2025-12-11 00:48:35 +00:00
Ilya Markov
0b6a8a304c [BugFix] Fix non detected failing tests (#30277)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
2025-12-09 17:57:55 +00:00
Zhewen Li
263c38d74d [CI/Build] Update batch invariant test trigger (#30080)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-12-05 00:42:37 +00:00