Li, Jiang
|
12449f9492
|
[Bugfix][CPU] Skip set_num_threads after thread binding (#38535)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
(cherry picked from commit 6557f4937f)
|
2026-03-30 23:01:42 -07:00 |
|
Andreas Karatzas
|
4f2ed5fddb
|
[ROCm][CI] Enable hybrid chunked prefill test (#38317)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-30 10:30:26 +08:00 |
|
Kyle Sayers
|
d28d86e8a3
|
[QeRL] Fix online quantized reloading (#38442)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-03-29 14:56:41 -06:00 |
|
TJian
|
58a249bc61
|
[ROCm] [Release] Update ROCm variant from rocm700 to rocm721 (#38413)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2026-03-28 06:07:03 +00:00 |
|
Sage Moore
|
497e234d38
|
[EPLB] Cleanup the transfer logic for the various eplb maps (#34520)
Signed-off-by: Sage Moore <sagmoore@redhat.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2026-03-27 10:18:46 +01:00 |
|
Shengqi Chen
|
84e439a9cb
|
[CI/Build] Move nightly wheel index generation to a single post-build step (#38322)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2026-03-27 07:44:18 +00:00 |
|
wenjun liu
|
d86060122a
|
[CI/Build] enable Intel XPU test flow with prebuilt image (#37447)
Signed-off-by: wendyliu235 <wenjun.liu@intel.com>
|
2026-03-26 18:16:04 -07:00 |
|
Giancarlo Delfin
|
c32e97602d
|
[Model Runner V2] Enable forcing a specific acceptance rate during rejection sampling (#38045)
Signed-off-by: Giancarlo Delfin <gdelfin@inferact.ai>
|
2026-03-26 13:38:12 -07:00 |
|
TJian
|
bc9c6fbbe6
|
[ROCm] [Bugfix] [Release] Fix nightly rocm release pipeline (#38263)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2026-03-26 18:47:10 +00:00 |
|
Andreas Karatzas
|
bff9a1c266
|
[ROCm][CI] Override PYTORCH_ROCM_ARCH with detected GPU arch in test containers (#38165)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-26 18:33:45 +00:00 |
|
Andreas Karatzas
|
9c3ae04bfe
|
[ROCm][CI] Add LM Eval Qwen3.5 Models test for MI355 (#38155)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-26 16:51:18 +00:00 |
|
TJian
|
60af7b967b
|
[Releases] [ROCm] Enable Nightly Docker Image and Wheel Releases for ROCm (#37283)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com>
|
2026-03-26 16:32:25 +00:00 |
|
Wentao Ye
|
e054f152fa
|
[CI] Add batch invariant test for b200 (#38014)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-03-26 11:54:54 -04:00 |
|
Fadi Arafeh
|
71161e8b63
|
[cpu][ci] remove soft-fail for Arm CI and add quant model tests (#37691)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
|
2026-03-26 07:03:31 +00:00 |
|
Richard Zou
|
6e37c46b35
|
[compile] Add some more startup tests for top models (#38046)
Signed-off-by: Richard Zou <zou3519@gmail.com>
|
2026-03-25 12:02:22 -04:00 |
|
Andreas Karatzas
|
04417ecd5f
|
[ROCm][CI] Rename filepath test to point to correct file (#38102)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-25 20:05:46 +08:00 |
|
Gregory Shtrasberg
|
189ddefbfd
|
[ROCm] Attention selector reordering (#36702)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
|
2026-03-25 17:42:56 +08:00 |
|
Andreas Karatzas
|
679c6a3ecc
|
[Bugfix][ROCm][MoE] Fix mxfp4 oracle regressions from #37128 (#37787)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-25 08:17:33 +08:00 |
|
Andreas Karatzas
|
8bbb7c7f20
|
[ROCm][CI][PD] Add Hybrid SSM integration tests to CI (#37924)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-25 07:58:39 +08:00 |
|
Kevin H. Luu
|
af945615b5
|
[release] Move the rest of release jobs to release queue (#38044)
Signed-off-by: khluu <khluu000@gmail.com>
|
2026-03-24 16:40:58 -07:00 |
|
amey asgaonkar
|
0c1809c806
|
Add Ubuntu 24.04 support for Docker builds (#35386)
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
|
2026-03-24 13:34:44 -07:00 |
|
Li, Jiang
|
352b90c4a4
|
[Bugfix] Add replacement of _compute_slot_mapping_kernel on CPU (#37987)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-03-24 07:00:20 -07:00 |
|
Kevin H. Luu
|
7281199a8c
|
[release] Move agent queue to Release cluster queues (#37783)
Signed-off-by: khluu <khluu000@gmail.com>
|
2026-03-23 20:36:47 -07:00 |
|
Kevin H. Luu
|
b2dd75eb48
|
Downsize CPU jobs to use small queue (#37913)
Signed-off-by: khluu <khluu000@gmail.com>
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-03-23 20:36:37 -07:00 |
|
Andreas Karatzas
|
de99d91ece
|
[ROCm][CI] Split Entrypoints Integration (API Server 1) into 3 jobs (#37906)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-24 09:48:37 +08:00 |
|
Wentao Ye
|
83c9d525b6
|
[CI] Add batch invariant test: Block FP8 + small MOE (#37895)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-03-23 21:16:14 -04:00 |
|
roikoren755
|
56777b5c89
|
[Test] E2E Nemotron-3-Super tests (#36803)
Signed-off-by: Roi Koren <roik@nvidia.com>
|
2026-03-23 17:49:56 -07:00 |
|
Kevin H. Luu
|
2488a82f89
|
[CI] Split V1 Others into 3 separate jobs (#37016)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-24 06:44:38 +08:00 |
|
Kyle Sayers
|
38364a7e32
|
[Sparse24] [Deprecation] Remove Sparse24 CT integration and kernels (#36799)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-03-23 16:03:29 -04:00 |
|
Kunshang Ji
|
91fd695b75
|
[CI] split Entrypoints Integration (API Server 1) into 3 jobs (#37882)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-23 10:37:56 -07:00 |
|
Nicolò Lucchesi
|
1cbbcfe8a3
|
[CI][PD] Add Hybrid SSM integration tests to CI (#37657)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-03-23 23:58:19 +08:00 |
|
Jee Jee Li
|
1f0d210641
|
[CI/Build][LoRA] Update Qwen35 LoRA testing (#37816)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-03-23 12:55:49 +08:00 |
|
Woosuk Kwon
|
43877a620b
|
[MRV2] Enable PP CUDA graph test (#37830)
Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
|
2026-03-22 16:30:25 -07:00 |
|
Andreas Karatzas
|
6eedec6e36
|
[ROCm][CI] Make some duplicated tests optional so that they are only evaluated in our nightly (#37780)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-22 16:03:18 +08:00 |
|
Andreas Karatzas
|
e78bc74268
|
[ROCm][CI] close missing quote in kernels/moe block in run-amd-test.sh (#37774)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-22 09:42:34 +08:00 |
|
Robert Shaw
|
eeee5b262d
|
[Quantization][Deprecation] Remove PTPC FP8 (#32700)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-03-21 22:10:16 +00:00 |
|
Andreas Karatzas
|
02eec7ecbe
|
[ROCm][CI] Update GSM8K eval config to use fp8-and-mixed models list (MI355) (#37721)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-21 15:27:12 +08:00 |
|
Andreas Karatzas
|
0d50fa1db6
|
[ROCm][CI] Mark gemma3 as large GPU test to avoid OOM on MI250 (#37610)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-21 12:57:25 +08:00 |
|
Andreas Karatzas
|
8bc6b5cdb0
|
[ROCm][CI] Setting some mi325_4 tests back to optional (in parity with upstream) (#37711)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 12:25:08 -07:00 |
|
Vadim Gimpelson
|
4f16ebbbd3
|
[Bugfix] Disable monolithic TRTLLM MoE for Renormalize routing (#37591) (#37605)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2026-03-20 12:19:26 -07:00 |
|
Lucas Wilkinson
|
e1d85e5c24
|
[Attention] Support distinguishing between short extends and decodes (#37303)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-03-20 10:49:36 -07:00 |
|
Flora Feng
|
b4c1aef21c
|
[Refactor] Relocate tests from tests/v1/entrypoints/ to tests/entrypoints/ (#37500)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-03-20 02:50:34 -07:00 |
|
Flora Feng
|
6050b93bed
|
[Refactor] Move serve entrypoint tests under tests/entrypoints/serve/ (#37595)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-03-20 02:10:47 -07:00 |
|
Andreas Karatzas
|
37cd9fc107
|
[ROCm][CI] Remove deepep DBO tests on gfx90a (#37614)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 17:07:07 +08:00 |
|
Andreas Karatzas
|
9cfd4ebb5e
|
[ROCm][CI] Update GSM8K eval config to use fp8-and-mixed models list (#37619)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 17:06:53 +08:00 |
|
Andreas Karatzas
|
bd8c4c0752
|
[CI] Removing deprecated rlhf examples reference (#37585)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 15:20:33 +08:00 |
|
Flora Feng
|
e2d1c8b5e8
|
[Refactor] Relocate entrypoint tests to match serving code structure (#37593)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-03-20 05:31:23 +00:00 |
|
Jee Jee Li
|
8fbe3f303f
|
[Bugfix][LoRA] Fix Qwen35 LoRA (#36976)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-03-20 11:09:32 +08:00 |
|
Andreas Karatzas
|
040a505ff5
|
[ROCm][CI] Cleaning and restructuring amd-ci legacy pipeline (#34839)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-19 14:30:58 -05:00 |
|
Cyrus Leung
|
c7bc12c20f
|
[CI/Build] Split out MM pooling tests (#37542)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-19 11:36:11 +00:00 |
|