Andreas Karatzas
|
2df2c85be4
|
[Kernels][MoE] Fix legacy_routing to use bitmatrix-based routing path (#38504)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-04-07 10:57:09 +08:00 |
|
Bowen Bao
|
201d2ea5bf
|
[CI][ROCm] Add Qwen3.5-35B-A3B-MXFP4 model eval into CI (#38664)
Signed-off-by: Bowen Bao <bowenbao@amd.com>
|
2026-04-03 04:05:45 +00:00 |
|
Bowen Bao
|
103f0de565
|
[ROCm][Quantization][1/N] Refactor quark_moe w_mxfp4 w/ oracle (#38774)
Signed-off-by: Bowen Bao <bowenbao@amd.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-04-03 03:29:57 +00:00 |
|
Bowen Bao
|
82a006beeb
|
[CI][ROCm] Add gpt-oss w4a8 in CI (#38292)
Signed-off-by: Bowen Bao <bowenbao@amd.com>
|
2026-04-03 00:06:01 +08:00 |
|
Jiangyun Zhu
|
ea7bfde6e4
|
[CI] fix LM Eval Qwen3.5 Models (B200) (#38632)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2026-03-31 13:20:08 +00:00 |
|
Andreas Karatzas
|
9c3ae04bfe
|
[ROCm][CI] Add LM Eval Qwen3.5 Models test for MI355 (#38155)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-26 16:51:18 +00:00 |
|
Vadim Gimpelson
|
52069012fe
|
[Bugfix] Fix DeepGemm E8M0 accuracy degradation for Qwen3.5 FP8 on Blackwell (#38083)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2026-03-26 01:21:47 -07:00 |
|
roikoren755
|
56777b5c89
|
[Test] E2E Nemotron-3-Super tests (#36803)
Signed-off-by: Roi Koren <roik@nvidia.com>
|
2026-03-23 17:49:56 -07:00 |
|
Vadim Gimpelson
|
4f16ebbbd3
|
[Bugfix] Disable monolithic TRTLLM MoE for Renormalize routing (#37591) (#37605)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2026-03-20 12:19:26 -07:00 |
|
Andreas Karatzas
|
9cfd4ebb5e
|
[ROCm][CI] Update GSM8K eval config to use fp8-and-mixed models list (#37619)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-20 17:06:53 +08:00 |
|
Andreas Karatzas
|
040a505ff5
|
[ROCm][CI] Cleaning and restructuring amd-ci legacy pipeline (#34839)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-19 14:30:58 -05:00 |
|
Michael Goin
|
09e4576f65
|
[Kernel] Add non-gated support for NVFP4 CUTLASS MoE (#37320)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-03-17 18:12:04 -04:00 |
|
Andreas Karatzas
|
179547d62c
|
[ROCm][CI] Fix ROCm GPT-OSS Eval test group (#36179)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-09 17:55:20 -07:00 |
|
Alexei-V-Ivanov-AMD
|
225d1090a0
|
Enabling some B200-specific tests on MI355 (#35253)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
Signed-off-by: Alexei-V-Ivanov-AMD <156011006+Alexei-V-Ivanov-AMD@users.noreply.github.com>
|
2026-03-06 19:27:20 +00:00 |
|
Robert Shaw
|
881a6b011b
|
[CI] Temporarily Disable Llama4 MoE Refactor Test (#35870)
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-03-03 10:36:15 -08:00 |
|
Matthew Bonanni
|
8e1fd5baf0
|
[CI] Bump num_speculative_tokens to 3 in nightly DeepSeek tests (#35882)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-03 09:26:44 -08:00 |
|
Robert Shaw
|
6521ccf286
|
[CI] Temporarily Disable Nightly Failures (#35770)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-03-03 01:49:13 +00:00 |
|
Michael Goin
|
de527e1cec
|
[UX] Add --moe-backend arg for explicit kernel selection (#33807)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-25 17:44:44 -08:00 |
|
Yongye Zhu
|
1976356ee6
|
[MoE Refactor] MXFP4 Cutlass Experts to MK (#34542)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
|
2026-02-25 17:32:39 -08:00 |
|
Benjamin Chislett
|
ee59a7c615
|
[Tests] Add GSM8k check to SpecDec E2E tests (#34772)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-02-25 07:51:14 -05:00 |
|
Michael Goin
|
caeb887bf6
|
[Bugfix] Fix NVFP4 TRTLLM MoE non-gated support; add gsm8k for Nemotron-3-Nano FP8+NVFP4 (#34725)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-02-18 09:39:22 -08:00 |
|
Linda
|
275e0d2a99
|
[NVIDIA][test] Tests for flashinfer TRTLLM BF16 MoE (#33715)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Pavani Majety <pmajety@nvidia.com>
|
2026-02-11 12:38:11 +00:00 |
|
Matthew Bonanni
|
9f8cb81b44
|
[CI] Add DeepSeek V3.2 nightly eval (#33566)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-02-02 16:10:02 +00:00 |
|
Robert Shaw
|
254db42ede
|
[Tests] Remove Duplicates (#33032)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-26 05:23:54 +00:00 |
|
Robert Shaw
|
42135d6898
|
[MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414)
|
2026-01-21 08:22:33 -05:00 |
|
jiahanc
|
7350331718
|
[BugFix] Fix TRT-LLM NVFP4 DP/EP (#32349)
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-19 14:32:24 -05:00 |
|
Micah Williamson
|
b84c426a8c
|
[ROCm][CI] Skip Qwen3-30B-A3B-MXFP4A16 Eval Test On Non-CUDA Platforms (#32460)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-01-16 00:17:44 -08:00 |
|
Yongye Zhu
|
31c29257c8
|
[MoE Refactor][17/N] Apply Refactor to Bf16 (#31827)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-01-15 12:53:40 -08:00 |
|
Dipika Sikka
|
361dfdc9d8
|
[Quant] Support MXFP4 W4A16 for compressed-tensors MoE models (#32285)
Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-15 07:25:55 -08:00 |
|
Robert Shaw
|
0fa8dd24d2
|
[Bugfix] Fix Typo from NVFP4 Refactor (#31977)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-08 16:18:50 -08:00 |
|
Robert Shaw
|
5dcd7ef1f2
|
[MoE Refactor][15/N] Apply Refactor to Fp8 (#31415)
|
2026-01-07 19:42:33 -05:00 |
|
Robert Shaw
|
d3e477c013
|
[MoE Refactor] Add Temporary Integration Tests - H100/B200 (#31759)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-06 10:34:17 -05:00 |
|
Matthew Bonanni
|
276e03b92c
|
[CI][DeepSeek] Add nightly DeepSeek R1 lm_eval tests on H200 (#30356)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-05 17:17:59 -05:00 |
|
Vadim Gimpelson
|
bc0a5a0c08
|
[CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2025-12-23 17:21:50 -08:00 |
|
Michael Goin
|
10ee1c64cf
|
[CI] Generalize gsm8k test args and add Qwen3-Next MTP B200 test (#30723)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-12-16 14:28:34 -05:00 |
|
Michael Goin
|
c9a3a02149
|
Add output token counting to gsm8k eval (#28594)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-14 09:32:03 +00:00 |
|
Robert Shaw
|
e605e8e323
|
[Bugfix] Fix Stream Sync for Shared Expert Overlap (#28430)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
|
2025-11-11 05:59:08 +00:00 |
|
Zhewen Li
|
a65a934ebe
|
[CI/Build] Temporary fix to LM Eval Small Models (#28324)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-11-09 21:08:38 +00:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Michael Goin
|
30a3e5af69
|
[CI] Add Qwen3 MoE NVFP4 to Blackwell lm-eval (#26316)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-07 10:36:15 -07:00 |
|
Michael Goin
|
60bc25e74c
|
[CI] Add Blackwell LM Eval Small Models test to nightly (#26052)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-05 14:59:50 -06:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Michael Goin
|
ee04c0cd04
|
[CI] Tweaks to GPT-OSS Eval (Blackwell) for stability (#26030)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-01 12:02:17 -07:00 |
|
WeiQing Chen
|
793be8d057
|
[Docs] GSM8K Accuracy Evaluation doc update (#25360)
Signed-off-by: David Chen <530634352@qq.com>
|
2025-09-22 02:49:13 +00:00 |
|
Michael Goin
|
493b10f8bf
|
[CI] GPT-OSS GPQA eval test for Blackwell (#24920)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-09-16 18:13:21 -07:00 |
|
Wentao Ye
|
3c96e7b8a1
|
[CI] Small Accuracy Eval Test for Deepseek Model (#24259)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-09-15 20:14:50 -06:00 |
|
Michael Goin
|
0f4f0191d8
|
[CI/Build] Replace lm-eval gsm8k tests with faster implementation (#23002)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-19 15:07:30 -07:00 |
|