Commit Graph

9 Commits

Author SHA1 Message Date
Michael Goin
de527e1cec [UX] Add --moe-backend arg for explicit kernel selection (#33807)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2026-02-25 17:44:44 -08:00
Michael Goin
caeb887bf6 [Bugfix] Fix NVFP4 TRTLLM MoE non-gated support; add gsm8k for Nemotron-3-Nano FP8+NVFP4 (#34725)
Signed-off-by: mgoin <mgoin64@gmail.com>
2026-02-18 09:39:22 -08:00
Linda
275e0d2a99 [NVIDIA][test] Tests for flashinfer TRTLLM BF16 MoE (#33715)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Pavani Majety <pmajety@nvidia.com>
2026-02-11 12:38:11 +00:00
Robert Shaw
254db42ede [Tests] Remove Duplicates (#33032)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-26 05:23:54 +00:00
Robert Shaw
42135d6898 [MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414) 2026-01-21 08:22:33 -05:00
Yongye Zhu
31c29257c8 [MoE Refactor][17/N] Apply Refactor to Bf16 (#31827)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2026-01-15 12:53:40 -08:00
Robert Shaw
0fa8dd24d2 [Bugfix] Fix Typo from NVFP4 Refactor (#31977)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-08 16:18:50 -08:00
Robert Shaw
5dcd7ef1f2 [MoE Refactor][15/N] Apply Refactor to Fp8 (#31415) 2026-01-07 19:42:33 -05:00
Robert Shaw
d3e477c013 [MoE Refactor] Add Temporary Integration Tests - H100/B200 (#31759)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-06 10:34:17 -05:00