nvfp4-megamoe-kernel/tests at e8b289e30d127cd049558f174c016b1942e1fe2b - nvfp4-megamoe-kernel - Gitea: Git with a cup of tea

biondizzle/nvfp4-megamoe-kernel

Files

History

biondizzle e8b289e30d WIP: CuTeDSL shared expert kernel

Dedicated runner (shared_expert_pipeline.py) and test (test_shared_expert.py).
Tried reusing MoE runner with 1 expert — fails because MoE runner assumes
hidden_size != HC_DIM for scatter. Need dedicated runner with correct
scale assembly. Will continue tomorrow.

2026-05-18 20:02:19 +00:00

..

cudagraph_test.py

fix: test L2 weight N dim should be hidden_size, not hidden_size//2

2026-05-16 19:07:36 +00:00

debug_output.py

Update CURRENT_BUG.md: current status, outstanding garbage output issue, hypotheses

2026-05-17 16:52:40 +00:00

layertest.py

restore: new bridge/moe_pipeline/layertest

2026-05-16 19:55:19 +00:00

requirements.txt

test: add standalone layer 0 comparison test (no vLLM, no Docker)

2026-05-16 02:13:18 +00:00

run_test.sh

fix: use setup.py install for CUTLASS extension build

2026-05-16 02:21:17 +00:00

test_b_layout.py

cleanup: move useful tests to tests/, nuke stale debug tests

2026-05-16 02:14:37 +00:00

test_cutedsl.py

fix: B tensor K-major strides, scale_b axis swap

2026-05-16 03:04:31 +00:00

test_multilayer.py

Add MoE scale ratio output

2026-05-17 22:58:27 +00:00

test_pipeline_real_weights.py

Pipeline test: use max_num_tokens=8192 matching vLLM

2026-05-17 23:04:44 +00:00

test_quick_rand.py

cleanup: move useful tests to tests/, nuke stale debug tests

2026-05-16 02:14:37 +00:00

test_runner_vs_pipeline.py

test: runner vs pipeline comparison + scale assembly comparison

2026-05-17 07:33:20 +00:00

test_scale_assembly.py

fix: separate L1/L2 scale buffers (different K_sf), fix assembly calls

2026-05-17 07:43:05 +00:00

test_scale_debug.py

test: scale assembly debug

2026-05-17 07:37:47 +00:00

test_shared_expert.py

WIP: CuTeDSL shared expert kernel

2026-05-18 20:02:19 +00:00

test_uniform_fp4.py

cleanup: move useful tests to tests/, nuke stale debug tests

2026-05-16 02:14:37 +00:00

test_warmup_gs.py

test: use runner's built-in warmup method

2026-05-17 08:24:27 +00:00