nvfp4-megamoe-kernel

Files

biondizzle 9b86b2b414 Test: fix fused router test - proper NVFP4 quantization and CuTe tensor setup

- Use quantize_to_nvfp4 for weight quantization
- Use quantize_activation_nvfp4 with computed global_scale
- Get mat_b and scale_b from Nvfp4Linear after finalize_weights
- Compare against both BF16 reference and NVFP4 GEMM reference

2026-06-01 08:56:20 +00:00

e2e

E3: model construction test

2026-05-30 21:22:34 +00:00

integration

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

unit

Test: fix fused router test - proper NVFP4 quantization and CuTe tensor setup

2026-06-01 08:56:20 +00:00

check_log.sh

Add check_log.sh convenience script

2026-05-22 17:07:23 +00:00

compare_hf_reference.py

Add HuggingFace reference comparison test

2026-05-31 12:05:19 +00:00

compare_layer0.py

Add HF reference test script

2026-05-31 20:11:37 +00:00

layer_compare.py

Fix remaining mHC API references: layer_compare.py, layer.py comment

2026-05-31 18:38:34 +00:00

requirements.txt

test: add standalone layer 0 comparison test (no vLLM, no Docker)

2026-05-16 02:13:18 +00:00

run_test.sh

run_test.sh: SIGKILL all children of screen session on cleanup

2026-05-22 17:08:12 +00:00

test_minimal_e2e.py

Fix mHCBlock import + relax RoPE round-trip threshold (BF16 noise expected)

2026-05-31 09:17:07 +00:00

test_residual_diagnostic.py

Fix expert weight indexing for 1D tensor

2026-05-31 09:23:10 +00:00

validate_layer.py

Fix dtype mismatch in validate_layer: cast flat to float before F.linear

2026-05-31 20:23:18 +00:00

verify_attention.py

fix verify_attention: proper multi-head SDPA + GQA

2026-05-31 05:55:10 +00:00