Files
nvfp4-megamoe-kernel/tests/unit
biondizzle ba67e055f7 Add production FMHA layer comparison test
Test loads real model weights, runs attention forward for layers 0-4,
compares production B1 mixed FP8 FMHA output vs PyTorch SDPA reference.
This will reveal the FMHA cosine degradation (was 0.679 at L1) with
real data patterns, not just synthetic random data.

Production values: HD=512, NOPE=448, ROPE=64, H=128, 8 GPUs.
2026-06-03 02:22:23 +00:00
..
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-30 03:46:38 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:46:53 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:55:59 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 14:38:03 +00:00
2026-06-02 09:43:45 +00:00
2026-05-28 14:40:55 +00:00
2026-05-28 14:33:31 +00:00
2026-05-28 16:36:53 +00:00
2026-05-28 17:00:20 +00:00
2026-05-28 16:39:45 +00:00
2026-05-28 16:42:24 +00:00
2026-05-28 15:51:55 +00:00
2026-05-28 15:49:47 +00:00
2026-05-28 15:48:15 +00:00
2026-05-28 15:54:05 +00:00
2026-05-28 11:39:15 +00:00