Files
nvfp4-megamoe-kernel/tests/unit
biondizzle e173295a3a FMHA SM100: Refactor into common + reference + TMEM epilogue headers
- fmha_common.cuh: BF16, TMEM ops, warp reductions (shared)
- fmha_sm100.cuh: Phase 1 reference (SMEM-based, cos 0.999999)
- fmha_epilogue_sm100.cuh: Phase 2 TMEM+correction epilogue (Priority 2)
- Test both kernels at hd=64 and hd=128
2026-05-28 06:31:05 +00:00
..
2026-05-23 03:25:29 +00:00
2026-05-23 03:20:46 +00:00
2026-05-24 22:23:08 +00:00
2026-05-24 22:04:51 +00:00
2026-05-24 03:48:37 +00:00
2026-05-23 23:58:57 +00:00