Files
nvfp4-megamoe-kernel/tests/unit
biondizzle aa41cfa2e5 Multi-head FMHA kernel (Milestone 5): grid launch with MHA/MQA/batch support
- fmha_6warp_multihead.cuh: grid=(1, n_h, batch) kernel with FmhaParams
- MQA support via k_head_stride=0 / v_head_stride=0
- LSE output for multi-segment KV merge composition
- test_fmha_6warp_multihead.cu: MHA (4+8 heads), MQA, batched tests
- HD-specific wrappers for hd=16/64/128/256
- Marked E2M1 dequant bug as FIXED in consultant issue file
2026-05-28 19:32:35 +00:00
..
2026-05-23 03:25:29 +00:00
2026-05-23 03:20:46 +00:00
2026-05-24 22:23:08 +00:00
2026-05-24 22:04:51 +00:00
2026-05-24 03:48:37 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:46:53 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:55:59 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 19:12:23 +00:00
2026-05-28 14:38:03 +00:00
2026-05-28 14:40:55 +00:00
2026-05-28 14:33:31 +00:00
2026-05-23 23:58:57 +00:00
2026-05-28 16:36:53 +00:00
2026-05-28 17:00:20 +00:00
2026-05-28 16:39:45 +00:00
2026-05-28 16:42:24 +00:00
2026-05-28 15:51:55 +00:00
2026-05-28 15:49:47 +00:00
2026-05-28 15:48:15 +00:00
2026-05-28 15:54:05 +00:00
2026-05-28 11:39:15 +00:00