Files
nvfp4-megamoe-kernel/src
biondizzle 6626b75a2f fix: use filter_zeros for SF allocation + no-branch forward mapping
- Allocation: cute::size(cute::filter_zeros(layout)) matches CUTLASS examples
- Kernel: layout_sf(make_coord(mn, k_sf*16, 0)) — no branching on LayoutRank
- Avoids silent fallthrough that wrote dst[0] for all threads
2026-05-15 22:58:51 +00:00
..