Files
nvfp4-megamoe-kernel/dsv4
biondizzle 0f6907b001 UMMA: fix descriptor + idesc — use gau-nernst tutorial values
- LBO = BLOCK_MN * 16 (bytes), SBO = 128 (bytes) for K-major NONE
- Canonical SMEM layout: column-major interleaving of core matrices
- idesc is SEPARATE 32-bit value (was using desc_a>>32 = WRONG)
- idesc encodes dtype/atype/btype/MMA_M/MMA_N
- This was the root cause of 'misaligned address' errors
2026-05-28 09:18:45 +00:00
..