Files
nvfp4-megamoe-kernel/dsv4/kernels
biondizzle 446a0ca9fd refactor(tmem): clean rewrite of TMEM epilogue kernel
Removed all dead code from the first (broken) attention loop approach.
Clean pipeline: SMEM attention → TMEM write → TMEM read → normalize → GMEM.

Also renamed sPvBuf to sO for clarity (same as reference kernel).
2026-05-28 07:49:03 +00:00
..