Files
nvfp4-megamoe-kernel/tests
biondizzle bd16e8fa85 fix: use tcgen05.wait::st/ld instead of nonexistent tcgen05.fence
ROOT CAUSE of TMET hang: tcgen05.fence.cta_group::1.sync.aligned is
NOT a valid PTX instruction. The correct TMEM ordering primitives are:
- tcgen05.wait::st.sync.aligned (wait for TMEM stores to complete)
- tcgen05.wait::ld.sync.aligned (wait for TMEM loads to complete)

Found in cutlass/arch/barrier.h fence_view_async_tmem_store/load.
2026-05-28 07:12:26 +00:00
..