D1.5: Add TODO for correction epilog - keeping working TMEM round-trip for now

This commit is contained in:
2026-05-24 00:37:36 +00:00
parent 9f88db897f
commit e632490682

View File

@@ -410,6 +410,7 @@ class FmhaKernel:
final_o_bar.arrive_and_wait()
# === NO-OP TMEM round-trip: re-map O from MMA layout to epilog layout ===
# TODO: Replace with correction epilog (D1.5) for zero-error one-way trip
tTMrO_noop = cute.make_rmem_tensor(
(tTMEM_LOADcO.shape, 128 // corr_tile_size), self.acc_dtype
)