biondizzle
d7a0fc2bc2
CRITICAL FIX: K GMEM slice (None,None,0,0) not (None,0,None,0)
...
K from QK MMA B-partition has GMEM iter at mode 1, NOT mode 2.
(None,0,None,0) hardcodes mode 1 to 0 → TMA always loads tile 0.
(None,None,0,0) keeps mode 1 free → correct multi-tile loading.
Proof: diag n=256 went from cos 0.711 → 0.999999 with this one change.
2026-05-22 17:59:57 +00:00
..
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-22 00:08:38 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 21:54:05 +00:00
2026-05-22 10:25:48 +00:00
2026-05-22 05:52:10 +00:00
2026-05-22 17:59:01 +00:00
2026-05-22 07:09:52 +00:00
2026-05-22 05:52:10 +00:00
2026-05-22 07:29:04 +00:00
2026-05-22 07:09:52 +00:00
2026-05-22 05:52:10 +00:00
2026-05-22 07:09:52 +00:00
2026-05-22 05:52:10 +00:00
2026-05-22 05:52:10 +00:00
2026-05-21 20:13:51 +00:00
2026-05-22 10:17:02 +00:00
2026-05-22 17:59:57 +00:00
2026-05-22 08:57:38 +00:00
2026-05-22 08:57:38 +00:00
2026-05-22 05:52:10 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 23:11:09 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 21:54:05 +00:00