biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 12:09:47 +00:00
c2e41a858e test: force 2 K-tiles for debug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 12:07:51 +00:00
8b2200a6d3 test: HD=64 full 4 K-tile accumulate + full-HD scalar reference
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 12:04:56 +00:00
afb18caf2d test: clean HD=64, 1 K-tile only, verify SMEM writes + compare vs scalar
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 12:01:29 +00:00
e587e26b06 test: log canonical indices we write Q to
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:59:10 +00:00
facd509c3c test: remove sanity check (zeroing loop overwrites), fix verify offsets
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:57:15 +00:00
20ae390d32 test: fix compile error
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:56:09 +00:00
7b16eceb91 test: more detailed SMEM sanity check
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:54:16 +00:00
eb0ca18e23 test: sanity check sQ[0] write+read
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:52:53 +00:00
8936a2dec7 test: clean SMEM write loops for HD=64
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:51:10 +00:00
2ffbfda47d test: print SMEM verify data
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:49:46 +00:00
4fd41365de test: add SMEM verify for HD=64 K-tile offsets
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:47:57 +00:00
4483539f01 test: HD=64 random data, 4 K-tiles, accumulate
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:46:14 +00:00
73bd21ce01 test: force 1 K-tile for HD=64 debug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:44:33 +00:00
abe1870429 test: HD=64 all-ones, expected S[0,j]=64 (unscaled) or 8.0 scaled
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:42:32 +00:00
73f9ff98c9 test: UMMA QK HD=64 (4 K-tiles, accumulate) — multi-K-tile test
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:41:20 +00:00
df34cae9c6 UMMA QK GEMM WORKING! Update docs — 4x was scale factor, not bug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:39:17 +00:00
1874a70a6d test: fix var ref
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:38:32 +00:00
8426d13285 test: fix comparison — row 0 is S[0,c], rows 1-127 should be zero
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:36:47 +00:00
6f40fafa91 test: verify ALL 128 rows × 8 cols match scalar reference
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-28 11:34:46 +00:00
3c7d9d9303 test: apply 1/sqrt(HD) scale to MMA output — 4x was the scale factor, not a bug!