biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 21:23:24 +00:00
7285331395 fix: replace col_major_src with explicit source strides
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 21:20:01 +00:00
f6fd549800 fix: restore col_major_src handling for SFB source layout
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:51:31 +00:00
63e67e1025 fix: rewrite SF remap as forward mapping (source→dst)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:44:48 +00:00
30b6c89424 fix: correct SF remap coordinate extraction
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:17:27 +00:00
ff5a0843dc fix: divide K element index by SFVecSize to get k_sf
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:11:41 +00:00
a09b9b53a3 cleanup: remove printf and diag function from CUDA kernel (build fix)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:03:21 +00:00
e7c3341317 docs: update DEBUG_LOG with M/K swap root cause
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 20:01:48 +00:00
deb6b3231a debug: swap M/K in SF remap + add printf diagnostics
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 19:59:43 +00:00
22f0457ccf test: isolate SFA vs SFB remap bug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 19:58:58 +00:00
9eaf6d07e8 test: quick random test
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:56:11 +00:00
fa7b394571 docs: update DEBUG_LOG with root cause (size→cosize) and full debug timeline
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:52:24 +00:00
c3841983a0 fix: SF remap uses cute::cosize() instead of cute::size()
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:51:53 +00:00
67dcfa83f5 test: random data at small dims + alpha sweep
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:51:32 +00:00
60f7f60818 test: ultra-minimal GEMM with all-ones
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:51:12 +00:00
363dd893f0 test: dimension sweep to isolate GEMM bug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:50:47 +00:00
fee5a97ebb fix: cosine_similarity dim for M>0
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:47:28 +00:00
f9330a1777 test: standalone M=1 GEMM test with deterministic data
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:35:03 +00:00
1b63a46168 docs: update DEBUG_LOG with cosine≈0 finding + new hypotheses
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:27:46 +00:00
773967452f debug: fix gs scalar conversion + add traceback
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 18:09:42 +00:00
df916b87eb debug: fix gs.item() for multi-element tensor