biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:30:50 +00:00
a63f452c86 Fix softmax loop: use self.n_kv_tiles not cute.size(gK, mode=[3])
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:29:54 +00:00
195b0506af auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:28:46 +00:00
88b66e9dca Add O rescale with pre-built paired atoms (corr_tile_size=16)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:27:43 +00:00
fae9f6fbb5 Reset to working_softmax_maybe.py + TMA fix only
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:25:20 +00:00
524b5b1840 Fix final normalize: use working 2D register tensor pattern from working_softmax_maybe.py
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:23:40 +00:00
85973743d6 Fix: add self.n_kv_tiles to __init__
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:22:15 +00:00
261f23e698 Add per-tile O rescale (O *= acc_scale) to softmax loop
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:17:10 +00:00
524f0bdfb4 Clean up: archive diagnostics and superseded tests
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:05:08 +00:00
8273a94506 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:02:35 +00:00
3a6f946316 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-23 00:00:48 +00:00
9b51586742 auto: pre-test commit
biondizzle pushed tag v-tma-multitile-fix to biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:51:34 +00:00
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:51:31 +00:00
89b86c4134 🚀🚀🚀 TMA MULTI-TILE FIX VERIFIED ON B200 🚀🚀🚀
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:35:56 +00:00
6db7fd339d FIX: (None,0,None,0) for ALL tma_partition outputs — verified shapes on B200
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:34:05 +00:00
a50cb138c8 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:30:45 +00:00
e3054b5937 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:29:16 +00:00
9a3cf248db auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:28:19 +00:00
8718e82258 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:27:34 +00:00
d45b6173f5 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-22 23:25:41 +00:00
4ce5926498 FIX: (None,0,None,0) pre-slice keeps KV tile axis (mode 2) free