biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 06:07:21 +00:00
311b28bd9f fixey wixey
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 06:01:54 +00:00
685bce48b4 actually handle expert param mapping
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 05:48:42 +00:00
f17efa340d are the weights ever not zero?
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 05:41:15 +00:00
c5d800f133 can we see the wt in?
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 05:31:17 +00:00
6a4f52cedc god dam i just want the gemm in
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 05:21:53 +00:00
3b3c506af5 whoops
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 05:08:58 +00:00
76e9b078a2 more debug2
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 04:53:29 +00:00
912e4622d7 more debug
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 04:35:49 +00:00
c7f6a1dc4d fix: transpose B and SFB on the Python side at weight-load time, and adjust the SFB remap kernel to read from column-major source layout
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 04:30:32 +00:00
c56cc34ae1 fix: LayoutBTag is now RowMajor
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 03:59:08 +00:00
9975558c23 Add always-on alpha/x_sf debug prints for L1 and L2 GEMM calls
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 03:52:03 +00:00
9c318c3353 force no cache
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 03:32:21 +00:00
ff6bb32684 Plumb global scale as GEMM alpha instead of folding into UE4M3
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 03:27:56 +00:00
d547da2948 stage_activation: add per-tensor global scale matching NVFP4 spec
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 03:01:08 +00:00
108ff07569 debug: remove one-shot gate from logit dump, log every forward
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 02:37:24 +00:00
3600a4b06a debug: add logit quality dump in compute_logits (ungated, once)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 02:17:17 +00:00
29f8b8c174 fix: load lm_head.weight in outer model before forwarding to inner
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 01:54:55 +00:00
46536e5ccf fix: hc param renames missing leading dot
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 01:29:01 +00:00
086f3fa5c5 fix: hc params dot→underscore + compressor position_bias→ape combined rule
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-15 01:16:20 +00:00
44d4b6c225 fix: add missing renames for Hadamard coding + compressor.ape