Files
nvfp4-megamoe-kernel/tests
biondizzle 2ef71dc21a fix: B tensor K-major strides, scale_b axis swap
Two fixes:
1. B tensor: permute(0,2,1).contiguous().permute(0,2,1) gives K-major
   stride (16384,1,128) matching reference
2. scale_b: transpose to (N, K_sf) before swizzling — reference uses
   (intermediate, hidden//16) not (hidden//16, intermediate)
2026-05-16 03:04:31 +00:00
..