Files
nvfp4-megamoe-kernel/src
biondizzle 839835cba4 fix: correct SF remap coordinate extraction for flat_rank=8
m = f0 + f1*32 + f2*128  (CuTe 'first sub varies fastest')
k_sf = f4 + f5*4
f3 is the Step<2> stride (degenerate, always=total), NOT a coordinate.
Previous formula (f3*2+f2)*128 was catastrophically wrong — mapped
everything to m=0 or m=huge.
2026-05-14 16:40:48 +00:00
..