Files
nvfp4-megamoe-kernel/dsv4
biondizzle 014d647ba3 fix: sink bias domain correction — add attn_sink/scale to raw logits
The softmax scales by scale_log2 = scale * log2(e). Adding sink_val to
raw logits causes it to be scaled too. Fix: add sink_val/scale instead,
so after scaling: (sink_val/scale) * scale_log2 = sink_val * log2(e).
This correctly multiplies attention weights by exp(sink_val).
2026-05-26 15:03:49 +00:00
..