Files
nvfp4-megamoe-kernel/tests/unit
biondizzle 9d64434954 D5c: add sink bias (attn_sink) logit modification to FMHA kernel
- Add n_comp parameter: compressed KV length, sink bias applies to positions >= n_comp
- Add sink_bias parameter: per-head FP32 logit bias for SWA positions
- D3 mask updated: kv_pos >= n_comp + swa_len (backward compatible when n_comp=0)
- D4 causal mask updated: compare SWA-relative position (kv_pos - n_comp) with m_coord
- Mathematical insight: sink merge = single softmax over [S_comp, S_swa + attn_sink]
- Add test_d5c_fused.py with combined KV + sink bias test
2026-05-26 14:59:52 +00:00
..
2026-05-23 03:25:29 +00:00
2026-05-23 03:20:46 +00:00
2026-05-24 22:23:08 +00:00
2026-05-24 22:04:51 +00:00
2026-05-24 03:48:37 +00:00
2026-05-23 23:58:57 +00:00