biondizzle
9d64434954
D5c: add sink bias (attn_sink) logit modification to FMHA kernel
- Add n_comp parameter: compressed KV length, sink bias applies to positions >= n_comp
- Add sink_bias parameter: per-head FP32 logit bias for SWA positions
- D3 mask updated: kv_pos >= n_comp + swa_len (backward compatible when n_comp=0)
- D4 causal mask updated: compare SWA-relative position (kv_pos - n_comp) with m_coord
- Mathematical insight: sink merge = single softmax over [S_comp, S_swa + attn_sink]
- Add test_d5c_fused.py with combined KV + sink bias test
2026-05-26 14:59:52 +00:00
..
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-26 14:59:52 +00:00
2026-05-22 17:07:23 +00:00
2026-05-16 02:13:18 +00:00
2026-05-22 17:08:12 +00:00
2026-05-23 00:17:07 +00:00