biondizzle
9d64434954
D5c: add sink bias (attn_sink) logit modification to FMHA kernel
- Add n_comp parameter: compressed KV length, sink bias applies to positions >= n_comp
- Add sink_bias parameter: per-head FP32 logit bias for SWA positions
- D3 mask updated: kv_pos >= n_comp + swa_len (backward compatible when n_comp=0)
- D4 causal mask updated: compare SWA-relative position (kv_pos - n_comp) with m_coord
- Mathematical insight: sink merge = single softmax over [S_comp, S_swa + attn_sink]
- Add test_d5c_fused.py with combined KV + sink bias test
2026-05-26 14:59:52 +00:00
..
2026-05-26 14:59:52 +00:00
2026-05-22 00:08:38 +00:00
2026-05-21 17:30:44 +00:00
2026-05-25 16:21:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-23 06:32:54 +00:00
2026-05-22 01:20:39 +00:00
2026-05-21 22:04:20 +00:00
2026-05-21 17:30:44 +00:00