- exp(LSE) != row_sum (it's row_sum * exp(max(S*scale))) - Normalize using reference attn_sum (same as other tests) - D5 merge uses normalized O + LSE: O = sum(exp(lse)*O_norm)/sum(exp(lse)) - Added 4-tile KV merge test (s_k=512)
- exp(LSE) != row_sum (it's row_sum * exp(max(S*scale))) - Normalize using reference attn_sum (same as other tests) - D5 merge uses normalized O + LSE: O = sum(exp(lse)*O_norm)/sum(exp(lse)) - Added 4-tile KV merge test (s_k=512)