- Run FMHA twice (compressed KV + SWA KV, normalize=False) - Merge with sink weights in Python - Verify end-to-end correctness vs FP32 reference
- Run FMHA twice (compressed KV + SWA KV, normalize=False) - Merge with sink weights in Python - Verify end-to-end correctness vs FP32 reference