-
[Bugfix] Zero-init MLA attention output buffers to prevent NaN from CUDA graph padding (#37442)
released this
2026-03-19 01:44:16 +00:00 Signed-off-by: Elvir Crncevic elvircrn@gmail.com
Signed-off-by: Matthew Bonanni mbonanni@redhat.com
Co-authored-by: Matthew Bonanni mbonanni@redhat.com
(cherry picked from commitef2c4f778d)Downloads