Commit Graph

10 Commits

Author SHA1 Message Date
Andreas Karatzas
807d680337 [ROCm][CI] Fix tool use test stability - disable skinny GEMM, prefix caching, eliminate batch variance (#35553)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-06 15:15:12 +08:00
Rohan Potdar
c5362c739f Reenable features for ROCm attention backends (#36185)
Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
2026-03-05 20:21:06 -08:00
Sage Moore
8c760b6ab6 [ROCm] Refactor ROCm attention backend selection logic (#35246)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2026-03-05 10:51:26 -06:00
Lucas Wilkinson
8b5014d3dd [Attention] FA4 integration (#32974)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2026-03-01 23:44:57 +00:00
Micah Williamson
0edf101d2b [ROCm] Add stablelm Head Size 80 To Supported Head Sizes For ROCM_ATTN (#35527)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2026-02-28 12:16:34 +08:00
Gregory Shtrasberg
9fa6c68fa6 [ROCm] Enabling encoder and encoder-decoder on ROCm and AITER unified backends (#35334)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2026-02-27 21:32:55 +00:00
Matthew Bonanni
f2c47886fd [Attention] Add FlashInfer Sparse MLA backend (#33451)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
2026-02-12 17:21:54 +00:00
Michael Goin
5e75a14a66 [Doc] Add DCP support to attention backend doc (#33936) 2026-02-09 18:33:43 -05:00
jennyyyyzhen
527bcd14d4 [ROCM] Enable aiter attn backend for qwen3-next model (#32492)
Signed-off-by: jennyyyyzhen <yzhen@hmc.edu>
2026-01-31 17:03:57 +08:00
Matthew Bonanni
77c4f45c6c [7/N][Attention][Docs] Add documentation for attention backends (#32477)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-28 17:20:22 -05:00