Lucas Wilkinson
|
8b5014d3dd
|
[Attention] FA4 integration (#32974)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2026-03-01 23:44:57 +00:00 |
|
Micah Williamson
|
0edf101d2b
|
[ROCm] Add stablelm Head Size 80 To Supported Head Sizes For ROCM_ATTN (#35527)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-02-28 12:16:34 +08:00 |
|
Gregory Shtrasberg
|
9fa6c68fa6
|
[ROCm] Enabling encoder and encoder-decoder on ROCm and AITER unified backends (#35334)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2026-02-27 21:32:55 +00:00 |
|
Matthew Bonanni
|
f2c47886fd
|
[Attention] Add FlashInfer Sparse MLA backend (#33451)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2026-02-12 17:21:54 +00:00 |
|
Michael Goin
|
5e75a14a66
|
[Doc] Add DCP support to attention backend doc (#33936)
|
2026-02-09 18:33:43 -05:00 |
|
jennyyyyzhen
|
527bcd14d4
|
[ROCM] Enable aiter attn backend for qwen3-next model (#32492)
Signed-off-by: jennyyyyzhen <yzhen@hmc.edu>
|
2026-01-31 17:03:57 +08:00 |
|
Matthew Bonanni
|
77c4f45c6c
|
[7/N][Attention][Docs] Add documentation for attention backends (#32477)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-28 17:20:22 -05:00 |
|