Wei Zhao
|
a3a51d20e7
|
[Benchmark] Improvements to attention benchmark script (#37115)
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
|
2026-03-16 22:22:40 +00:00 |
|
Matthew Bonanni
|
f444c05c32
|
[Attention] Use FA4 for MLA prefill (#34732)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-03-12 12:10:17 -04:00 |
|
Matthew Bonanni
|
f2c47886fd
|
[Attention] Add FlashInfer Sparse MLA backend (#33451)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2026-02-12 17:21:54 +00:00 |
|
Matthew Bonanni
|
e82fa448c4
|
Add attention benchmarking tools (#26835)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-01-28 00:09:20 +00:00 |
|