Yong Hoon Shin
|
71470bc4af
|
[Misc] Add unit tests for chunked local attention (#21692)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-07-31 11:39:16 -07:00 |
|
Maximilien de Bayser
|
1cd6eaba54
|
Support encoder-only models without KV-Cache (#21270)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
|
2025-07-26 21:09:52 +08:00 |
|
Lucas Wilkinson
|
61b8cea3b4
|
[Attention] Optimize FlashInfer MetadataBuilder Build call (#21137)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-07-24 03:21:46 -07:00 |
|
Lucas Wilkinson
|
76b494444f
|
[Attention] Refactor attention metadata builder interface (#20466)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-07-17 04:44:25 +00:00 |
|