Commit Graph

4 Commits

Author SHA1 Message Date
Gassan Salama
1363e3d6d5 [cpu][performance] CPU Paged Attention NEON BFMMLA BF16 Implementation (#32263)
Signed-off-by: Gassan <gassan.salama@arm.com>
2026-02-06 15:01:48 +08:00
R3hankhan
4dffc5e044 [CPU] Split attention dispatch by head_dim alignment (#32161)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2026-02-03 19:37:15 -08:00
R3hankhan
8e27663b6a [CPU] Add head sizes 80 and 112 with vec16 fallback (#31968)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2026-01-09 22:14:46 +08:00
Fadi Arafeh
730bd35378 [perf][cpu] Accelerate paged attention GEMMs (QK, PV) on Arm CPUs with NEON (#29193)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
2025-11-22 09:04:36 -08:00