Commit Graph

5 Commits

Author SHA1 Message Date
Woosuk Kwon
c9d5b6d4a8 Replace FlashAttention with xformers (#70) 2023-05-05 02:01:08 -07:00
Siyuan (Ryans) Zhuang
e3cec88aa5 Memcpy kernel for flash attention (#29)
* optimize

* add benchmark

* add assert

* add test
2023-04-10 18:22:49 -07:00
Woosuk Kwon
0f40557af6 Implement block copy kernel to optimize beam search (#32) 2023-04-07 17:45:07 -07:00
Woosuk Kwon
897cb2ae28 Optimize data movement (#20) 2023-04-02 00:30:17 -07:00
Woosuk Kwon
0deacbce6e Implement single_query_cached_kv_attention kernel (#3) 2023-03-01 15:02:19 -08:00