Commit Graph

6 Commits

Author SHA1 Message Date
Woosuk Kwon
88c0268a18 Implement custom kernel for LLaMA rotary embedding (#14) 2023-03-30 11:04:21 -07:00
Woosuk Kwon
80a2f812f1 Implement LLaMA (#9)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2023-03-30 12:25:32 +08:00
Zhuohan Li
721fa3df15 FastAPI-based working frontend (#10) 2023-03-29 14:48:56 +08:00
Zhuohan Li
2f49f15585 Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
Woosuk Kwon
cfae35b861 Add miscellaneous updates (#8) 2023-03-13 13:48:38 -07:00
Woosuk Kwon
e9d3f2ff77 Add memory analyzer & utomatically configure KV cache size (#6) 2023-03-11 23:23:14 -08:00