Implement AWQ quantization support for LLaMA (#1032)

Co-authored-by: Robert Irvine <robert@seamlessml.com>
Co-authored-by: root <rirv938@gmail.com>
Co-authored-by: Casper <casperbh.96@gmail.com>
Co-authored-by: julian-q <julianhquevedo@gmail.com>
This commit is contained in:
Woosuk Kwon
2023-09-16 00:03:37 -07:00
committed by GitHub
parent b9fe4616f9
commit e3e79e9e8a
19 changed files with 1178 additions and 208 deletions

4
.gitignore vendored
View File

@@ -173,3 +173,7 @@ cython_debug/
# Sphinx documentation
_build/
# vim swap files
*.swo
*.swp