Add Support for 2/3/8-bit GPTQ Quantization Models (#2330)

2024-02-29 13:52:23 +08:00
parent 929b4f2973
commit 01a5d18a53
8 changed files with 1663 additions and 156 deletions
--- a/csrc/quantization/gptq/q_gemm.cu
+++ b/csrc/quantization/gptq/q_gemm.cu