Integrate Marlin Kernels for Int4 GPTQ inference (#2497)
Co-authored-by: Robert Shaw <114415538+rib-2@users.noreply.github.com> Co-authored-by: alexm <alexm@neuralmagic.com>
This commit is contained in:
1145
csrc/quantization/marlin/marlin_cuda_kernel.cu
Normal file
1145
csrc/quantization/marlin/marlin_cuda_kernel.cu
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user