vllm/csrc/quantization at 10fa9eea21ae757d17c1369afa6172598db3be92 - vllm

Files

Alexander Matveev 6979ade384 Add GPTQ Marlin 2:4 sparse structured support (#4790 )

Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>

2024-05-16 12:56:15 -04:00

AQLM CUDA support (#3287 )

2024-04-23 13:59:33 -04:00

2024-02-12 11:02:17 -08:00

2024-05-09 18:04:17 -06:00

2024-04-11 16:35:51 -04:00

2024-05-16 09:55:29 -04:00

2024-05-16 12:56:15 -04:00

2024-01-03 09:52:29 -08:00