vllm/csrc/quantization/gguf at a95354a36ee65523a499b3eb42f70a4a0ea4322d - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

sasha0552 781e3b9a42 [Bugfix][Kernel] Fix build for sm_60 in GGUF kernel (#8506 )

2024-09-16 12:15:57 -06:00

..

dequantize.cuh

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

ggml-common.h

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

gguf_kernel.cu

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

mmq.cuh

[Core] Support loading GGUF model (#5191 )

2024-08-05 17:54:23 -06:00

mmvq.cuh

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

vecdotq.cuh

[Bugfix][Kernel] Fix build for sm_60 in GGUF kernel (#8506 )

2024-09-16 12:15:57 -06:00