vllm/csrc/quantization/gguf at 661a34fd4fdd700a29b2db758e23e4e243e7ff18 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

sasha0552 781e3b9a42 [Bugfix][Kernel] Fix build for sm_60 in GGUF kernel (#8506 )

2024-09-16 12:15:57 -06:00

..

dequantize.cuh

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

ggml-common.h

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

gguf_kernel.cu

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

mmq.cuh

[Core] Support loading GGUF model (#5191 )

2024-08-05 17:54:23 -06:00

mmvq.cuh

[Bugfix][Kernel] Add IQ1_M quantization implementation to GGUF kernel (#8357 )

2024-09-15 16:51:44 -06:00

vecdotq.cuh

[Bugfix][Kernel] Fix build for sm_60 in GGUF kernel (#8506 )

2024-09-16 12:15:57 -06:00