vllm/csrc/quantization/gguf at 2f2c1d73a745d8a38d1a21a5865a7d53d8d616b7 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Gregory Shtrasberg 90eeea8f85 [Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

2025-07-24 00:37:19 -07:00

..

dequantize.cuh

…

ggml-common.h

…

gguf_kernel.cu

[Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

2025-07-24 00:37:19 -07:00

mmq.cuh

…

mmvq.cuh

[Kernel] GGUF MMVQ kernel for multiple input vectors (#18754 )

2025-06-16 17:33:26 +08:00

moe_vec.cuh

[Kernel] GGUF MoeVec kernel (#16780 )

2025-05-06 23:07:23 -07:00

moe.cuh

…

vecdotq.cuh

…