vllm/csrc/quantization at cb0b4432746f8a7aaeeba9a8068e8fd4baf84a0a - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

mikaylagawarecki 7c080dd3c5 [4/n] Migrate FP4/W4A8 CUTLASS kernels to torch stable ABI (#37503 )

Signed-off-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>

2026-03-31 10:21:13 -07:00

..

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

[2/n] Migrate per_token_group_quant to torch stable ABI (#36058 )

2026-03-25 10:15:13 -07:00

[Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

2025-07-24 00:37:19 -07:00

[ROCm][GPTQ][Bugfix] Fix GPTQ GEMM kernel output zeroing race condition (#30719 )

2025-12-29 01:13:14 -08:00

[Refactor] Rename gptq_marlin to marlin to match MoE (#32952 )

2026-01-23 16:48:12 -05:00

hadamard/hadacore

Use narrow over indexing in hadacore_transform to prep for ABI stable (#28756 )

2025-11-15 01:10:15 -08:00

[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423 )

2026-03-30 09:36:18 -07:00

[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972 )

2026-03-27 16:36:08 -07:00

[3/n] Migrate cutlass/scaled_mm_entry.cu torch stable ABI (#37221 )

2026-03-30 11:20:13 -07:00

activation_kernels.cu

[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972 )

2026-03-27 16:36:08 -07:00

utils.cuh

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00