vllm/csrc/quantization at 16a65e41736c5e8d27e9e843668f6e3f99d68d9a - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Monishver c09ad767cd Feature/silu block quant fusion v1 (#32996 )

Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

2026-04-01 18:50:43 +00:00

..

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

Feature/silu block quant fusion v1 (#32996 )

2026-04-01 18:50:43 +00:00

[Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

2025-07-24 00:37:19 -07:00

[ROCm][GPTQ][Bugfix] Fix GPTQ GEMM kernel output zeroing race condition (#30719 )

2025-12-29 01:13:14 -08:00

[Refactor] Rename gptq_marlin to marlin to match MoE (#32952 )

2026-01-23 16:48:12 -05:00

hadamard/hadacore

Use narrow over indexing in hadacore_transform to prep for ABI stable (#28756 )

2025-11-15 01:10:15 -08:00

[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423 )

2026-03-30 09:36:18 -07:00

[Kernel] Add MXFP8 to Marlin GEMM/MoE and refactor Mxfp8LinearOp (#34664 )

2026-04-01 09:41:42 -07:00

[3/n] Migrate cutlass/scaled_mm_entry.cu torch stable ABI (#37221 )

2026-03-30 11:20:13 -07:00

activation_kernels.cu

[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972 )

2026-03-27 16:36:08 -07:00

utils.cuh

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00