vllm/csrc/quantization at da543d1abe2468a1b79f230e91e8bbdc2bf6ee71 - vllm

Files

Roberto L. Castro a201ad72d8 [Refactor][Kernel] Add global helper to deduplicate vectorized memory ops (#35105 )

Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>
Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>

2026-02-27 16:28:17 -08:00

awq

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

cutlass_w4a8

[Kernel]Support W4A8 Grouped GEMM on Hopper (#29691 )

2025-12-08 19:29:06 -08:00

fp4

[Refactor][Kernel] Add global helper to deduplicate vectorized memory ops (#35105 )

2026-02-27 16:28:17 -08:00

fused_kernels

[Bugfix] Fix quant RMS norm fusion for quantization with TMA-aligned scales (#33255 )

2026-02-17 23:35:04 -08:00

gguf

[Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

2025-07-24 00:37:19 -07:00

gptq

[ROCm][GPTQ][Bugfix] Fix GPTQ GEMM kernel output zeroing race condition (#30719 )

2025-12-29 01:13:14 -08:00

gptq_allspark

[Refactor] Rename gptq_marlin to marlin to match MoE (#32952 )

2026-01-23 16:48:12 -05:00

hadamard/hadacore

Use narrow over indexing in hadacore_transform to prep for ABI stable (#28756 )

2025-11-15 01:10:15 -08:00

machete

Add flake8-implicit-str-concat rules to Ruff (#33191 )

2026-01-28 04:56:10 +00:00

marlin

[Quantization][Deprecation] Remove Marlin 24 (#32688 )

2026-01-28 15:54:59 +00:00

w8a8

[WideEP] Remove pplx all2all backend (#33724 )

2026-02-26 14:30:10 -08:00

activation_kernels.cu

[Performance][B200] silu_mul_quant: pack scales in int32 (#28358 )

2025-11-13 10:16:55 -08:00

utils.cuh

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00

vectorization_utils.cuh

Make sure that vectorize_with_alignment produced vectorized global loads (#23182 )

2025-08-21 20:06:54 +00:00

vectorization.cuh

[Perf] Tune scaled_fp8_quant by increasing vectorization (#18844 )

2025-06-03 13:48:25 -07:00