vllm/csrc/quantization at 75e01a39a16c1f39d5e2cf37edb6525df1a76c9f - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Gregory Shtrasberg 56c976c1b5 [ROCm] Enable fused_silu_mul_block_quant on ROCm (#38817 )

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

2026-04-08 11:23:32 -05:00

..

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

[ROCm] Enable fused_silu_mul_block_quant on ROCm (#38817 )

2026-04-08 11:23:32 -05:00

[Bugfix][ROCm] Fix for warp_size uses on host (#21205 )

2025-07-24 00:37:19 -07:00

[ROCm][GPTQ][Bugfix] Fix GPTQ GEMM kernel output zeroing race condition (#30719 )

2025-12-29 01:13:14 -08:00

[Refactor] Rename gptq_marlin to marlin to match MoE (#32952 )

2026-01-23 16:48:12 -05:00

hadamard/hadacore

Use narrow over indexing in hadacore_transform to prep for ABI stable (#28756 )

2025-11-15 01:10:15 -08:00

[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423 )

2026-03-30 09:36:18 -07:00

[Kernel] Add MXFP8 to Marlin GEMM/MoE and refactor Mxfp8LinearOp (#34664 )

2026-04-01 09:41:42 -07:00

[ROCm] Enable fused_silu_mul_block_quant on ROCm (#38817 )

2026-04-08 11:23:32 -05:00

activation_kernels.cu

[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972 )

2026-03-27 16:36:08 -07:00

utils.cuh

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00