vllm/csrc/quantization/fused_kernels at f8b19c0ffd65f7f6f01a0da4a39b6890f5db40cb - vllm

Files

Luka Govedič bd7157a071 [torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 )

Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-10-17 08:10:23 -06:00

fused_layernorm_dynamic_per_token_quant.cu

2025-10-17 08:10:23 -06:00

layernorm_utils.cuh

2025-09-17 09:15:42 -04:00

quant_conversions.cuh

2025-10-08 10:20:48 -04:00