vllm/csrc/quantization/w8a8 at 60d17251c920ae3c9d02e4b4101b738e4905aee4 - vllm

Files

Wentao Ye 541a2ef892 [Perf] Deepgemm fused layout kernel for activations, 4.3% throughput improvement, 10.7% TTFT improvement. (#29546 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-12-07 20:31:14 +08:00

2025-11-25 06:59:07 -08:00

2025-12-07 20:31:14 +08:00

2025-11-08 14:31:33 -08:00

per_token_group_quant_8bit.h

2025-10-08 10:20:48 -04:00