vllm/csrc/quantization/w8a8 at d4d2751732c3ccae162a5a0160c7d4fe05d2779a - vllm

Files

Wentao Ye 1e6b115300 [Refactor] Reduce duplicate code in per_token_group_quant cuda kernels (#30496 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-12-12 16:45:23 -05:00

2025-12-08 19:29:06 -08:00

2025-12-12 16:45:23 -05:00

2025-11-08 14:31:33 -08:00

per_token_group_quant_8bit.h

2025-10-08 10:20:48 -04:00