vllm/csrc/quantization/fp8 at 4979eb79da80a6dd7d4e52103053bf00b80c65cb - vllm

Files

elvischenv dbeee3844c [Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization (#24757 )

Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>

2025-09-13 00:16:24 -07:00

2025-06-15 20:05:28 -07:00

2025-09-13 00:16:24 -07:00

common.cu

2025-08-05 02:36:43 -07:00

common.cuh

2025-09-13 00:16:24 -07:00

per_token_group_quant.cu

2025-07-29 21:50:46 -06:00