vllm/csrc/quantization/fp8/nvidia at 239ef0c1ac0dfe68d8d2e28c54ecf9aa9bcd945b - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

elvischenv dbeee3844c [Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization (#24757 )

Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>

2025-09-13 00:16:24 -07:00

..

quant_utils.cuh

[Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization (#24757 )

2025-09-13 00:16:24 -07:00