vllm/benchmarks/benchmark_throughput.py at ef9b636e2d427f588bf11242e312ba8954d9aff0

Files

Woosuk Kwon 37ca558103 Optimize model execution with CUDA graph (#1926 )

Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

2023-12-16 21:12:08 -08:00

View Raw