vllm/vllm/engine at 4efbac6d3593ed35fd5b6ccb3958bd96b2c9b4da - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Woosuk Kwon a463c333dd Use CuPy for CUDA graphs (#2811 )

2024-02-13 11:32:06 -08:00

..

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

Remove hardcoded device="cuda" to support more devices (#2503 )

2024-02-01 15:46:39 -08:00

async_llm_engine.py

fix some bugs (#2689 )

2024-01-31 10:09:23 -08:00

llm_engine.py

Use CuPy for CUDA graphs (#2811 )

2024-02-13 11:32:06 -08:00

metrics.py

Refactor Prometheus and Add Request Level Metrics (#2316 )

2024-01-31 14:58:07 -08:00

ray_utils.py

[Ray] Integration compiled DAG off by default (#2471 )

2024-02-08 09:57:25 -08:00