Files
vllm/requirements.txt

16 lines
446 B
Plaintext
Raw Normal View History

ninja # For faster builds.
psutil
ray >= 2.9
sentencepiece # Required for LLaMA tokenizer.
numpy
torch == 2.1.2
2024-01-23 06:34:21 +08:00
transformers >= 4.37.0 # Required for Qwen2
2023-12-17 02:28:02 -08:00
xformers == 0.0.23.post1 # Required for CUDA 12.1.
fastapi
uvicorn[standard]
2024-01-22 01:05:56 +01:00
pydantic >= 2.0 # Required for OpenAI server.
aioprometheus[starlette]
pynvml == 11.5.0
triton >= 2.1.0
2024-02-13 11:32:06 -08:00
cupy-cuda12x == 12.3.0 # Required for CUDA graphs. CUDA 11.8 users should install cupy-cuda11x instead.