2025-05-24 22:25:33 +08:00
|
|
|
# Using vLLM
|
|
|
|
|
|
2025-09-24 20:30:33 +01:00
|
|
|
First, vLLM must be [installed](../getting_started/installation/) for your chosen device in either a Python or Docker environment.
|
2025-08-12 12:25:55 +01:00
|
|
|
|
|
|
|
|
Then, vLLM supports the following usage patterns:
|
2025-05-24 22:25:33 +08:00
|
|
|
|
|
|
|
|
- [Inference and Serving](../serving/offline_inference.md): Run a single instance of a model.
|
|
|
|
|
- [Deployment](../deployment/docker.md): Scale up model instances for production.
|
|
|
|
|
- [Training](../training/rlhf.md): Train or fine-tune a model.
|