2025-07-08 11:27:40 +01:00
# Registering a Model
2025-01-06 21:40:31 +08:00
vLLM relies on a model registry to determine how to run each model.
2025-07-08 10:49:13 +01:00
A list of pre-registered architectures can be found [here ](../../models/supported_models.md ).
2025-01-06 21:40:31 +08:00
If your model is not on this list, you must register it to vLLM.
This page provides detailed instructions on how to do so.
## Built-in models
2025-05-23 11:09:53 +02:00
To add a model directly to the vLLM library, start by forking our [GitHub repository ](https://github.com/vllm-project/vllm ) and then [build it from source][build-from-source].
2025-01-06 21:40:31 +08:00
This gives you the ability to modify the codebase and test your model.
2025-07-08 10:49:13 +01:00
After you have implemented your model (see [tutorial ](basic.md )), put it into the <gh-dir:vllm/model_executor/models> directory.
2025-01-06 21:40:31 +08:00
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
2025-07-08 10:49:13 +01:00
Finally, update our [list of supported models ](../../models/supported_models.md ) to promote your model!
2025-01-06 21:40:31 +08:00
2025-06-11 16:39:58 +08:00
!!! important
2025-05-23 11:09:53 +02:00
The list of models in each section should be maintained in alphabetical order.
2025-01-06 21:40:31 +08:00
## Out-of-tree models
2025-07-08 10:49:13 +01:00
You can load an external model [using a plugin ](../../design/plugin_system.md ) without modifying the vLLM codebase.
2025-01-06 21:40:31 +08:00
To register the model, use the following code:
```python
2025-05-27 14:30:31 +08:00
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry
from your_code import YourModelForCausalLM
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
2025-01-06 21:40:31 +08:00
```
If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess` :
```python
2025-05-27 14:30:31 +08:00
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry
ModelRegistry.register_model(
"YourModelForCausalLM",
2025-10-14 18:21:53 +08:00
"your_code:YourModelForCausalLM",
2025-05-27 14:30:31 +08:00
)
2025-01-06 21:40:31 +08:00
```
2025-06-11 16:39:58 +08:00
!!! important
2025-05-23 11:09:53 +02:00
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
2025-07-08 10:49:13 +01:00
Read more about that [here ](multimodal.md ).