2025-05-23 11:09:53 +02:00
---
2025-06-26 23:18:49 +08:00
title: Registering a Model
2025-05-23 11:09:53 +02:00
---
[](){ #new -model-registration }
2025-01-06 21:40:31 +08:00
vLLM relies on a model registry to determine how to run each model.
2025-05-23 11:09:53 +02:00
A list of pre-registered architectures can be found [here][supported-models].
2025-01-06 21:40:31 +08:00
If your model is not on this list, you must register it to vLLM.
This page provides detailed instructions on how to do so.
## Built-in models
2025-05-23 11:09:53 +02:00
To add a model directly to the vLLM library, start by forking our [GitHub repository ](https://github.com/vllm-project/vllm ) and then [build it from source][build-from-source].
2025-01-06 21:40:31 +08:00
This gives you the ability to modify the codebase and test your model.
2025-05-23 11:09:53 +02:00
After you have implemented your model (see [tutorial][new-model-basic]), put it into the <gh-dir:vllm/model_executor/models> directory.
2025-01-06 21:40:31 +08:00
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
2025-05-23 11:09:53 +02:00
Finally, update our [list of supported models][supported-models] to promote your model!
2025-01-06 21:40:31 +08:00
2025-06-11 16:39:58 +08:00
!!! important
2025-05-23 11:09:53 +02:00
The list of models in each section should be maintained in alphabetical order.
2025-01-06 21:40:31 +08:00
## Out-of-tree models
2025-05-27 14:30:31 +08:00
You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.
2025-01-06 21:40:31 +08:00
To register the model, use the following code:
```python
2025-05-27 14:30:31 +08:00
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry
from your_code import YourModelForCausalLM
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
2025-01-06 21:40:31 +08:00
```
If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess` :
```python
2025-05-27 14:30:31 +08:00
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry
ModelRegistry.register_model(
"YourModelForCausalLM",
"your_code:YourModelForCausalLM"
)
2025-01-06 21:40:31 +08:00
```
2025-06-11 16:39:58 +08:00
!!! important
2025-05-23 11:09:53 +02:00
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
Read more about that [here][supports-multimodal].