2025-05-23 11:09:53 +02:00
---
2025-06-26 15:47:06 +08:00
title: Summary
2025-05-23 11:09:53 +02:00
---
[](){ #new -model }
2025-06-26 15:47:06 +08:00
!!! important
Many decoder language models can now be automatically loaded using the [Transformers backend][transformers-backend] without having to implement them in vLLM. See if `vllm serve <model>` works first!
2025-05-23 11:09:53 +02:00
2025-06-26 15:47:06 +08:00
vLLM models are specialized [PyTorch ](https://pytorch.org/ ) models that take advantage of various [features][compatibility-matrix] to optimize their performance.
2025-05-23 11:09:53 +02:00
2025-06-26 15:47:06 +08:00
The complexity of integrating a model into vLLM depends heavily on the model's architecture.
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.
However, this can be more complex for models that include new operators (e.g., a new attention mechanism).
2025-05-23 11:09:53 +02:00
2025-06-26 15:47:06 +08:00
Read through these pages for a step-by-step guide:
2025-06-26 23:18:49 +08:00
- [Basic Model ](basic.md )
- [Registering a Model ](registration.md )
- [Unit Testing ](tests.md )
2025-06-26 15:47:06 +08:00
- [Multi-Modal Support ](multimodal.md )
2025-05-23 11:09:53 +02:00
!!! tip
If you are encountering issues while integrating your model into vLLM, feel free to open a [GitHub issue ](https://github.com/vllm-project/vllm/issues )
or ask on our [developer slack ](https://slack.vllm.ai ).
We will be happy to help you out!