docs/contributing/model/tests.md

# Unit Testing

This page explains how to write unit tests to verify the implementation of your model.

## Required Tests

These tests are necessary to get your PR merged into vLLM library.
Without them, the CI for your PR will fail.

### Model loading

Include an example HuggingFace repository for your model in [tests/models/registry.py](../../../tests/models/registry.py).
This enables a unit test that loads dummy weights to ensure that the model can be initialized in vLLM.

!!! important
    The list of models in each section should be maintained in alphabetical order.

!!! tip
    If your model requires a development version of HF Transformers, you can set
    `min_transformers_version` to skip the test in CI until the model is released.

## Optional Tests

These tests are optional to get your PR merged into vLLM library.
Passing these tests provides more confidence that your implementation is correct, and helps avoid future regressions.

### Model correctness

These tests compare the model outputs of vLLM against [HF Transformers](https://github.com/huggingface/transformers). You can add new tests under the subdirectories of [tests/models](../../../tests/models).

#### Generative models

For [generative models](../../models/generative_models.md), there are two levels of correctness tests, as defined in [tests/models/utils.py](../../../tests/models/utils.py):

- Exact correctness (`check_outputs_equal`): The text outputted by vLLM should exactly match the text outputted by HF.
- Logprobs similarity (`check_logprobs_close`): The logprobs outputted by vLLM should be in the top-k logprobs outputted by HF, and vice versa.

#### Pooling models

For [pooling models](../../models/pooling_models.md), we simply check the cosine similarity, as defined in [tests/models/utils.py](../../../tests/models/utils.py).

### Multi-modal processing

#### Common tests

Adding your model to [tests/models/multimodal/processing/test_common.py](../../../tests/models/multimodal/processing/test_common.py) verifies that the following input combinations result in the same outputs:

- Text + multi-modal data
- Tokens + multi-modal data
- Text + cached multi-modal data
- Tokens + cached multi-modal data

#### Model-specific tests

You can add a new file under [tests/models/multimodal/processing](../../../tests/models/multimodal/processing) to run tests that only apply to your model.

For example, if the HF processor for your model accepts user-specified keyword arguments, you can verify that the keyword arguments are being applied correctly, such as in [tests/models/multimodal/processing/test_phi3v.py](../../../tests/models/multimodal/processing/test_phi3v.py).
Stop using title frontmatter and fix doc that can only be reached by search (#20623) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-07-08 11:27:40 +01:00			`# Unit Testing`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			`This page explains how to write unit tests to verify the implementation of your model.`

			`## Required Tests`

			`These tests are necessary to get your PR merged into vLLM library.`
			`Without them, the CI for your PR will fail.`

			`### Model loading`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`Include an example HuggingFace repository for your model in [tests/models/registry.py](../../../tests/models/registry.py).`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00			`This enables a unit test that loads dummy weights to ensure that the model can be initialized in vLLM.`

[Doc] Support "important" and "announcement" admonitions (#19479) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-06-11 16:39:58 +08:00			`!!! important`
Migrate docs from Sphinx to MkDocs (#18145) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-05-23 11:09:53 +02:00			`The list of models in each section should be maintained in alphabetical order.`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
Migrate docs from Sphinx to MkDocs (#18145) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-05-23 11:09:53 +02:00			`!!! tip`
			`If your model requires a development version of HF Transformers, you can set`
			`min_transformers_version` to skip the test in CI until the model is released.
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			`## Optional Tests`

			`These tests are optional to get your PR merged into vLLM library.`
			`Passing these tests provides more confidence that your implementation is correct, and helps avoid future regressions.`

			`### Model correctness`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`These tests compare the model outputs of vLLM against [HF Transformers](https://github.com/huggingface/transformers). You can add new tests under the subdirectories of [tests/models](../../../tests/models).`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			`#### Generative models`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`For [generative models](../../models/generative_models.md), there are two levels of correctness tests, as defined in [tests/models/utils.py](../../../tests/models/utils.py):`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			- Exact correctness (`check_outputs_equal`): The text outputted by vLLM should exactly match the text outputted by HF.
			- Logprobs similarity (`check_logprobs_close`): The logprobs outputted by vLLM should be in the top-k logprobs outputted by HF, and vice versa.

			`#### Pooling models`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`For [pooling models](../../models/pooling_models.md), we simply check the cosine similarity, as defined in [tests/models/utils.py](../../../tests/models/utils.py).`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			`### Multi-modal processing`

			`#### Common tests`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`Adding your model to [tests/models/multimodal/processing/test_common.py](../../../tests/models/multimodal/processing/test_common.py) verifies that the following input combinations result in the same outputs:`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
			`- Text + multi-modal data`
			`- Tokens + multi-modal data`
			`- Text + cached multi-modal data`
			`- Tokens + cached multi-modal data`

			`#### Model-specific tests`

[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`You can add a new file under [tests/models/multimodal/processing](../../../tests/models/multimodal/processing) to run tests that only apply to your model.`
[Doc] Basic guide for writing unit tests for new models (#11951) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-11 21:27:24 +08:00
[Docs] Reduce custom syntax used in docs (#27009) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-10-17 04:05:34 +01:00			`For example, if the HF processor for your model accepts user-specified keyword arguments, you can verify that the keyword arguments are being applied correctly, such as in [tests/models/multimodal/processing/test_phi3v.py](../../../tests/models/multimodal/processing/test_phi3v.py).`