2025-05-24 22:25:33 +08:00
# Reproducibility
2025-02-10 20:56:50 +05:30
2025-11-21 19:56:59 +08:00
vLLM does not guarantee the reproducibility of the results by default, for the sake of performance. To achieve
2025-11-23 10:58:48 +08:00
reproducible results:
- In offline mode, you can either set `VLLM_ENABLE_V1_MULTIPROCESSING=0` which makes scheduling deterministic,
or enable [batch invariance ](../features/batch_invariance.md ) to make the outputs insensitive to scheduling.
- In online mode, you can only enable [batch invariance ](../features/batch_invariance.md ).
2025-02-10 20:56:50 +05:30
2025-10-17 04:05:34 +01:00
Example: [examples/offline_inference/reproducibility.py ](../../examples/offline_inference/reproducibility.py )
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
!!! warning
2025-02-10 20:56:50 +05:30
2025-11-23 10:58:48 +08:00
Setting `VLLM_ENABLE_V1_MULTIPROCESSING=0` will change the random state of user code
(i.e. the code that constructs [LLM][vllm.LLM] class).
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
!!! note
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
Even with the above settings, vLLM only provides reproducibility
when it runs on the same hardware and the same vLLM version.
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
## Setting the global seed
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
The `seed` parameter in vLLM is used to control the random states for various random number generators.
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
If a specific seed value is provided, the random states for `random` , `np.random` , and `torch.manual_seed` will be set accordingly.
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
### Default Behavior
2025-02-10 20:56:50 +05:30
2025-05-27 15:03:13 +08:00
In V1, the `seed` parameter defaults to `0` which sets the random state for each worker, so the results will remain consistent for each vLLM run even if `temperature > 0` .
2025-02-10 20:56:50 +05:30
2025-11-23 10:58:48 +08:00
It is impossible to un-specify a seed for V1 because different workers need to sample the same outputs
for workflows such as speculative decoding. For more information, see: <https://github.com/vllm-project/vllm/pull/17929>
2025-02-10 20:56:50 +05:30
2025-11-23 10:58:48 +08:00
!!! note
2025-02-10 20:56:50 +05:30
2025-11-23 10:58:48 +08:00
The random state in user code (i.e. the code that constructs [LLM][vllm.LLM] class) is updated by vLLM
only if the workers are run in the same process as user code, i.e.: `VLLM_ENABLE_V1_MULTIPROCESSING=0` .
2025-05-27 15:03:13 +08:00
2025-11-23 10:58:48 +08:00
By default, `VLLM_ENABLE_V1_MULTIPROCESSING=1` so you can use vLLM without having to worry about
accidentally making deterministic subsequent operations that rely on random state.