[Misc][Doc] Add note regarding loading generation_config by default (#15281)

Signed-off-by: Roger Wang <ywang@roblox.com>
This commit is contained in:
Roger Wang
2025-03-23 14:00:55 -07:00
committed by GitHub
parent d6cd59f122
commit 9c5c81b0da
4 changed files with 27 additions and 1 deletions

View File

@@ -46,6 +46,11 @@ for output in outputs:
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
:::{important}
By default, vLLM will use sampling parameters recommended by model creator by applying the `generation_config.json` from the huggingface model repository if it exists. In most cases, this will provide you with the best results by default if {class}`~vllm.SamplingParams` is not specified.
However, if vLLM's default sampling parameters are preferred, please pass `generation_config="vllm"` when creating the {class}`~vllm.LLM` instance.
:::
A code example can be found here: <gh-file:examples/offline_inference/basic/basic.py>
### `LLM.beam_search`