[Feature][Frontend]: Deprecate --enable-reasoning (#17452)

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
This commit is contained in:
Chauncey
2025-05-01 21:46:16 +08:00
committed by GitHub
parent f5a3c655b2
commit 98060b001d
16 changed files with 49 additions and 91 deletions

View File

@@ -21,11 +21,10 @@ vLLM currently supports the following reasoning models:
## Quickstart
To use reasoning models, you need to specify the `--enable-reasoning` and `--reasoning-parser` flags when making a request to the chat completion endpoint. The `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output.
To use reasoning models, you need to specify the `--reasoning-parser` flags when making a request to the chat completion endpoint. The `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output.
```bash
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
--enable-reasoning --reasoning-parser deepseek_r1
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
```
Next, make a request to the model that should return the reasoning content in the response.
@@ -140,8 +139,7 @@ Remember to check whether the `reasoning_content` exists in the response before
The reasoning content is also available in the structured output. The structured output engine like `xgrammar` will use the reasoning content to generate structured output. It is only supported in v0 engine now.
```bash
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
--enable-reasoning --reasoning-parser deepseek_r1
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
```
Please note that the `VLLM_USE_V1` environment variable must be set to `0` to use the v0 engine.
@@ -316,9 +314,8 @@ class DeepSeekReasoner(Reasoner):
The structured output engine like `xgrammar` will use `end_token_id` to check if the reasoning content is present in the model output and skip the structured output if it is the case.
Finally, you can enable reasoning for the model by using the `--enable-reasoning` and `--reasoning-parser` flags.
Finally, you can enable reasoning for the model by using the `--reasoning-parser` flags.
```bash
vllm serve <model_tag> \
--enable-reasoning --reasoning-parser example
vllm serve <model_tag> --reasoning-parser example
```