[V1] Structured Outputs + Thinking compatibility (#16577)

Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
This commit is contained in:
Aaron Pham
2025-05-14 18:45:24 -04:00
committed by GitHub
parent d93c976a0d
commit 2fc9075b82
10 changed files with 233 additions and 75 deletions

View File

@@ -141,10 +141,10 @@ Remember to check whether the `reasoning_content` exists in the response before
The reasoning content is also available in the structured output. The structured output engine like `xgrammar` will use the reasoning content to generate structured output. It is only supported in v0 engine now.
```bash
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
```
Please note that the `VLLM_USE_V1` environment variable must be set to `0` to use the v0 engine.
The following is an example client:
```python
from openai import OpenAI