[V1] Structured Outputs + Thinking compatibility (#16577)

Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-05-14 18:45:24 -04:00
parent d93c976a0d
commit 2fc9075b82
10 changed files with 233 additions and 75 deletions
--- a/docs/source/features/reasoning_outputs.md
+++ b/docs/source/features/reasoning_outputs.md
@@ -141,10 +141,10 @@ Remember to check whether the `reasoning_content` exists in the response before
 The reasoning content is also available in the structured output. The structured output engine like `xgrammar` will use the reasoning content to generate structured output. It is only supported in v0 engine now.

 ```bash
-VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
+vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
 ```

-Please note that the `VLLM_USE_V1` environment variable must be set to `0` to use the v0 engine.
+The following is an example client:

 ```python
 from openai import OpenAI