[Doc] Unify structured outputs examples (#18196)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
This commit is contained in:
54
examples/online_serving/structured_outputs/README.md
Normal file
54
examples/online_serving/structured_outputs/README.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Structured Outputs
|
||||
|
||||
This script demonstrates various structured output capabilities of vLLM's OpenAI-compatible server.
|
||||
It can run individual constraint type or all of them.
|
||||
It supports both streaming responses and concurrent non-streaming requests.
|
||||
|
||||
To use this example, you must start an vLLM server with any model of your choice.
|
||||
|
||||
```bash
|
||||
vllm serve Qwen/Qwen2.5-3B-Instruct
|
||||
```
|
||||
|
||||
To serve a reasoning model, you can use the following command:
|
||||
|
||||
```bash
|
||||
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --reasoning-parser deepseek_r1
|
||||
```
|
||||
|
||||
If you want to run this script standalone with `uv`, you can use the following:
|
||||
|
||||
```bash
|
||||
uvx --from git+https://github.com/vllm-project/vllm#subdirectory=examples/online_serving/structured_outputs structured-output
|
||||
```
|
||||
|
||||
See [feature docs](../../../features/structured_outputs.md) for more information.
|
||||
|
||||
!!! tip
|
||||
If vLLM is running remotely, then set `OPENAI_BASE_URL=<remote_url>` before running the script.
|
||||
|
||||
## Usage
|
||||
|
||||
Run all constraints, non-streaming:
|
||||
|
||||
```bash
|
||||
uv run structured_outputs.py
|
||||
```
|
||||
|
||||
Run all constraints, streaming:
|
||||
|
||||
```bash
|
||||
uv run structured_outputs.py --stream
|
||||
```
|
||||
|
||||
Run certain constraints, for example `structural_tag` and `regex`, streaming:
|
||||
|
||||
```bash
|
||||
uv run structured_outputs.py --constraint structural_tag regex --stream
|
||||
```
|
||||
|
||||
Run all constraints, with reasoning models and streaming:
|
||||
|
||||
```bash
|
||||
uv run structured_outputs.py --reasoning --stream
|
||||
```
|
||||
Reference in New Issue
Block a user