[Docs] Document security risks of GPT-OSS Python tool (#35139)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
This commit is contained in:
@@ -219,6 +219,47 @@ The most effective approach is to deploy vLLM behind a reverse proxy (such as ng
|
||||
- Blocks all other endpoints, including the unauthenticated inference and operational control endpoints
|
||||
- Implements additional authentication, rate limiting, and logging at the proxy layer
|
||||
|
||||
## Tool Server and MCP Security
|
||||
|
||||
vLLM supports connecting to external tool servers via the `--tool-server` argument. This enables models to call tools through the Responses API (`/v1/responses`). Tool server support works with all models — it is not limited to specific model architectures.
|
||||
|
||||
**Important:** No tool servers are enabled by default. They must be explicitly opted into via configuration.
|
||||
|
||||
### Built-in Demo Tools (GPT-OSS)
|
||||
|
||||
Passing `--tool-server demo` enables built-in demo tools that work with any model that supports tool calling. The tool implementations are not part of vLLM — they are provided by the separately installed [`gpt-oss`](https://github.com/openai/gpt-oss) package. vLLM provides thin wrappers that delegate to `gpt-oss`.
|
||||
|
||||
- **Code interpreter** (`python`): Python execution via Docker (via `gpt_oss.tools.python_docker`)
|
||||
- **Web browser** (`browser`): Search via Exa API, requires `EXA_API_KEY` (via `gpt_oss.tools.simple_browser`)
|
||||
|
||||
#### Code Interpreter (Python Tool) Security Risks
|
||||
|
||||
The code interpreter executes model-generated code inside a Docker container. However, the container is **not configured with network isolation by default**. It inherits the host's Docker networking configuration (e.g., default bridge network or `--network=host`), which means:
|
||||
|
||||
- The container may be able to access the host network and LAN.
|
||||
- Internal services reachable from the container may be exploited via SSRF (Server-Side Request Forgery).
|
||||
- Cloud metadata services (e.g., `169.254.169.254`) may be accessible.
|
||||
- If vulnerable internal services (such as `torch.distributed` endpoints) are reachable from the container, this could be used to attack them.
|
||||
|
||||
This is particularly concerning because the code being executed is generated by the model, which may be influenced by adversarial inputs (prompt injection).
|
||||
|
||||
#### Controlling Built-in Tool Availability
|
||||
|
||||
Built-in demo tools are controlled by two settings:
|
||||
|
||||
1. **`--tool-server demo`**: Enables the built-in demo tools (browser and Python code interpreter).
|
||||
|
||||
2. **`VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`**: When built-in tools are requested via the `mcp` tool type in the Responses API, this comma-separated allowlist controls which tool labels are permitted. Valid values are:
|
||||
- `container` - Container tool
|
||||
- `code_interpreter` - Python code execution tool
|
||||
- `web_search_preview` - Web search/browser tool
|
||||
|
||||
If this variable is not set or is empty, no built-in tools requested via MCP tool type will be enabled.
|
||||
|
||||
To disable the Python code interpreter specifically, omit `code_interpreter` from `VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`.
|
||||
|
||||
**Consider a custom implementation**: The GPT-OSS Python tool is a reference implementation. For production deployments, consider implementing a custom code execution sandbox with stricter isolation guarantees. See the [GPT-OSS documentation](https://github.com/openai/gpt-oss?tab=readme-ov-file#python) for guidance.
|
||||
|
||||
## Reporting Security Vulnerabilities
|
||||
|
||||
If you believe you have found a security vulnerability in vLLM, please report it following the project's security policy. For more information on how to report security issues and the project's security policy, please see the [vLLM Security Policy](https://github.com/vllm-project/vllm/blob/main/SECURITY.md).
|
||||
|
||||
Reference in New Issue
Block a user