[Perf][Frontend] eliminate api_key and x_request_id headers middleware overhead (#19946)

Signed-off-by: Yazan-Sharaya <yazan.sharaya.yes@gmail.com>
This commit is contained in:
Yazan Sharaya
2025-06-27 07:44:14 +03:00
committed by GitHub
parent cd4cfee689
commit 6e244ae091
4 changed files with 190 additions and 33 deletions

View File

@@ -146,11 +146,6 @@ completion = client.chat.completions.create(
Only `X-Request-Id` HTTP request header is supported for now. It can be enabled
with `--enable-request-id-headers`.
> Note that enablement of the headers can impact performance significantly at high QPS
> rates. We recommend implementing HTTP headers at the router level (e.g. via Istio),
> rather than within the vLLM layer for this reason.
> See [this PR](https://github.com/vllm-project/vllm/pull/11529) for more details.
??? Code
```python