[Perf][Frontend] eliminate api_key and x_request_id headers middleware overhead (#19946)
Signed-off-by: Yazan-Sharaya <yazan.sharaya.yes@gmail.com>
This commit is contained in:
@@ -146,11 +146,6 @@ completion = client.chat.completions.create(
|
||||
Only `X-Request-Id` HTTP request header is supported for now. It can be enabled
|
||||
with `--enable-request-id-headers`.
|
||||
|
||||
> Note that enablement of the headers can impact performance significantly at high QPS
|
||||
> rates. We recommend implementing HTTP headers at the router level (e.g. via Istio),
|
||||
> rather than within the vLLM layer for this reason.
|
||||
> See [this PR](https://github.com/vllm-project/vllm/pull/11529) for more details.
|
||||
|
||||
??? Code
|
||||
|
||||
```python
|
||||
|
||||
Reference in New Issue
Block a user