[Doc]: fixing typos in various files (#30540)

Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
This commit is contained in:
Didier Durand
2025-12-14 11:14:37 +01:00
committed by GitHub
parent add1b9d3de
commit 1a55cfafcb
12 changed files with 17 additions and 17 deletions

View File

@@ -7,7 +7,7 @@ This guide covers optimization strategies and performance tuning for vLLM V1.
## Preemption
Due to the auto-regressive nature of transformer architecture, there are times when KV cache space is insufficient to handle all batched requests.
Due to the autoregressive nature of transformer architecture, there are times when KV cache space is insufficient to handle all batched requests.
In such cases, vLLM can preempt requests to free up KV cache space for other requests. Preempted requests are recomputed when sufficient KV cache space becomes
available again. When this occurs, you may see the following warning: