fix some typos (#24071)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
This commit is contained in:
@@ -210,7 +210,7 @@ vllm serve Qwen/Qwen2.5-VL-3B-Instruct --api-server-count 4 -dp 2
|
||||
|
||||
!!! note
|
||||
API server scale-out disables [multi-modal IPC caching](#ipc-caching)
|
||||
because it requires a one-to-one correspondance between API and engine core processes.
|
||||
because it requires a one-to-one correspondence between API and engine core processes.
|
||||
|
||||
This does not impact [multi-modal processor caching](#processor-caching).
|
||||
|
||||
@@ -227,7 +227,7 @@ to avoid repeatedly processing the same multi-modal inputs in `BaseMultiModalPro
|
||||
### IPC Caching
|
||||
|
||||
Multi-modal IPC caching is automatically enabled when
|
||||
there is a one-to-one correspondance between API (`P0`) and engine core (`P1`) processes,
|
||||
there is a one-to-one correspondence between API (`P0`) and engine core (`P1`) processes,
|
||||
to avoid repeatedly transferring the same multi-modal inputs between them.
|
||||
|
||||
### Configuration
|
||||
|
||||
Reference in New Issue
Block a user