[Doc] Update CPU doc (#20676)

Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Li, Jiang
2025-07-10 01:28:30 +08:00
committed by GitHub
parent 0bbac1c1b4
commit 138709f8d1
5 changed files with 99 additions and 84 deletions

View File

@@ -61,6 +61,23 @@ Execute the following commands to build and install vLLM from the source.
# --8<-- [end:pre-built-images]
# --8<-- [start:build-image-from-source]
```bash
docker build -f docker/Dockerfile.s390x \
--tag vllm-cpu-env .
# Launching OpenAI server
docker run --rm \
--privileged=true \
--shm-size=4g \
-p 8000:8000 \
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
vllm-cpu-env \
--model=meta-llama/Llama-3.2-1B-Instruct \
--dtype=float \
other vLLM OpenAI server arguments
```
# --8<-- [end:build-image-from-source]
# --8<-- [start:extra-information]
# --8<-- [end:extra-information]