[Docs] use uv in CPU installation docs (#22089)

Signed-off-by: David Xia <david@davidxia.com>
This commit is contained in:
David Xia
2025-08-01 10:55:55 -04:00
committed by GitHub
parent 3146519add
commit 97608dc276
3 changed files with 48 additions and 31 deletions

View File

@@ -1,6 +1,6 @@
# --8<-- [start:installation]
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users shall build from the vLLM source to natively run on IBM Z platform.
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users must build from source to natively run on IBM Z platform.
Currently the CPU implementation for s390x architecture supports FP32 datatype only.
@@ -40,21 +40,32 @@ curl https://sh.rustup.rs -sSf | sh -s -- -y && \
. "$HOME/.cargo/env"
```
Execute the following commands to build and install vLLM from the source.
Execute the following commands to build and install vLLM from source.
!!! tip
Please build the following dependencies, `torchvision`, `pyarrow` from the source before building vLLM.
Please build the following dependencies, `torchvision`, `pyarrow` from source before building vLLM.
```bash
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
pip install -v \
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
uv pip install -v \
--torch-backend auto \
-r requirements-build.txt \
-r requirements-cpu.txt \
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
pip install dist/*.whl
uv pip install dist/*.whl
```
??? console "pip"
```bash
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
pip install -v \
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
-r requirements-build.txt \
-r requirements-cpu.txt \
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
pip install dist/*.whl
```
# --8<-- [end:build-wheel-from-source]
# --8<-- [start:pre-built-images]
@@ -63,19 +74,19 @@ Execute the following commands to build and install vLLM from the source.
```bash
docker build -f docker/Dockerfile.s390x \
--tag vllm-cpu-env .
--tag vllm-cpu-env .
# Launching OpenAI server
# Launch OpenAI server
docker run --rm \
--privileged=true \
--shm-size=4g \
-p 8000:8000 \
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
vllm-cpu-env \
--model=meta-llama/Llama-3.2-1B-Instruct \
--dtype=float \
other vLLM OpenAI server arguments
--privileged true \
--shm-size 4g \
-p 8000:8000 \
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
vllm-cpu-env \
--model meta-llama/Llama-3.2-1B-Instruct \
--dtype float \
other vLLM OpenAI server arguments
```
# --8<-- [end:build-image-from-source]