[Docs] use uv in CPU installation docs (#22089)
Signed-off-by: David Xia <david@davidxia.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# --8<-- [start:installation]
|
||||
|
||||
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users shall build from the vLLM source to natively run on IBM Z platform.
|
||||
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users must build from source to natively run on IBM Z platform.
|
||||
|
||||
Currently the CPU implementation for s390x architecture supports FP32 datatype only.
|
||||
|
||||
@@ -40,21 +40,32 @@ curl https://sh.rustup.rs -sSf | sh -s -- -y && \
|
||||
. "$HOME/.cargo/env"
|
||||
```
|
||||
|
||||
Execute the following commands to build and install vLLM from the source.
|
||||
Execute the following commands to build and install vLLM from source.
|
||||
|
||||
!!! tip
|
||||
Please build the following dependencies, `torchvision`, `pyarrow` from the source before building vLLM.
|
||||
Please build the following dependencies, `torchvision`, `pyarrow` from source before building vLLM.
|
||||
|
||||
```bash
|
||||
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
|
||||
pip install -v \
|
||||
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
|
||||
uv pip install -v \
|
||||
--torch-backend auto \
|
||||
-r requirements-build.txt \
|
||||
-r requirements-cpu.txt \
|
||||
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
|
||||
pip install dist/*.whl
|
||||
uv pip install dist/*.whl
|
||||
```
|
||||
|
||||
??? console "pip"
|
||||
```bash
|
||||
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
|
||||
pip install -v \
|
||||
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
|
||||
-r requirements-build.txt \
|
||||
-r requirements-cpu.txt \
|
||||
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
|
||||
pip install dist/*.whl
|
||||
```
|
||||
|
||||
# --8<-- [end:build-wheel-from-source]
|
||||
# --8<-- [start:pre-built-images]
|
||||
|
||||
@@ -63,19 +74,19 @@ Execute the following commands to build and install vLLM from the source.
|
||||
|
||||
```bash
|
||||
docker build -f docker/Dockerfile.s390x \
|
||||
--tag vllm-cpu-env .
|
||||
--tag vllm-cpu-env .
|
||||
|
||||
# Launching OpenAI server
|
||||
# Launch OpenAI server
|
||||
docker run --rm \
|
||||
--privileged=true \
|
||||
--shm-size=4g \
|
||||
-p 8000:8000 \
|
||||
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
|
||||
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
|
||||
vllm-cpu-env \
|
||||
--model=meta-llama/Llama-3.2-1B-Instruct \
|
||||
--dtype=float \
|
||||
other vLLM OpenAI server arguments
|
||||
--privileged true \
|
||||
--shm-size 4g \
|
||||
-p 8000:8000 \
|
||||
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
|
||||
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
|
||||
vllm-cpu-env \
|
||||
--model meta-llama/Llama-3.2-1B-Instruct \
|
||||
--dtype float \
|
||||
other vLLM OpenAI server arguments
|
||||
```
|
||||
|
||||
# --8<-- [end:build-image-from-source]
|
||||
|
||||
Reference in New Issue
Block a user