Allow markdownlint to run locally (#36398)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2026-03-09 03:05:24 +00:00
committed by GitHub
parent fde4771bbd
commit a0f44bb616
47 changed files with 394 additions and 392 deletions

View File

@@ -1,9 +1,10 @@
# --8<-- [start:installation]
<!-- markdownlint-disable MD041 -->
--8<-- [start:installation]
vLLM supports basic model inferencing and serving on x86 CPU platform, with data types FP32, FP16 and BF16.
# --8<-- [end:installation]
# --8<-- [start:requirements]
--8<-- [end:installation]
--8<-- [start:requirements]
- OS: Linux
- CPU flags: `avx512f` (Recommended), `avx512_bf16` (Optional), `avx512_vnni` (Optional)
@@ -11,11 +12,11 @@ vLLM supports basic model inferencing and serving on x86 CPU platform, with data
!!! tip
Use `lscpu` to check the CPU flags.
# --8<-- [end:requirements]
# --8<-- [start:set-up-using-python]
--8<-- [end:requirements]
--8<-- [start:set-up-using-python]
# --8<-- [end:set-up-using-python]
# --8<-- [start:pre-built-wheels]
--8<-- [end:set-up-using-python]
--8<-- [start:pre-built-wheels]
Pre-built vLLM wheels for x86 with AVX512 are available since version 0.13.0. To install release wheels:
@@ -25,6 +26,7 @@ export VLLM_VERSION=$(curl -s https://api.github.com/repos/vllm-project/vllm/rel
# use uv
uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cpu-cp38-abi3-manylinux_2_35_x86_64.whl --torch-backend cpu
```
??? console "pip"
```bash
# use pip
@@ -46,7 +48,7 @@ uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VE
export LD_PRELOAD="$TC_PATH:$IOMP_PATH:$LD_PRELOAD"
```
**Install the latest code**
#### Install the latest code
To install the wheel built from the latest main branch:
@@ -54,7 +56,7 @@ To install the wheel built from the latest main branch:
uv pip install vllm --extra-index-url https://wheels.vllm.ai/nightly/cpu --index-strategy first-index --torch-backend cpu
```
**Install specific revisions**
#### Install specific revisions
If you want to access the wheels for previous commits (e.g. to bisect the behavior change, performance regression), you can specify the commit hash in the URL:
@@ -63,8 +65,8 @@ export VLLM_COMMIT=730bd35378bf2a5b56b6d3a45be28b3092d26519 # use full commit ha
uv pip install vllm --extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT}/cpu --index-strategy first-index --torch-backend cpu
```
# --8<-- [end:pre-built-wheels]
# --8<-- [start:build-wheel-from-source]
--8<-- [end:pre-built-wheels]
--8<-- [start:build-wheel-from-source]
Install recommended compiler. We recommend to use `gcc/g++ >= 12.3.0` as the default compiler to avoid potential problems. For example, on Ubuntu 22.4, you can run:
@@ -158,8 +160,8 @@ uv pip install dist/*.whl
]
```
# --8<-- [end:build-wheel-from-source]
# --8<-- [start:pre-built-images]
--8<-- [end:build-wheel-from-source]
--8<-- [start:pre-built-images]
You can pull the latest available CPU image from Docker Hub:
@@ -189,10 +191,10 @@ vllm/vllm-openai-cpu:latest-x86_64 <args...>
!!! warning
If deploying the pre-built images on machines without `avx512f`, `avx512_bf16`, or `avx512_vnni` support, an `Illegal instruction` error may be raised. See the build-image-from-source section below for build arguments to match your target CPU capabilities.
# --8<-- [end:pre-built-images]
# --8<-- [start:build-image-from-source]
--8<-- [end:pre-built-images]
--8<-- [start:build-image-from-source]
## Building for your target CPU
#### Building for your target CPU
```bash
docker build -f docker/Dockerfile.cpu \
@@ -212,15 +214,15 @@ docker build -f docker/Dockerfile.cpu \
- `VLLM_CPU_{ISA}=true` - Force-enable the instruction set (build with ISA regardless of build system capabilities)
- `VLLM_CPU_{ISA}=false` - Rely on auto-detection (default)
### Examples
##### Examples
**Auto-detection build (default)**
###### Auto-detection build (default)
```bash
docker build -f docker/Dockerfile.cpu --tag vllm-cpu-env --target vllm-openai .
```
**Cross-compile for AVX512**
###### Cross-compile for AVX512
```bash
docker build -f docker/Dockerfile.cpu \
@@ -231,7 +233,7 @@ docker build -f docker/Dockerfile.cpu \
--target vllm-openai .
```
**Cross-compile for AVX2**
###### Cross-compile for AVX2
```bash
docker build -f docker/Dockerfile.cpu \
@@ -240,7 +242,7 @@ docker build -f docker/Dockerfile.cpu \
--target vllm-openai .
```
## Launching the OpenAI server
#### Launching the OpenAI server
```bash
docker run --rm \
@@ -255,6 +257,6 @@ docker run --rm \
other vLLM OpenAI server arguments
```
# --8<-- [end:build-image-from-source]
# --8<-- [start:extra-information]
# --8<-- [end:extra-information]
--8<-- [end:build-image-from-source]
--8<-- [start:extra-information]
--8<-- [end:extra-information]