[CPU] Improve CPU Docker build (#30953)
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com>
This commit is contained in:
@@ -164,21 +164,76 @@ uv pip install dist/*.whl
|
||||
[https://gallery.ecr.aws/q9t5s3a7/vllm-cpu-release-repo](https://gallery.ecr.aws/q9t5s3a7/vllm-cpu-release-repo)
|
||||
|
||||
!!! warning
|
||||
If deploying the pre-built images on machines without `avx512f`, `avx512_bf16`, or `avx512_vnni` support, an `Illegal instruction` error may be raised. It is recommended to build images for these machines with the appropriate build arguments (e.g., `--build-arg VLLM_CPU_DISABLE_AVX512=true`, `--build-arg VLLM_CPU_AVX512BF16=false`, or `--build-arg VLLM_CPU_AVX512VNNI=false`) to disable unsupported features. Please note that without `avx512f`, AVX2 will be used and this version is not recommended because it only has basic feature support.
|
||||
If deploying the pre-built images on machines without `avx512f`, `avx512_bf16`, or `avx512_vnni` support, an `Illegal instruction` error may be raised. See the build-image-from-source section below for build arguments to match your target CPU capabilities.
|
||||
|
||||
# --8<-- [end:pre-built-images]
|
||||
# --8<-- [start:build-image-from-source]
|
||||
|
||||
## Building for your target CPU
|
||||
|
||||
vLLM supports building Docker images for x86 CPU platforms with automatic instruction set detection.
|
||||
|
||||
### Basic build command
|
||||
|
||||
```bash
|
||||
docker build -f docker/Dockerfile.cpu \
|
||||
--build-arg VLLM_CPU_AVX512BF16=false (default)|true \
|
||||
--build-arg VLLM_CPU_AVX512VNNI=false (default)|true \
|
||||
--build-arg VLLM_CPU_AMXBF16=false|true (default) \
|
||||
--build-arg VLLM_CPU_DISABLE_AVX512=false (default)|true \
|
||||
--build-arg VLLM_CPU_DISABLE_AVX512=<false (default)|true> \
|
||||
--build-arg VLLM_CPU_AVX2=<false (default)|true> \
|
||||
--build-arg VLLM_CPU_AVX512=<false (default)|true> \
|
||||
--build-arg VLLM_CPU_AVX512BF16=<false (default)|true> \
|
||||
--build-arg VLLM_CPU_AVX512VNNI=<false (default)|true> \
|
||||
--build-arg VLLM_CPU_AMXBF16=<false|true (default)> \
|
||||
--tag vllm-cpu-env \
|
||||
--target vllm-openai .
|
||||
```
|
||||
|
||||
# Launching OpenAI server
|
||||
!!! note "Instruction set auto-detection"
|
||||
By default, vLLM will auto-detect CPU instruction sets (AVX512, AVX2, etc.) from the build system's CPU flags. Build arguments like `VLLM_CPU_AVX2`, `VLLM_CPU_AVX512`, `VLLM_CPU_AVX512BF16`, `VLLM_CPU_AVX512VNNI`, and `VLLM_CPU_AMXBF16` are primarily used for **cross-compilation** or for building container images on systems that don't have the target platforms ISA:
|
||||
|
||||
- Set `VLLM_CPU_{ISA}=true` to force-enable an instruction set (for cross-compilation to target platforms with that ISA)
|
||||
- Set `VLLM_CPU_{ISA}=false` to rely on auto-detection
|
||||
- When an ISA build arg is set to `true`, vLLM will build with that instruction set regardless of the build system's CPU capabilities
|
||||
|
||||
### Build examples
|
||||
|
||||
**Example 1: Auto-detection (native build)**
|
||||
|
||||
Build on a machine with the same CPU as your target deployment:
|
||||
|
||||
```bash
|
||||
# Auto-detects all CPU features from the build system
|
||||
docker build -f docker/Dockerfile.cpu \
|
||||
--tag vllm-cpu-env \
|
||||
--target vllm-openai .
|
||||
```
|
||||
|
||||
**Example 2: Cross-compilation for AVX512 deployment**
|
||||
|
||||
Build an AVX512 image on any x86_64 system (even without AVX512):
|
||||
|
||||
```bash
|
||||
docker build -f docker/Dockerfile.cpu \
|
||||
--build-arg VLLM_CPU_AVX512=true \
|
||||
--build-arg VLLM_CPU_AVX512BF16=true \
|
||||
--build-arg VLLM_CPU_AVX512VNNI=true \
|
||||
--tag vllm-cpu-avx512 \
|
||||
--target vllm-openai .
|
||||
```
|
||||
|
||||
**Example 3: Cross-compilation for AVX2 deployment**
|
||||
|
||||
Build an AVX2 image for older CPUs:
|
||||
|
||||
```bash
|
||||
docker build -f docker/Dockerfile.cpu \
|
||||
--build-arg VLLM_CPU_AVX2=true \
|
||||
--tag vllm-cpu-avx2 \
|
||||
--target vllm-openai .
|
||||
```
|
||||
|
||||
## Launching the OpenAI server
|
||||
|
||||
```bash
|
||||
docker run --rm \
|
||||
--security-opt seccomp=unconfined \
|
||||
--cap-add SYS_NICE \
|
||||
|
||||
Reference in New Issue
Block a user