add tip for VLLM_USE_PRECOMPILED arg to reduce docker build time (#31385)

Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
This commit is contained in:
yitingdc
2025-12-28 11:19:47 +08:00
committed by GitHub
parent 727c41f3fd
commit b326598e97

View File

@@ -80,6 +80,15 @@ DOCKER_BUILDKIT=1 docker build . \
If you are using Podman instead of Docker, you might need to disable SELinux labeling by
adding `--security-opt label=disable` when running `podman build` command to avoid certain [existing issues](https://github.com/containers/buildah/discussions/4184).
!!! note
If you have not changed any C++ or CUDA kernel code, you can use precompiled wheels to significantly reduce Docker build time.
* **Enable the feature** by adding the build argument: `--build-arg VLLM_USE_PRECOMPILED="1"`.
* **How it works**: By default, vLLM automatically finds the correct wheels from our [Nightly Builds](https://docs.vllm.ai/en/latest/contributing/ci/nightly_builds/) by using the merge-base commit with the upstream `main` branch.
* **Override commit**: To use wheels from a specific commit, provide the `--build-arg VLLM_PRECOMPILED_WHEEL_COMMIT=<commit_hash>` argument.
For a detailed explanation, refer to the documentation on 'Set up using Python-only build (without compilation)' part in [Build wheel from source](https://docs.vllm.ai/en/latest/contributing/ci/nightly_builds.html#precompiled-wheels-usage), these args are similar.
## Building for Arm64/aarch64
A docker container can be built for aarch64 systems such as the Nvidia Grace-Hopper and Grace-Blackwell. Using the flag `--platform "linux/arm64"` will build for arm64.