add tip for VLLM_USE_PRECOMPILED arg to reduce docker build time (#31385)

Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
2025-12-28 11:19:47 +08:00
parent 727c41f3fd
commit b326598e97
1 changed files with 9 additions and 0 deletions
--- a/docs/deployment/docker.md
+++ b/docs/deployment/docker.md
@@ -80,6 +80,15 @@ DOCKER_BUILDKIT=1 docker build . \
    If you are using Podman instead of Docker, you might need to disable SELinux labeling by
    adding `--security-opt label=disable` when running `podman build` command to avoid certain [existing issues](https://github.com/containers/buildah/discussions/4184).

+!!! note
+    If you have not changed any C++ or CUDA kernel code, you can use precompiled wheels to significantly reduce Docker build time.
+
+    *   **Enable the feature** by adding the build argument: `--build-arg VLLM_USE_PRECOMPILED="1"`.
+    *   **How it works**: By default, vLLM automatically finds the correct wheels from our [Nightly Builds](https://docs.vllm.ai/en/latest/contributing/ci/nightly_builds/) by using the merge-base commit with the upstream `main` branch.
+    *   **Override commit**: To use wheels from a specific commit, provide the `--build-arg VLLM_PRECOMPILED_WHEEL_COMMIT=<commit_hash>` argument.
+
+    For a detailed explanation, refer to the documentation on 'Set up using Python-only build (without compilation)' part in [Build wheel from source](https://docs.vllm.ai/en/latest/contributing/ci/nightly_builds.html#precompiled-wheels-usage), these args are similar.
+
 ## Building for Arm64/aarch64

 A docker container can be built for aarch64 systems such as the Nvidia Grace-Hopper and Grace-Blackwell. Using the flag `--platform "linux/arm64"` will build for arm64.