[Docker] Add cuda arch list as build option (#1950)

2023-12-08 09:53:47 -08:00
parent 2b981012a6
commit c85b80c2b6
2 changed files with 13 additions and 1 deletions
--- a/docs/source/serving/deploying_with_docker.rst
+++ b/docs/source/serving/deploying_with_docker.rst
@@ -31,6 +31,14 @@ You can build and run vLLM from source via the provided dockerfile. To build vLL

    $ DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai # optionally specifies: --build-arg max_jobs=8 --build-arg nvcc_threads=2

+
+.. note::
+
+        By default vLLM will build for all GPU types for widest distribution. If you are just building for the
+        current GPU type the machine is running on, you can add the argument ``--build-arg torch_cuda_arch_list=""``
+        for vLLM to find the current GPU type and build for that.
+
+
 To run vLLM:

 .. code-block:: console