[Docker] Add cuda arch list as build option (#1950)

This commit is contained in:
Simon Mo
2023-12-08 09:53:47 -08:00
committed by GitHub
parent 2b981012a6
commit c85b80c2b6
2 changed files with 13 additions and 1 deletions

View File

@@ -31,6 +31,14 @@ You can build and run vLLM from source via the provided dockerfile. To build vLL
$ DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai # optionally specifies: --build-arg max_jobs=8 --build-arg nvcc_threads=2
.. note::
By default vLLM will build for all GPU types for widest distribution. If you are just building for the
current GPU type the machine is running on, you can add the argument ``--build-arg torch_cuda_arch_list=""``
for vLLM to find the current GPU type and build for that.
To run vLLM:
.. code-block:: console