[ROCm] add support to ROCm 6.0 and MI300 (#2274)

2024-01-26 15:41:10 -05:00
parent 5265631d15
commit 6b7de1a030
8 changed files with 96 additions and 13 deletions
--- a/docs/source/getting_started/amd-installation.rst
+++ b/docs/source/getting_started/amd-installation.rst
@@ -11,10 +11,10 @@ Requirements
 ------------

 * OS: Linux
-* Python: 3.8 -- 3.11 (Verified on 3.10)
-* GPU: MI200s
+* Python: 3.8 -- 3.11
+* GPU: MI200s (gfx90a), MI300 (gfx942)
 * Pytorch 2.0.1/2.1.1/2.2
-* ROCm 5.7
+* ROCm 5.7 (Verified on python 3.10) or ROCm 6.0 (Verified on python 3.9)

 Installation options:

@@ -27,6 +27,8 @@ Installation options:
 (Recommended) Option 1: Quick start with vLLM pre-installed in Docker Image
 ---------------------------------------------------------------------------

+This option is for ROCm 5.7 only:
+
 .. code-block:: console

    $ docker pull embeddedllminfo/vllm-rocm:vllm-v0.2.4
@@ -50,6 +52,9 @@ Option 2: Build from source

 You can build and install vLLM from source:

+Below instruction is for ROCm 5.7 only. 
+At the time of this documentation update, PyTorch on ROCm 6.0 wheel is not yet available on the PyTorch website.
+
 0. Install prerequisites (skip if you are already in an environment/docker with the following installed):

 - `ROCm <https://rocm.docs.amd.com/en/latest/deploy/linux/index.html>`_
@@ -95,6 +100,23 @@ You can build and install vLLM from source:

 Build a docker image from `Dockerfile.rocm`, and launch a docker container.

+The `Dokerfile.rocm` is designed to support both ROCm 5.7 and ROCm 6.0 and later versions. It provides flexibility to customize the build of docker image using the following arguments:
+
+* `BASE_IMAGE`: specifies the base image used when running ``docker build``, specifically the PyTorch on ROCm base image. We have tested ROCm 5.7 and ROCm 6.0. The default is `rocm/pytorch:rocm6.0_ubuntu20.04_py3.9_pytorch_2.1.1`
+* `FX_GFX_ARCHS`: specifies the GFX architecture that is used to build flash-attention, for example, `gfx90a;gfx942` for MI200 and MI300. The default is `gfx90a;gfx942`
+* `FA_BRANCH`: specifies the branch used to build the flash-attention in `ROCmSoftwarePlatform's flash-attention repo <https://github.com/ROCmSoftwarePlatform/flash-attention>`_. The default is `3d2b6f5`
+
+Their values can be passed in when running ``docker build`` with ``--build-arg`` options.
+
+For example, to build docker image for vllm on ROCm 5.7, you can run:
+
+.. code-block:: console
+
+    $ docker build --build-arg BASE_IMAGE="rocm/pytorch:rocm5.7_ubuntu22.04_py3.10_pytorch_2.0.1" \
+       -f Dockerfile.rocm -t vllm-rocm . 
+
+To build vllm on ROCm 6.0, you can use the default:
+
 .. code-block:: console

    $ docker build -f Dockerfile.rocm -t vllm-rocm . 
@@ -142,3 +164,8 @@ Alternatively, if you plan to install vLLM-ROCm on a local machine or start from
        $ cd vllm
        $ pip install -U -r requirements-rocm.txt
        $ python setup.py install # This may take 5-10 minutes.
+
+.. note::
+
+    - You may need to turn on the ``--enforce-eager`` flag if you experience process hang when running the `benchmark_thoughput.py` script to test your installation.
+