[ROCm] add support to ROCm 6.0 and MI300 (#2274)

This commit is contained in:
Hongxia Yang
2024-01-26 15:41:10 -05:00
committed by GitHub
parent 5265631d15
commit 6b7de1a030
8 changed files with 96 additions and 13 deletions

View File

@@ -11,10 +11,10 @@ Requirements
------------
* OS: Linux
* Python: 3.8 -- 3.11 (Verified on 3.10)
* GPU: MI200s
* Python: 3.8 -- 3.11
* GPU: MI200s (gfx90a), MI300 (gfx942)
* Pytorch 2.0.1/2.1.1/2.2
* ROCm 5.7
* ROCm 5.7 (Verified on python 3.10) or ROCm 6.0 (Verified on python 3.9)
Installation options:
@@ -27,6 +27,8 @@ Installation options:
(Recommended) Option 1: Quick start with vLLM pre-installed in Docker Image
---------------------------------------------------------------------------
This option is for ROCm 5.7 only:
.. code-block:: console
$ docker pull embeddedllminfo/vllm-rocm:vllm-v0.2.4
@@ -50,6 +52,9 @@ Option 2: Build from source
You can build and install vLLM from source:
Below instruction is for ROCm 5.7 only.
At the time of this documentation update, PyTorch on ROCm 6.0 wheel is not yet available on the PyTorch website.
0. Install prerequisites (skip if you are already in an environment/docker with the following installed):
- `ROCm <https://rocm.docs.amd.com/en/latest/deploy/linux/index.html>`_
@@ -95,6 +100,23 @@ You can build and install vLLM from source:
Build a docker image from `Dockerfile.rocm`, and launch a docker container.
The `Dokerfile.rocm` is designed to support both ROCm 5.7 and ROCm 6.0 and later versions. It provides flexibility to customize the build of docker image using the following arguments:
* `BASE_IMAGE`: specifies the base image used when running ``docker build``, specifically the PyTorch on ROCm base image. We have tested ROCm 5.7 and ROCm 6.0. The default is `rocm/pytorch:rocm6.0_ubuntu20.04_py3.9_pytorch_2.1.1`
* `FX_GFX_ARCHS`: specifies the GFX architecture that is used to build flash-attention, for example, `gfx90a;gfx942` for MI200 and MI300. The default is `gfx90a;gfx942`
* `FA_BRANCH`: specifies the branch used to build the flash-attention in `ROCmSoftwarePlatform's flash-attention repo <https://github.com/ROCmSoftwarePlatform/flash-attention>`_. The default is `3d2b6f5`
Their values can be passed in when running ``docker build`` with ``--build-arg`` options.
For example, to build docker image for vllm on ROCm 5.7, you can run:
.. code-block:: console
$ docker build --build-arg BASE_IMAGE="rocm/pytorch:rocm5.7_ubuntu22.04_py3.10_pytorch_2.0.1" \
-f Dockerfile.rocm -t vllm-rocm .
To build vllm on ROCm 6.0, you can use the default:
.. code-block:: console
$ docker build -f Dockerfile.rocm -t vllm-rocm .
@@ -142,3 +164,8 @@ Alternatively, if you plan to install vLLM-ROCm on a local machine or start from
$ cd vllm
$ pip install -U -r requirements-rocm.txt
$ python setup.py install # This may take 5-10 minutes.
.. note::
- You may need to turn on the ``--enforce-eager`` flag if you experience process hang when running the `benchmark_thoughput.py` script to test your installation.