Optimize Mixtral with expert parallelism (#2090)

2023-12-13 23:55:07 -08:00
parent f1c8520146
commit 21d93c140d
6 changed files with 230 additions and 343 deletions
--- a/docs/source/models/supported_models.rst
+++ b/docs/source/models/supported_models.rst
@@ -74,8 +74,7 @@ Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for in
 Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.

 .. note::
-    Currently, the ROCm version of vLLM does not support Mixtral.
-    Additionally, it only supports Mistral for context lengths up to 4096.
+    Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.

 .. tip::
    The easiest way to check if your model is supported is to run the program below: