Optimize Mixtral with expert parallelism (#2090)
This commit is contained in:
@@ -74,8 +74,7 @@ Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for in
|
||||
Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.
|
||||
|
||||
.. note::
|
||||
Currently, the ROCm version of vLLM does not support Mixtral.
|
||||
Additionally, it only supports Mistral for context lengths up to 4096.
|
||||
Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
|
||||
|
||||
.. tip::
|
||||
The easiest way to check if your model is supported is to run the program below:
|
||||
|
||||
Reference in New Issue
Block a user