Support expert parallel in Transformers backend (#26162)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-04 05:35:04 +01:00
parent ea507c3a93
commit d3d649efec
2 changed files with 32 additions and 21 deletions
--- a/docs/models/supported_models.md
+++ b/docs/models/supported_models.md
@@ -32,8 +32,9 @@ If the Transformers model implementation follows all the steps in [writing a cus
 - All the features listed in the [compatibility matrix](../features/README.md#feature-x-feature)
 - Any combination of the following vLLM parallelisation schemes:
    - Data parallel
-    - Pipeline parallel
    - Tensor parallel
+    - Expert parallel
+    - Pipeline parallel

 Checking if the modeling backend is Transformers is as simple as: