Support expert parallel in Transformers backend (#26162)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
This commit is contained in:
@@ -32,8 +32,9 @@ If the Transformers model implementation follows all the steps in [writing a cus
|
||||
- All the features listed in the [compatibility matrix](../features/README.md#feature-x-feature)
|
||||
- Any combination of the following vLLM parallelisation schemes:
|
||||
- Data parallel
|
||||
- Pipeline parallel
|
||||
- Tensor parallel
|
||||
- Expert parallel
|
||||
- Pipeline parallel
|
||||
|
||||
Checking if the modeling backend is Transformers is as simple as:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user