Add pipeline parallel support to TransformersModel (#12832)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>
This commit is contained in:
@@ -73,7 +73,7 @@ The Transformers fallback explicitly supports the following features:
|
||||
|
||||
- <project:#quantization-index> (except GGUF)
|
||||
- <project:#lora-adapter>
|
||||
- <project:#distributed-serving> (pipeline parallel coming soon <gh-pr:12832>!)
|
||||
- <project:#distributed-serving> (requires `transformers>=4.49.0`)
|
||||
|
||||
#### Remote code
|
||||
|
||||
|
||||
Reference in New Issue
Block a user