[Doc][3/N] Reorganize Serving section (#11766)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
11
docs/source/deployment/frameworks/lws.md
Normal file
11
docs/source/deployment/frameworks/lws.md
Normal file
@@ -0,0 +1,11 @@
|
||||
(deployment-lws)=
|
||||
|
||||
# LWS
|
||||
|
||||
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
|
||||
A major use case is for multi-host/multi-node distributed inference.
|
||||
|
||||
vLLM can be deployed with [LWS](https://github.com/kubernetes-sigs/lws) on Kubernetes for distributed model serving.
|
||||
|
||||
Please see [this guide](https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm) for more details on
|
||||
deploying vLLM on Kubernetes using LWS.
|
||||
Reference in New Issue
Block a user