Signed-off-by: ChenqianCao <39755070+ChenqianCao@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Disaggregated Serving
This example contains scripts that demonstrate the disaggregated serving features of vLLM.
Files
disagg_proxy_demo.py- Demonstrates XpYd (X prefill instances, Y decode instances).kv_events.sh- Demonstrates KV cache event publishing.mooncake_connector- A proxy demo for MooncakeConnector.