Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com> Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
335 B
335 B
Disaggregated Serving
This example contains scripts that demonstrate the disaggregated serving features of vLLM.
Files
disagg_proxy_demo.py- Demonstrates XpYd (X prefill instances, Y decode instances).kv_events.sh- Demonstrates KV cache event publishing.mooncake_connector- A proxy demo for MooncakeConnector.