Files
vllm/docs/deployment/integrations/dynamo.md
2026-03-05 17:39:50 +08:00

621 B

NVIDIA Dynamo

NVIDIA Dynamo is an open-source framework for distributed LLM inference that can run vLLM on Kubernetes with flexible serving architectures (e.g. aggregated/disaggregated, optional router/planner).

For Kubernetes deployment instructions and examples (including vLLM), see the Deploying Dynamo on Kubernetes guide.

Background reading: InfoQ news coverage — NVIDIA Dynamo simplifies Kubernetes deployment for LLM inference.