Files
2025-12-23 20:27:22 +00:00

433 B

KServe

vLLM can be deployed with KServe on Kubernetes for highly scalable distributed model serving.

You can use vLLM with KServe's Hugging Face serving runtime or via LLMInferenceService that uses llm-d.