[Frontend][Core] Re-add shutdown timeout - allowing in-flight requests to finish (#36666)

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
This commit is contained in:
Mark McLoughlin
2026-03-13 19:10:06 +00:00
committed by GitHub
parent 5a3f1eb62f
commit 7afe0faab1
14 changed files with 762 additions and 96 deletions

View File

@@ -327,6 +327,12 @@ class VllmConfig:
weight_transfer_config: WeightTransferConfig | None = None
"""The configurations for weight transfer during RL training."""
shutdown_timeout: int = Field(default=0, ge=0)
"""Shutdown grace period for in-flight requests. Shutdown will be delayed for
up to this amount of time to allow already-running requests to complete. Any
remaining requests are aborted once the timeout is reached.
"""
def compute_hash(self) -> str:
"""
WARNING: Whenever a new field is added to this config,