[Tests] Shutdown test RemoteVLLMServer cleanly (#36950)

Recent PR #33949 changed the teardown logic of the RemoteVLLMServer test utility class to
send SIGTERM to all vllm (sub)processes at once, which breaks the clean/coordinated
shutdown logic that assumes only the top-level process will receive a signal (for example
when running in a container that's shut down).

This caused a bunch of errors and stacktraces in some test logs, even though those tests
still pass. We should still attempt a normal shutdown and only kill other procs if they are
still running after a few seconds.

Example: tests/v1/distributed/test_external_lb_dp.py::test_external_lb_completion_streaming

Signed-off-by: Nick Hill <nickhill123@gmail.com>
This commit is contained in:
Nick Hill
2026-03-13 00:32:55 -07:00
committed by GitHub
parent f296a1966d
commit b373b5102a

View File

@@ -235,13 +235,10 @@ class RemoteVLLMServer:
except (ProcessLookupError, OSError):
pgid = None
# Phase 1: graceful SIGTERM to the entire process group
if pgid is not None:
with contextlib.suppress(ProcessLookupError, OSError):
os.killpg(pgid, signal.SIGTERM)
print(f"[RemoteOpenAIServer] Sent SIGTERM to process group {pgid}")
else:
# Phase 1: graceful SIGTERM to the root process
with contextlib.suppress(ProcessLookupError, OSError):
self.proc.terminate()
print(f"[RemoteOpenAIServer] Sent SIGTERM to process {pid}")
try:
self.proc.wait(timeout=15)