Make AsyncLLMEngine more robust & fix batched abort (#969)

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Avnish Narayan <38871737+avnishn@users.noreply.github.com>
This commit is contained in:
Antoni Baum
2023-09-07 13:43:45 -07:00
committed by GitHub
parent 7a9c20c715
commit c07ece5ca4
7 changed files with 345 additions and 55 deletions

View File

@@ -14,6 +14,7 @@ from vllm.utils import random_uuid
TIMEOUT_KEEP_ALIVE = 5 # seconds.
TIMEOUT_TO_PREVENT_DEADLOCK = 1 # seconds.
app = FastAPI()
engine = None
@app.post("/generate")