[Frontend] don't block event loop in tokenization (preprocess) in OpenAI compatible server (#10635)

Signed-off-by: Tomer Asida <tomera@ai21.com>
This commit is contained in:
tomeras91
2024-11-27 23:21:10 +02:00
committed by GitHub
parent 9b4b150395
commit 395b1c7454
7 changed files with 206 additions and 56 deletions

View File

@@ -101,7 +101,7 @@ class OpenAIServingCompletion(OpenAIServing):
tokenizer = await self.engine_client.get_tokenizer(lora_request)
request_prompts, engine_prompts = self._preprocess_completion(
request_prompts, engine_prompts = await self._preprocess_completion(
request,
tokenizer,
request.prompt,