Commit Graph

11 Commits

Author SHA1 Message Date
Cyrus Leung
e808156f30 [Misc] Collect model support info in a single process per model (#9233) 2024-10-11 11:08:11 +00:00
Cyrus Leung
3b00b9c26c [Core] renamePromptInputs and inputs (#8876) 2024-09-26 20:35:15 -07:00
Simon Mo
4f1ba0844b Revert "rename PromptInputs and inputs with backward compatibility (#8760) (#8810) 2024-09-25 10:36:26 -07:00
科英
64840dfae4 [Frontend] MQLLMEngine supports profiling. (#8761) 2024-09-25 09:37:41 -07:00
Cyrus Leung
28e1299e60 rename PromptInputs and inputs with backward compatibility (#8760) 2024-09-25 09:36:47 -07:00
Joe Runde
6e0c9d6bd0 [Bugfix] Use heartbeats instead of health checks (#8583) 2024-09-24 20:37:38 -07:00
Simon Mo
3185fb0cca Revert "[Core] Rename PromptInputs to PromptType, and inputs to prompt" (#8750) 2024-09-24 05:45:20 +00:00
Alexander Matveev
1a2aef3e59 Add output streaming support to multi-step + async while ensuring RequestOutput obj reuse (#8335) 2024-09-23 15:38:04 -07:00
Cyrus Leung
0057894ef7 [Core] Rename PromptInputs and inputs(#8673) 2024-09-20 19:00:54 -07:00
Nick Hill
76515f303b [Frontend] Use MQLLMEngine for embeddings models too (#8584) 2024-09-19 12:51:06 -04:00
Alexander Matveev
7c7714d856 [Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-09-18 13:56:58 +00:00