Joe Runde
|
ac2f3f7fee
|
[Bugfix] Validate lora adapters to avoid crashing server (#11727)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-10 15:56:36 +08:00 |
|
Wallas Henrique
|
c0292211ce
|
[CI/Build] Replaced some models on tests for smaller ones (#9570)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
|
2024-10-22 04:52:14 +00:00 |
|
Alexander Matveev
|
7c7714d856
|
[Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-09-18 13:56:58 +00:00 |
|
Nick Hill
|
39178c7fbc
|
[Tests] Disable retries and use context manager for openai client (#7565)
|
2024-08-26 21:33:17 -07:00 |
|
Joe Runde
|
21b9c49aa3
|
[Frontend] Kill the server on engine death (#6594)
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-08-08 09:47:48 -07:00 |
|