Nicolas Basile
66c54aa9c3
Check the max prompt length for the OpenAI completions API ( #472 )
2023-08-08 17:43:49 -07:00
YHPeter
e8ddc08ec8
[BUG FIX] upgrade fschat version to 0.2.23 ( #650 )
...
Co-authored-by: hao.yu <hao.yu@cn-c017.server.mila.quebec >
2023-08-02 14:05:59 -07:00
Zhuohan Li
58a072be15
[Fix] Add model sequence length into model config ( #575 )
2023-07-25 23:46:30 -07:00
Zhuohan Li
82ad323dee
[Fix] Add chat completion Example and simplify dependencies ( #576 )
2023-07-25 23:45:48 -07:00
Ricardo Lu
8c4b2592fb
fix: enable trust-remote-code in api server & benchmark. ( #509 )
2023-07-19 17:06:15 -07:00
Ricardo Lu
b396cb4998
fix: only response [DONE] once when streaming response. ( #378 )
2023-07-06 18:08:40 -07:00
akxxsb
3d64cf019e
[Server] use fastchat.model.model_adapter.get_conversation_template method to get model template ( #357 )
2023-07-04 21:39:59 -07:00
Zhuohan Li
98fe8cb542
[Server] Add option to specify chat template for chat endpoint ( #345 )
2023-07-03 23:01:56 -07:00
Zhuohan Li
42e0c1df78
[Quality] Add CI for formatting ( #343 )
2023-07-03 14:50:56 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter ( #326 )
2023-07-03 11:31:55 -07:00
Zhuohan Li
0ffded812a
[Fix] Better error message for batched prompts ( #342 )
2023-07-03 09:27:31 -07:00
Michele Catalano
0bd2a573a5
Allow send list of str for the Prompt on openai demo endpoint /v1/completions ( #323 )
...
* allow str or List[str] for prompt
* Update vllm/entrypoints/openai/api_server.py
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com >
---------
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com >
2023-07-03 09:17:50 -07:00
Ricardo Lu
49b26e2cec
feat: add ChatCompletion endpoint in OpenAI demo server. ( #330 )
2023-07-02 22:54:33 -07:00
Woosuk Kwon
998d9d1509
[Tokenizer] Add tokenizer mode ( #298 )
2023-06-28 14:19:22 -07:00
Woosuk Kwon
4338cc4750
[Tokenizer] Add an option to specify tokenizer ( #284 )
2023-06-28 09:46:58 -07:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM ( #150 )
2023-06-17 03:07:40 -07:00