Cyrus Leung
|
32e46e000f
|
[Frontend] Automatic detection of chat content format from AST (#9919)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-16 13:35:40 +08:00 |
|
Cyrus Leung
|
b311efd0bd
|
[Misc] Fix import error in tensorizer tests and cleanup some code (#10349)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-15 09:34:17 +00:00 |
|
Patrick von Platen
|
11cd1ae6ad
|
[Tool parsing] Improve / correct mistral tool parsing (#10333)
|
2024-11-15 00:42:49 +00:00 |
|
Zijin Xiao
|
554af9228d
|
[Bugfix] use AF_INET6 for OpenAI Compatible Server with ipv6 (#9583)
Signed-off-by: xiaozijin <xiaozijin@bytedance.com>
|
2024-11-14 16:38:53 -08:00 |
|
Guillaume Calmettes
|
52b48c1ead
|
[BugFix]: properly deserialize tool_calls iterator before processing by mistral-common when MistralTokenizer is used (#9951)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2024-11-14 04:48:16 +00:00 |
|
Mike Depinet
|
f67ce05d0b
|
[Frontend] Pythonic tool parser (#9859)
Signed-off-by: Mike Depinet <mike@fixie.ai>
|
2024-11-14 04:14:34 +00:00 |
|
Cyrus Leung
|
0b8bb86bf1
|
[1/N] Initial prototype for multi-modal processor (#10044)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-13 12:39:03 +00:00 |
|
zifeitong
|
47db6ec831
|
[Frontend] Add per-request number of cached token stats (#10174)
|
2024-11-12 16:42:28 +00:00 |
|
Guillaume Calmettes
|
36c513a076
|
[BugFix] Do not raise a ValueError when tool_choice is set to the supported none option and tools are not defined. (#10000)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2024-11-12 11:13:46 +00:00 |
|
Robert Shaw
|
6ace6fba2c
|
[V1] AsyncLLM Implementation (#9826)
Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-11-11 23:05:38 +00:00 |
|
yansh97
|
ad9a78bf64
|
[Doc] Fix typo error in vllm/entrypoints/openai/cli_args.py (#10196)
|
2024-11-11 00:14:22 +00:00 |
|
cjackal
|
d88bff1b96
|
[Frontend] add add_request_id middleware (#9594)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
|
2024-11-09 10:18:29 +00:00 |
|
Maximilien de Bayser
|
ae62fd17c0
|
[Frontend] Tool calling parser for Granite 3.0 models (#9027)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-11-07 07:09:02 -08:00 |
|
Lei Yang
|
0dfba97b42
|
[Frontend] Fix multiple values for keyword argument error (#10075) (#10076)
Signed-off-by: Lei <ylxx@live.com>
|
2024-11-07 09:07:19 +00:00 |
|
Nick Hill
|
29862b884b
|
[Frontend] Adjust try/except blocks in API impl (#10056)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2024-11-06 20:07:51 -08:00 |
|
Aaron Pham
|
21063c11c7
|
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-11-06 07:11:55 +00:00 |
|
Russell Bryant
|
5952d81139
|
[Frontend] Fix tcp port reservation for api server (#10012)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-05 07:50:57 -08:00 |
|
Robert Shaw
|
04cef2c6ab
|
[Bugfix] Fix MQLLMEngine hanging (#9973)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
|
2024-11-04 16:01:43 -05:00 |
|
Cyrus Leung
|
06386a64dd
|
[Frontend] Chat-based Embeddings API (#9759)
|
2024-11-01 08:13:35 +00:00 |
|
Kevin H. Luu
|
890ca36072
|
Revert "[Bugfix] Use host argument to bind to interface (#9798)" (#9852)
|
2024-10-31 01:44:51 +00:00 |
|
Guillaume Calmettes
|
abbfb6134d
|
[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint (#9837)
|
2024-10-30 18:15:56 -07:00 |
|
Joe Runde
|
3b3f1e7436
|
[Bugfix][core] replace heartbeat with pid check (#9818)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-10-30 09:34:07 -07:00 |
|
Will Eaton
|
882a1ad0de
|
[Model] tool calling support for ibm-granite/granite-20b-functioncalling (#8339)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>
|
2024-10-29 15:07:37 -07:00 |
|
Sven Seeberg
|
0f43387157
|
[Bugfix] Use host argument to bind to interface (#9798)
|
2024-10-29 10:37:59 -07:00 |
|
Zhong Qishuai
|
ef7865b4f9
|
[Frontend] re-enable multi-modality input in the new beam search implementation (#9427)
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
|
2024-10-29 11:49:47 +00:00 |
|
Sam Stoelinga
|
067e77f9a8
|
[Bugfix] Steaming continuous_usage_stats default to False (#9709)
Signed-off-by: Sam Stoelinga <sammiestoel@gmail.com>
|
2024-10-26 05:05:47 +00:00 |
|
Vinay R Damodaran
|
33bab41060
|
[Bugfix]: Make chat content text allow type content (#9358)
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
|
2024-10-24 05:05:49 +00:00 |
|
Yuhong Guo
|
434984e665
|
[Frontend] Support custom request_id from request (#9550)
Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>
|
2024-10-22 18:07:30 +00:00 |
|
Chen Zhang
|
5b59fe0f08
|
[Bugfix] Pass json-schema to GuidedDecodingParams and make test stronger (#9530)
|
2024-10-20 00:05:02 +00:00 |
|
Cyrus Leung
|
051eaf6db3
|
[Model] Add user-configurable task for models that support both generation and embedding (#9424)
|
2024-10-18 11:31:58 -07:00 |
|
Nick Hill
|
25aeb7d4c9
|
[BugFix] Fix and simplify completion API usage streaming (#9475)
|
2024-10-18 14:10:26 +00:00 |
|
tomeras91
|
d2b1bf55ec
|
[Frontend][Feature] Add jamba tool parser (#9154)
|
2024-10-18 10:27:48 +00:00 |
|
Cyrus Leung
|
390be74649
|
[Misc] Print stack trace using logger.exception (#9461)
|
2024-10-17 13:55:48 +00:00 |
|
Chang Su
|
ba30942240
|
[Bugfix] Fix vLLM UsageInfo and logprobs None AssertionError with empty token_ids (#9034)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-15 15:40:43 -07:00 |
|
Nick Hill
|
e9d517f276
|
[BugFix] Fix chat API continuous usage stats (#9357)
|
2024-10-14 23:19:48 -07:00 |
|
Brendan Wong
|
4d31cd424b
|
[Frontend] merge beam search implementations (#9296)
|
2024-10-14 15:05:52 -07:00 |
|
Maximilien de Bayser
|
ec10cb8511
|
[BugFix] Fix tool call finish reason in streaming case (#9209)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-10-11 18:24:26 -07:00 |
|
Cyrus Leung
|
cfaa6008e6
|
[Bugfix] Access get_vocab instead of vocab in tool parsers (#9188)
|
2024-10-09 08:59:57 -06:00 |
|
Daniele
|
9a94ca4a5d
|
[Bugfix] fix OpenAI API server startup with --disable-frontend-multiprocessing (#8537)
|
2024-10-08 09:38:40 -07:00 |
|
Alex Brooks
|
069d3bd8d0
|
[Frontend] Add Early Validation For Chat Template / Tool Call Parser (#9151)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-08 14:31:26 +00:00 |
|
Brendan Wong
|
8c746226c9
|
[Frontend] API support for beam search for MQLLMEngine (#9117)
|
2024-10-08 05:51:43 +00:00 |
|
youkaichao
|
18b296fdb2
|
[core] remove beam search from the core (#9105)
|
2024-10-07 05:47:04 +00:00 |
|
Yanyi Liu
|
fdf59d30ea
|
[Bugfix] fix tool_parser error handling when serve a model not support it (#8709)
|
2024-10-06 12:51:08 +00:00 |
|
Brendan Wong
|
168cab6bbf
|
[Frontend] API support for beam search (#9087)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-10-05 23:39:03 -07:00 |
|
Flávia Béo
|
0dcc8cbe5a
|
Adds truncate_prompt_tokens param for embeddings creation (#8999)
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
|
2024-10-04 18:31:40 +00:00 |
|
代君
|
3dbb215b38
|
[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model (#8405)
|
2024-10-04 10:36:39 +08:00 |
|
Guillaume Calmettes
|
83caf35e08
|
[BugFix] Enforce Mistral ToolCall id constraint when using the Mistral tool call parser (#9020)
|
2024-10-03 16:44:52 +08:00 |
|
Sebastian Schoennenbeck
|
35bd215168
|
[Core] [Frontend] Priority scheduling for embeddings and in the OpenAI-API (#8965)
|
2024-10-01 09:58:06 +00:00 |
|
Joe Runde
|
062c89e7c9
|
[Frontend][Core] Move guided decoding params into sampling params (#8252)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-01 09:34:25 +08:00 |
|
danieljannai21
|
6c9ba48fde
|
[Frontend] Added support for HF's new continue_final_message parameter (#8942)
|
2024-09-29 17:59:47 +00:00 |
|