Harry Mellor
313ae8c16a
[Deprecation] Remove everything scheduled for removal in v0.10.0 ( #20979 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-07-15 15:57:53 +00:00
Alex Brooks
41060c6e08
[Core] Add Support for Default Modality Specific LoRAs [generate / chat completions] ( #19126 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
2025-07-10 21:09:37 +01:00
Chauncey
2155e95ef1
[Bugfix] Fix the issue where reasoning_content is None when Thinkng is enabled and tool_choice is set to 'required'. ( #20662 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-07-09 07:39:58 +00:00
Gabriel Marinho
a4113b035c
[Platform] Add custom default max tokens ( #18557 )
...
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com >
2025-07-04 10:50:17 +08:00
Shintarou Okada
3d19d47d91
[Frontend] Expand tools even if tool_choice="none" ( #17177 )
...
Signed-off-by: okada shintarou <okada@preferred.jp >
2025-07-01 12:47:38 -04:00
Max Wittig
f59fc60fb3
[Feat][CLI] enforce-include-usage ( #19695 )
...
Signed-off-by: Max Wittig <max.wittig@siemens.com >
2025-06-25 01:43:04 -04:00
Chauncey
836d4ce140
[Bugfix] fix missing 'finish_reason': null in streaming chat ( #19662 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-06-16 14:10:39 +00:00
Chauncey
8fc57501d3
[Bugfix]: Fix the incompatibility issue with stream when Thinking is disabled ( #19135 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-06-05 06:24:24 +00:00
Chauncey
4de790fcad
[Bugfix]: Fix the incompatibility issue with tool_choice 'required' when Thinking is enabled ( #19075 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-06-03 23:27:24 +00:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com >
2025-06-03 11:20:17 -07:00
Chauncey
77164dad5e
[Bugfix] Consistent ascii handling in tool parsers ( #18883 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-05-30 04:44:43 +00:00
Alex Brooks
321331b8ae
[Core] Add Lora Support to Beam Search ( #18346 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
2025-05-28 08:58:24 -07:00
Feng XiaoLong
4fc1bf813a
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking ( #18454 )
...
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com >
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com >
2025-05-23 16:16:26 -07:00
David Xia
f25e0d1125
[Bugfix]: make most of test_openai_schema.py pass ( #17664 )
2025-05-14 17:04:35 -07:00
Robert Shaw
d19110204c
[P/D] NIXL Integration ( #17751 )
...
Signed-off-by: ApostaC <yihua98@uchicago.edu >
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com >
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com >
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com >
Signed-off-by: mgoin <mgoin64@gmail.com >
Signed-off-by: Nick Hill <nhill@redhat.com >
Signed-off-by: Brent Salisbury <bsalisbu@redhat.com >
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com >
Co-authored-by: ApostaC <yihua98@uchicago.edu >
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com >
Co-authored-by: mgoin <mgoin64@gmail.com >
Co-authored-by: Nick Hill <nhill@redhat.com >
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com >
Co-authored-by: Brent Salisbury <bsalisbu@redhat.com >
2025-05-12 09:46:16 -07:00
Maximilien de Bayser
05a4324f8e
Initialize the delta tool call fields explicitly ( #17340 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com >
Co-authored-by: igmainc <igmainc@icloud.com >
2025-05-12 13:28:58 +00:00
Chauncey
5394ad7387
[Bugfix] fix KeyError on top logprobs are special tokens ( #17637 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-05-04 19:22:35 -07:00
Chauncey
98060b001d
[Feature][Frontend]: Deprecate --enable-reasoning ( #17452 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-05-01 06:46:16 -07:00
Guillaume Calmettes
1da6a09274
[Bugfix]: do not shutdown server if skip_special_use=False for MistralTokenizer ( #14094 )
...
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com >
2025-04-09 19:43:09 -07:00
Matthias Matt
cefb9e5a28
[Frontend] Implement Tool Calling with tool_choice='required' ( #13483 )
...
Signed-off-by: Liangfu Chen <liangfc@amazon.com >
Signed-off-by: Matt, Matthias <matthias.matt@tuwien.ac.at >
Co-authored-by: Liangfu Chen <liangfc@amazon.com >
Co-authored-by: mgoin <michael@neuralmagic.com >
2025-04-02 07:45:45 -07:00
Ce Gao
32b14baf8a
[Refactor][Frontend] Keep all logic about reasoning into one class ( #14428 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai >
2025-03-28 00:23:30 -07:00
Jason (Siyu) Zhu
cec8c7d7f8
Refactor error handling for multiple exceptions in preprocessing ( #15650 )
...
Signed-off-by: JasonZhu1313 <jasonchu13@outlook.com >
2025-03-28 03:27:20 +00:00
Robin
d6cd59f122
[Frontend] Support tool calling and reasoning parser ( #14511 )
...
Signed-off-by: WangErXiao <863579016@qq.com >
2025-03-23 14:00:07 -07:00
Guillaume Calmettes
fd8e055ffb
[BugFix]: properly catch templating error when preprocess input ( #13976 )
...
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com >
2025-03-14 05:58:34 -07:00
Harry Mellor
47512b3200
Default to generation_config from model ( #12622 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-03-08 14:46:15 +08:00
Benjamin Chislett
32985bed7c
[Frontend] Allow return_tokens_as_token_ids to be passed as a request param ( #14066 )
...
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai >
2025-03-05 06:30:40 +00:00
Harry Mellor
e5b2f1601a
[Frontend] Do prompt_logprobs clamping for chat as well as completions ( #14225 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-03-04 20:13:06 +00:00
Harry Mellor
9badee53de
Fix performance when --generation-config is not None ( #14223 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-03-04 20:59:22 +01:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Keyun Tong
0ffdf8ce0c
[HTTP Server] Make model param optional in request ( #13568 )
2025-02-21 21:55:50 -08:00
Rafael Vasquez
314cfade02
[Frontend] Generate valid tool call IDs when using tokenizer-mode=mistral ( #12332 )
2025-02-12 08:29:56 -08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com >
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com >
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2025-02-02 11:58:18 -08:00
Ce Gao
a7e3eba66f
[Frontend] Support reasoning content for deepseek r1 ( #12473 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai >
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
Co-authored-by: Michael Goin <mgoin@redhat.com >
2025-01-29 11:38:08 +08:00
Robert Shaw
33fc1e2e86
[Frontend] Improve StreamingResponse Exception Handling ( #11752 )
2025-01-05 16:35:01 -05:00
Joe Runde
4db72e57f6
[Bugfix][Refactor] Unify model management in frontend ( #11660 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com >
2025-01-01 02:21:51 +00:00
Yanyi Liu
5aef49806d
[Feature] Add load generation config from model ( #11164 )
...
Signed-off-by: liuyanyi <wolfsonliu@163.com >
Signed-off-by: Yanyi Liu <wolfsonliu@163.com >
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2024-12-19 10:50:38 +00:00
Joe Runde
2d1b9baa8f
[Bugfix] Fix request cancellation without polling ( #11190 )
Create Release / Create Release (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.10, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.11, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.12, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (11.8, ubuntu-20.04, 3.9, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.10, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.11, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.12, 2.4.0) (push) Has been cancelled
Create Release / Build Wheel (12.1, ubuntu-20.04, 3.9, 2.4.0) (push) Has been cancelled
2024-12-17 12:26:32 -08:00
Brad Hilton
9c3dadd1c9
[Frontend] Add logits_processors as an extra completion argument ( #11150 )
...
Signed-off-by: Brad Hilton <brad.hilton.nw@gmail.com >
2024-12-14 16:46:42 +00:00
Jiaxin Shan
85362f028c
[Misc][LoRA] Ensure Lora Adapter requests return adapter name ( #11094 )
...
Signed-off-by: Jiaxin Shan <seedjeffwan@gmail.com >
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com >
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com >
2024-12-12 09:25:16 +00:00
Clayton
7439a8b5fc
[Bugfix] Multiple fixes to tool streaming with hermes and mistral ( #10979 )
...
Signed-off-by: cedonley <clayton@donley.io >
2024-12-12 01:10:12 +00:00
Joe Runde
980ad394a8
[Frontend] Use request id from header ( #10968 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com >
2024-12-10 13:46:29 +08:00
Chauncey
da7e702c6f
[Bug]: When apply continue_final_message for OpenAI server, the "echo":false is ignored ( #10180 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2024-11-21 16:24:32 +00:00
Cyrus Leung
32e46e000f
[Frontend] Automatic detection of chat content format from AST ( #9919 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-11-16 13:35:40 +08:00
Patrick von Platen
11cd1ae6ad
[Tool parsing] Improve / correct mistral tool parsing ( #10333 )
2024-11-15 00:42:49 +00:00
Guillaume Calmettes
52b48c1ead
[BugFix]: properly deserialize tool_calls iterator before processing by mistral-common when MistralTokenizer is used ( #9951 )
...
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com >
2024-11-14 04:48:16 +00:00
Mike Depinet
f67ce05d0b
[Frontend] Pythonic tool parser ( #9859 )
...
Signed-off-by: Mike Depinet <mike@fixie.ai >
2024-11-14 04:14:34 +00:00
Cyrus Leung
0b8bb86bf1
[1/N] Initial prototype for multi-modal processor ( #10044 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-11-13 12:39:03 +00:00
zifeitong
47db6ec831
[Frontend] Add per-request number of cached token stats ( #10174 )
2024-11-12 16:42:28 +00:00
Cyrus Leung
06386a64dd
[Frontend] Chat-based Embeddings API ( #9759 )
2024-11-01 08:13:35 +00:00
Zhong Qishuai
ef7865b4f9
[Frontend] re-enable multi-modality input in the new beam search implementation ( #9427 )
...
Signed-off-by: Qishuai Ferdinandzhong@gmail.com
2024-10-29 11:49:47 +00:00