Commit Graph

178 Commits

Author SHA1 Message Date
Cyrus Leung
d31f7844f8 [Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-19 05:20:55 -07:00
Cyrus Leung
d2740fafbf [Chore] Separate out vllm.utils.collections (#26990)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-16 08:35:35 +00:00
Tao Hui
85a65e7f51 [Model] Add DeepSeek-V3.1 reasoning parser (split from PR #24972) (#25589)
Signed-off-by: taohui <taohui3@gmail.com>
Signed-off-by: Tao Hui <taohui3@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-10-15 11:09:52 +08:00
Max Wittig
fd85c9f426 [Bugfix][FE]: Always include usage with --enable-force-include-usage (#20983)
Signed-off-by: Max Wittig <max.wittig@siemens.com>
Signed-off-by: Antoine Auger <antoineauger@users.noreply.github.com>
Co-authored-by: Antoine Auger <antoineauger@users.noreply.github.com>
2025-10-14 09:17:39 +02:00
Harry Mellor
8fcaaf6a16 Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Chauncey
d0bed837ac [Refactor]Reduce duplicate code in serving_chat (#26627)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-10-11 12:04:49 +00:00
Chauncey
910abdbd08 [Bugfix] fixed top_logprobs: -1 does not appear to work as intended (#26470)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-10-11 00:41:17 +08:00
Cyrus Leung
4bdf7ac593 [Bugfix] Fix SHM cache initialization (#26427)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-09 02:48:04 -07:00
Harry Mellor
4e256cadc2 Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 09:18:11 -07:00
Harry Mellor
d6953beb91 Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Isotr0py
a42d2df75f [Frontend] Cache chat template kwargs resolution (#26227)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-04 15:32:30 +00:00
Aleksandr Samarin
73a99cc2a5 [Model] Fixed stream generator for gpt-oss + spec-decoding (#26027)
Signed-off-by: Aleksandr Samarin <astrlrd@nebius.com>
2025-10-03 13:43:41 +00:00
Yang Liu
812b7f54a8 [Renderer] Move Processor out of AsyncLLM (#24138)
Signed-off-by: Yang <lymailforjob@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-03 11:29:45 +00:00
kyt
2ed3f20dba [openai] Fix missing tool usage check (system message) (#24768)
Signed-off-by: kyt <eluban4532@gmail.com>
2025-10-03 18:55:44 +08:00
Russell Bryant
7977e5027c Add filtering for chat template kwargs (#25794)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-27 10:46:49 +00:00
Ben Browning
5caaeb714c [Bugfix] [Frontend] Cleanup gpt-oss non-streaming chat tool calls (#25514)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-09-24 03:20:38 +00:00
Aaron Pham
29283e8976 [Chore] Cleanup guided namespace, move to structured outputs config (#22772)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 09:20:27 +00:00
Shijun Yin
2b85697031 [BugFix] enable DOTALL to match multi-line tool_call parameters in extract_tool_call_required_streaming (#24668)
Signed-off-by: Shijun Yin <shijun.yin@outlook.com>
2025-09-17 09:21:18 +00:00
Zhuohan Li
6c47f6bfa4 [Core] Remove tokenizer group in vLLM (#24078)
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
2025-09-17 08:42:59 +00:00
Cheng Kuan Yong Jason
68dbde5dbb [Bugfix] remove duplicate tokens streamed in required tool choice streaming (#23312)
Signed-off-by: Jason Cheng <jasoncky96@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-09-16 15:16:32 +08:00
Harry Mellor
c1eda615ba Fix model name included in responses (#24663)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-11 10:47:51 -07:00
Aaron Pham
fb691ee4e7 [Fix] [gpt-oss] fix non-tool calling path for chat completion (#24324) 2025-09-06 19:10:32 +00:00
Aaron Pham
c29fb540ff [gpt-oss] tool parser supports for /chat/completions [1/n] (#22386)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-09-04 20:39:12 -07:00
Nicolò Lucchesi
65e038931d [Frontend] Skip unnecessary detokenization when token_id is requested (#24236)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-04 23:04:12 +00:00
Woosuk Kwon
5685370271 [Chore][V0 Deprecation] Move LogProb to a separate file (#24055)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-01 12:07:53 -07:00
Gabriel Marinho
5b8077b8ac Fix wrong truncate_prompt_tokens type hint (#22761)
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Gabriel Marinho <104592062+gmarinho2@users.noreply.github.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2025-08-30 20:39:38 +00:00
Chen Zhang
3210264421 [Frontend] Add --log-error-stack to print stack trace for error response (#22960)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-27 04:58:59 +00:00
Federico
585e0bde36 [Bugfix] UnboundLocalError when GptOss reasoning specified (#23054)
Signed-off-by: Federico <65908512+coval3nte@users.noreply.github.com>
2025-08-27 00:29:52 +00:00
Yu Guo
49ab23b3cc [gpt-oss] use reasoning channel for reasoning text in serving_chat (#22920)
Signed-off-by: Yu Guo <yuguo@meta.com>
2025-08-25 06:29:34 +00:00
bigmoyan
582bbe6bd7 [Fix] correct tool_id for kimi-k2 when use tool_choice=required (#21259)
Co-authored-by: wangzhengtao <wangzhengtao@msh.team>
2025-08-20 12:59:54 -07:00
Marko Rosenmueller
80141bbf2f fix: use cache_salt for gpt-oss (#23186)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
2025-08-19 18:12:25 +00:00
Yuge Zhang
24f4d1a224 Add return_token_ids parameter to OpenAI API endpoints (#22587)
Signed-off-by: Yuge Zhang <scottyugochang@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-08-19 09:48:31 -07:00
Nick Hill
f6b5040590 [Frontend] Avoid list copies in serving_chat.py (#22947)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-08-16 02:06:30 +00:00
Adrián García García
8e8e0b6af1 feat: Add --enable-log-outputs flag for logging model generations (#20707)
Signed-off-by: Adrian Garcia <adrian.garcia@inceptionai.ai>
2025-08-06 23:10:13 -07:00
Woosuk Kwon
f263a4b53f [gpt-oss] Support chat completion api (#22342) 2025-08-06 01:57:39 -07:00
Hongsheng Liu
7656cf4cf3 [Bugfix] [issue-21565] Fix the incompatibility issue with stream and named function calling when Thinking is disabled (#21573)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
2025-07-27 22:43:50 -07:00
Shintarou Okada
6eca337ce0 Replace --expand-tools-even-if-tool-choice-none with --exclude-tools-when-tool-choice-none for v0.10.0 (#20544)
Signed-off-by: okada <kokuzen@gmail.com>
Signed-off-by: okada shintarou <okada@preferred.jp>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 02:56:36 -07:00
Michael Goin
82ec66f514 [V0 Deprecation] Remove Prompt Adapters (#20588)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-23 16:36:48 -07:00
Asher
5a7fb3ab9e [Model] Add ToolParser and MoE Config for Hunyuan A13B (#20820)
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
2025-07-17 09:10:09 +00:00
Harry Mellor
313ae8c16a [Deprecation] Remove everything scheduled for removal in v0.10.0 (#20979)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-15 15:57:53 +00:00
Alex Brooks
41060c6e08 [Core] Add Support for Default Modality Specific LoRAs [generate / chat completions] (#19126)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2025-07-10 21:09:37 +01:00
Chauncey
2155e95ef1 [Bugfix] Fix the issue where reasoning_content is None when Thinkng is enabled and tool_choice is set to 'required'. (#20662)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-09 07:39:58 +00:00
Gabriel Marinho
a4113b035c [Platform] Add custom default max tokens (#18557)
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
2025-07-04 10:50:17 +08:00
Shintarou Okada
3d19d47d91 [Frontend] Expand tools even if tool_choice="none" (#17177)
Signed-off-by: okada shintarou <okada@preferred.jp>
2025-07-01 12:47:38 -04:00
Max Wittig
f59fc60fb3 [Feat][CLI] enforce-include-usage (#19695)
Signed-off-by: Max Wittig <max.wittig@siemens.com>
2025-06-25 01:43:04 -04:00
Chauncey
836d4ce140 [Bugfix] fix missing 'finish_reason': null in streaming chat (#19662)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-06-16 14:10:39 +00:00
Chauncey
8fc57501d3 [Bugfix]: Fix the incompatibility issue with stream when Thinking is disabled (#19135)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-06-05 06:24:24 +00:00
Chauncey
4de790fcad [Bugfix]: Fix the incompatibility issue with tool_choice 'required' when Thinking is enabled (#19075)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-06-03 23:27:24 +00:00
Simon Mo
02f0c7b220 [Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Chauncey
77164dad5e [Bugfix] Consistent ascii handling in tool parsers (#18883)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-05-30 04:44:43 +00:00