Marko Rosenmueller
|
455949675d
|
[Frontend][Bug] allow tool calls in analysis channel (#28139)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-19 10:47:44 +00:00 |
|
lif
|
086b96339f
|
[Bugfix] Add validation for tool requests when tool_parser is unavailable (#30613)
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
2025-12-19 18:23:28 +08:00 |
|
Wenqi Glantz
|
4924ac582c
|
Add hidden dimension validation for multimodal embedding inputs (#30968)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
|
2025-12-19 07:59:36 +00:00 |
|
PlatinumGod
|
6a09612b2e
|
[Bugfix] Fix tool_choice="none" being ignored by GPT-OSS/harmony models (#30867)
Signed-off-by: yujiepu <pyjapple@gmail.com>
Signed-off-by: PlatinumGod <pyjapple@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-19 09:34:27 +08:00 |
|
inkcherry
|
500f26e6d3
|
[Bugfix] fix DP-aware routing in OpenAI API requests (#29002)
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-12-18 09:50:42 -08:00 |
|
Matthew Bonanni
|
7eb6cb6c18
|
[Attention] Update tests to remove deprecated env vars (#30563)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-17 09:49:59 -08:00 |
|
Nicolò Lucchesi
|
9ca8cb38fd
|
[CI][Bugfix] Fix flaky tests/entrypoints/openai/test_audio.py::test_chat_streaming_audio (#30878)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-17 18:49:56 +01:00 |
|
Chauncey
|
9ad5b21710
|
[Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-17 02:27:30 -08:00 |
|
Nicolò Lucchesi
|
ca702a14dc
|
[Frontend] Add max-completion-token option to transcription/translation endpoints (#30769)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-16 19:36:49 +00:00 |
|
Andrew Xia
|
0d0c929f23
|
[responsesAPI][8] input/output messages for ResponsesParser (#30158)
Signed-off-by: Andrew Xia <axia@fb.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Andrew Xia <axia@fb.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-16 13:54:59 +08:00 |
|
penfree
|
bbd850e597
|
[Bugfix] fix streaming final output for non harmony (#30237)
Signed-off-by: penfree <qiupengfei@baidu.com>
Co-authored-by: penfree <qiupengfei@baidu.com>
|
2025-12-16 09:03:11 +08:00 |
|
Chauncey
|
2a1776b7ac
|
[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-15 12:54:52 +00:00 |
|
Wenqi Glantz
|
84e23d103d
|
additional protection for CVE-2025-62164 (#30649)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
|
2025-12-15 03:07:10 +00:00 |
|
Cyrus Leung
|
dcb31196da
|
[Chore] Remove redundant RequestPrompt (#30612)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-14 09:22:37 +00:00 |
|
Cyrus Leung
|
64251f48df
|
[Chore] Adjust tokenizer import to avoid circular imports (#30601)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-13 04:42:39 -08:00 |
|
Cyrus Leung
|
b09806e28f
|
[Bugfix] Dictionary MM embeddings for online chat (#30507)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-13 15:48:56 +08:00 |
|
Benjamin Bartels
|
f3237f3f6b
|
[Frontend] Fixes anthropic streaming message_start usage nesting (#30266)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-12 16:28:54 +00:00 |
|
Ben Browning
|
8f8fda261a
|
[Bugfix] Multiple fixes for gpt-oss Chat Completion prompting (#28729)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-12 12:59:53 +08:00 |
|
Will Eaton
|
a9e4106f28
|
[P/D] KV Load Failure Recovery/Abort Configuration (#26813)
Signed-off-by: Will Eaton <weaton@redhat.com>
Signed-off-by: Will Eaton <me@wseaton.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-10 11:00:52 -08:00 |
|
Andrew Xia
|
c3487aca34
|
[responsesAPI][6] Fix multi turn MCP tokenization (#30230)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-10 10:13:13 +08:00 |
|
wang.yuqi
|
2e660c2434
|
[Frontend] Binary embedding response does not return metadata by setting encoding_format to bytes_only. (#30249)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-08 12:01:21 +00:00 |
|
daniel-salib
|
444f0e3f33
|
[Frontend] Add MCP type support infrastructure to Responses API (#30054)
Signed-off-by: Daniel Salib <danielsalib@meta.com>
|
2025-12-08 10:02:52 +08:00 |
|
Cyrus Leung
|
e83b7e379c
|
Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145)" (#30199)
|
2025-12-07 00:00:22 -08:00 |
|
Cyrus Leung
|
27f4c2fd46
|
[Renderer] Separate out RendererConfig from ModelConfig (#30145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 23:15:42 -08:00 |
|
jeremyteboul
|
dce6d229f7
|
Support multiple image/audio embeddings per requests (#29988)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
|
2025-12-07 04:34:24 +00:00 |
|
Andrew Xia
|
421125d03a
|
[ez] move harmony utils to parser folder (#30117)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-06 17:34:34 -05:00 |
|
Viacheslav
|
21bb323542
|
Gigachat 3 tool parser and tests (#29905)
Signed-off-by: Viacheslav Barinov <viacheslav.teh@gmail.com>
|
2025-12-06 12:04:14 +00:00 |
|
Deboleina
|
02a4169193
|
[Tests] Tool call tests for openai/gpt-oss-20b (#26237)
Signed-off-by: Debolina Roy <debroy@redhat.com>
|
2025-12-05 19:03:29 -08:00 |
|
Nicolò Lucchesi
|
e23ca3a0e8
|
[CI] Re-use whisper_client for all tests (#30148)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-05 19:47:37 +00:00 |
|
Andrew Xia
|
da7bc54ea8
|
[responsesAPI][5] ResponsesParser with tools for full MCP python loop (#29798)
Signed-off-by: Andrew Xia <axia@fb.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-05 11:11:50 -05:00 |
|
strinczer
|
b73b158ab0
|
[Bugfix] Fix parse_output_message crash on commentary with no recipient (#29972)
Signed-off-by: Shai Trinczer <strinczer@icloud.com>
Signed-off-by: strinczer <strinczer@icloud.com>
|
2025-12-05 10:51:12 +00:00 |
|
wang.yuqi
|
74c4d80c6c
|
[Model][6/N] Improve all pooling task | Support chunked prefill with ALL pooling (#27145)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-04 13:44:15 +00:00 |
|
Cyrus Leung
|
9ae2f60374
|
[Misc] Various cleanups for MM input processing (#29970)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 06:22:20 +00:00 |
|
Benjamin Bartels
|
fca3f46658
|
[Frontend] Fixes anthropic /v1/messages streaming not containing input_tokens on first chunk (#29971)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-04 05:50:27 +00:00 |
|
Chauncey
|
3f42b05fbc
|
[Refactor] [1/N] to simplify the vLLM serving architecture (#28040)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-03 01:26:39 -08:00 |
|
Andrew Xia
|
3a7751485b
|
[responsesAPI] support input output messages for non harmony models (#29549)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 23:59:23 -08:00 |
|
Andrew Xia
|
52cb349fc0
|
[responsesAPI][3] ResponsesParser to set up non harmony MCP (#29413)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 11:24:45 -05:00 |
|
ImaGoodFella
|
60c3d413af
|
[Multimodal][Core] Optimize multimodal preprocessing cache by hashing image bytes instead of pixel values (#29621)
Signed-off-by: Rahul Steiger <rasteiger@ethz.ch>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 21:49:02 +08:00 |
|
Cyrus Leung
|
653591d5e7
|
[Chore] Move tokenizer initialization methods (#29793)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 13:33:37 +08:00 |
|
Zuyi Zhao
|
53bf71b0f0
|
[Misc] Update conftest for entrypoints/sagemaker test folder (#29799)
Signed-off-by: Zuyi Zhao <zhaozuy@amazon.com>
|
2025-12-01 18:56:39 -09:00 |
|
Andrew Xia
|
fa8804ad9c
|
[responsesAPI][4] fix responseOutputItem Kimi K2 thinking bug (#29555)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 02:11:35 +00:00 |
|
sangbumlikeagod
|
092bb73b8a
|
[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation (#24209)
Signed-off-by: sangbumlikeagod <oironese@naver.com>
Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
|
2025-12-01 18:19:17 +01:00 |
|
Cyrus Leung
|
f0a28bf661
|
[Misc] Unify tokenizer registration (#29767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-01 11:34:58 +00:00 |
|
daniel-salib
|
014ece97c7
|
[Frontend] Add tool filtering support to ToolServer (#29224)
Signed-off-by: Daniel Salib <danielsalib@meta.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-01 08:03:57 +00:00 |
|
wang.yuqi
|
62de4f4257
|
[Frontend] Resettle pooling entrypoints (#29634)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2025-12-01 15:30:43 +08:00 |
|
Jee Jee Li
|
b9d0504a36
|
[Bugfix] Revert test_tokenization.py (#29729)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-11-29 16:35:15 +00:00 |
|
Cyrus Leung
|
34a984274e
|
[Misc] Refactor tokenizer interface (#29693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-29 04:02:21 -08:00 |
|
Jee Jee Li
|
39e63dec7c
|
[LoRA] Cleanup LoRA unused code (#29611)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-11-28 22:52:58 -08:00 |
|
Tsukasa OI
|
762a4a6ca9
|
[Frontend] Perform offline path replacement to tokenizer (#29706)
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
|
2025-11-28 18:32:08 -08:00 |
|
Cyrus Leung
|
b2c50eda50
|
[Bugfix] Fix wrong mock attribute (#29704)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-29 10:30:41 +08:00 |
|