Andreas Karatzas
|
4f9ce35afe
|
[CI][Bugfix] Fix token counting in chunked prefill compl test (#31630)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-01-03 14:28:49 +08:00 |
|
Andreas Karatzas
|
21de6d4b02
|
[CI][Bugfix] Fix token counting in chunked prefill streaming test (#31565)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-31 23:05:14 +00:00 |
|
Hojin Yang
|
dc837bc23e
|
feat(frontend): add --default-chat-template-kwargs CLI argument (#31343)
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>
|
2025-12-30 03:38:47 +00:00 |
|
amittell
|
9c884faa95
|
[Bugfix] Preserve tool call id/type/name in streaming finish chunk (#31438)
Signed-off-by: amittell <mittell@me.com>
Signed-off-by: Alex Mittell <mittell@me.com>
|
2025-12-29 21:10:52 +08:00 |
|
Chauncey
|
48d5ca4e8b
|
[CI] fix test_chat_truncation_content_not_null test (#31488)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-29 12:47:08 +00:00 |
|
Andreas Karatzas
|
c79dbfa9ad
|
[CI] Fix flaky vision beam search test with flexible semantic validation (#31324)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-26 04:39:32 +00:00 |
|
Cyrus Leung
|
09dc7c690c
|
[Chore][1/2] Drop v0.14 deprecations (#31285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-24 09:54:01 -08:00 |
|
Cyrus Leung
|
aa3868ecfe
|
[Chore] Remove unused noqas (#31263)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-24 05:38:46 -08:00 |
|
Andreas Karatzas
|
0247a91e00
|
[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm (#28979)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-23 22:42:30 -08:00 |
|
Cyrus Leung
|
bb62dda2c3
|
[Misc] Introduce encode_*_url utility function (#31208)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-23 13:45:21 +00:00 |
|
Nicolò Lucchesi
|
b1c3f96ae3
|
[CI][Bugfix] Fix entrypoints/openai/test_audio.py (#31151)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-22 07:21:40 -08:00 |
|
汪志鹏
|
3e92b2b7ac
|
[BugFix]fix gpt-oss v1/completions response bug (#30608)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: bbrowning <bbrownin@redhat.com>
|
2025-12-21 10:39:31 +08:00 |
|
Marko Rosenmueller
|
455949675d
|
[Frontend][Bug] allow tool calls in analysis channel (#28139)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-19 10:47:44 +00:00 |
|
lif
|
086b96339f
|
[Bugfix] Add validation for tool requests when tool_parser is unavailable (#30613)
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
2025-12-19 18:23:28 +08:00 |
|
Wenqi Glantz
|
4924ac582c
|
Add hidden dimension validation for multimodal embedding inputs (#30968)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
|
2025-12-19 07:59:36 +00:00 |
|
PlatinumGod
|
6a09612b2e
|
[Bugfix] Fix tool_choice="none" being ignored by GPT-OSS/harmony models (#30867)
Signed-off-by: yujiepu <pyjapple@gmail.com>
Signed-off-by: PlatinumGod <pyjapple@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-19 09:34:27 +08:00 |
|
inkcherry
|
500f26e6d3
|
[Bugfix] fix DP-aware routing in OpenAI API requests (#29002)
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-12-18 09:50:42 -08:00 |
|
Matthew Bonanni
|
7eb6cb6c18
|
[Attention] Update tests to remove deprecated env vars (#30563)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-17 09:49:59 -08:00 |
|
Nicolò Lucchesi
|
9ca8cb38fd
|
[CI][Bugfix] Fix flaky tests/entrypoints/openai/test_audio.py::test_chat_streaming_audio (#30878)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-17 18:49:56 +01:00 |
|
Chauncey
|
9ad5b21710
|
[Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-17 02:27:30 -08:00 |
|
Nicolò Lucchesi
|
ca702a14dc
|
[Frontend] Add max-completion-token option to transcription/translation endpoints (#30769)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-16 19:36:49 +00:00 |
|
Andrew Xia
|
0d0c929f23
|
[responsesAPI][8] input/output messages for ResponsesParser (#30158)
Signed-off-by: Andrew Xia <axia@fb.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Andrew Xia <axia@fb.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-16 13:54:59 +08:00 |
|
penfree
|
bbd850e597
|
[Bugfix] fix streaming final output for non harmony (#30237)
Signed-off-by: penfree <qiupengfei@baidu.com>
Co-authored-by: penfree <qiupengfei@baidu.com>
|
2025-12-16 09:03:11 +08:00 |
|
Chauncey
|
2a1776b7ac
|
[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-15 12:54:52 +00:00 |
|
Wenqi Glantz
|
84e23d103d
|
additional protection for CVE-2025-62164 (#30649)
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
|
2025-12-15 03:07:10 +00:00 |
|
Cyrus Leung
|
dcb31196da
|
[Chore] Remove redundant RequestPrompt (#30612)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-14 09:22:37 +00:00 |
|
Cyrus Leung
|
64251f48df
|
[Chore] Adjust tokenizer import to avoid circular imports (#30601)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-13 04:42:39 -08:00 |
|
Benjamin Bartels
|
f3237f3f6b
|
[Frontend] Fixes anthropic streaming message_start usage nesting (#30266)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-12 16:28:54 +00:00 |
|
Ben Browning
|
8f8fda261a
|
[Bugfix] Multiple fixes for gpt-oss Chat Completion prompting (#28729)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-12 12:59:53 +08:00 |
|
Will Eaton
|
a9e4106f28
|
[P/D] KV Load Failure Recovery/Abort Configuration (#26813)
Signed-off-by: Will Eaton <weaton@redhat.com>
Signed-off-by: Will Eaton <me@wseaton.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-10 11:00:52 -08:00 |
|
daniel-salib
|
444f0e3f33
|
[Frontend] Add MCP type support infrastructure to Responses API (#30054)
Signed-off-by: Daniel Salib <danielsalib@meta.com>
|
2025-12-08 10:02:52 +08:00 |
|
Cyrus Leung
|
e83b7e379c
|
Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145)" (#30199)
|
2025-12-07 00:00:22 -08:00 |
|
Cyrus Leung
|
27f4c2fd46
|
[Renderer] Separate out RendererConfig from ModelConfig (#30145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-06 23:15:42 -08:00 |
|
Andrew Xia
|
421125d03a
|
[ez] move harmony utils to parser folder (#30117)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-06 17:34:34 -05:00 |
|
Viacheslav
|
21bb323542
|
Gigachat 3 tool parser and tests (#29905)
Signed-off-by: Viacheslav Barinov <viacheslav.teh@gmail.com>
|
2025-12-06 12:04:14 +00:00 |
|
Deboleina
|
02a4169193
|
[Tests] Tool call tests for openai/gpt-oss-20b (#26237)
Signed-off-by: Debolina Roy <debroy@redhat.com>
|
2025-12-05 19:03:29 -08:00 |
|
Nicolò Lucchesi
|
e23ca3a0e8
|
[CI] Re-use whisper_client for all tests (#30148)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-12-05 19:47:37 +00:00 |
|
Andrew Xia
|
da7bc54ea8
|
[responsesAPI][5] ResponsesParser with tools for full MCP python loop (#29798)
Signed-off-by: Andrew Xia <axia@fb.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-05 11:11:50 -05:00 |
|
Cyrus Leung
|
9ae2f60374
|
[Misc] Various cleanups for MM input processing (#29970)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 06:22:20 +00:00 |
|
Benjamin Bartels
|
fca3f46658
|
[Frontend] Fixes anthropic /v1/messages streaming not containing input_tokens on first chunk (#29971)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-04 05:50:27 +00:00 |
|
Chauncey
|
3f42b05fbc
|
[Refactor] [1/N] to simplify the vLLM serving architecture (#28040)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-03 01:26:39 -08:00 |
|
Andrew Xia
|
3a7751485b
|
[responsesAPI] support input output messages for non harmony models (#29549)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 23:59:23 -08:00 |
|
Andrew Xia
|
52cb349fc0
|
[responsesAPI][3] ResponsesParser to set up non harmony MCP (#29413)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 11:24:45 -05:00 |
|
ImaGoodFella
|
60c3d413af
|
[Multimodal][Core] Optimize multimodal preprocessing cache by hashing image bytes instead of pixel values (#29621)
Signed-off-by: Rahul Steiger <rasteiger@ethz.ch>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-02 21:49:02 +08:00 |
|
Cyrus Leung
|
653591d5e7
|
[Chore] Move tokenizer initialization methods (#29793)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-02 13:33:37 +08:00 |
|
sangbumlikeagod
|
092bb73b8a
|
[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation (#24209)
Signed-off-by: sangbumlikeagod <oironese@naver.com>
Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
|
2025-12-01 18:19:17 +01:00 |
|
Cyrus Leung
|
f0a28bf661
|
[Misc] Unify tokenizer registration (#29767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-01 11:34:58 +00:00 |
|
daniel-salib
|
014ece97c7
|
[Frontend] Add tool filtering support to ToolServer (#29224)
Signed-off-by: Daniel Salib <danielsalib@meta.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-12-01 08:03:57 +00:00 |
|
wang.yuqi
|
62de4f4257
|
[Frontend] Resettle pooling entrypoints (#29634)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2025-12-01 15:30:43 +08:00 |
|
Jee Jee Li
|
b9d0504a36
|
[Bugfix] Revert test_tokenization.py (#29729)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-11-29 16:35:15 +00:00 |
|