Chauncey
132bfd45b6
[Bugfix][ResponsesAPI] Fix crash when tool_choice=required exceeds max_output_tokens ( #37258 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-03-17 08:54:52 +00:00
Flora Feng
3e3d320c1b
[Refactor] Relocate responses API tests ( #37241 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-03-17 05:14:52 +00:00
Flora Feng
384dc7f77b
[Refactor] Relocate completion and chat completion tests ( #37125 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-03-17 11:31:23 +08:00
Walter Beller-Morales
061980c36a
[Feature][Frontend] add support for Cohere Embed v2 API ( #37074 )
...
Signed-off-by: walterbm <walter.beller.morales@gmail.com >
2026-03-16 19:55:53 -04:00
Ben Browning
7a49742b88
[CI/Build] Add common tool call parser test suite ( #27599 )
...
Signed-off-by: Ben Browning <bbrownin@redhat.com >
2026-03-16 19:46:20 -04:00
Flora Feng
dfa8852db2
[Refactor] Consolidate GPT-OSS reasoning parser tests ( #36915 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
Signed-off-by: Flora Feng <4florafeng@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-16 15:53:07 -04:00
Max de Bayser
9f9ecff4cd
Add simple granite4 tool parser ( #36827 )
...
Signed-off-by: Max de Bayser <maxdebayser@gmail.com >
Signed-off-by: Max de Bayser <mbayser@br.ibm.com >
2026-03-16 10:49:09 -07:00
Benjamin Bartels
0e5a9382af
[Bugfix] accept redacted thinking blocks in Anthropic messages ( #36992 )
...
Signed-off-by: Benjamin Bartels <benjaminba@tiglab-ubuntu.ilab.local >
Signed-off-by: bbartels <benjamin@bartels.dev >
Co-authored-by: Benjamin Bartels <benjaminba@tiglab-ubuntu.ilab.local >
2026-03-16 22:01:57 +08:00
Isotr0py
912fbe9555
[Bugfix] Fix Qwen2.5-Omni/Qwen3-Omni use_audio_in_video with multi-video inputs ( #37147 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-03-16 08:56:06 +00:00
Andrew Xia
e9163b536e
[responsesAPI][ez] add a unit test for SimpleContext logprobs ( #37126 )
...
Signed-off-by: Andrew Xia <axia@meta.com >
2026-03-15 17:12:26 -07:00
Isotr0py
143e4dccdf
[Misc] Add online audio_in_video test ( #36775 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-03-15 00:14:11 -07:00
Sergey Zinchenko
4a718e770d
[Bug] Fix Failure in /v1/chat/completions/render for Multimodal Requests ( https://github.com/vllm-project/vllm/issues/35665 ) ( #35684 )
2026-03-14 14:10:11 +00:00
Flora Feng
bcfdadb1bc
[Refactor] Relocate chat completion and anthropic tests ( #36919 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-03-14 12:16:16 +08:00
Andrew Xia
f680dc1b39
[responsesAPI] prioritize content over summary in reasoning item input ( #36516 )
...
Signed-off-by: Andrew Xia <axia@meta.com >
Signed-off-by: Andrew Xia <mitandrewxia@gmail.com >
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Andrew Xia <axia@fb.com >
2026-03-14 09:20:30 +08:00
Mark McLoughlin
7afe0faab1
[Frontend][Core] Re-add shutdown timeout - allowing in-flight requests to finish ( #36666 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com >
Signed-off-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com >
Co-authored-by: Nick Hill <nickhill123@gmail.com >
2026-03-13 12:10:06 -07:00
Sage
a2268617cf
[Frontend] Delegate preprocessing to OpenAIServingRender ( #36483 )
...
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com >
2026-03-13 00:39:43 -07:00
Eunkwang Jeon
bdc2343454
[Bugfix] Fix KeyError in parse_response_input for reasoning items with optional content ( #34499 )
...
Signed-off-by: jeonsworld <jeonsworld@gmail.com >
2026-03-13 00:13:36 +08:00
Kunshang Ji
53ec16a705
[Hardware] Replace torch.cuda.device_count/current_device/set_device API ( #36145 )
...
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com >
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com >
2026-03-12 07:57:47 -07:00
Martin Hickey
7f1f36bf91
[CI] Fix mypy for vllm/reasoning ( #35742 )
...
Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-12 12:21:33 +00:00
Chauncey
5a71cdd76e
[Bugfix] Fix crash when tool_choice=required exceeds max_tokens ( #36841 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-03-12 03:28:45 -07:00
Chauncey
9fe404ed04
[Frontend] OpenAI Responses API supports Tool/Function calling with streaming ( #29947 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-03-12 15:03:50 +08:00
Nick Hill
262b76a09f
[Frontend] Exclude anthropic billing header to avoid prefix cache miss ( #36829 )
...
Signed-off-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-12 01:20:34 +00:00
Wentao Ye
c34ba6b961
[Perf] Optimize compute maxsim using batched version, 3.2% E2E throughput improvement ( #36710 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com >
2026-03-12 08:37:01 +08:00
Isotr0py
c84b519cf3
[Bugfix] Fix negative max_tokens when input prompt is too long ( #36789 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-03-11 16:30:51 +00:00
Ning Xie
fe714dd507
[openapi server] log exception in exception handler(2/N) ( #36201 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com >
2026-03-10 20:16:30 -07:00
Mark McLoughlin
234860399b
[Frontend][Core] Revert "Add shutdown timeout" ( #34730 and #36270 ) ( #36628 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com >
2026-03-10 06:20:41 -07:00
Harry Mellor
c88510083b
Fix Qwen2.5-VL test for Transformers v5 ( #36532 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-10 12:05:34 +00:00
Chang Su
507ddbe992
feat(grpc): extract gRPC servicer into smg-grpc-servicer package, add --grpc flag to vllm serve ( #36169 )
...
Signed-off-by: Chang Su <chang.s.su@oracle.com >
Co-authored-by: Nick Hill <nhill@redhat.com >
2026-03-10 03:29:59 -07:00
Hojin Yang
0836be3b03
[Model] Add HyperCLOVAX-SEED-Think-32B vision-language model support ( #31471 )
...
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-03-10 10:59:19 +08:00
Micah Williamson
4ff9b045fe
[ROCm][CI] Prep Tests For Change To ROCM_ATTN As New Default Backend On ROCm ( #36025 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com >
2026-03-09 13:27:55 -05:00
Alex Brooks
65a4da1504
[Frontend] Add Support for MM Encoder/Decoder Beam Search (Online Transcriptions) ( #36160 )
...
Signed-off-by: Alex Brooks <albrooks@redhat.com >
2026-03-09 05:46:23 +00:00
wang.yuqi
fff3711a24
[Frontend][2/n] Improve pooling entrypoints | embed. ( #36110 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
Signed-off-by: wang.yuqi <noooop@126.com >
2026-03-09 11:42:19 +08:00
Andreas Karatzas
40077ea3de
[CI] fix flaky empty responses and add diagnostic assertions in vision chat tests ( #36341 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-03-08 14:42:24 +08:00
milesial
755356b3d1
feat: expose media_io_kwargs at runtime ( #34778 )
...
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com >
2026-03-07 04:27:04 +00:00
Andreas Karatzas
58928475e4
[ROCm][CI] Making entrypoints more deterministic on ROCm ( #36293 )
2026-03-06 19:04:40 -08:00
Chenguang Zheng
fcb73f306c
[bugfix] add api process rank in default multimodal request ( #36150 )
...
Signed-off-by: fake0fan <645327136@qq.com >
Signed-off-by: Chenguang ZHENG <645327136@qq.com >
2026-03-06 12:00:09 +00:00
Andreas Karatzas
807d680337
[ROCm][CI] Fix tool use test stability - disable skinny GEMM, prefix caching, eliminate batch variance ( #35553 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-03-06 15:15:12 +08:00
Ajay Anubolu
43f10573c9
[Bugfix] Fix misleading context length error messages ( #36197 )
...
Signed-off-by: AjAnubolu <anuboluajay@gmail.com >
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-05 22:15:12 -08:00
Mark McLoughlin
27066d1b2b
[Frontend][Core] Add shutdown timeout - allowing in-flight requests to finish ( #34730 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com >
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com >
2026-03-05 22:04:31 -08:00
Shiyan Deng
8e87cc57f1
[Bug] Fix a corner case in _process_simple_streaming_events ( #34754 )
...
Signed-off-by: Shiyan Deng <dsy842974287@meta.com >
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com >
2026-03-05 20:57:32 -08:00
Ning Xie
176c799f4c
[openai api] log exception in exception handler (1/N) ( #31164 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com >
2026-03-05 16:00:12 +00:00
Andreas Karatzas
b03ff6a96b
[CI] Stabilize test_no_args_tool_call and add ROCm-specific server args ( #36107 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-03-05 21:52:49 +08:00
Hyunkyun Moon
bc6be89d16
[Frontend] Add vllm launch command for GPU-less preprocessing serving ( #34551 )
...
Signed-off-by: HyunKyun Moon <mhg5303@gmail.com >
2026-03-04 18:41:52 +00:00
Kunshang Ji
16d2ad1d38
[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache ( #30681 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com >
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-04 09:49:47 +00:00
JasonCohere
ae88468bcc
fix: Ensure invalid audio files return 400 error ( #34715 )
...
Signed-off-by: Jason Ozuzu <jasonozuzu@cohere.com >
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com >
2026-03-03 08:47:39 -08:00
hallerite
b8401cde0e
add regression test ( #35834 )
...
Signed-off-by: hallerite <git@hallerite.com >
2026-03-03 07:32:15 +00:00
Martin Vit
95a395dbec
[Bugfix] Fix Anthropic API base64 image handling in Messages endpoint ( #35557 )
...
Signed-off-by: Martin Vit <martin@voipmonitor.org >
2026-02-28 20:57:08 +00:00
Wentao Ye
e113a30113
[Deprecation] Deprecate code in 0.17 as scheduled ( #35441 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com >
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-28 17:32:37 +00:00
Andreas Karatzas
1e69c04887
[ROCm][CI] Parametrize vision score tests across attention backends with per-backend tolerances ( #35571 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-28 08:59:26 +00:00
Chauncey
06254d4cbb
[CI] add trainer_send_weights for MockWeightTransferEngine ( #35589 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-02-28 06:47:43 +00:00