Martin Vit
95a395dbec
[Bugfix] Fix Anthropic API base64 image handling in Messages endpoint ( #35557 )
...
Signed-off-by: Martin Vit <martin@voipmonitor.org >
2026-02-28 20:57:08 +00:00
Umut Polat
1d5ab5d603
[Bugfix] Move chat completion response_format validation to Pydantic model_validator ( #35510 )
...
Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com >
2026-02-27 21:26:19 -08:00
Umut Polat
b66a74649e
[Bugfix] Replace assert with ValueError for response_format validation in completions endpoint ( #35456 )
...
Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com >
2026-02-27 08:01:06 +00:00
daniel-salib
d43048ce05
[Bugfix] Emit reasoning_part events in simple streaming path for Resp… ( #35184 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com >
2026-02-27 09:49:06 +08:00
Krish Gupta
3827c8c55a
[Test] Add tests for n parameter in chat completions API ( #35283 )
...
Signed-off-by: KrxGu <krishom70@gmail.com >
2026-02-26 09:14:07 +00:00
Flora Feng
186ea22efe
[Misc][Harmony] Move Responses API only harmony utils to responses/harmony.py ( #35339 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-02-26 14:35:16 +08:00
pushkar
5d18bf8b32
[Bugfix] Fix Harmony preamble visibility in Responses API ( #32114 )
...
Signed-off-by: Pushkar Patel <git@thepushkarp.com >
Signed-off-by: pupa <pupa@users.noreply.github.com >
2026-02-25 08:08:16 -08:00
Andreas Karatzas
2ff3e436ad
[Responses][CI] Filter negative token IDs in schema fuzz test to avoid 500 errors ( #35231 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-25 05:52:44 +00:00
Flora Feng
ec1d30c0f6
[Responses] Decouple SSE event helpers from Harmony context ( #35148 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-02-24 20:05:25 -08:00
Pooya Davoodi
e3b2324ec4
[Frontend] Use init_app_state and FrontendArgs in run_batch ( #32967 )
...
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-24 19:40:39 -08:00
Harry Mellor
28c5e69ba0
Enforce that model is the first positional arg when --served-model-name is used ( #34973 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-23 08:38:05 -08:00
Robert Shaw
d13ece38d7
[CI] Skip Responses API ( #34990 )
...
Signed-off-by: Robert Shaw <robshaw@redhat.com >
Co-authored-by: Robert Shaw <robshaw@redhat.com >
2026-02-23 07:46:45 -08:00
Andreas Karatzas
dd8c3a7fb2
[ROCm][CI] Fix realtime test timeouts caused by aiter JIT compilation delays ( #35052 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-22 10:07:18 +00:00
Roman
98b0205c3c
[Frontend] Add automatic language detection for Whisper transcription ( #34342 )
...
Signed-off-by: space_check <roman.vuskov@rwth-aachen.de >
Signed-off-by: Roman <45857014+spacecheck@users.noreply.github.com >
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com >
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com >
2026-02-21 04:49:41 -08:00
Andreas Karatzas
991d6bff38
[CI][MCP][Harmony] Heavy refactoring Harmony & MCP response tests and stabilizing with deterministic test infrastructure ( #33949 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-20 20:03:32 -08:00
Micah Williamson
f5432e35a3
[ROCm][CI] Loosen RemoteOpenAIServer Startup Timeout ( #34922 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com >
2026-02-20 05:37:49 +00:00
Varun Chawla
676f82ae81
Add validation to reject non-text content in system messages ( #34072 )
...
Signed-off-by: Varun Chawla <varun_6april@hotmail.com >
2026-02-19 21:30:33 -08:00
Tal Nir
f75b61a9e9
[Voxtral Realtime] Fix engine crash on empty multimodal embeddings ( #34862 )
...
Signed-off-by: Tal Nir <tal@nervexneurotech.com >
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-18 23:21:47 -08:00
Jaeyeon Kim(김재연)
9681068cf9
[Frontend] Fix reasoning_tokens for text-based parsers in Responses API ( #33513 )
...
Signed-off-by: Jaeyeon Kim <anencore94@gmail.com >
2026-02-18 23:16:41 -08:00
Flora Feng
1e4a084c8e
[CI] Fix flaky test_parsable_context ( #34717 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com >
2026-02-17 18:42:52 +00:00
Nicolò Lucchesi
6cc403e67d
[Bugfix][CI] Fix flaky entrypoints/openai/test_response_api_with_harmony.py::test_function_calling[openai/gpt-oss-20b] ( #34624 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-02-16 16:11:07 +00:00
Almog Tavor
72d5951d02
[Bugfix] Treat generation_config max_tokens as default not ceiling ( #34063 )
...
Signed-off-by: almogtavor <almogtavor@gmail.com >
2026-02-16 07:58:24 -08:00
Andreas Karatzas
974d829b05
[CI][Frontend] Return 422 instead of 500 for invalid Anthropic tool_choice ( #34590 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-15 20:06:48 -08:00
Cyrus Leung
73391a1baa
[Renderer] Move InputPreprocessor into Renderer (1/2) ( #34510 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2026-02-14 10:14:21 -08:00
Ben Browning
fd267bc7b7
[Bugfix]: Fix structured output in multi-turn gpt-oss ( #34454 )
...
Signed-off-by: Ben Browning <bbrownin@redhat.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-13 11:12:48 -08:00
Cyrus Leung
2f308214c0
[Refactor] Pass full VllmConfig to Renderer ( #34485 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-12 22:48:38 -08:00
Andreas Karatzas
6afa587d31
[ROCm][CI] Fix serving tokens test failures ( #34047 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-13 11:27:53 +08:00
Cyrus Leung
fc22cae4ac
[CI/Build] Update video URLs for testing ( #34446 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-12 18:15:36 -08:00
Alec S
be7370daf3
[Frontend] Enable generic structured_outputs for responses API ( #33709 )
...
Signed-off-by: Alec Solder <alecs@fb.com >
Co-authored-by: Alec Solder <alecs@fb.com >
2026-02-12 16:15:48 -08:00
Patrick von Platen
1100a97621
[Voxstral Realtime] Enable tests ( #33803 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
2026-02-12 09:43:24 -08:00
Cyrus Leung
fb455ed547
[V0 Deprecation] Remove code related to per-request logits processors ( #34400 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-12 20:44:28 +08:00
Cyrus Leung
b96f7314b4
[Refactor] Pass Renderer to Input Processor ( #34329 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-11 19:38:11 -08:00
Junseo Park
5458eb835d
[Bugfix] send None sentinel on final commit so server properly sends transcription.done ( #33963 )
...
Signed-off-by: pjs102793 <pjs102793@naver.com >
Co-authored-by: Nick Hill <nickhill123@gmail.com >
2026-02-11 21:01:53 +00:00
Adam Binford
1b8756562e
Responses harmony system message structured ( #34268 )
...
Signed-off-by: Adam Binford <adamq43@gmail.com >
2026-02-11 05:14:28 -08:00
wang.yuqi
dab1de9f38
[Frontend][CI] Consolidate instrumentator entrypoints ( #34123 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
2026-02-10 07:30:19 +00:00
Cyrus Leung
ab97bcf662
[CI/Build] Relax test_mcp_tool_call ( #34204 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-10 05:18:57 +00:00
Pooya Davoodi
2cb2340f7a
[Frontend]Add support for transcriptions and translations to run_batch ( #33934 )
...
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io >
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2026-02-07 05:24:57 -08:00
Sumanth R Hegde
ae2e93f89b
[Fix] Fix logprobs=0 handling for /inference/v1/generate endpoint ( #34010 )
...
Signed-off-by: SumanthRH <sumanthrh99@gmail.com >
2026-02-06 20:33:40 +00:00
Cyrus Leung
cd8b405bd0
[Refactor] Consolidate sequence normalization and enc-dec parsing ( #33928 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-06 15:43:47 +00:00
Harry Mellor
1887acca9e
Fix tokenizer test for renamed attr on Transformers v5 ( #33902 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-05 19:16:20 +00:00
Aaron Hao
c1858b7ec8
[Feat][RL][1/2] Native Weight Syncing API: NCCL ( #31943 )
...
Signed-off-by: ahao-anyscale <ahao@anyscale.com >
Signed-off-by: Aaron Hao <ahao@anyscale.com >
Co-authored-by: SumanthRH <sumanthrh99@gmail.com >
2026-02-05 12:13:23 -05:00
Andreas Karatzas
fb1270f1f8
[CI][Bugfix]: return McpCall for built-in MCP tools in non-streaming mode ( #32762 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-05 11:14:06 +08:00
Andrew Xia
e1bf04b6c2
[1/N] Initial Implementation of Parser for ResponsesAPI ( #32712 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: Andrew Xia <axia@fb.com >
2026-02-04 10:59:03 +08:00
Patrick von Platen
3f7662d650
[Voxtral Realtime] Change name ( #33716 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
2026-02-03 13:03:28 -08:00
Daniel Mescheder
4c4b6f7a97
[Frontend] Add sampling parameters to Responses API ( #32609 )
...
Signed-off-by: Daniel Mescheder <dmesch@amazon.com >
Co-authored-by: Daniel Mescheder <dmesch@amazon.com >
2026-02-03 13:51:10 +08:00
Patrick von Platen
5019c59dd2
[Voxtral Realtime] Introduce global log mel max ( #33574 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-02 17:01:47 -05:00
Harry Mellor
6141ebe0dd
Remove incorrect tokenizer info test ( #33565 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-02 17:11:44 +00:00
Cyrus Leung
f0a1c8453a
[Frontend] Use new Renderer for Completions and Tokenize API ( #32863 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-31 04:51:15 -08:00
Nicolò Lucchesi
8ece60768f
[CI] Qwen3-ASR transcriptios tests ( #33414 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-01-30 16:17:56 +00:00
Harry Mellor
c5113f60f2
Remove deprecated reasoning_content message field ( #33402 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-01-30 11:48:15 +00:00