wang.yuqi
1ed963d43a
[Bugfix] Fix Qwen3-VL-Reranker load. ( #33298 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
Signed-off-by: wang.yuqi <noooop@126.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
(cherry picked from commit abb34ac43a )
2026-02-02 00:13:12 -08:00
Chauncey
afb390ab02
[CI] Fix AssertionError: MCP tool call not found in output_messages ( #33093 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
(cherry picked from commit a2393ed496 )
2026-01-28 02:16:14 -08:00
Cyrus Leung
11b556878b
[Refactor] Use data parser for matching data items to multi-modal UUIDs ( #32955 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-26 15:00:28 +08:00
sangbumlikeagod
9b77bb790d
[Frontend] add logprob, compression_rate to 'verbose_json' features ( #31059 )
...
Signed-off-by: sangbumlikeagod <oironese@naver.com >
Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com >
2026-01-23 16:35:13 +00:00
wang.yuqi
05f3d714db
[Frontend][3/n] Make pooling entrypoints request schema consensus | EmbedRequest & ClassifyRequest ( #32905 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
Signed-off-by: wang.yuqi <noooop@126.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-23 12:03:44 +00:00
Isotr0py
444e2e7e1f
[Misc] Bump opencv-python dependecy version to 4.13 ( #32668 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-22 15:51:15 +00:00
Cyrus Leung
d117a4d1a9
[Frontend] Introduce Renderer for processing chat messages (using ModelConfig) ( #30200 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-22 12:44:22 +00:00
wang.yuqi
328cbb2773
[Frontend][2/n] Make pooling entrypoints request schema consensus | ChatRequest ( #32574 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
2026-01-22 10:32:44 +00:00
杨朱 · Kiki
bb9172030e
[Metrics] Complete removal of deprecated vllm:time_per_output_token_seconds metric ( #32661 )
...
This PR completes the removal of the deprecated vllm:time_per_output_token_seconds
metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13,
but delayed until v0.15.
Signed-off-by: carlory <baofa.fan@daocloud.io >
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com >
2026-01-20 12:28:41 +00:00
Jackmin801
12dab78f49
[Feat] allow inplace loading lora ( #31326 )
...
Signed-off-by: Jackmin801 <ongjackm@gmail.com >
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com >
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com >
2026-01-20 10:15:20 +08:00
wang.yuqi
c88860d759
[Frontend] Score entrypoint support data_1 & data_2 and queries & documents as inputs ( #32577 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
2026-01-19 14:07:46 +00:00
Nicolò Lucchesi
74c583bc50
[Core] Whisper support torch.compile ( #30385 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-01-19 10:02:31 +00:00
Hyunkyun Moon
3c8740aacb
[Frontend] Add render endpoints for prompt preprocessing ( #32473 )
...
Signed-off-by: HyunKyun Moon <mhg5303@gmail.com >
Signed-off-by: Hyunkyun Moon <mhg5303@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-19 12:21:46 +08:00
Alex Brooks
7518a3dc65
[CI/Build] Use Common Event Map Fixture in Harmony / MCP Server Tests ( #32531 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com >
2026-01-19 04:05:51 +00:00
vanshil shah
037a6487af
apply _validate_input to MistralTokenizer token-id chat prompts ( #32448 )
...
Signed-off-by: Vanshil Shah <vanshilshah@gmail.com >
2026-01-17 03:23:45 +00:00
wang.yuqi
4ae77dfd42
[Frontend][1/n] Make pooling entrypoints request schema consensus | CompletionRequest ( #32395 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
2026-01-16 06:17:04 +00:00
cjackal
35bf5d08e8
[bugfix] Fix online serving crash when text type response_format is received ( #26822 )
...
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com >
Signed-off-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com >
Co-authored-by: j0shuajun <59368606+j0shuajun@users.noreply.github.com >
2026-01-16 12:23:54 +08:00
Micah Williamson
46f8a982b1
[ROCm][CI] Enable AITER Unified Attention On ROCm For gpt-oss Test ( #32431 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com >
2026-01-16 00:55:57 +00:00
Cyrus Leung
28459785ff
[3/N] Group together media-related code ( #32406 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-15 11:52:12 +00:00
Chauncey
707b44cc28
[Refactor] [11/N] to simplify the mcp architecture ( #32396 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-15 18:49:31 +08:00
Chauncey
4c1c501a7e
[Refactor] [10/N] to simplify the vLLM openai completion serving architecture ( #32369 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-15 07:41:34 +00:00
Aleksandr Samarin
d084e9fca7
[MODEL] Fix handling of multiple channels for gpt-oss with speculative decoding ( #26291 )
...
Signed-off-by: Aleksandr Samarin <astrlrd@nebius.com >
Signed-off-by: southfreebird <yvorott@gmail.com >
Co-authored-by: southfreebird <yvorott@gmail.com >
2026-01-14 13:20:52 -05:00
Chauncey
9312a6c03a
[Refactor] [8/N] to simplify the vLLM openai responsesapi_serving architecture ( #32260 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-14 07:26:24 +00:00
Andrew Xia
af54d2e2d0
[responseAPI] support partial message generation ( #32100 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Signed-off-by: Andrew Xia <mitandrewxia@gmail.com >
Signed-off-by: Lu Fang <30275821+houseroad@users.noreply.github.com >
Co-authored-by: Andrew Xia <axia@fb.com >
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com >
2026-01-13 10:41:26 -08:00
Chauncey
4f02cb2eac
[Refactor] [7/N] to simplify the vLLM lora serving architecture ( #32251 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-13 15:37:34 +00:00
Chauncey
fefce49807
[Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture ( #32240 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2026-01-13 13:01:39 +00:00
Cyrus Leung
232214b2ae
[Bugfix] Replace PoolingParams.normalize with use_activation ( #32243 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-13 10:45:42 +00:00
Andrew Xia
a307ac0734
[responsesAPI] add unit test for optional function tool call id ( #32036 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: Andrew Xia <axia@fb.com >
2026-01-12 16:14:54 -08:00
daniel-salib
d7b2e57097
[Frontend] Fix Flaky MCP Streaming Test ( #32153 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com >
2026-01-12 18:03:32 +08:00
Cyrus Leung
a374532111
[CI/Build] Separate out flaky responses API tests ( #32110 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-11 05:01:12 -08:00
Andreas Karatzas
d83becd503
[ROCm][CI] Fix flaky test_function_calling_with_stream and reduce schema test examples ( #32063 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-10 05:02:35 +00:00
Andrew Xia
1f8b7c536b
[responsesAPI] fix incomplete_messages for simple/parsable context ( #31836 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: Andrew Xia <axia@fb.com >
2026-01-09 21:00:57 +00:00
Andrew Xia
f32c629eb4
[Frontend][gpt-oss] Allow system message to overwrite model identity ( #31737 )
...
Signed-off-by: lacora <hyelacora@gmail.com >
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: lacora <hyelacora@gmail.com >
Co-authored-by: Andrew Xia <axia@fb.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-01-09 14:03:57 -05:00
Andreas Karatzas
020732800c
[Bugfix] Fix OpenAPI schema test failures ( #31921 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-09 10:56:20 +00:00
TJian
7a05d2dc65
[CI] [ROCm] Fix tests/entrypoints/test_grpc_server.py on ROCm ( #31970 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
2026-01-09 12:54:20 +08:00
daniel-salib
a4ec0c5595
[Frontend] Add MCP tool streaming support to Responses API ( #31761 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com >
2026-01-09 09:19:34 +08:00
Cyrus Leung
aa125ecf0e
[Frontend] Improve error message ( #31987 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-08 20:07:03 +00:00
Chang Su
791b2fc30a
[grpc] Support gRPC server entrypoint ( #30190 )
...
Signed-off-by: Chang Su <chang.s.su@oracle.com >
Signed-off-by: njhill <nickhill123@gmail.com >
Signed-off-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: njhill <nickhill123@gmail.com >
Co-authored-by: Simon Mo <simon.mo@hey.com >
2026-01-07 23:24:46 -08:00
wang.yuqi
96860af655
[Model] rename use_pad_token to use_sep_token ( #31784 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
2026-01-06 14:16:04 +00:00
Andreas Karatzas
4f9ce35afe
[CI][Bugfix] Fix token counting in chunked prefill compl test ( #31630 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-03 14:28:49 +08:00
Andreas Karatzas
21de6d4b02
[CI][Bugfix] Fix token counting in chunked prefill streaming test ( #31565 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-31 23:05:14 +00:00
Hojin Yang
dc837bc23e
feat(frontend): add --default-chat-template-kwargs CLI argument ( #31343 )
...
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com >
2025-12-30 03:38:47 +00:00
amittell
9c884faa95
[Bugfix] Preserve tool call id/type/name in streaming finish chunk ( #31438 )
...
Signed-off-by: amittell <mittell@me.com >
Signed-off-by: Alex Mittell <mittell@me.com >
2025-12-29 21:10:52 +08:00
Chauncey
48d5ca4e8b
[CI] fix test_chat_truncation_content_not_null test ( #31488 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com >
2025-12-29 12:47:08 +00:00
Andreas Karatzas
c79dbfa9ad
[CI] Fix flaky vision beam search test with flexible semantic validation ( #31324 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-26 04:39:32 +00:00
Cyrus Leung
09dc7c690c
[Chore][1/2] Drop v0.14 deprecations ( #31285 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-24 09:54:01 -08:00
wang.yuqi
1ff67df182
[CI] Reorganization pooling_mteb_test ( #31265 )
...
Signed-off-by: wang.yuqi <noooop@126.com >
2025-12-24 23:36:20 +08:00
Cyrus Leung
aa3868ecfe
[Chore] Remove unused noqas ( #31263 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-24 05:38:46 -08:00
Andreas Karatzas
0247a91e00
[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm ( #28979 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2025-12-23 22:42:30 -08:00
Cyrus Leung
bb62dda2c3
[Misc] Introduce encode_*_url utility function ( #31208 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-23 13:45:21 +00:00