liuzhenwei
0c63739135
[EPD] update EPD script arguments ( #36742 )
...
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com >
2026-03-31 12:02:09 +00:00
Matej Rojec
2908094567
Add /v1/chat/completions/batch endpoint for batched chat completions ( #38011 )
...
Signed-off-by: Matej Rojec <64556640+MatejRojec@users.noreply.github.com >
2026-03-26 12:13:33 +08:00
Aaron Hao
47a1f11bff
[docs] Add docs for new RL flows ( #36188 )
...
Signed-off-by: ahao-anyscale <ahao@anyscale.com >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-18 09:04:26 +00:00
Isotr0py
a836524d20
[Chore] Replace all base64 usages with faster pybase64 package ( #37290 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-03-17 14:44:19 +00:00
rasmith
0024f39a32
[ROCm][P/D][MORI][BugFix] Add transfer_id for moriio_connector so moriio_connector to restore P/D functionality ( #34907 )
...
Signed-off-by: Randall Smith <Randall.Smith@amd.com >
2026-03-16 10:36:51 +08:00
Kunshang Ji
53ec16a705
[Hardware] Replace torch.cuda.device_count/current_device/set_device API ( #36145 )
...
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com >
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com >
2026-03-12 07:57:47 -07:00
wang.yuqi
dcf8862fd4
[Examples][1/n] Resettle basic examples. ( #35579 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
Signed-off-by: wang.yuqi <noooop@126.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-08 20:22:53 -07:00
Harry Mellor
a0f44bb616
Allow markdownlint to run locally ( #36398 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-08 20:05:24 -07:00
Jiayi Yan
6a895197fa
[Bugfix][CI] fix typos ( #34934 )
...
Signed-off-by: 1195343015 <1195343015@qq.com >
Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-03-05 17:05:46 +00:00
Qi Wang
6aa6ad8992
[BugFix] Fix implicit and incorrect assumption on ECConnector is_producer ( #34783 )
...
Signed-off-by: Qi Wang <qiwa@nvidia.com >
2026-03-04 15:01:30 +01:00
Aaron Hao
2ce6f3cf67
[Feat][RL][2/2] Native Weight Syncing API: IPC ( #34171 )
...
Signed-off-by: hao-aaron <ahao@anyscale.com >
Signed-off-by: Aaron Hao <ahao@anyscale.com >
Signed-off-by: ahao-anyscale <ahao@anyscale.com >
2026-02-27 13:45:21 -07:00
Tyler Michael Smith
eb19955c37
[WideEP] Remove pplx all2all backend ( #33724 )
...
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com >
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-26 14:30:10 -08:00
zhongdaor-nv
a0fe7ea2f0
[feat] Add per-block extra_keys to KV events ( #33304 )
...
Signed-off-by: zhongdaor-nv <zhongdaor@nvidia.com >
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-20 20:11:40 -08:00
Vlad Tiberiu Mihailescu
e739c29ea4
[CI/Build] Add opentelemetry libs in default vllm build (requirements/common.txt) ( #34466 )
...
Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com >
2026-02-20 19:54:55 -08:00
junuxyz
c61a98f529
[CI][BugFix] ShellCheck cleanup to remove baseline and preserve runtime behavior ( #34514 )
...
Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com >
2026-02-17 12:22:56 +00:00
ChenqianCao
ad65177a19
[Bugfix] Fix 'remove_instance_endpoint' method logic in disagg_proxy_demo ( #32922 )
...
Signed-off-by: ChenqianCao <39755070+ChenqianCao@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-17 10:06:53 +00:00
Aaron Hao
dddbff4624
[Core] Move pause and resume functions into engine ( #34125 )
...
Signed-off-by: ahao-anyscale <ahao@anyscale.com >
Signed-off-by: Aaron Hao <ahao@anyscale.com >
Signed-off-by: hao-aaron <ahao@anyscale.com >
Signed-off-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com >
2026-02-13 00:15:10 -08:00
AllenDou
21dfb842d7
[model] support FunASR model ( #33247 )
...
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com >
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com >
2026-02-11 07:37:09 +00:00
Cyrus Leung
25e48a3aae
[Doc] Update usage of --limit-mm-per-prompt ( #34148 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-09 21:12:13 -08:00
Aaron Hao
c1858b7ec8
[Feat][RL][1/2] Native Weight Syncing API: NCCL ( #31943 )
...
Signed-off-by: ahao-anyscale <ahao@anyscale.com >
Signed-off-by: Aaron Hao <ahao@anyscale.com >
Co-authored-by: SumanthRH <sumanthrh99@gmail.com >
2026-02-05 12:13:23 -05:00
Patrick von Platen
3f7662d650
[Voxtral Realtime] Change name ( #33716 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
2026-02-03 13:03:28 -08:00
dtc
0d6ccf68fa
[P/D] rework mooncake connector and introduce its bootstrap server ( #31034 )
...
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com >
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com >
2026-02-03 08:08:25 -08:00
Cyrus Leung
f0a1c8453a
[Frontend] Use new Renderer for Completions and Tokenize API ( #32863 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-31 04:51:15 -08:00
Patrick von Platen
10152d2194
[Realtime API] Adds minimal realtime API based on websockets ( #33187 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Nick Hill <nickhill123@gmail.com >
2026-01-30 18:41:29 +08:00
Harry Mellor
9432ed8c7e
Explicitly set return_dict for apply_chat_template ( #33372 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-01-30 07:27:04 +00:00
Mark McLoughlin
1cb4341fbc
[ROCm][PD] Remove unused moriio connector proxy code ( #32939 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com >
2026-01-23 15:59:04 +00:00
Cyrus Leung
09194b90a5
[Doc] Update docs for MM model development with context usage ( #32691 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 10:37:35 -08:00
杨朱 · Kiki
bb9172030e
[Metrics] Complete removal of deprecated vllm:time_per_output_token_seconds metric ( #32661 )
...
This PR completes the removal of the deprecated vllm:time_per_output_token_seconds
metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13,
but delayed until v0.15.
Signed-off-by: carlory <baofa.fan@daocloud.io >
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com >
2026-01-20 12:28:41 +00:00
Cyrus Leung
4753f3bf69
[Model] Use context managers for encoder- and LM-only mode ( #32605 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-01-20 11:43:38 +08:00
Tomas Ruiz
4a5299c93f
feat: spec decode with draft models ( #24322 )
...
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com >
2026-01-19 16:05:46 -05:00
HappyAmazonian
2f4a71daf2
[Misc] Add In-Container restart capability through supervisord for sagemaker entrypoint ( #28502 )
...
Signed-off-by: Shen Teng <sheteng@amazon.com >
Signed-off-by: HappyAmazonian <91216626+HappyAmazonian@users.noreply.github.com >
2026-01-13 13:06:10 -08:00
Isotr0py
2d0c5b630e
[Doc] Remove hardcoded Whisper in example openai translation client ( #32027 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-09 14:44:52 +00:00
inkcherry
4505849b30
[ROCm][PD] add moriio kv connector. ( #29304 )
...
Signed-off-by: inkcherry <mingzhi.liu@amd.com >
2026-01-09 14:01:57 +00:00
labAxiaoming
a01f2faedf
Add multimodal input method in the documentation ( #31601 )
...
Signed-off-by: xiaoming <1259730330@qq.com >
2026-01-02 12:43:30 +00:00
Ekagra Ranjan
adcf682fc7
[Audio] Improve Audio Inference Scripts (offline/online) ( #29279 )
...
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com >
2025-12-31 23:34:18 +00:00
Sage
39512aba72
[Prefix Cache] Include lora_name in BlockStored event for deterministic KV-cache reconstruction ( #27577 )
...
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com >
Co-authored-by: Sage <80211083+sagiahrac@users.noreply.github.com >
2025-12-30 00:17:16 +00:00
dengyunyang
8f8f469b1b
[BugFix] skip language model in Encoder ( #30242 )
...
Signed-off-by: dengyunyang <584797741@qq.com >
2025-12-22 05:25:59 -08:00
Elizabeth Thomas
41b6f9200f
Remove all2all backend envvar ( #30363 )
...
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-12-18 19:46:28 +00:00
Didier Durand
1a55cfafcb
[Doc]: fixing typos in various files ( #30540 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com >
Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com >
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com >
2025-12-14 02:14:37 -08:00
Ev Lacey
d527cf0b3d
[FIX]Patch run-cluster.sh (fix for #28328 ) ( #30002 )
...
Signed-off-by: elacey <elacey@nvidia.com >
Signed-off-by: Ev Lacey <github@everettlacey.com >
2025-12-11 23:36:31 +00:00
Or Ozeri
4c6fd25880
kv_transfer: Rename the shared storage connectors ( #30201 )
...
Signed-off-by: Or Ozeri <oro@il.ibm.com >
2025-12-08 20:46:09 -08:00
Andrew Xia
da7bc54ea8
[responsesAPI][5] ResponsesParser with tools for full MCP python loop ( #29798 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Signed-off-by: Andrew Xia <axia@meta.com >
Co-authored-by: Andrew Xia <axia@fb.com >
2025-12-05 11:11:50 -05:00
Cyrus Leung
9ae2f60374
[Misc] Various cleanups for MM input processing ( #29970 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-12-04 06:22:20 +00:00
wang.yuqi
2eb4fe9129
[examples] Resettle pooling examples. ( #29365 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-12-02 15:54:28 +00:00
daniel-salib
014ece97c7
[Frontend] Add tool filtering support to ToolServer ( #29224 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com >
Co-authored-by: Chauncey <chaunceyjiang@gmail.com >
2025-12-01 08:03:57 +00:00
Roger Wang
0ff70821c9
[Core] Deprecate xformers ( #29262 )
...
Signed-off-by: Roger Wang <hey@rogerw.io >
2025-11-24 04:18:55 +00:00
Josh Moore
c309bb5245
[Bugfix] Update Gradio OpenAI Chatbot Webserver example to new Gradio message history format ( #29249 )
...
Signed-off-by: joshiemoore <joshiemoore98@gmail.com >
2025-11-24 00:47:54 +00:00
Andrew Xia
742e9ff6b3
[responsesAPI] parse reasoning item input ( #28248 )
...
Signed-off-by: Andrew Xia <axia@fb.com >
Co-authored-by: Andrew Xia <axia@fb.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2025-11-22 15:42:11 +08:00
Pan Li
e5bfcb6a88
[BugFix][PD]: make example proxy usable with P2pNcclConnector ( #26628 )
...
Signed-off-by: PAN <1162953505@qq.com >
2025-11-20 17:38:31 +00:00
Didier Durand
7ed27f3cb5
[Doc]: fix typos in various files ( #28945 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com >
2025-11-18 22:52:30 -08:00