Aaron Hao
|
cad21918e3
|
[BUG] Fix rlhf_async example (#35788)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-03-02 20:36:40 +00:00 |
|
Fynn Schmitt-Ulms
|
9433acb8df
|
[Spec Decode] Add hidden states extraction system (#33736)
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
2026-03-02 14:29:09 -05:00 |
|
Aaron Hao
|
2ce6f3cf67
|
[Feat][RL][2/2] Native Weight Syncing API: IPC (#34171)
Signed-off-by: hao-aaron <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-27 13:45:21 -07:00 |
|
Tyler Michael Smith
|
eb19955c37
|
[WideEP] Remove pplx all2all backend (#33724)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-26 14:30:10 -08:00 |
|
Jakub Zakrzewski
|
111d869069
|
[Model] Add nvidia/llama-nemotron-embed-vl-1b-v2 multimodal embedding model (#35297)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
|
2026-02-26 14:17:17 +00:00 |
|
Aaron Hao
|
596ed1f02e
|
[RL] Validation for pause_mode='keep' (#34992)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-23 16:30:56 -05:00 |
|
Athrael Soju
|
970861ac0c
|
[New Model] Add ColModernVBERT (#34558)
Signed-off-by: Athrael Soju <athrael.soju@gmail.com>
Signed-off-by: athrael-soju <athrael-soju@users.noreply.github.com>
|
2026-02-22 12:23:41 +08:00 |
|
Nicolò Lucchesi
|
ab6f3487a6
|
[PD] Change kv_load_failure_policy Default from "recompute" to "fail" (#34896)
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-21 01:34:57 -08:00 |
|
zhongdaor-nv
|
a0fe7ea2f0
|
[feat] Add per-block extra_keys to KV events (#33304)
Signed-off-by: zhongdaor-nv <zhongdaor@nvidia.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-20 20:11:40 -08:00 |
|
Kata Coder
|
5719a4e4e6
|
[Frontend] Support multimodal inputs for late-interaction scoring (ColQwen3) + NewModel: nvidia/nemotron-colembed (#34574)
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
|
2026-02-20 20:01:40 -08:00 |
|
Vlad Tiberiu Mihailescu
|
e739c29ea4
|
[CI/Build] Add opentelemetry libs in default vllm build (requirements/common.txt) (#34466)
Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com>
|
2026-02-20 19:54:55 -08:00 |
|
junuxyz
|
c61a98f529
|
[CI][BugFix] ShellCheck cleanup to remove baseline and preserve runtime behavior (#34514)
Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>
|
2026-02-17 12:22:56 +00:00 |
|
ChenqianCao
|
ad65177a19
|
[Bugfix] Fix 'remove_instance_endpoint' method logic in disagg_proxy_demo (#32922)
Signed-off-by: ChenqianCao <39755070+ChenqianCao@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-02-17 10:06:53 +00:00 |
|
Christian Pinto
|
6930becd45
|
(bugfix): Fixed encode in LLM entrypoint for IOProcessr plugin prompts (#34618)
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
2026-02-16 07:33:55 -08:00 |
|
Christian Pinto
|
342a7cda2d
|
[Misc] Update tests and examples for Prithvi/Terratorch models (#34416)
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-13 23:03:51 -08:00 |
|
Kata Coder
|
d1ea65d0a1
|
[new model] add COLQwen3 code & Inference (#34398)
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: katacoder <craftsangjae@gmail.com>
|
2026-02-14 12:15:19 +08:00 |
|
Ilya Boytsov
|
071d863e20
|
Extend ColBERT support to non-standard BERT backbones (#34170)
Signed-off-by: Ilya Boytsov <ilya.boytsov@aleph-alpha.com>
|
2026-02-13 09:53:09 +00:00 |
|
Aaron Hao
|
dddbff4624
|
[Core] Move pause and resume functions into engine (#34125)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Signed-off-by: hao-aaron <ahao@anyscale.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-13 00:15:10 -08:00 |
|
AllenDou
|
21dfb842d7
|
[model] support FunASR model (#33247)
Signed-off-by: zixiao <shunli.dsl@alibaba-inc.com>
Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
|
2026-02-11 07:37:09 +00:00 |
|
Lucas Wilkinson
|
ba0511fd80
|
[Misc] Add run one batch script that supports profiling (#32968)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-02-10 18:29:49 -08:00 |
|
zzaebok
|
cbea11c9f0
|
[Docs] Fix format error in KV load failure recovery doc (#34137)
Signed-off-by: Jaebok Lee <jaebok9541@naver.com>
|
2026-02-10 02:16:26 -08:00 |
|
Cyrus Leung
|
2c32558a3c
|
[Bugfix] Fix --trust-remote-code conflict (#34218)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-10 00:29:10 -08:00 |
|
Cyrus Leung
|
25e48a3aae
|
[Doc] Update usage of --limit-mm-per-prompt (#34148)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-09 21:12:13 -08:00 |
|
wang.yuqi
|
22b64948f6
|
[Frontend][last/5] Make pooling entrypoints request schema consensus. (#31127)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-02-09 06:42:38 +00:00 |
|
Aaron Hao
|
89a385d79f
|
[Feat][RL] Pause and Resume with keep requests for single engine (#32351)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-07 00:08:58 +00:00 |
|
SorenDreano
|
6e7b1c4b59
|
[Docs] Improve documentation (#33799)
Co-authored-by: Soren Dreano <soren@numind.ai>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-02-06 12:57:09 +00:00 |
|
Benjamin Chislett
|
af3162d3aa
|
[Spec Decode] Unified Parallel Drafting (#32887)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-02-05 12:37:18 -05:00 |
|
Aaron Hao
|
c1858b7ec8
|
[Feat][RL][1/2] Native Weight Syncing API: NCCL (#31943)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: SumanthRH <sumanthrh99@gmail.com>
|
2026-02-05 12:13:23 -05:00 |
|
Ilya Boytsov
|
439afa4eea
|
feat: Add ColBERT late interaction model support (#33686)
Signed-off-by: Ilya Boytsov <ilyaboytsov1805@gmail.com>
Signed-off-by: Ilya Boytsov <boytsovpanamera@mail.ru>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-02-05 08:05:13 +08:00 |
|
wang.yuqi
|
1b8fe6f7c4
|
[Frontend][4/n] Make pooling entrypoints request schema consensus | ScoreRequest (#33060)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-02-04 01:48:40 +00:00 |
|
Patrick von Platen
|
3f7662d650
|
[Voxtral Realtime] Change name (#33716)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2026-02-03 13:03:28 -08:00 |
|
dtc
|
0d6ccf68fa
|
[P/D] rework mooncake connector and introduce its bootstrap server (#31034)
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
|
2026-02-03 08:08:25 -08:00 |
|
zxy
|
a3acfa1071
|
[Models] Intern-S1-Pro (#33636)
Signed-off-by: zxy <zhou0493@e.ntu.edu.sg>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-03 05:49:45 -08:00 |
|
Cyrus Leung
|
83449a5ff0
|
[Refactor] Clean up pooling serial utils (#33665)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-03 10:29:18 +00:00 |
|
Yang Liu
|
199e3cb476
|
[Model] Use mm_position to compute mrope positions for GLM-4.xV (#33039)
Signed-off-by: Yang <lymailforjob@gmail.com>
|
2026-02-02 16:55:48 +00:00 |
|
Isotr0py
|
4061dcf4c5
|
[Bugfix] Enable Kimi k25 processor test (#33562)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 14:25:25 +00:00 |
|
Komal Kumar Teru
|
ba871fb788
|
[Misc] support arbitrary MM datasets in spec dec bench (#33486)
Signed-off-by: kkt-cohere <komal@cohere.com>
Signed-off-by: Komal Kumar Teru <162363718+kkt-cohere@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-02-02 08:49:48 +00:00 |
|
RED
|
808dd87b30
|
[Model] Support DeepSeek-OCR-2 (#33165)
Signed-off-by: liuli <ll407707@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: liuli <ll407707@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 06:24:10 +00:00 |
|
Cyrus Leung
|
f0a1c8453a
|
[Frontend] Use new Renderer for Completions and Tokenize API (#32863)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-31 04:51:15 -08:00 |
|
Michael Goin
|
29fba76781
|
[UX] Use gguf repo_id:quant_type syntax for examples and docs (#33371)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-01-31 12:14:54 +08:00 |
|
Isotr0py
|
9df152bbf6
|
[Misc] Algin Qwen3-VL-embedding image example outputs with HF repo example (#33419)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-30 19:36:56 -08:00 |
|
Patrick von Platen
|
10152d2194
|
[Realtime API] Adds minimal realtime API based on websockets (#33187)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-01-30 18:41:29 +08:00 |
|
hujiaxin0
|
ba45bedfd1
|
[model] Add support for openPangu7B-VL (#32449)
Signed-off-by: hujiaxin <524446785@qq.com>
Signed-off-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
Co-authored-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
|
2026-01-30 15:54:27 +08:00 |
|
Harry Mellor
|
9432ed8c7e
|
Explicitly set return_dict for apply_chat_template (#33372)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-30 07:27:04 +00:00 |
|
Ryan Rock
|
070c811d6f
|
[CI][AMD] Skip 4 GPUs testgroup ray tests (#33305)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2026-01-29 21:39:53 -08:00 |
|
Wang Haoyu
|
c46b0cd0af
|
[Model][Multimodal] Add explicit MusicFlamingo adapter (#32696)
Signed-off-by: WangHaoyuuu <mailwhaoyu@gmail.com>
|
2026-01-30 11:01:29 +08:00 |
|
Roger Wang
|
8b3f0a99dd
|
[Models] Qwen3-ASR (#33312)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-01-29 19:27:15 +08:00 |
|
wang.yuqi
|
abb34ac43a
|
[Bugfix] Fix Qwen3-VL-Reranker load. (#33298)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-01-29 08:42:53 +00:00 |
|
ramos
|
36d450e3b8
|
Adds FunAudioChat multimodal audio model support (#2) (#33058)
Signed-off-by: ramos <49182011+nemoramo@users.noreply.github.com>
Signed-off-by: mayufeng <mayufeng@example.com>
Co-authored-by: mayufeng <mayufeng@example.com>
|
2026-01-28 05:18:09 +00:00 |
|
Harry Mellor
|
2eb673a088
|
Add flake8-implicit-str-concat rules to Ruff (#33191)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-28 04:56:10 +00:00 |
|