Aaron Hao
|
596ed1f02e
|
[RL] Validation for pause_mode='keep' (#34992)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-23 16:30:56 -05:00 |
|
Nicolò Lucchesi
|
ab6f3487a6
|
[PD] Change kv_load_failure_policy Default from "recompute" to "fail" (#34896)
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-21 01:34:57 -08:00 |
|
Lucas Wilkinson
|
ba0511fd80
|
[Misc] Add run one batch script that supports profiling (#32968)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-02-10 18:29:49 -08:00 |
|
zzaebok
|
cbea11c9f0
|
[Docs] Fix format error in KV load failure recovery doc (#34137)
Signed-off-by: Jaebok Lee <jaebok9541@naver.com>
|
2026-02-10 02:16:26 -08:00 |
|
Cyrus Leung
|
2c32558a3c
|
[Bugfix] Fix --trust-remote-code conflict (#34218)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-10 00:29:10 -08:00 |
|
Cyrus Leung
|
25e48a3aae
|
[Doc] Update usage of --limit-mm-per-prompt (#34148)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-09 21:12:13 -08:00 |
|
Aaron Hao
|
89a385d79f
|
[Feat][RL] Pause and Resume with keep requests for single engine (#32351)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-07 00:08:58 +00:00 |
|
SorenDreano
|
6e7b1c4b59
|
[Docs] Improve documentation (#33799)
Co-authored-by: Soren Dreano <soren@numind.ai>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-02-06 12:57:09 +00:00 |
|
Benjamin Chislett
|
af3162d3aa
|
[Spec Decode] Unified Parallel Drafting (#32887)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-02-05 12:37:18 -05:00 |
|
Aaron Hao
|
c1858b7ec8
|
[Feat][RL][1/2] Native Weight Syncing API: NCCL (#31943)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: SumanthRH <sumanthrh99@gmail.com>
|
2026-02-05 12:13:23 -05:00 |
|
zxy
|
a3acfa1071
|
[Models] Intern-S1-Pro (#33636)
Signed-off-by: zxy <zhou0493@e.ntu.edu.sg>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-03 05:49:45 -08:00 |
|
Yang Liu
|
199e3cb476
|
[Model] Use mm_position to compute mrope positions for GLM-4.xV (#33039)
Signed-off-by: Yang <lymailforjob@gmail.com>
|
2026-02-02 16:55:48 +00:00 |
|
Isotr0py
|
4061dcf4c5
|
[Bugfix] Enable Kimi k25 processor test (#33562)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 14:25:25 +00:00 |
|
Komal Kumar Teru
|
ba871fb788
|
[Misc] support arbitrary MM datasets in spec dec bench (#33486)
Signed-off-by: kkt-cohere <komal@cohere.com>
Signed-off-by: Komal Kumar Teru <162363718+kkt-cohere@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-02-02 08:49:48 +00:00 |
|
RED
|
808dd87b30
|
[Model] Support DeepSeek-OCR-2 (#33165)
Signed-off-by: liuli <ll407707@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: liuli <ll407707@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 06:24:10 +00:00 |
|
Michael Goin
|
29fba76781
|
[UX] Use gguf repo_id:quant_type syntax for examples and docs (#33371)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-01-31 12:14:54 +08:00 |
|
hujiaxin0
|
ba45bedfd1
|
[model] Add support for openPangu7B-VL (#32449)
Signed-off-by: hujiaxin <524446785@qq.com>
Signed-off-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
Co-authored-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
|
2026-01-30 15:54:27 +08:00 |
|
Harry Mellor
|
9432ed8c7e
|
Explicitly set return_dict for apply_chat_template (#33372)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-30 07:27:04 +00:00 |
|
Ryan Rock
|
070c811d6f
|
[CI][AMD] Skip 4 GPUs testgroup ray tests (#33305)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2026-01-29 21:39:53 -08:00 |
|
Wang Haoyu
|
c46b0cd0af
|
[Model][Multimodal] Add explicit MusicFlamingo adapter (#32696)
Signed-off-by: WangHaoyuuu <mailwhaoyu@gmail.com>
|
2026-01-30 11:01:29 +08:00 |
|
Roger Wang
|
8b3f0a99dd
|
[Models] Qwen3-ASR (#33312)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-01-29 19:27:15 +08:00 |
|
ramos
|
36d450e3b8
|
Adds FunAudioChat multimodal audio model support (#2) (#33058)
Signed-off-by: ramos <49182011+nemoramo@users.noreply.github.com>
Signed-off-by: mayufeng <mayufeng@example.com>
Co-authored-by: mayufeng <mayufeng@example.com>
|
2026-01-28 05:18:09 +00:00 |
|
Harry Mellor
|
2eb673a088
|
Add flake8-implicit-str-concat rules to Ruff (#33191)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-28 04:56:10 +00:00 |
|
Yuxuan Zhang
|
bb17e8f11c
|
[GLM-OCR] GLM-OCR with MTP Support (#33005)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-26 06:24:43 -08:00 |
|
Itay Etelis
|
6ca2c91b96
|
[Model] Use mm_position to compute mrope positions for Qwen3-Omni (#33010)
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
Co-authored-by: Itay Etelis <itay.etelis@ibm.com>
|
2026-01-26 13:48:07 +00:00 |
|
ltd0924
|
b40db4dfec
|
[StepVL] add step vl offline example (#33054)
Signed-off-by: luotingdan <luotingdan@stepfun.com>
Co-authored-by: luotingdan <luotingdan@stepfun.com>
|
2026-01-26 01:00:32 -08:00 |
|
Itay Etelis
|
a698e8e7ad
|
[Model] Use mm_position to compute mrope positions for Qwen2.5-Omni (#32772)
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
Co-authored-by: Itay Etelis <itay.etelis@ibm.com>
|
2026-01-25 20:15:53 +08:00 |
|
Robert Shaw
|
cea3c754c4
|
[Quantization][Deprecation] Remove DeepSpeedFp8 (#32679)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-21 09:32:12 -05:00 |
|
Kim Hee Su
|
7727ce35c2
|
[Model] Add Eagle2.5-8B Vision-Language Model support (#32456)
Signed-off-by: kimheesu <wlskaka4@gmail.com>
|
2026-01-21 09:39:53 +00:00 |
|
Tomas Ruiz
|
4a5299c93f
|
feat: spec decode with draft models (#24322)
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
|
2026-01-19 16:05:46 -05:00 |
|
wang.yuqi
|
c88860d759
|
[Frontend] Score entrypoint support data_1 & data_2 and queries & documents as inputs (#32577)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-01-19 14:07:46 +00:00 |
|
Isotr0py
|
38bf2ffb21
|
[Bugfix] Fix GLM-ASR audio encoder RoPE dim (#32540)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-18 19:17:59 +08:00 |
|
sangho.lee
|
7e6f123810
|
Add Molmo2 multimodal model support (#30997)
Signed-off-by: sanghol <sanghol@allenai.org>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-14 15:33:09 +08:00 |
|
Jaehyun An
|
6bc9c8473e
|
[MODEL] New model support for kakaocorp/kanana-1.5-v-3b-instruct (#29384)
Signed-off-by: Jaehyun An <steve.ai@kakaocorp.com>
|
2026-01-12 16:39:02 +00:00 |
|
Isotr0py
|
9dbe1fe960
|
[Bugfix] Fix missing scale passing for encoder Triton Attention implementation (#32149)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-12 11:13:41 +00:00 |
|
Ning Xie
|
d74132ca3b
|
fix offline inference chat response prompt (#32088)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2026-01-11 14:01:18 +00:00 |
|
Ning Xie
|
14fc7a68c7
|
[Bugfix] fix offline chat output prompt (#32076)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2026-01-10 07:50:57 +00:00 |
|
Matthew Bonanni
|
2612ba9285
|
[1/N][Attention] Restructure attention: move files (#31916)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-09 13:10:24 -08:00 |
|
tianshu-Michael-yu
|
03fd76c570
|
[Model] Add LFM2-VL model support (#31758)
Signed-off-by: Tianshu Yu <tianshuyu.formal@gmail.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-01-08 05:00:27 -08:00 |
|
Cyrus Leung
|
da71d44410
|
[Doc] Show that use_audio_in_video is supported in docs (#30837)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-05 23:27:19 -08:00 |
|
Ekagra Ranjan
|
adcf682fc7
|
[Audio] Improve Audio Inference Scripts (offline/online) (#29279)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2025-12-31 23:34:18 +00:00 |
|
baonudesifeizhai
|
d722e9e614
|
Add GLM-ASR multimodal support (#31436)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-31 23:12:24 +08:00 |
|
Isotr0py
|
40a8756224
|
[Chore]: Remove HF format Phi4-MM examples (#31405)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-27 13:42:02 +00:00 |
|
Lucas Wilkinson
|
7e065eba59
|
[CI] Fix "2 Node Tests (4 GPUs in total)" (#31090)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-22 10:32:40 +08:00 |
|
Lucas Wilkinson
|
ae0770fa6b
|
[CI] Fix H200 Distributed test (#31054)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-20 16:48:49 -05:00 |
|
汪志鹏
|
1adeb3b84c
|
[New Model] BAGEL support (AR only) (#28439)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-15 14:58:23 +08:00 |
|
Lasha Koroshinadze
|
3a20450d31
|
Add AudioFlamingo3 model support (#30539)
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-14 02:14:55 -08:00 |
|
Ryan Rock
|
197473c4e7
|
[CI/Build] Use spawn subprocess for ROCm (#30272)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
2025-12-12 03:33:17 +00:00 |
|
Concurrensee
|
2cc5affc38
|
[ROCM][CI] Fix AMD Examples Test Group (#30276)
Signed-off-by: Yida Wu <yida.wu@amd.com>
Signed-off-by: Yida <yida.wu@amd.com>
|
2025-12-11 18:03:54 -05:00 |
|
Cyrus Leung
|
7e24e5d4d6
|
[Deprecation] Remove deprecated task, seed and MM settings (#30397)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-10 19:59:39 -08:00 |
|