Aaron Hao
|
47a1f11bff
|
[docs] Add docs for new RL flows (#36188)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-18 09:04:26 +00:00 |
|
Ekagra Ranjan
|
b5ca9c3557
|
[Models] Cohere ASR (#35809)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2026-03-17 21:04:17 +00:00 |
|
sfeiqiang
|
8cb24d3aed
|
[KV Connector] Support using FlexKV as KV Cache Offloading option. (#34328)
Signed-off-by: phaedonsun <phaedonsun@tencent.com>
Co-authored-by: phaedonsun <phaedonsun@tencent.com>
|
2026-03-12 00:46:20 -07:00 |
|
Hongxin Xu
|
bea02cdf93
|
Fix routed experts capture for hybrid models (Mamba + Attention) (#35744)
Signed-off-by: arlenxu <arlenxu@tencent.com>
Signed-off-by: xhx1022 <1737006628@qq.com>
Co-authored-by: arlenxu <arlenxu@tencent.com>
|
2026-03-11 08:53:10 -07:00 |
|
Silvia Colabrese
|
f33251ffc8
|
[Bugfix] Fix Mistral-small --format (#36782)
Signed-off-by: 12010486 <silvia.colabrese@intel.com>
|
2026-03-11 04:47:52 -07:00 |
|
tunglinwood
|
42fadebecb
|
[Model] Add support for moonshotai/Kimi-Audio-7B-Instruct (#36127)
Signed-off-by: tunglinwood <tunglinwood@gmail.com>
Signed-off-by: tunglinwood <tomwu.tunglin@gmail.com>
Signed-off-by: tunglinwood <113751333+tunglinwood@users.noreply.github.com>
|
2026-03-10 21:24:48 -07:00 |
|
wang.yuqi
|
dcf8862fd4
|
[Examples][1/n] Resettle basic examples. (#35579)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-08 20:22:53 -07:00 |
|
Cyrus Leung
|
de00ebeac4
|
[Bugfix] Fix simple Mistral-Small example (#36156)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-03-05 20:25:11 -08:00 |
|
Kunshang Ji
|
66a2209645
|
[Hardware] Replace torch.cuda.synchronize() api with torch.accelerator.synchronize (#36085)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-03-05 10:36:39 +00:00 |
|
Dr Alex Mitre
|
3417ba5648
|
docs: add README for logits_processor examples (#35933)
|
2026-03-04 17:09:19 +00:00 |
|
Kunshang Ji
|
16d2ad1d38
|
[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-04 09:49:47 +00:00 |
|
Andreas Karatzas
|
f7da9cdffc
|
[ROCm][CI] Support async weight transfer example with platform-aware determinism (#35710)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-03-04 09:44:14 +08:00 |
|
Aaron Hao
|
cad21918e3
|
[BUG] Fix rlhf_async example (#35788)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-03-02 20:36:40 +00:00 |
|
Fynn Schmitt-Ulms
|
9433acb8df
|
[Spec Decode] Add hidden states extraction system (#33736)
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
2026-03-02 14:29:09 -05:00 |
|
Aaron Hao
|
2ce6f3cf67
|
[Feat][RL][2/2] Native Weight Syncing API: IPC (#34171)
Signed-off-by: hao-aaron <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-27 13:45:21 -07:00 |
|
Aaron Hao
|
596ed1f02e
|
[RL] Validation for pause_mode='keep' (#34992)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-23 16:30:56 -05:00 |
|
Nicolò Lucchesi
|
ab6f3487a6
|
[PD] Change kv_load_failure_policy Default from "recompute" to "fail" (#34896)
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-21 01:34:57 -08:00 |
|
Lucas Wilkinson
|
ba0511fd80
|
[Misc] Add run one batch script that supports profiling (#32968)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-02-10 18:29:49 -08:00 |
|
zzaebok
|
cbea11c9f0
|
[Docs] Fix format error in KV load failure recovery doc (#34137)
Signed-off-by: Jaebok Lee <jaebok9541@naver.com>
|
2026-02-10 02:16:26 -08:00 |
|
Cyrus Leung
|
2c32558a3c
|
[Bugfix] Fix --trust-remote-code conflict (#34218)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-10 00:29:10 -08:00 |
|
Cyrus Leung
|
25e48a3aae
|
[Doc] Update usage of --limit-mm-per-prompt (#34148)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-09 21:12:13 -08:00 |
|
Aaron Hao
|
89a385d79f
|
[Feat][RL] Pause and Resume with keep requests for single engine (#32351)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-07 00:08:58 +00:00 |
|
SorenDreano
|
6e7b1c4b59
|
[Docs] Improve documentation (#33799)
Co-authored-by: Soren Dreano <soren@numind.ai>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-02-06 12:57:09 +00:00 |
|
Benjamin Chislett
|
af3162d3aa
|
[Spec Decode] Unified Parallel Drafting (#32887)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-02-05 12:37:18 -05:00 |
|
Aaron Hao
|
c1858b7ec8
|
[Feat][RL][1/2] Native Weight Syncing API: NCCL (#31943)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aaron Hao <ahao@anyscale.com>
Co-authored-by: SumanthRH <sumanthrh99@gmail.com>
|
2026-02-05 12:13:23 -05:00 |
|
zxy
|
a3acfa1071
|
[Models] Intern-S1-Pro (#33636)
Signed-off-by: zxy <zhou0493@e.ntu.edu.sg>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-03 05:49:45 -08:00 |
|
Yang Liu
|
199e3cb476
|
[Model] Use mm_position to compute mrope positions for GLM-4.xV (#33039)
Signed-off-by: Yang <lymailforjob@gmail.com>
|
2026-02-02 16:55:48 +00:00 |
|
Isotr0py
|
4061dcf4c5
|
[Bugfix] Enable Kimi k25 processor test (#33562)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 14:25:25 +00:00 |
|
Komal Kumar Teru
|
ba871fb788
|
[Misc] support arbitrary MM datasets in spec dec bench (#33486)
Signed-off-by: kkt-cohere <komal@cohere.com>
Signed-off-by: Komal Kumar Teru <162363718+kkt-cohere@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-02-02 08:49:48 +00:00 |
|
RED
|
808dd87b30
|
[Model] Support DeepSeek-OCR-2 (#33165)
Signed-off-by: liuli <ll407707@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: liuli <ll407707@alibaba-inc.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-02-02 06:24:10 +00:00 |
|
Michael Goin
|
29fba76781
|
[UX] Use gguf repo_id:quant_type syntax for examples and docs (#33371)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-01-31 12:14:54 +08:00 |
|
hujiaxin0
|
ba45bedfd1
|
[model] Add support for openPangu7B-VL (#32449)
Signed-off-by: hujiaxin <524446785@qq.com>
Signed-off-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
Co-authored-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com>
|
2026-01-30 15:54:27 +08:00 |
|
Harry Mellor
|
9432ed8c7e
|
Explicitly set return_dict for apply_chat_template (#33372)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-30 07:27:04 +00:00 |
|
Ryan Rock
|
070c811d6f
|
[CI][AMD] Skip 4 GPUs testgroup ray tests (#33305)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2026-01-29 21:39:53 -08:00 |
|
Wang Haoyu
|
c46b0cd0af
|
[Model][Multimodal] Add explicit MusicFlamingo adapter (#32696)
Signed-off-by: WangHaoyuuu <mailwhaoyu@gmail.com>
|
2026-01-30 11:01:29 +08:00 |
|
Roger Wang
|
8b3f0a99dd
|
[Models] Qwen3-ASR (#33312)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2026-01-29 19:27:15 +08:00 |
|
ramos
|
36d450e3b8
|
Adds FunAudioChat multimodal audio model support (#2) (#33058)
Signed-off-by: ramos <49182011+nemoramo@users.noreply.github.com>
Signed-off-by: mayufeng <mayufeng@example.com>
Co-authored-by: mayufeng <mayufeng@example.com>
|
2026-01-28 05:18:09 +00:00 |
|
Harry Mellor
|
2eb673a088
|
Add flake8-implicit-str-concat rules to Ruff (#33191)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-01-28 04:56:10 +00:00 |
|
Yuxuan Zhang
|
bb17e8f11c
|
[GLM-OCR] GLM-OCR with MTP Support (#33005)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-26 06:24:43 -08:00 |
|
Itay Etelis
|
6ca2c91b96
|
[Model] Use mm_position to compute mrope positions for Qwen3-Omni (#33010)
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
Co-authored-by: Itay Etelis <itay.etelis@ibm.com>
|
2026-01-26 13:48:07 +00:00 |
|
ltd0924
|
b40db4dfec
|
[StepVL] add step vl offline example (#33054)
Signed-off-by: luotingdan <luotingdan@stepfun.com>
Co-authored-by: luotingdan <luotingdan@stepfun.com>
|
2026-01-26 01:00:32 -08:00 |
|
Itay Etelis
|
a698e8e7ad
|
[Model] Use mm_position to compute mrope positions for Qwen2.5-Omni (#32772)
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
Co-authored-by: Itay Etelis <itay.etelis@ibm.com>
|
2026-01-25 20:15:53 +08:00 |
|
Robert Shaw
|
cea3c754c4
|
[Quantization][Deprecation] Remove DeepSpeedFp8 (#32679)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-01-21 09:32:12 -05:00 |
|
Kim Hee Su
|
7727ce35c2
|
[Model] Add Eagle2.5-8B Vision-Language Model support (#32456)
Signed-off-by: kimheesu <wlskaka4@gmail.com>
|
2026-01-21 09:39:53 +00:00 |
|
Tomas Ruiz
|
4a5299c93f
|
feat: spec decode with draft models (#24322)
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
|
2026-01-19 16:05:46 -05:00 |
|
wang.yuqi
|
c88860d759
|
[Frontend] Score entrypoint support data_1 & data_2 and queries & documents as inputs (#32577)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-01-19 14:07:46 +00:00 |
|
Isotr0py
|
38bf2ffb21
|
[Bugfix] Fix GLM-ASR audio encoder RoPE dim (#32540)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-18 19:17:59 +08:00 |
|
sangho.lee
|
7e6f123810
|
Add Molmo2 multimodal model support (#30997)
Signed-off-by: sanghol <sanghol@allenai.org>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-14 15:33:09 +08:00 |
|
Jaehyun An
|
6bc9c8473e
|
[MODEL] New model support for kakaocorp/kanana-1.5-v-3b-instruct (#29384)
Signed-off-by: Jaehyun An <steve.ai@kakaocorp.com>
|
2026-01-12 16:39:02 +00:00 |
|
Isotr0py
|
9dbe1fe960
|
[Bugfix] Fix missing scale passing for encoder Triton Attention implementation (#32149)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-12 11:13:41 +00:00 |
|