Cyrus Leung
|
64ab3c7253
|
[Doc] Update V1 status of various pooling models (#23189)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-20 10:33:41 +08:00 |
|
Isotr0py
|
d6a1a20973
|
[CI/Build] Update transformers to v4.55.2 (#23093)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-19 10:06:17 -07:00 |
|
myselvess
|
b87cb97a53
|
[Model] support new model ovis2.5 (#23084)
Signed-off-by: myselvess <244285088@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-19 13:12:59 +00:00 |
|
wang.yuqi
|
f856c33ce9
|
[Model] Add multi_label_classification support (#23173)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-08-19 12:54:30 +00:00 |
|
Isotr0py
|
31fd3265c8
|
[Bugfix] Fix broken Minimax-01-VL model (#22116)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-19 08:49:29 +00:00 |
|
Woosuk Kwon
|
14006840ea
|
[V0 Deprecation] Remove V0 FlashInfer attention backend (#22776)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-18 19:54:16 -07:00 |
|
杨朱 · Kiki
|
569aefd134
|
chore: remove unnecessary patch_padding_side for the chatglm model (#23090)
Signed-off-by: carlory <baofa.fan@daocloud.io>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-08-18 12:32:13 +00:00 |
|
Cyrus Leung
|
27e8d1ea3e
|
[Refactor] Define MultiModalKwargsItems separate from MultiModalKwargs (#23053)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-18 09:52:00 +00:00 |
|
Michael Goin
|
4fc722eca4
|
[Kernel/Quant] Remove AQLM (#22943)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-08-16 19:38:21 +00:00 |
|
汪志鹏
|
829bbd7882
|
[New Model]mBART model (#22883)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-08-16 12:16:58 +00:00 |
|
Isotr0py
|
2dbccce8a6
|
[CI][Bugfix] Skip Ovis2 generation test because of broken remote code (#22954)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-16 09:44:19 +00:00 |
|
Isotr0py
|
cc826a202b
|
[Multimodal] Update Tensor schema test to cover arbitrary shape mm inputs (#22867)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-16 00:44:50 -07:00 |
|
Thomas Parnell
|
75531a6c13
|
[V1] [Hybrid] Support using float32 for state in Hybrid Models (Mamba2, Mamba1, Minimax) (#22928)
Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Daniel Afrimi <danielafrimi8@gmail.com>
Co-authored-by: Burkhard Ringlein <ngl@zurich.ibm.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2025-08-15 12:57:06 +00:00 |
|
amirai21
|
fe91ce9591
|
[V1] - Split Prefill and Decode for Mamba1 models (#22653)
Signed-off-by: amirk <amirk@ai21.com>
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
Co-authored-by: Asaf Joseph Gardin <39553475+Josephasafg@users.noreply.github.com>
|
2025-08-15 08:59:52 +00:00 |
|
wang.yuqi
|
5406ebf5c9
|
[CI] Pooling models mteb test uses enforce_eager (#22878)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-08-15 01:16:15 -07:00 |
|
Thomas Parnell
|
ab9f2cfd19
|
[CI] [Hybrid] Bump min transformers version for Bamba and Jamba (#22908)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-14 11:01:16 -07:00 |
|
Isotr0py
|
7c3a0741c6
|
[Bugfix] Fix PixtralHFImagePixelInputs dynamic shape check (#22827)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-14 02:35:43 -07:00 |
|
Cyrus Leung
|
0ca2393b47
|
[CI/Build] Increase pooling tolerance to pass CI (#22844)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-08-13 18:52:48 -04:00 |
|
Isotr0py
|
df0e0f023e
|
[CI/Build] Skip gpt_big model test because of broken HF model (#22848)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-13 20:36:28 +00:00 |
|
Cyrus Leung
|
c9232d41f4
|
[CI/Build] Update VLM common tests (#22841)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-13 10:03:05 -07:00 |
|
Duc-Viet Hoang
|
a01e0018b5
|
[Bugfix] Fix Nemotron VL image processing (#22739)
Co-authored-by: ducviet00-h2 <viet.d.hoang@h2corporation.jp>
|
2025-08-13 03:11:36 -07:00 |
|
Woosuk Kwon
|
71683ca6f6
|
[V0 Deprecation] Remove multi-step scheduling (#22138)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-08-12 20:18:39 -07:00 |
|
Harry Mellor
|
80bb1e8afe
|
Officially support SmolLM3 using the Transformers backend (#22665)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-12 05:38:48 -07:00 |
|
dongluw
|
9f909b8996
|
[New Model] Support Command-A-Vision (#22660)
Signed-off-by: donglu <donglu@cohere.com>
|
2025-08-12 01:39:54 -07:00 |
|
wang.yuqi
|
6d729c43fb
|
[Bugfix] Fix ModernBert load & Enable sliding window attention for bidirectional attention. (#22637)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-08-12 00:23:17 -07:00 |
|
22quinn
|
807d21b80d
|
[BugFix] [Spec Decode] Remove LlamaForCausalLMEagle3 to fix CI (#22611)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-08-11 10:31:36 -07:00 |
|
Isotr0py
|
c90fb03df5
|
[CI/Build] Skip Mllama HF runner tests with Transformers v4.55.0 (#22659)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-08-11 10:00:58 -07:00 |
|
wang.yuqi
|
84cf78acee
|
[Model] Pooling models default to using chunked prefill & prefix caching if supported. (#20930)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-08-11 09:41:37 -07:00 |
|
Maximilien de Bayser
|
39052dbca8
|
Support token_type_ids in V1 with less code changes (#21985)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-08-10 22:54:59 -07:00 |
|
Isotr0py
|
049c245143
|
[Misc] Replace flaky image urls in pixtral test (#22574)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-08-10 06:18:21 -07:00 |
|
Le Chen
|
3d7363e61c
|
[Config] add "qwen" as a native eagle3 target supported model (#22333)
Signed-off-by: lechen <lecself@163.com>
Signed-off-by: LeChen <lecself@163.com>
|
2025-08-09 20:21:05 -07:00 |
|
Thomas Parnell
|
61f67d8acd
|
[V1] [Hybrid] Enable Full CUDA Graph (decode-only) for Mamba layers (#21401)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-09 20:16:11 -07:00 |
|
Nicolò Lucchesi
|
5a16fa614c
|
[Model] Gemma3n MM (#20495)
Signed-off-by: ShriKode <shrikode@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: ShriKode <shrikode@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
|
2025-08-09 09:56:25 -07:00 |
|
Thomas Parnell
|
1bf5e1f25b
|
[CI] [Hybrid] Speed up hybrid models test by removing large models (#22563)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-09 02:04:42 -07:00 |
|
Yuxuan Zhang
|
a6022e6fbc
|
GLM-4.5V with new class name at transformers (#22520)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-09 00:50:21 -07:00 |
|
Isotr0py
|
7920e9b1c5
|
[Bugfix] Fix failing GPT-OSS initialization test (#22557)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-09 00:03:26 -07:00 |
|
Thomas Parnell
|
8a0ffd6285
|
Remove mamba_ssm from vLLM requirements; install inside test container using --no-build-isolation (#22541)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-08 23:05:32 -07:00 |
|
Isotr0py
|
429e4e2d42
|
[Bugfix] Fix ModernBert cuda graph capturing in v1 (#21901)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-08-08 22:17:22 -07:00 |
|
Harry Mellor
|
41b9655751
|
Skip Qwen 1 in CI because remote code is no longer compatible with Transformers (#22536)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-08 16:20:58 -07:00 |
|
Cyrus Leung
|
139d155781
|
[Frontend] Use engine argument to control MM cache size (#22441)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-07 09:47:10 -07:00 |
|
Cyrus Leung
|
766bc8162c
|
[Core] Store only the keys for multi-modal data in P0 (#22198)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-07 01:45:04 -07:00 |
|
wang.yuqi
|
2a4c825523
|
[CI] Skip the pooling models that do not support transformers v4.55 (#22411)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-08-06 23:05:03 -07:00 |
|
tc-mb
|
41b67f4263
|
[model] Support MiniCPM-V 4.0 (#22166)
Co-authored-by: imning3 <hbning@pku.edu.cn>
|
2025-08-06 18:35:46 -07:00 |
|
Asaf Joseph Gardin
|
46a13949d5
|
[v1] - Mamba1 Attention Metadata (#21249)
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
|
2025-08-06 17:03:42 -07:00 |
|
Isotr0py
|
fa00c5d75b
|
[Misc] Clean up duplicated hf overrides (#22311)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-08-06 07:50:25 +00:00 |
|
Woosuk Kwon
|
de98252f49
|
Add GPT-OSS model code and config [1/N] (#22327)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 23:26:00 -07:00 |
|
Harry Mellor
|
796bae07c5
|
Update transformers to v4.55 (#21931)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 22:56:14 -07:00 |
|
wang.yuqi
|
586f286789
|
[Model] Pooling model activation supports per request control by PoolingParams (#20538)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-08-05 00:37:00 -07:00 |
|
Yuxuan Zhang
|
6fa41e0c32
|
self.gate dtype update for GLM-4.5 (#22203)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
2025-08-04 19:12:38 -07:00 |
|
Raghav Ravishankar
|
a5fff3bd49
|
Fix Arcee model weight loading: Add custom load_weights (#21725)
Signed-off-by: alyosha-swamy <raghav@arcee.ai>
|
2025-08-04 04:09:56 -07:00 |
|