Cyrus Leung
|
9fb52e523a
|
[V1] Support any head size for FlexAttention backend (#20467)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-06 09:54:36 -07:00 |
|
Thomas Parnell
|
2f35a022e6
|
Enable V1 for Hybrid SSM/Attention Models (#20016)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Co-authored-by: Stanislaw Wozniak <stw@zurich.ibm.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2025-07-04 17:46:53 +00:00 |
|
CSWYF3634076
|
e303dcf523
|
[Model] Add Ernie4.5 and Ernie4.5MoE Model Support (#20220)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-07-02 03:37:01 -07:00 |
|
Kwai-Keye
|
8452946c06
|
[Model][VLM] Support Keye-VL-8B-Preview (#20126)
Signed-off-by: Kwai-Keye <Keye@kuaishou.com>
|
2025-07-01 23:35:04 -07:00 |
|
aiyiwang2025
|
ecad851cbd
|
[Model]Add Tencent HunYuanMoEV1 Model Support (#20114)
Signed-off-by: aiyiwang <aiyiwang@tencent.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: quinnrong <quinnrong@tencent.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-01 07:28:13 -07:00 |
|
Yuxuan Zhang
|
ed70f3c64f
|
Add GLM4.1V model (Draft) (#19331)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-01 12:48:26 +00:00 |
|
redmoe-moutain
|
65b1cbb138
|
[Model] support dots1 (#18254)
Signed-off-by: redmoe-moutain <agiredmoe@gmail.com>
|
2025-06-29 19:34:36 -07:00 |
|
Michael Goin
|
c329ceca6d
|
[CI Fix] Pin tests/models/registry.py MiniMaxText01ForCausalLM to revision due to model changes (#20199)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-06-28 13:43:06 +08:00 |
|
Robert Shaw
|
d1c956dc0f
|
Gemma3n (Text-only) (#20134)
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.me>
|
2025-06-27 07:16:26 +00:00 |
|
cascade
|
e6327c9b3e
|
[Feature] Support sequence parallelism for static fp8 quantization (#19181)
Signed-off-by: cascade812 <cascade812@outlook.com>
|
2025-06-23 16:09:02 -04:00 |
|
汪志鹏
|
c3bf9bad11
|
[New model support]Support Tarsier2 (#19887)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-21 04:01:51 +00:00 |
|
Adrian
|
f1e840e842
|
[Model] GPT2ForSequenceClassification model (#19663)
Signed-off-by: nie3e <adrcwiek@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-06-20 12:07:41 +00:00 |
|
Maximilien de Bayser
|
799397ee4f
|
Support embedding models in V1 (#16188)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-06-18 21:36:33 -07:00 |
|
qscqesze
|
387bdf0ab9
|
[Model] Add support for MiniMaxM1ForCausalLM (shares architecture with MiniMaxText01ForCausalLM) (#19677)
Signed-off-by: QscQ <qscqesze@gmail.com>
|
2025-06-16 09:47:14 -07:00 |
|
wang.yuqi
|
3952731e8f
|
[New Model]: Support Qwen3 Embedding & Reranker (#19260)
|
2025-06-10 20:07:30 -07:00 |
|
Luis Vega
|
cb6d572e85
|
[Model] NemotronH support (#18863)
Signed-off-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com>
Co-authored-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com>
|
2025-06-05 21:29:28 +00:00 |
|
Cyrus Leung
|
01dc9a76db
|
[CI/Build][Bugfix] Ensure compatibility with transformers 4.52 (#18678)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-04 04:49:20 -07:00 |
|
Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
汪志鹏
|
1282bd812e
|
Add tarsier model support (#18985)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-03 13:13:13 +08:00 |
|
Shawn Huang
|
e1fadf1197
|
[Feature] minicpm eagle support (#18943)
Signed-off-by: huangyuxiang03 <huangyx0321@gmail.com>
Co-authored-by: huangyuxiang03 <huangyx0321@gmail.com>
|
2025-05-30 06:45:56 -07:00 |
|
wang.yuqi
|
de65fc8e1e
|
[CI] improve embed testing (#18747)
|
2025-05-28 00:16:35 -07:00 |
|
Cyrus Leung
|
57fd13a707
|
[Bugfix] Fix profiling dummy data for Pixtral (#18677)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-25 14:05:30 +00:00 |
|
Isotr0py
|
75f81750f3
|
[VLM] Initialize video input support for InternVL models (#18499)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 04:51:25 +00:00 |
|
Yuanhao WU
|
a859320575
|
[Model] Add support for Qwen2.5-Omni-7B-AWQ (Qwen2_5OmniForConditionalGeneration) (#18647)
|
2025-05-24 09:15:36 +00:00 |
|
Michael Goin
|
0ddf88e16e
|
[CI] Enable test_initialization to run on V1 (#16736)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-23 15:09:44 -07:00 |
|
Harry Mellor
|
ca86a7cf6e
|
[CI/Build] Update bamba test model location (#18544)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-22 06:01:07 -07:00 |
|
Dhia Eddine Rhaiem
|
eca18691d2
|
[MODEL] FalconH1 (#18406)
Signed-off-by: dhia.rhaiem <dhia.rhaiem@tii.ae>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Ilyas Chahed <ilyas.chahed@tii.ae>
Co-authored-by: Jingwei Zuo <jingwei.zuo@tii.ae>
|
2025-05-21 04:59:06 -07:00 |
|
rongfu.leng
|
82e7f9bb03
|
[Misc] replace does not exist model (#18119)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-05-14 02:13:47 -07:00 |
|
wang.yuqi
|
63ad622233
|
[New Model]: support GTE NewModel (#17986)
|
2025-05-14 01:31:31 -07:00 |
|
bwshen-mi
|
acee8f48aa
|
[Model] Support MiMo-7B inference with MTP (#17433)
Signed-off-by: wp-alpha <wangpeng66@xiaomi.com>
Co-authored-by: wangpeng66 <wangpeng66@xiaomi.com>
|
2025-05-12 23:25:33 +00:00 |
|
Isotr0py
|
021c16c7ca
|
[Model] Broadcast Ovis2 implementation to fit Ovis1.6 (#17861)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-11 17:56:30 -07:00 |
|
Cyrus Leung
|
96722aa81d
|
[Frontend] Chat template fallbacks for multimodal models (#17805)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-07 23:05:54 -07:00 |
|
Stan Wozniak
|
999328be0d
|
[Model] Add GraniteMoeHybrid 4.0 model (#17497)
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
Signed-off-by: Stanislaw Wozniak <stw@zurich.ibm.com>
Co-authored-by: Thomas Ortner <boh@zurich.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
|
2025-05-06 12:00:31 +08:00 |
|
Isotr0py
|
f66f1e0fa3
|
[Bugfix] Fix broken Qwen2.5-omni tests (#17613)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-03 17:08:14 +00:00 |
|
Isotr0py
|
88c8304104
|
[Model] Refactor Ovis2 to support original tokenizer (#17537)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-01 11:00:53 -07:00 |
|
Marco
|
54072f315f
|
[MODEL ADDITION] Ovis2 Model Addition (#15826)
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-04-30 07:33:29 +00:00 |
|
Harry Mellor
|
7489ec0bab
|
Remove Bamba 9B from CI (#17407)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 21:10:31 +00:00 |
|
Harry Mellor
|
0350809f3a
|
Remove Falcon3 2x7B from CI (#17404)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 19:52:25 +00:00 |
|
Jee Jee Li
|
890f104cdf
|
[Doc] Fix QWen3MOE info (#17381)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-04-29 12:38:32 +00:00 |
|
qscqesze
|
cde384cd92
|
[Model] support MiniMax-VL-01 model (#16328)
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-04-29 12:05:50 +08:00 |
|
Alex Brooks
|
fa93cd9f60
|
[Model] Add Granite Speech Support (#16246)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-28 10:05:00 +00:00 |
|
Cyrus Leung
|
909fdaf152
|
[Bugfix] Fix standard models tests (#17217)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-26 02:26:41 -07:00 |
|
Benjamin Chislett
|
a0e619e62a
|
[V1][Spec Decode] EAGLE-3 Support (#16937)
Signed-off-by: Bryan Lu <yuzhelu@amazon.com>
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
Co-authored-by: Bryan Lu <yuzhelu@amazon.com>
|
2025-04-25 15:43:07 -07:00 |
|
Harry Mellor
|
0bd7f8fca5
|
Bump Transformers to 4.51.3 (#17116)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-25 08:34:34 -07:00 |
|
Yang Fan
|
2c1bd848a6
|
[Model][VLM] Add Qwen2.5-Omni model support (thinker only) (#15130)
Signed-off-by: fyabc <suyang.fy@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Xiong Wang <wangxiongts@163.com>
|
2025-04-18 23:14:36 -07:00 |
|
wang.yuqi
|
3d3ab3689f
|
[New Model]: Snowflake Arctic Embed (Family) (#16649)
|
2025-04-18 08:11:57 -07:00 |
|
intervitens
|
5b1aca2ae3
|
[Bugfix] Fix GLM4 model (#16618)
Signed-off-by: intervitens <intervitens@tutanota.com>
|
2025-04-17 03:35:07 -07:00 |
|
Isotr0py
|
cb072ce93b
|
[Bugfix] Update Florence-2 tokenizer to make grounding tasks work (#16734)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-17 04:17:39 +00:00 |
|
xsank
|
ee378f3d49
|
[Model] support modernbert (#16648)
Signed-off-by: 唯勤 <xsank.mz@alibaba-inc.com>
Co-authored-by: 唯勤 <xsank.mz@alibaba-inc.com>
|
2025-04-16 05:30:15 -07:00 |
|
Shinichi Hemmi
|
3badb0213b
|
[Model] Add PLaMo2 (#14323)
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: shemmi <shemmi@preferred.jp>
Co-authored-by: Kento Nozawa <nzw0301@preferred.jp>
Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
|
2025-04-15 19:31:30 -07:00 |
|