Yakine Tahtah
|
4e67a8f616
|
[Bugfix] Fix GLM-4 MoE router logits dtype for data parallel chunking (#31055)
Signed-off-by: ReinforcedKnowledge <reinforced.knowledge@gmail.com>
|
2026-01-06 17:57:56 +00:00 |
|
roikoren755
|
28c94770ad
|
[NemotronH] Use ReplicatedLinear for fc1_latent_proj (#31807)
Signed-off-by: Roi Koren <roik@nvidia.com>
|
2026-01-06 16:00:40 +00:00 |
|
wang.yuqi
|
96860af655
|
[Model] rename use_pad_token to use_sep_token (#31784)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-01-06 14:16:04 +00:00 |
|
Jzz1943
|
2c1a4f2488
|
[Bugfix]: avoid overriding audio/text kwargs (Qwen3-Omni) (#31790)
Signed-off-by: Zhongze Jiang <jiangzhongze.jzz@ant-intl.com>
|
2026-01-06 12:59:17 +00:00 |
|
Isotr0py
|
7101e0851f
|
[Models]: Use MMEncoderAttention for MoonViT (#31738)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: h100 <h100@inferact.ai>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: h100 <h100@inferact.ai>
|
2026-01-06 08:00:25 +00:00 |
|
Cyrus Leung
|
da71d44410
|
[Doc] Show that use_audio_in_video is supported in docs (#30837)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-05 23:27:19 -08:00 |
|
maang
|
cd1245a184
|
[Cleanup] Remove redundant decoder_layer_type assignment in Qwen2 (#31760)
Signed-off-by: maang <maang_h@163.com>
|
2026-01-05 18:09:18 -08:00 |
|
maang
|
d386ab1412
|
[Docs] Improve malformed exception caused by backslash line continuations (#31694)
Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-01-05 17:51:54 -08:00 |
|
Michael Goin
|
9513029898
|
[Bugfix] Properly apply v_scale for mimo_v2_flash (#31175)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-01-05 23:20:46 +00:00 |
|
amitz-nv
|
ee21291825
|
[Model] Nemotron Parse 1.1 Support (#30864)
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-05 13:00:14 -08:00 |
|
Wang Kunpeng
|
5708297e4e
|
[Misc][Model][Refactor] Pass the prefix into Linear layers (#31669)
Signed-off-by: Wang Kunpeng <1289706727@qq.com>
|
2026-01-05 20:03:18 +00:00 |
|
wang.yuqi
|
911d38ed99
|
[Model] Let more models to support the score template. (#31335)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-01-05 11:54:26 +00:00 |
|
Qiping Pan
|
a2ad15c070
|
[Model] Enable LoRA support for BLIP2 (#31620)
Signed-off-by: Qiping Pan <panqiping@outlook.com>
|
2026-01-05 08:02:24 +00:00 |
|
wang.yuqi
|
8be6432bda
|
[CI Failure] Fix NomicBert max_model_len validation (#31662)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-01-05 11:06:52 +08:00 |
|
jeremyteboul
|
97a01308e9
|
Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
|
2026-01-03 04:31:09 +00:00 |
|
Jay Hemnani
|
5ac55eb30f
|
[Model] Enable LoRA support for tower and connector in LLaVA (#31513)
Signed-off-by: Jay Hemnani <jayhemnani9910@gmail.com>
Co-authored-by: Jay Hemnani <jayhemnani9910@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-01-01 19:32:39 -08:00 |
|
zhima771
|
27864a851c
|
feat: support LoRA for DeepSeek-OCR(Language Model part) (#31569)
Signed-off-by: zhima771 <15836938703@163.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-01-01 19:32:11 -08:00 |
|
Fanjiang Ye
|
d8da76f3b7
|
[Bugfix] Fix BAGEL online serving for text and image understanding (#31546)
Signed-off-by: Dylan1229 <yvanphys@gmail.com>
Signed-off-by: UED <zxr3611244710@gmail.com>
Signed-off-by: mr-ye-cao <yecaoyc2019@gmail.com>
Co-authored-by: UED <zxr3611244710@gmail.com>
Co-authored-by: mr-ye-cao <yecaoyc2019@gmail.com>
Co-authored-by: Mr-Ye-Cao <60802056+Mr-Ye-Cao@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-31 14:46:10 -08:00 |
|
baonudesifeizhai
|
d722e9e614
|
Add GLM-ASR multimodal support (#31436)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-31 23:12:24 +08:00 |
|
danisereb
|
108a2728f7
|
Add get_expert_mapping to NemotronHModel (for LoRA support) (#31539)
Signed-off-by: Daniel Serebrenik <daserebrenik@nvidia.com>
|
2025-12-30 21:09:03 -08:00 |
|
yt0428
|
3f52fa5aa2
|
[Model] Add support for openPangu moe model (#28775)
Signed-off-by: yuantao <2422264527@qq.com>
Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-30 08:11:38 -08:00 |
|
Roger Young
|
5bc664110f
|
Optimize QKNorm for MiniMax-M2/M2.1 (#31493)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-12-29 16:30:18 +00:00 |
|
Harry Mellor
|
e37e7349e6
|
Replace nn.ConvNd with vLLM's ConvNdLayer for Transformers modeling backend (#31498)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-29 16:20:01 +00:00 |
|
Jee Jee Li
|
ce1eafd1a5
|
[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674)
Signed-off-by: bk-201 <joy25810@foxmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: bk-201 <joy25810@foxmail.com>
Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: Anexdeus <5142168@mail.ru>
|
2025-12-26 04:48:20 -08:00 |
|
Jee Jee Li
|
c3666f56fd
|
[Misc] Fix Qwen2-MoE shared_expert_gate (#31339)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-12-26 05:10:39 +00:00 |
|
Shinichi Hemmi
|
9ee05cbe7f
|
Support LoRA and GPTQModel for PLaMo 2/3 (#31322)
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
|
2025-12-26 11:41:33 +08:00 |
|
oscardev256
|
b7165d53c6
|
Feature/isaac 0.1 (#28367)
Signed-off-by: oscardev256 <42308241+oscardev256@users.noreply.github.com>
Signed-off-by: Oscar Gonzalez <ogonzal6@alumni.jh.edu>
Signed-off-by: Yang <lymailforjob@gmail.com>
Co-authored-by: Yang <lymailforjob@gmail.com>
|
2025-12-25 18:49:11 -08:00 |
|
SongHe
|
2d6001f491
|
[Model][Ernie4.5-VL] Support video metadata for timestamp rendering (#31274)
Signed-off-by: dengsonghe <dengsonghe@baidu.com>
Co-authored-by: dengsonghe <dengsonghe@baidu.com>
|
2025-12-25 14:07:15 +00:00 |
|
Cyrus Leung
|
09dc7c690c
|
[Chore][1/2] Drop v0.14 deprecations (#31285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-24 09:54:01 -08:00 |
|
wang.yuqi
|
bd89ce16d2
|
[Model] Introduce verify_and_update_model_config for VerifyAndUpdateConfig. (#31131)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-12-24 09:54:57 +00:00 |
|
Pleaplusone
|
b41aeb3468
|
[Bugfix][ROCm] Fix load issue on deepseek quark quantization when shared expert enabled (#31261)
Signed-off-by: ganyi <ygan@amd.com>
|
2025-12-24 16:47:44 +08:00 |
|
Xiong Wang
|
bb24592d13
|
[Qwen3-Omni] fixed _get_feat_extract_output_lengths function (#31007)
Signed-off-by: Xiong Wang <wangxiongts@163.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-12-23 21:33:54 -08:00 |
|
Andreas Karatzas
|
e42894f5b5
|
[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance (#31235)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-24 02:56:58 +00:00 |
|
Wentao Ye
|
76e6a95192
|
[Bug] Fix Number of dimensions of tensors must match. for Deepseek V3.2 (#31160)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-24 10:41:09 +08:00 |
|
Cyrus Leung
|
dd424571c8
|
[Bugfix] Enable dynamic_dims for different embeds shape (#31223)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-24 10:15:47 +08:00 |
|
Patrick von Platen
|
3faa8bee57
|
adapt voxtral (#31095)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2025-12-23 05:31:55 -08:00 |
|
Harry Mellor
|
b10d47e0e0
|
Add util function for checking nesting of rope parameters (#31146)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-23 11:41:49 +00:00 |
|
Jakub Zakrzewski
|
23daef548d
|
[Frontend] Support using chat template as custom score template for reranking models (#30550)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2025-12-23 11:19:16 +00:00 |
|
Jee Jee Li
|
6b16fff01b
|
[Bugfix] Fix Jais2ForCausalLM (#31198)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-12-23 07:44:01 +00:00 |
|
dengyunyang
|
8f8f469b1b
|
[BugFix] skip language model in Encoder (#30242)
Signed-off-by: dengyunyang <584797741@qq.com>
|
2025-12-22 05:25:59 -08:00 |
|
Li Wang
|
256a33ecb4
|
[Model] Fix bagel failed to run (#31132)
Signed-off-by: wangli <wangli858794774@gmail.com>
|
2025-12-22 02:15:54 -08:00 |
|
Kevin McKay
|
14c3e6ade3
|
[Misc] Fix spelling typos in model comments (#31117)
Signed-off-by: c0de128 <kevin.mckay@outlook.com>
|
2025-12-21 21:14:14 -08:00 |
|
baonudesifeizhai
|
54c8924384
|
[MoE Refactor][5/N] Isolate zero expert to LongCatFlash (#28891)
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Signed-off-by: Dongjie Zou <85092850+baonudesifeizhai@users.noreply.github.com>
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
|
2025-12-20 18:22:04 +00:00 |
|
Yuxuan Zhang
|
8a7a414374
|
GLM-4.7 Tool Parser and Doc Update (#30876)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
2025-12-20 00:09:58 +00:00 |
|
Wentao Ye
|
4cf9429897
|
[Bug] Fix error 'Dynamo failed to run FX node with fake tensors for Deepseek V3.2 (#31046)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-19 23:31:31 +00:00 |
|
Zhonghua Deng
|
969bbc7c61
|
[Model] Add MiMo-V2-Flash support (#30836)
Signed-off-by: Abatom <abzhonghua@gmail.com>
Signed-off-by: Jumiar <liuanqim10@126.com>
Signed-off-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jumiar <liuanqim10@126.com>
Co-authored-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-12-19 17:17:03 +00:00 |
|
Andreas Karatzas
|
7b43db210c
|
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements (#30270)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-19 02:17:27 +00:00 |
|
Isotr0py
|
700a5ad6c6
|
[MM Encoder]: Migrate legacy ViT MultiHeadAttention to new MMEncoderAttention interface (#30684)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-19 02:04:19 +08:00 |
|
sarathc-cerebras
|
28d15ab56b
|
adds jais 2 support (#30188)
Signed-off-by: sarathc-cerebras <sarath.chandran@cerebras.net>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-12-18 15:46:58 +00:00 |
|
zzhxxx
|
b166ef20e1
|
[refactor] Add prefix support to embed_tokens in DeepSeek MTP (#30788)
Signed-off-by: zzhx1 <zzh_201018@outlook.com>
|
2025-12-18 04:45:56 +00:00 |
|