Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
Isotr0py
|
ec2dcd80bc
|
[Misc] Update WeightsMapper for qwen2-vl/qwen2.5-vl (#19054)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-06-03 09:08:20 +00:00 |
|
汪志鹏
|
1282bd812e
|
Add tarsier model support (#18985)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-03 13:13:13 +08:00 |
|
jennyyyyzhen
|
ebb1ec9318
|
[Model] enable data parallel for Llama4 vision encoder (#18368)
Signed-off-by: yzhen <yzhen@devgpu093.cco2.facebook.com>
Co-authored-by: yZhen <yZhen@fb.com>
Co-authored-by: yzhen <yzhen@devgpu093.cco2.facebook.com>
|
2025-06-02 19:22:54 +08:00 |
|
Isotr0py
|
a35ca765a5
|
[LoRA] Support dynamically initialize packed_modules_mapping for VLM with arbitrary components (#18987)
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-06-01 11:06:57 +08:00 |
|
Benjamin Chislett
|
1bc86a3da1
|
[Bugfix] Fix EAGLE3 broken logits (#18909)
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
|
2025-05-31 19:58:07 -07:00 |
|
Isotr0py
|
5a8641638a
|
[VLM] Add PP support and fix GPTQ inference for Ovis models (#18958)
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-30 17:11:44 +00:00 |
|
Shawn Huang
|
e1fadf1197
|
[Feature] minicpm eagle support (#18943)
Signed-off-by: huangyuxiang03 <huangyx0321@gmail.com>
Co-authored-by: huangyuxiang03 <huangyx0321@gmail.com>
|
2025-05-30 06:45:56 -07:00 |
|
Lukas Geiger
|
c3bb9f2331
|
[Model] Use in-place adds in SigLIP (#18922)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-30 17:12:59 +08:00 |
|
Cyrus Leung
|
4f4a6b844a
|
[Deprecation] Remove mean pooling default for Qwen2EmbeddingModel (#18913)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-30 06:53:37 +00:00 |
|
iLeGend
|
3987e2ae96
|
[Model] Use AutoWeightsLoader for mamba2 (#18918)
Signed-off-by: iLeGend <824040212@qq.com>
|
2025-05-30 04:50:10 +00:00 |
|
Cyrus Leung
|
1aa2f81b43
|
[Misc] Update type annotation for rotary embedding base (#18914)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-30 10:17:01 +08:00 |
|
Jee Jee Li
|
34d6c447c4
|
[LoRA] Add LoRA support for InternVL (#18842)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-05-29 08:46:24 +00:00 |
|
Maximilien de Bayser
|
515b413ebf
|
Prevent the cross-encoder logic from being applied to classification tasks (#18838)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-28 19:16:17 -07:00 |
|
rongfu.leng
|
c68b5c63eb
|
[Misc] fix olmoe model layer can't laod in tp gt 1 (#18828)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-05-28 17:36:21 +00:00 |
|
wang.yuqi
|
3e9ce609bd
|
[Bugfix] Fix nomic max_model_len (#18755)
|
2025-05-27 20:29:53 -07:00 |
|
Cyrus Leung
|
696259ca01
|
[Core] Automatically cast multi-modal input dtype (#18756)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-27 23:45:48 +08:00 |
|
Hyogeun Oh (오효근)
|
a68e293cb9
|
[Doc] Convert Sphinx directives ( {class}, {meth}, {attr}, ...) to MkDocs format for better documentation linking (#18663)
Signed-off-by: Zerohertz <ohg3417@gmail.com>
|
2025-05-27 01:44:20 -07:00 |
|
Shawn Huang
|
6881107948
|
[BUG FIX] minicpm (#18739)
Signed-off-by: huangyuxiang03 <huangyx0321@gmail.com>
Co-authored-by: huangyuxiang03 <huangyx0321@gmail.com>
|
2025-05-27 01:04:49 -07:00 |
|
Lukas Geiger
|
b50602d5f0
|
[Model][Gemma3] Cast image pixel values already on CPU (#18732)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-27 05:42:54 +00:00 |
|
Lukas Geiger
|
0eebd74842
|
[Model][Gemma3] Simplify image input validation (#18710)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-27 11:13:37 +08:00 |
|
Cyrus Leung
|
a869baca73
|
[Bugfix] Fix Llama GGUF initialization (#18717)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 07:49:22 -07:00 |
|
Naveassaf
|
6d68030f1c
|
[Model] Add support for YARN in NemotronNAS models (#18427)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
|
2025-05-26 10:31:49 +00:00 |
|
Maximilien de Bayser
|
561b77a0d6
|
[Bugfix] Fix the lm_head in gpt_bigcode in lora mode (#6357)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
|
2025-05-26 14:52:25 +08:00 |
|
Cyrus Leung
|
57fd13a707
|
[Bugfix] Fix profiling dummy data for Pixtral (#18677)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-25 14:05:30 +00:00 |
|
Isotr0py
|
75f81750f3
|
[VLM] Initialize video input support for InternVL models (#18499)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 04:51:25 +00:00 |
|
ztang2370
|
2cd4d58df4
|
[Model] use AutoWeightsLoader for gpt2 (#18625)
Signed-off-by: zt2370 <ztang2370@gmail.com>
|
2025-05-24 13:36:13 +00:00 |
|
Yuanhao WU
|
a859320575
|
[Model] Add support for Qwen2.5-Omni-7B-AWQ (Qwen2_5OmniForConditionalGeneration) (#18647)
|
2025-05-24 09:15:36 +00:00 |
|
Feng XiaoLong
|
4fc1bf813a
|
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
|
2025-05-23 16:16:26 -07:00 |
|
Michael Goin
|
0ddf88e16e
|
[CI] Enable test_initialization to run on V1 (#16736)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-23 15:09:44 -07:00 |
|
Jiayi Yao
|
2628a69e35
|
[V1] Support Deepseek MTP (#18435)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Co-authored-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-05-23 10:26:28 -07:00 |
|
Harry Mellor
|
2edb533af2
|
Replace {func} with mkdocs style links (#18610)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 05:51:38 -07:00 |
|
Mengqing Cao
|
4ce64e2df4
|
[Bugfix][Model] Fix baichuan model loader for tp (#18597)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-05-23 02:39:05 -07:00 |
|
Harry Mellor
|
a1fe24d961
|
Migrate docs from Sphinx to MkDocs (#18145)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 02:09:53 -07:00 |
|
Benjamin Chislett
|
583507d130
|
[Spec Decode] Make EAGLE3 draft token ID mapping optional (#18488)
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-05-22 20:17:39 -07:00 |
|
Harry Mellor
|
4b0da7b60e
|
Enable hybrid attention models for Transformers backend (#18494)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 10:12:08 +08:00 |
|
Mark McLoughlin
|
c6b636f9fb
|
[V1][Spec Decoding] Use model_loader.get_model() to load models (#18273)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-05-23 02:05:44 +00:00 |
|
Chenheli Hua
|
04eb88dc80
|
Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. (#18569)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-05-23 01:59:18 +00:00 |
|
Shane A
|
51797775c3
|
[Bugfix][Model] Make Olmo2Model weight loading return loaded weights (#18504)
Signed-off-by: Shane A <shanea@allenai.org>
|
2025-05-21 21:17:03 -07:00 |
|
youngrok cha
|
d022115cc6
|
[Bugfix] Inconsistent token calculation compared to HF in llava family (#18479)
Signed-off-by: jaycha <jaycha@ncsoft.com>
|
2025-05-21 20:21:47 -07:00 |
|
Dhia Eddine Rhaiem
|
20bd6f4d2e
|
[FalconH1] Fix output dtype in RMSNorm fallback path for Falcon-H1 (e.g. 0.5B) (#18500)
Signed-off-by: dhia.rhaiem <dhia.rhaiem@tii.ae>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Ilyas Chahed <ilyas.chahed@tii.ae>
Co-authored-by: Jingwei Zuo <jingwei.zuo@tii.ae>
|
2025-05-21 19:23:59 -07:00 |
|
Dhia Eddine Rhaiem
|
eca18691d2
|
[MODEL] FalconH1 (#18406)
Signed-off-by: dhia.rhaiem <dhia.rhaiem@tii.ae>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Ilyas Chahed <ilyas.chahed@tii.ae>
Co-authored-by: Jingwei Zuo <jingwei.zuo@tii.ae>
|
2025-05-21 04:59:06 -07:00 |
|
Gregory Shtrasberg
|
0c15c2e486
|
[Bugfix] config.head_dim is now explicitly set to None (#18432)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-05-20 21:04:33 -07:00 |
|
Calvin Chen
|
e1f5a71ed7
|
[Model] use AutoWeightsLoader for bloom (#18300)
Signed-off-by: calvin chen <120380290@qq.com>
|
2025-05-20 09:40:05 -07:00 |
|
Isotr0py
|
f07a673eb2
|
[Misc] Allow AutoWeightsLoader to skip loading weights with specific substr in name (#18358)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-19 20:20:12 -07:00 |
|
wwl2755
|
9da1095daf
|
[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa (#18175)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-05-18 19:49:46 -07:00 |
|
Lifu Huang
|
4fb349f66a
|
Fix copy-paste error in phi4mm image processing (#18315)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
|
2025-05-18 07:00:12 -07:00 |
|
rongfu.leng
|
9214e60631
|
[Model] use AutoWeightsLoader for solar (#18113)
|
2025-05-17 00:24:17 -07:00 |
|
learner0810
|
87d871470d
|
[Model] Use autoweightloader for dbrx (#18251)
Signed-off-by: learner0810 <zhongjun.li@daocloud.io>
|
2025-05-16 07:54:13 -07:00 |
|
Vadim Gimpelson
|
67da5720d4
|
[PERF] Speed up Qwen2.5-VL model by speed up rotary position embedding (#17973)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@centml.ai>
|
2025-05-15 23:31:02 -07:00 |
|