Pleaplusone
|
be6931ee27
|
[ROCm][Bugfix] Fix ptpc scale load issue for fused shared expert path in deepseek mtp (#33018)
Signed-off-by: ganyi <ygan@amd.com>
|
2026-01-26 23:19:04 +08:00 |
|
Cyrus Leung
|
dcd80206b7
|
[Chore] Update type annotation of input_ids in model forward (#33063)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-26 06:02:10 -08:00 |
|
Cyrus Leung
|
600aaab8d6
|
[Model] Remove incorrect SupportsPP from MTP models (#32150)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-12 01:19:30 -08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
482914849c
|
[BugFix] LoRA: Support loading base_layer of experts (#31104)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2026-01-07 14:49:39 +08:00 |
|
zzhxxx
|
b166ef20e1
|
[refactor] Add prefix support to embed_tokens in DeepSeek MTP (#30788)
Signed-off-by: zzhx1 <zzh_201018@outlook.com>
|
2025-12-18 04:45:56 +00:00 |
|
Matthew Bonanni
|
51c57b51dd
|
[Bugfix] Fix DeepSeek R1 MTP weight loading (#29545)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-12-02 15:52:18 +00:00 |
|
杰兮
|
8005e606bf
|
[Bugfix][Rocm] Fix shared expert weight loading failure in DeepSeek-MTP (#27563)
Signed-off-by: zhyajie <yajizhan@amd.com>
Co-authored-by: zhyajie <yajizhan@amd.com>
|
2025-11-24 10:16:52 +00:00 |
|
Harry Mellor
|
97d1c99302
|
Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 19:14:33 -08:00 |
|
Ilya Markov
|
e50c454672
|
[BugFix] Support EP/DP + EPLB with MTP (#25311)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2025-11-05 15:22:17 +00:00 |
|
Mengqing Cao
|
1004205795
|
[MTP] Refactor mtp predictor to avoid d2h operation (#27643)
Signed-off-by: MengqingCao <cmq0113@163.com>
|
2025-10-30 17:27:39 +00:00 |
|
Benjamin Chislett
|
f381cf2302
|
[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (#27227)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-10-20 22:51:44 -07:00 |
|
Mengqing Cao
|
302ef403a2
|
[DSA][MLA] Tiny refactor on DeepSeek to make it reusable for different backends (#26656)
Signed-off-by: MengqingCao <cmq0113@163.com>
|
2025-10-15 00:16:44 -07:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Lucas Wilkinson
|
29255cfc3b
|
[Spec-Decode] Support piecewise cudagraphs for Eagle head (#25109)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com>
|
2025-10-10 01:20:31 -04:00 |
|
Benjamin Chislett
|
2161efe978
|
[Bugfix] Allow skipping MoE in NVFP4 (fix for MTP) (#25987)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-10-06 16:16:30 -04:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
whx
|
cbf9221992
|
[Model] Supplement to PR 24862: Pass param prefix to LLMHead (#25805)
Signed-off-by: whx-sjtu <2952154980@qq.com>
|
2025-10-03 21:34:53 +08:00 |
|
Yongye Zhu
|
fa7e254a7f
|
[New Model] DeepSeek-V3.2 (Rebased to Main) (#25896)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>
Co-authored-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Siyuan Fu <siyuanf@nvidia.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Xiaozhu Meng <mxz297@gmail.com>
Co-authored-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
|
2025-09-30 17:14:41 +08:00 |
|
Cyrus Leung
|
27d7638b94
|
[Bugfix] Merge MM embeddings by index instead of token IDs (#16229)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-27 08:15:12 +00:00 |
|
Woosuk Kwon
|
1c3ffdbecc
|
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-21 10:37:11 -07:00 |
|
Tyler Michael Smith
|
955c624915
|
[Bugfix][Wide EP] Fix redundant work when using DeepEP, TP Attn, and EP MoE (#24134)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
|
2025-09-08 19:01:51 -07:00 |
|
Benjamin Chislett
|
fbd88728b3
|
[Bugfix] Fix DeepSeek MTP (#22934)
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
|
2025-08-16 01:25:06 +00:00 |
|
Benjamin Chislett
|
2dff2e21d9
|
[Bugfix] Fix MTP weight loading (#21941)
|
2025-07-31 16:33:53 -04:00 |
|
cjackal
|
8359f4c8d8
|
[V1][Speculative Decoding] Fix DeepSeek MTP (#20022)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
|
2025-06-25 08:41:02 -07:00 |
|
Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
Jiayi Yao
|
2628a69e35
|
[V1] Support Deepseek MTP (#18435)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Co-authored-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-05-23 10:26:28 -07:00 |
|
Harry Mellor
|
26d0419309
|
Update deprecated type hinting in models (#18132)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-14 22:06:50 -07:00 |
|
Woosuk Kwon
|
b411418ff0
|
[Chore] Remove Sampler from Model Code (#17084)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-04-24 02:49:33 -07:00 |
|
Benjamin Chislett
|
9804145cac
|
[Model][Speculative Decoding] Expand DeepSeek MTP code to support k > n_predict (#13626)
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
|
2025-02-27 15:28:08 -08:00 |
|
Harry Mellor
|
cdc1fa12eb
|
Remove unused kwargs from model definitions (#13555)
|
2025-02-24 17:13:52 -08:00 |
|
Lucia Fang
|
f525c0be8b
|
[Model][Speculative Decoding] DeepSeek MTP spec decode (#12755)
Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
|
2025-02-19 17:06:23 +08:00 |
|