ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
482914849c
|
[BugFix] LoRA: Support loading base_layer of experts (#31104)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2026-01-07 14:49:39 +08:00 |
|
Cyrus Leung
|
33b06a6f24
|
[Misc] Remove redundant attention var constants (#29650)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-28 04:35:19 -08:00 |
|
Zhewen Li
|
93c8672ceb
|
[Bugfix] Fix spec decode memory regression after #28549 (#28819)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-11-20 19:05:50 +08:00 |
|
Eldar Kurtić
|
e439c784fa
|
Add support for Eagle with separate lm-head and embed_tokens layers (#28549)
Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>
|
2025-11-15 06:12:02 -08:00 |
|
Harry Mellor
|
97d1c99302
|
Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 19:14:33 -08:00 |
|
Ilya Markov
|
e50c454672
|
[BugFix] Support EP/DP + EPLB with MTP (#25311)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
2025-11-05 15:22:17 +00:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Benjamin Chislett
|
2161efe978
|
[Bugfix] Allow skipping MoE in NVFP4 (fix for MTP) (#25987)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-10-06 16:16:30 -04:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Cyrus Leung
|
27d7638b94
|
[Bugfix] Merge MM embeddings by index instead of token IDs (#16229)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-27 08:15:12 +00:00 |
|
Woosuk Kwon
|
1c3ffdbecc
|
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-21 10:37:11 -07:00 |
|
Chen Ding
|
1a0a04dae9
|
[Perf] Optimize memory peak during EAGLE model loading. (#24585)
Signed-off-by: Chen Ding <candy.dc@alibaba-inc.com>
|
2025-09-19 03:31:16 +00:00 |
|
whx
|
4a9375fe9d
|
[Model] Pass param prefix to LLMHead (#24862)
Signed-off-by: whx-sjtu <2952154980@qq.com>
|
2025-09-17 16:01:27 +08:00 |
|
Tyler Michael Smith
|
955c624915
|
[Bugfix][Wide EP] Fix redundant work when using DeepEP, TP Attn, and EP MoE (#24134)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
|
2025-09-08 19:01:51 -07:00 |
|
Xin Yang
|
83e69a09d6
|
[Model] Support deepseek with eagle (#21086)
Signed-off-by: Xin Yang <xyangx@amazon.com>
|
2025-08-20 19:01:31 +08:00 |
|