Michael Goin
|
eb5ed20743
|
[Bugfix] Define router_logits_dtype for remaining MoE models (#33737)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-02-04 13:24:14 +08:00 |
|
Matthew Bonanni
|
a608b4c6c2
|
[5/N][Attention] Finish eliminating vllm/attention folder (#32064)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-01-27 10:02:51 -05:00 |
|
Cyrus Leung
|
dcd80206b7
|
[Chore] Update type annotation of input_ids in model forward (#33063)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-01-26 06:02:10 -08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
482914849c
|
[BugFix] LoRA: Support loading base_layer of experts (#31104)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2026-01-07 14:49:39 +08:00 |
|
Harry Mellor
|
cf3eacfe58
|
Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim (#30389)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-11 20:45:23 +00:00 |
|
Harry Mellor
|
e10c84e06a
|
Access partial_rotary_factor from rope_parameters (#29966)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-04 18:42:49 +00:00 |
|
Matthew Bonanni
|
430dd4d9eb
|
[Attention] Remove imports from vllm/attention/__init__.py (#29342)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-11-26 10:53:15 -07:00 |
|
Harry Mellor
|
a8b70304d6
|
Update rope_scaling to rope_parameters in preparation for Transformers v5 (#28542)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-19 09:06:36 -08:00 |
|
hwhaokun
|
085a525332
|
[Model] Fix lmhead init bug of bailing_moe (#28777)
Signed-off-by: hwhaokun <haokun0405@163.com>
Co-authored-by: zhaozx-cn <zhaozx2116@163.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-11-15 05:44:12 -08:00 |
|
zhaozx-cn
|
433c0f8675
|
[Model] Fix bailing_moe accuracy problem (#28277)
Signed-off-by: zhaozx-cn <zhaozx2116@163.com>
|
2025-11-14 13:33:02 +00:00 |
|
Harry Mellor
|
97d1c99302
|
Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 19:14:33 -08:00 |
|
Jee Jee Li
|
9d1c474704
|
[LoRA][1/N]Remove LoRA extra vocab (#28382)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-11-11 11:06:21 -08:00 |
|
ant-yy
|
5c3bae1a6a
|
[Fix] Remove divisibility requirement between num_kv_heads and tp_size in bailing_moe (#26876)
Signed-off-by: vito.yy <vito.yy@antgroup.com>
|
2025-10-15 16:44:04 +08:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
bnellnm
|
47e66c24e2
|
[Model] Apply shared experts overlap optimization to all models with shared experts (#26145)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2025-10-09 11:31:04 -04:00 |
|
Harry Mellor
|
6c04638214
|
Fix per file ruff ignores related to line length (#26262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-06 05:12:40 +00:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Harry Mellor
|
61aedb5ffe
|
MoveVllmConfig from config/__init__.py to config/vllm.py (#25271)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-29 19:49:49 -07:00 |
|
Woosuk Kwon
|
1c3ffdbecc
|
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-21 10:37:11 -07:00 |
|
ant-yy
|
72c99f2a75
|
[Model]: support Ling2.0 (#24627)
Signed-off-by: vito.yy <vito.yy@antgroup.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-15 05:09:30 -07:00 |
|
Lukas Geiger
|
de533ab2a1
|
[Models] Improve iteration over layers (#19497)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-08-29 09:26:34 +08:00 |
|
Jinzhen Lin
|
c657369841
|
support torch.compile for bailing moe (#21664)
|
2025-07-26 23:54:32 +00:00 |
|
Jee Jee Li
|
466e878f2a
|
[Quantization] Enable BNB support for more MoE models (#21100)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-18 17:52:02 -07:00 |
|
Cyrus Leung
|
ac2bf41e53
|
[Model] Remove model sampler (#21059)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-16 19:03:37 +00:00 |
|
ant-yy
|
38efa28278
|
[Model] Add Ling implementation (#20680)
Signed-off-by: vito.yy <vito.yy@antgroup.com>
|
2025-07-14 22:10:32 +08:00 |
|