youkaichao
|
68ad4e3a8d
|
[Core] Support fully transparent sleep mode (#11743)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-22 14:39:32 +08:00 |
|
Mengqing Cao
|
4004f144f3
|
[Build] update requirements of no-device (#12299)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-01-22 14:29:31 +08:00 |
|
youkaichao
|
66818e5b63
|
[core] separate builder init and builder prepare for each batch (#12253)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-22 14:13:52 +08:00 |
|
Nick Hill
|
222a9dc350
|
[Benchmark] More accurate TPOT calc in benchmark_serving.py (#12288)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-01-22 13:46:14 +08:00 |
|
Cyrus Leung
|
cbdc4ad5a5
|
[Ci/Build] Fix mypy errors on main (#12296)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-22 12:06:54 +08:00 |
|
Liangfu Chen
|
016e3676e7
|
[CI] add docker volume prune to neuron CI (#12291)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
|
2025-01-22 10:47:49 +08:00 |
|
Kevin H. Luu
|
64ea24d0b3
|
[ci/lint] Add back default arg for pre-commit (#12279)
Signed-off-by: kevin <kevin@anyscale.com>
|
2025-01-22 01:15:27 +00:00 |
|
Cyrus Leung
|
df76e5af26
|
[VLM] Simplify post-processing of replacement info (#12269)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-21 16:48:13 -08:00 |
|
Hongxia Yang
|
09ccc9c8f7
|
[Documentation][AMD] Add information about prebuilt ROCm vLLM docker for perf validation purpose (#12281)
Signed-off-by: Hongxia Yang <hongxyan@amd.com>
|
2025-01-22 07:49:22 +08:00 |
|
Aleksandr Malyshev
|
69196a9bc7
|
[BUGFIX] When skip_tokenize_init and multistep are set, execution crashes (#12277)
Signed-off-by: maleksan85 <maleksan@amd.com>
Co-authored-by: maleksan85 <maleksan@amd.com>
|
2025-01-21 23:30:46 +00:00 |
|
Divakar Verma
|
2acba47d9b
|
[bugfix] moe tuning. rm is_navi() (#12273)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-01-21 22:47:32 +00:00 |
|
Jani Monoses
|
9c485d9e25
|
[Core] Free CPU pinned memory on environment cleanup (#10477)
|
2025-01-21 11:56:41 -08:00 |
|
wangxiyuan
|
fa9ee08121
|
[Misc] Set default backend to SDPA for get_vit_attn_backend (#12235)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-01-21 11:52:11 -08:00 |
|
Adrian Cole
|
347eeebe3b
|
[Misc] Remove experimental dep from tracing.py (#12007)
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
|
2025-01-21 11:51:55 -08:00 |
|
Andy Lo
|
18fd4a8331
|
[Bugfix] Multi-sequence broken (#11898)
Signed-off-by: Andy Lo <andy@mistral.ai>
|
2025-01-21 11:51:35 -08:00 |
|
Ricky Xu
|
132a132100
|
[v1][stats][1/n] Add RequestStatsUpdate and RequestStats types (#10907)
Signed-off-by: rickyx <rickyx@anyscale.com>
|
2025-01-21 11:51:13 -08:00 |
|
Jinzhen Lin
|
1e60f87bb3
|
[Kernel] fix moe_align_block_size error condition (#12239)
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
|
2025-01-21 10:30:28 -08:00 |
|
Jannis Schönleber
|
9705b90bcf
|
[Bugfix] fix race condition that leads to wrong order of token returned (#10802)
Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
|
2025-01-21 09:47:04 -08:00 |
|
youkaichao
|
3aec49e56f
|
[ci/build] update nightly torch for gh200 test (#12270)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-21 23:03:17 +08:00 |
|
Mengqing Cao
|
c64612802b
|
[Platform] improve platforms getattr (#12264)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-01-21 14:42:41 +00:00 |
|
Thomas Parnell
|
9a7c3a0042
|
Remove pytorch comments for outlines + compressed-tensors (#12260)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-01-21 21:49:08 +08:00 |
|
Roger Wang
|
b197a5ccfd
|
[V1][Bugfix] Fix data item ordering in mixed-modality inference (#12259)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-01-21 13:18:43 +00:00 |
|
youkaichao
|
c81081fece
|
[torch.compile] transparent compilation with more logging (#12246)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-21 19:32:55 +08:00 |
|
Cyrus Leung
|
a94eee4456
|
[Bugfix] Fix mm_limits access for merged multi-modal processor (#12252)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-21 10:09:39 +00:00 |
|
Cyrus Leung
|
f2e9f2a3be
|
[Misc] Remove redundant TypeVar from base model (#12248)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-21 08:40:39 +00:00 |
|
Jee Jee Li
|
1f1542afa9
|
[Misc]Add BNB quantization for PaliGemmaForConditionalGeneration (#12237)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-21 07:49:08 +00:00 |
|
Cyrus Leung
|
96912550c8
|
[Misc] Rename MultiModalInputsV2 -> MultiModalInputs (#12244)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-21 07:31:19 +00:00 |
|
youkaichao
|
2fc6944c5e
|
[ci/build] disable failed and flaky tests (#12240)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-21 13:25:03 +08:00 |
|
Nicolò Lucchesi
|
5fe6bf29d6
|
[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64 (#12230)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-01-21 12:23:14 +08:00 |
|
Gregory Shtrasberg
|
d4b62d4641
|
[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (#11777)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-01-21 12:22:23 +08:00 |
|
Michael Goin
|
ecf67814f1
|
Add quantization and guided decoding CODEOWNERS (#12228)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-01-20 18:23:40 -07:00 |
|
Jinzhen Lin
|
750f4cabfa
|
[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) (#12222)
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
Co-authored-by: Michael Goin <mgoin@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-01-20 16:42:16 -08:00 |
|
Cheng Kuan Yong Jason
|
06a760d6e8
|
[bugfix] catch xgrammar unsupported array constraints (#12210)
Signed-off-by: Jason Cheng <jasoncky96@gmail.com>
|
2025-01-20 16:42:02 -08:00 |
|
youkaichao
|
da7512215f
|
[misc] add cuda runtime version to usage data (#12190)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-01-21 00:31:01 +00:00 |
|
Işık
|
af69a6aded
|
fix: update platform detection for M-series arm based MacBook processors (#12227)
Signed-off-by: isikhi <huseyin.isik000@gmail.com>
|
2025-01-20 22:23:28 +00:00 |
|
Roger Wang
|
7bd3630067
|
[Misc] Update CODEOWNERS (#12229)
|
2025-01-20 22:19:09 +00:00 |
|
Chen Zhang
|
96663699b2
|
[CI] Pass local python version explicitly to pre-commit mypy.sh (#12224)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-20 23:49:18 +08:00 |
|
Cyrus Leung
|
18572e3384
|
[Bugfix] Fix HfExampleModels.find_hf_info (#12223)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 15:35:36 +00:00 |
|
wangxiyuan
|
86bfb6dba7
|
[Misc] Pass attention to impl backend (#12218)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-01-20 23:25:28 +08:00 |
|
Chen Zhang
|
5f0ec3935a
|
[V1] Remove _get_cache_block_size (#12214)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-20 21:54:16 +08:00 |
|
youkaichao
|
c222f47992
|
[core][bugfix] configure env var during import vllm (#12209)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 19:35:59 +08:00 |
|
youkaichao
|
170eb35079
|
[misc] print a message to suggest how to bypass commit hooks (#12217)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 18:06:24 +08:00 |
|
Cyrus Leung
|
b37d82791e
|
[Model] Upgrade Aria to transformers 4.48 (#12203)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 17:58:48 +08:00 |
|
Cyrus Leung
|
3127e975fb
|
[CI/Build] Make pre-commit faster (#12212)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 17:36:24 +08:00 |
|
Cyrus Leung
|
4001ea1266
|
[CI/Build] Remove dummy CI steps (#12208)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 16:41:57 +08:00 |
|
youkaichao
|
5c89a29c22
|
[misc] add placeholder format.sh (#12206)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 16:04:49 +08:00 |
|
Cyrus Leung
|
59a0192fb9
|
[Core] Interface for accessing model from VllmRunner (#10353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 15:00:59 +08:00 |
|
Isotr0py
|
83609791d2
|
[Model] Add Qwen2 PRM model support (#12202)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-20 14:59:46 +08:00 |
|
Yuan Tang
|
0974c9bc5c
|
[Bugfix] Fix incorrect types in LayerwiseProfileResults (#12196)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-20 14:59:20 +08:00 |
|
Yuan Tang
|
d2643128f7
|
[DOC] Add missing docstring in LLMEngine.add_request() (#12195)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-20 14:59:00 +08:00 |
|