Commit Graph

44 Commits

Author SHA1 Message Date
Woosuk Kwon
c9b38be8aa [Spec Decode] Make propose_draft_token_ids non-blocking for lower TTFT (#23041)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-18 17:20:38 -07:00
H
24d1dffbeb [executor] feat: add supports_pp attr to executors (#21786)
Signed-off-by: Haibin Lin <haibin.lin@bytedance.com>
2025-08-03 18:04:45 +08:00
wuhang
e6680f9e25 [Bugfix] Add log prefix in non-dp mode engine core (#21889)
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-08-01 09:04:16 +00:00
Nick Hill
7234fe2685 [Misc] Rework process titles (#21780)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-07-29 05:14:47 +00:00
wuhang
bccc43c033 [Bugfix]check health for engine core process exiting unexpectedly (#21728)
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-07-28 06:17:31 -07:00
Chauncey
6da0078523 [Feat] Allow custom naming of vLLM processes (#21445)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-24 03:15:23 -07:00
kourosh hakhamaneshi
9f414a12ad [BugFix] Make PD work with Ray (#21072)
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2025-07-19 08:46:50 -07:00
David Ben-David
4fcef49ec4 [V1] [KVConnector] Fix MultiprocExecutor worker output aggregation (#21048)
Signed-off-by: David Ben-David <davidb@pliops.com>
Co-authored-by: David Ben-David <davidb@pliops.com>
2025-07-17 13:29:45 +08:00
Chen LI
10be209493 [Bug Fix] get_distributed_init_method should get the ip from get_ip i… (#20889)
Signed-off-by: Chen Li <lcpingping@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-07-15 21:23:52 +00:00
Woosuk Kwon
d4d309409f Implement Async Scheduling (#19970)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-14 23:01:46 -07:00
Nick Hill
574ad60db9 [KVConnector] Always call connector clear_metadata() at end of step (#20756)
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: David Ben-David <sdavidbd@gmail.com>
2025-07-10 22:37:27 +01:00
Or Ozeri
cc876d0f29 [KVConnector] Aggregate finished requests on the scheduler (#19555)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2025-07-10 09:22:18 +01:00
jinqinn
f39ab2d4bd [Misc] Configurable timeout for execute_model RPC calls via env var (#19544)
Signed-off-by: jinqinn <goodqinjin@163.com>
2025-06-22 20:36:26 -07:00
jennyyyyzhen
cda10fa3e2 [Multi Modal] Add an env var for message queue max chunk bytes (#19242)
Signed-off-by: yZhen <yZhen@fb.com>
Co-authored-by: yZhen <yZhen@fb.com>
2025-06-08 21:39:12 +08:00
Simon Mo
02f0c7b220 [Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Nick Hill
93ecb8139c [BugFix] Increase TP execute_model timeout (#18558)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-05-23 10:22:11 +08:00
Rabi Mishra
acb54ca8e1 Intialize io_thread_pool attribute in the beginning. (#18331)
Signed-off-by: rabi <ramishra@redhat.com>
2025-05-21 20:21:14 -07:00
Nick Hill
ed5272cf21 [BugFix] Avoid secondary missing MultiprocExecutor.workers error (#17811)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-05-07 21:55:04 +00:00
Li, Jiang
a6fed02068 [V1][PP] Support PP for MultiprocExecutor (#14219)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Signed-off-by: jiang.li <jiang1.li@intel.com>
2025-05-06 07:58:05 -07:00
Nick Hill
5175b884f7 [BugFix] Remove default multiproc executor collective_rpc timeout (#17000)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-22 23:27:14 +00:00
Robert Shaw
2b05b8ce69 [V1][Frontend] Improve Shutdown And Logs (#11737)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-04-16 19:48:34 -07:00
yihong
04149cce27 [BugFix] fix some typos found by typos. (#16314)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-09 03:43:59 -07:00
youkaichao
a865bc1ca6 [core] do not send error across process (#16174)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-04-07 19:09:03 -07:00
Nick Hill
15dac210f0 [V1] AsyncLLM data parallel (#13923)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-03-27 16:14:41 -07:00
Mickaël Seznec
a597a57595 [Attention] Flash Attention 3 - fp8 (#14570)
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
2025-03-20 01:14:20 -04:00
Harry Mellor
cf069aa8aa Update deprecated Python 3.8 typing (#13971) 2025-03-02 17:34:51 -08:00
Li, Jiang
02296f420d [Bugfix][V1][Minor] Fix shutting_down flag checking in V1 MultiprocExecutor (#14053) 2025-02-28 22:31:01 -08:00
youkaichao
eb24dc4a45 [v1] torchrun compatibility (#13642)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-23 22:47:24 +08:00
youkaichao
3e472d882a [core] set up data parallel communication (#13591)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-22 19:28:59 +08:00
Russell Bryant
e489ad7a21 [Misc] Add SPDX-License-Identifier headers to python source files (#12628)
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**

commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:18:24 2025 -0500

    Add SPDX license headers to python source files
    
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
    also be easily used by tools to help manage license compliance.
    
The Linux Foundation runs license scans against the codebase to help
ensure
    we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
    
    More information can be found on the SPDX site:
    
    - https://spdx.dev/learn/handling-license-info/
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:36:32 2025 -0500

    Check for SPDX headers using pre-commit
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

---------

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Cyrus Leung
59a0192fb9 [Core] Interface for accessing model from VllmRunner (#10353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-20 15:00:59 +08:00
youkaichao
6d0e3d3724 [core] clean up executor class hierarchy between v1 and v0 (#12171)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-18 14:35:15 +08:00
youkaichao
87a0c076af [core] allow callable in collective_rpc (#12151)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-17 20:47:01 +08:00
Chen Zhang
69d765f5a5 [V1] Move more control of kv cache initialization from model_executor to EngineCore (#11960)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2025-01-17 07:39:35 +00:00
youkaichao
bf53e0c70b Support torchrun and SPMD-style offline inference (#12071)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-16 19:58:53 +08:00
youkaichao
ad34c0df0f [core] platform agnostic executor via collective_rpc (#11256)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-15 13:45:21 +08:00
WangErXiao
e5d7ed0c53 [V1] log GPU blocks num for MultiprocExecutor (#11656) 2025-01-04 00:13:12 +00:00
Robert Shaw
1543914c04 [V1] Improve TP>1 Error Handling + Stack Trace (#11721)
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-01-03 21:29:11 +00:00
Robert Shaw
df04dffade [V1] [4/N] API Server: ZMQ/MP Utilities (#11541) 2024-12-28 01:45:08 +00:00
Tyler Michael Smith
2bfdbf2a36 [V1][Core] Use weakref.finalize instead of atexit (#11242)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-12-16 22:11:33 -08:00
Mark McLoughlin
6d917d0eeb Enable mypy checking on V1 code (#11105)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2024-12-14 09:54:04 -08:00
Russell Bryant
4863e5fba5 [Core] V1: Use multiprocessing by default (#11074)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-12-13 16:27:32 -08:00
Russell Bryant
ccede2b264 [Core] cleanup zmq ipc sockets on exit (#11115)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-12-11 19:12:24 -08:00
Tyler Michael Smith
28b3a1c7e5 [V1] Multiprocessing Tensor Parallel Support for v1 (#9856)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-12-10 06:28:14 +00:00