biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Shu Wang	2ea50e977a	Enable Allgather/ReduceScatter backend for NaiveAllToAll (#23964 ) Signed-off-by: Shu Wang. <shuw@nvidia.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Signed-off-by: Shu Wang <shuw@nvidia.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-09-18 15:52:58 +00:00
Hyogeun Oh (오효근)	b419937c78	[Docs] Fix warnings in mkdocs build (continued) (#25163 ) Signed-off-by: Zerohertz <ohg3417@gmail.com>	2025-09-18 08:23:26 -07:00
Punitvara	05b044e698	[Doc] Fix cross-reference warnings (#25058 ) Signed-off-by: Punit Vara <punitvara@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-18 02:05:16 -07:00
Sage Moore	567939953b	[Core/DBO][1/N] Add Dual-Batch Overlap mechanism to VLLM (#23693 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Sage Moore <sage@neuralmagic.com> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Co-authored-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-09-16 12:21:48 -04:00
dongluw	a5b84f1cbf	[Core] Shared memory based object store for Multimodal data caching and IPC (#20452 ) Signed-off-by: donglu <donglu@cohere.com>	2025-09-12 07:54:17 -07:00
Ilya Markov	1fdd5c42d7	[Kernels] Enable Torch Symmetric Memory All-Reduce By Default (#24111 ) Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-09-11 09:45:31 -07:00
Woosuk Kwon	4172235ab7	[V0 deprecation] Deprecate V0 Neuron backend (#21159 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-06 16:15:18 -07:00
Didier Durand	35bf193864	[Doc]: fix typos in Python comments (#24294 ) Signed-off-by: Didier Durand <durand.didier@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-09-05 19:41:12 -07:00
bnellnm	e9b92dcd89	[Kernels] Overlap shared experts with send/recv (#23273 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-09-03 12:35:18 -04:00
Didier Durand	0235103cbb	[Doc]: fix typos in Python comments (#24042 ) Signed-off-by: Didier Durand <durand.didier@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-01 19:07:45 -07:00
Didier Durand	107284959a	[Doc]: fix typos in Python comments (#24026 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-09-01 09:38:20 +00:00
Chaojun Zhang	235c9db8a7	[XPU] support data parallel for MoE models on XPU (#22887 ) Signed-off-by: chzhang <chaojun.zhang@intel.com>	2025-08-29 09:23:04 +08:00
Yongye Zhu	082cc07ef8	DP/EP Support for gpt-oss with deepep-ht comm kernel on SM100 (#23608 )	2025-08-27 17:33:21 -04:00
yzds	c7c80af084	fix pynccl reduce_scatter (#23648 ) Co-authored-by: hongchao <hongchao@msh.team>	2025-08-26 18:21:11 -07:00
Ilya Markov	0313cf854d	[PERF] PyTorch Symmetric Memory All-Reduce (#20759 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: ilmarkov <imarkov@redhat.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-08-22 15:39:08 -06:00
Chengji Yao	e9d6a3db69	[TPU] make ptxla not imported when using tpu_commons (#23081 ) Signed-off-by: Chengji Yao <chengjiyao@gmail.com> Signed-off-by: Chengji Yao <chengjiyao@google.com> Co-authored-by: Chengji Yao <chengjiyao@gmail.com>	2025-08-19 11:46:42 +08:00
bnellnm	8ad7285ea2	[Kernels] Clean up FusedMoeMethodBase and modular kernel setup. Remove extra arguments from modular kernel methods. (#22035 ) Signed-off-by: Bill Nell <bnell@redhat.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-08-15 14:46:00 -04:00
Ilya Markov	1d20c34717	[CI] Fix `tests/distributed/test_ca_buffer_sharing.py` (#22849 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Co-authored-by: ilmarkov <imarkov@redhat.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-08-13 20:09:30 -07:00
Shu Wang	b2c8ce57c6	Fix Flashinfer CUTLASS MOE Allgather (#21963 ) Signed-off-by: Shu Wang <shuw@nvidia.com>	2025-08-07 19:18:25 -07:00
Ning Xie	74333ae2f6	[Misc] correct static type check for GroupCoordinator (#21946 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-05 03:17:46 -07:00
Ning Xie	7de45db9a5	[Misc] update doc comment for send (#22026 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-03 00:55:20 -07:00
Rui Qiao	d331759488	Introduce RayPPCommunicator for ray-based PP (#21660 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-08-01 11:50:58 -07:00
Li, Jiang	a15a50fc17	[CPU] Enable shared-memory based pipeline parallel for CPU backend (#21289 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-07-21 09:07:08 -07:00
Woosuk Kwon	4de7146351	[V0 deprecation] Remove V0 HPU backend (#21131 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-17 16:37:36 -07:00
Trevor Morris	a8593237c0	Add pynccl all-gatherv and reducescatterv (#20154 ) Signed-off-by: Trevor Morris <tmorris@nvidia.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-07-11 18:59:23 -07:00
Varun Sundar Rabindranath	53fa457391	[Misc] Add unit tests for MoE ModularKernel combinations + Profiling utility (#20449 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-07-11 07:51:46 -07:00
Liangliang Ma	a3e4e85ece	[XPU][CI] enhance xpu test support (#20652 ) Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>	2025-07-09 16:53:09 +00:00
Wentao Ye	4d36693687	[Refactor] Create a function util and cache the results for `has_deepgemm`, `has_deepep`, `has_pplx` (#20187 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-06-28 22:06:38 +00:00
li haoyang	0740e29b66	[Feature] add quick all reduce (#19744 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: Haoyang Li <Haoyang.Li@amd.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-06-26 20:54:24 -07:00
Zhonghua Deng	eccdc8318c	[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242 ) Signed-off-by: Abatom <abzhonghua@gmail.com>	2025-06-18 06:32:36 +00:00
Varun Sundar Rabindranath	5cf2daea9a	[Misc] Fixes and Optimizations for DeepEP + DeepGEMM combination. (#19298 ) Signed-off-by: Varun <vsundarr@redhat.com> Co-authored-by: Varun <vsundarr@redhat.com>	2025-06-09 10:50:39 -04:00
Povilas Kanapickas	85e2b7bb13	[MISC][Bugfix] Use less CPU when message queue has been empty for some time (#16226 ) Signed-off-by: Povilas Kanapickas <povilas@radix.lt>	2025-06-05 16:53:08 +00:00
Tyler Michael Smith	d459fae0a2	[Bugfix][EP+DP] Fix internode check (#19112 ) Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>	2025-06-04 23:39:23 +08:00
Varun Sundar Rabindranath	fa98d77773	[Kernel] DeepEP dispatch-combine kernel integration (#18434 ) Signed-off-by: Varun <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-06-03 12:30:02 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Tyler Michael Smith	8a57872b2a	[Bugfix][EP+DP] Use pplx-kernel internode instead of intranode (#19034 ) Signed-off-by: Tyler Michael Smith <tysmith@redhat.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-06-03 11:36:51 +08:00
youkaichao	6a7988c55b	Refactor pplx init logic to make it modular (prepare for deepep) (#18200 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-05-23 23:43:43 +08:00
Li, Jiang	93f71673ce	[BugFix][CPU] Fix x86 SHM distributed module initialization (#18536 ) Signed-off-by: jiang.li <jiang1.li@intel.com>	2025-05-22 07:35:00 -07:00
Russell Bryant	6e0fd34d3c	[CI] Fix race condition with StatelessProcessGroup.barrier (#18506 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-05-21 20:19:13 -07:00
Siyuan Liu	48ac2bed5b	[Hardware][TPU] Optionally import for TPU backend (#18269 ) Signed-off-by: Siyuan Liu <lsiyuan@google.com> Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com> Co-authored-by: Carol Zheng <cazheng@google.com> Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com> Co-authored-by: Hongmin Fan <fanhongmin@google.com>	2025-05-17 15:23:12 +08:00
Lucia Fang	3d2779c29a	[Feature] Support Pipeline Parallism in torchrun SPMD offline inference for V1 (#17827 ) Signed-off-by: Lucia Fang <fanglu@fb.com>	2025-05-15 22:28:27 -07:00
Harry Mellor	dc372b9c8a	Update deprecated type hinting in `vllm/device_allocator` and `vllm/distributed` (#18126 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-14 04:07:57 -07:00
youkaichao	6266c57bae	[core][distributed] add ep group and all2all interface (#18077 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-05-14 10:46:49 +08:00
Isotr0py	6e4a93e3f7	[Bugfix][CPU] Fix broken AVX2 CPU TP support (#17252 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-09 08:55:14 +00:00
Tyler Michael Smith	68e1ee0072	[Bugfix][Easy] Fix whitespace in shm_broadcast.py logging (#17635 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-05-04 19:20:19 -07:00
Russell Bryant	a0304dc504	[Security] Don't bind tcp zmq socket to all interfaces (#17197 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-04-28 10:08:20 -07:00
cascade	690fe019f0	[Feature] support sequence parallelism using compilation pass (#16155 ) Signed-off-by: cascade812 <cascade812@outlook.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-04-27 06:29:35 -07:00
Robert Shaw	2b05b8ce69	[V1][Frontend] Improve Shutdown And Logs (#11737 ) Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com> Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-04-16 19:48:34 -07:00
Chengji Yao	01b6113659	[TPU] optimize the all-reduce performance (#15903 ) Signed-off-by: Chengji Yao <chengjiyao@google.com>	2025-04-03 00:25:14 +00:00
Li, Jiang	550b2801ad	[CPU][Bugfix] Using custom allreduce for CPU backend (#15934 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-04-02 07:46:47 -07:00

1 2 3

140 Commits