biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
zhrrr	68c09efc37	[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>	2025-11-11 12:00:31 -05:00
Boyuan Feng	6ab183813c	[Graph Partition][Cache] Use inductor partition ops config (#27702 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-11-05 13:04:48 +00:00
dongbo910220	a0003b56b0	[Chore] Separate out system utilities from vllm.utils (#27201 ) Signed-off-by: dongbo910220 <1275604947@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-22 20:25:25 +00:00
Luka Govedič	bd7157a071	[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-17 08:10:23 -06:00
Richard Zou	9b6504c307	[BugFix] Work around graph partition x torch.compile cache issue (#26956 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-10-15 20:06:11 -07:00
Angela Yi	b59dd19b55	[compile] Enable sequence parallelism for full cuda graph without specifying compile sizes (#26681 ) Signed-off-by: angelayi <yiangela7@gmail.com>	2025-10-13 18:15:34 -07:00
Luka Govedič	d5e0fca264	[torch.compile] Cleanup compilation tests and custom passes, add debug utils, fix DCE bug (#23091 ), fix test (#24376 ), and prep for custom op matching (#24604 ) (#24542 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: luka <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-09-22 12:30:05 -07:00
wangxiyuan	6597d7a456	[Platform] import activation_quant_fusion for CUDA only (#23882 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-08-28 22:54:16 -07:00
Gregory Shtrasberg	031ca762d7	[ROCm][Bugfix] Compilation passes fix (#22202 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>	2025-08-04 19:12:28 -07:00
TJian	26b5f7bd2a	[BUG] [ROCm] Fix import bug on ROCm (#22083 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-08-01 05:25:20 -07:00
Chaojun Zhang	d9f9a3fd96	[XPU] Conditionally import CUDA-specific passes to avoid import errors on xpu platform (#21036 ) Signed-off-by: chzhang <chaojun.zhang@intel.com>	2025-07-24 23:23:36 +08:00
Ilya Markov	37a7d5d74a	[Misc] Refactor AllReduceFusionPass. Remove parameter (#20918 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-07-15 06:57:40 +00:00
Ilya Markov	fc0f41d10a	Integration SM100 FlashInfer fused allreduce RMSNorm (#20691 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-07-11 18:58:15 -07:00
cascade	e6327c9b3e	[Feature] Support sequence parallelism for static fp8 quantization (#19181 ) Signed-off-by: cascade812 <cascade812@outlook.com>	2025-06-23 16:09:02 -04:00
Luka Govedič	f98548b9da	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Sage Moore <sage@neuralmagic.com>	2025-06-12 08:31:04 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
cascade	71ea614d4a	[Feature]Add async tensor parallelism using compilation pass (#17882 ) Signed-off-by: cascade812 <cascade812@outlook.com>	2025-05-23 01:03:34 -07:00
Harry Mellor	19324d660c	Update deprecated type hinting in `vllm/compilation` (#18072 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-13 08:32:48 -07:00
Sage Moore	460a2b1100	[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations (#10867 ) Signed-off-by: Sage Moore <sage@neuralmagic.com>	2025-05-01 07:59:28 -07:00
cascade	690fe019f0	[Feature] support sequence parallelism using compilation pass (#16155 ) Signed-off-by: cascade812 <cascade812@outlook.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-04-27 06:29:35 -07:00
Luka Govedič	f622dbcf39	[Fix] [torch.compile] Improve UUID system for custom passes (#15249 ) Signed-off-by: luka <luka@neuralmagic.com>	2025-03-24 01:54:07 +00:00
Luka Govedič	bd56c983d6	[torch.compile] Fix RMSNorm + quant fusion in the non-cutlass-fp8 case, rename RedundantReshapesPass to NoopEliminationPass (#10902 ) Signed-off-by: luka <luka@neuralmagic.com>	2025-02-28 16:20:11 -07:00
youkaichao	09b95e36ab	[torch.compile] PyTorch 2.6 and nightly compatibility (#12393 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-07 01:09:07 +08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Lucas Tucker	dbeac95dbb	Mypy checking for vllm/compilation (#11496 ) Signed-off-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: lucast2021 <lucast2021@headroyce.org>	2024-12-26 05:04:07 +00:00
Luka Govedič	8b0fe06c89	[torch.compile] Inductor code caching fix (#10273 ) Signed-off-by: luka <luka@neuralmagic.com> Signed-off-by: Luka Govedic <luka.govedic@gmail.com>	2024-11-20 21:44:57 -08:00

26 Commits