biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Kunshang Ji	e10604480b	[XPU][1/N] Deprecate ipex and switch to vllm-xpu-kernels for xpu platform (#33379 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2026-02-02 22:46:10 -08:00
csy0225	c3b40dc3e7	[Models] Step-3.5-Flash (#33523 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: i-zhangmingming <i-zhangmingming@stepfun.com> Co-authored-by: xiewuxun <xiewuxun@stepfun.com> Co-authored-by: zetaohong <i-hongzetao@stepfun.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2026-02-02 10:21:18 +08:00
Luka Govedič	5e4e0e51f4	[torch.compile] Compile `CustomOp.forward_native` for `SiluAndMul` and `QuantFP8` to avoid raw torch ops inside opaque custom ops (#32806 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2026-01-22 19:52:26 -08:00
Shanshan Shen	08d954f036	[Doc] Add developer guide for CustomOp (#30886 ) Signed-off-by: shen-shanshan <467638484@qq.com>	2026-01-09 16:21:11 +00:00
Divakar Verma	4b40924998	[ROCm] Fallback pytorch GELU with tanh approximation to GELU() (#29244 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com> Signed-off-by: Divakar Verma <137818590+divakar-amd@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-02 02:02:22 +00:00
Jiangyun Zhu	ab3e80042e	[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled (#27146 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-10-22 00:22:39 -04:00
Cyrus Leung	d31f7844f8	[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-19 05:20:55 -07:00
Cyrus Leung	d2740fafbf	[Chore] Separate out `vllm.utils.collections` (#26990 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 08:35:35 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Konrad Zawora	4aa23892d6	[Bugfix] Fix platform-specific routing in CustomOp implementations (#24444 ) Signed-off-by: Konrad Zawora <kzawora@habana.ai>	2025-09-11 17:15:01 +00:00
Woosuk Kwon	4172235ab7	[V0 deprecation] Deprecate V0 Neuron backend (#21159 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-06 16:15:18 -07:00
co63oc	1bd007f234	fix some typos (#24071 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-09-02 20:44:50 -07:00
EduardDurech	1cf3753b90	[MODEL] `Apertus` and `XIELU` (#23068 ) Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com> Co-authored-by: AllenHaoHuang <allenhuangdd@gmail.com>	2025-08-29 20:29:18 +08:00
LIYIFAN_liyifan	c9abb10489	[Bugfix] Fix Dense module loading for sentence-transformers embedding models (simplified V2) (#23408 ) Signed-off-by: FFFfff1FFFfff <yifanli0919@gmail.com>	2025-08-25 05:39:24 +00:00
Jee Jee Li	4d4061b6e7	[Kernel] Add cuda kernel for gpt_oss activation (#22951 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-17 05:03:24 +00:00
Simon Mo	f1f0d2fab8	Revert "[Kernel] Add cuda kernel for gpt_oss activation" (#22948 )	2025-08-14 17:38:10 -07:00
Jee Jee Li	81f4b96481	[Kernel] Add cuda kernel for gpt_oss activation (#22538 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-14 17:21:29 -07:00
Li, Jiang	b5dfb94fa0	[CI/Build][Bugfix] Fix Qwen2.5 tests in CPU CI via fallback silu_and_mul to torch native implementation (#22145 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-08-03 05:34:04 -07:00
Robert Shaw	d1c956dc0f	Gemma3n (Text-only) (#20134 ) Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Roger Wang <hey@rogerw.me> Co-authored-by: Roger Wang <hey@rogerw.me>	2025-06-27 07:16:26 +00:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
wang.yuqi	63ad622233	[New Model]: support GTE NewModel (#17986 )	2025-05-14 01:31:31 -07:00
wang.yuqi	3d3ab3689f	[New Model]: Snowflake Arctic Embed (Family) (#16649 )	2025-04-18 08:11:57 -07:00
Liangfu Chen	f75aa72732	[Neuron] Add custom_ops for neuron backend (#13246 ) Signed-off-by: Liangfu Chen <liangfc@amazon.com> Co-authored-by: George Novack <gnovack@amazon.com> Co-authored-by: Aoyu Zhang <aoyuzhan@amazon.com>	2025-02-25 11:47:49 -08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Li, Jiang	d4e6194570	[CI/Build][CPU][Bugfix] Fix CPU CI (#12150 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-01-17 19:39:52 +08:00
Jee Jee Li	42f5e7c52a	[Kernel] Support MulAndSilu (#11624 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-15 02:29:53 +00:00
cennn	d907be7dc7	[misc] remove python function call for custom activation op (#11885 ) Co-authored-by: youkaichao <youkaichao@gmail.com>	2025-01-10 17:18:25 +08:00
Yan Ma	78f4590b60	[Bugfix][XPU] fix silu_and_mul (#11823 ) Signed-off-by: yan ma <yan.ma@intel.com>	2025-01-09 00:11:50 +08:00
Li, Jiang	2f7024987e	[CI/Build][Bugfix] Fix CPU CI image clean up (#11836 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-01-08 15:18:28 +00:00
youkaichao	869579a702	[optimization] remove python function call for custom op (#11750 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-07 17:04:28 +00:00
Michael Goin	399c798608	Remove ScaledActivation for AWQ (#10057 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-11-06 14:27:06 +00:00
Michael Goin	a53046b16f	[Model] Support quantization of PixtralHFTransformer for PixtralHF (#9921 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-11-05 10:42:20 -08:00
Jee Jee Li	295a061fb3	[Kernel] add kernel for FATReLU (#9610 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-10-24 16:18:27 +08:00
Michael Goin	3921a2f29e	[Model] Support Pixtral models in the HF Transformers format (#9036 )	2024-10-18 13:29:56 -06:00
Luka Govedič	0f41fbe5a3	[torch.compile] Fine-grained CustomOp enabling mechanism (#9300 )	2024-10-17 18:36:37 +00:00
Junhao Li	5b8a1fde84	[Model][Bugfix] Add FATReLU activation and support for openbmb/MiniCPM-S-1B-sft (#9396 )	2024-10-16 16:40:24 +00:00
Kunshang Ji	851725202a	[Hardware][intel GPU] bump up ipex version to 2.3 (#8365 ) Co-authored-by: Yan Ma <yan.ma@intel.com>	2024-09-13 16:54:34 -07:00
Michael Goin	281977bd6e	[Doc] Add Nemotron to supported model docs (#6843 )	2024-07-26 17:32:44 -04:00
Michael Goin	07278c37dd	[Model] Support Nemotron models (Nemotron-3, Nemotron-4, Minitron) (#6611 )	2024-07-26 14:33:42 -04:00
Roger Wang	bd620b01fb	[Kernel][CPU] Add Quick `gelu` to CPU (#5717 )	2024-06-21 06:39:40 +00:00
Roger Wang	ad137cd111	[Model] Port over CLIPVisionModel for VLMs (#5591 )	2024-06-20 11:52:09 +00:00
Kunshang Ji	728c4c8a06	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 ) Co-authored-by: Jiang Li <jiang1.li@intel.com> Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-06-17 11:01:25 -07:00
Woosuk Kwon	41ca62cf03	[Misc] Add CustomOp interface for device portability (#5255 )	2024-06-05 09:18:19 -07:00
Jee Li	d6f4bd7cdd	[Misc]Add customized information for models (#4132 )	2024-04-30 21:18:14 -07:00
Kunshang Ji	e9da5a40c6	[Misc] Add indirection layer for custom ops (#3913 )	2024-04-10 20:26:07 -07:00
youkaichao	63e7176f26	[Core][Refactor] move parallel_utils into vllm/distributed (#3950 ) [WIP][Core][Refactor] move vllm/model_executor/parallel_utils into vllm/distributed and vllm/device_communicators (#3950)	2024-04-10 15:33:30 -07:00
SangBin Cho	6e435de766	[1/n][Chunked Prefill] Refactor input query shapes (#3236 )	2024-03-20 14:46:05 -07:00
Woosuk Kwon	602358f8a8	Add kernel for GeGLU with approximate GELU (#3337 )	2024-03-12 22:06:17 -07:00
Woosuk Kwon	fd5dcc5c81	Optimize GeGLU layer in Gemma (#2975 )	2024-02-21 20:17:52 -08:00

1 2

62 Commits