biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
汪志鹏	1adeb3b84c	[New Model] BAGEL support (AR only) (#28439 ) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-15 14:58:23 +08:00
Harry Mellor	cf3eacfe58	Standardise `get_rope` to use `rope_parameters["partial_rotary_factor"]`, not `rotary_dim` (#30389 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 20:45:23 +00:00
Harry Mellor	f5b0846ba0	Fix some Transformers nightly tests (#29802 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-02 07:05:27 +00:00
Matthew Bonanni	430dd4d9eb	[Attention] Remove imports from `vllm/attention/__init__.py` (#29342 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-26 10:53:15 -07:00
Laith Sakka	7a228b5305	Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-11-24 10:12:41 -05:00
Harry Mellor	a8b70304d6	Update `rope_scaling` to `rope_parameters` in preparation for Transformers v5 (#28542 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-19 09:06:36 -08:00
Harry Mellor	97d1c99302	Rename clashing method names for vLLM model protocol (#27583 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 19:14:33 -08:00
Jee Jee Li	9d1c474704	[LoRA][1/N]Remove LoRA extra vocab (#28382 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-11-11 11:06:21 -08:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Woosuk Kwon	1c3ffdbecc	[V0 Deprecation] Remove V0 sampling metadata (#25345 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-21 10:37:11 -07:00
Roger Wang	0f7acdd73c	[Model] Support Qwen3-VL Model Series (#24727 ) Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-17 05:01:04 +00:00
Lukas Geiger	de533ab2a1	[Models] Improve iteration over layers (#19497 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-08-29 09:26:34 +08:00
Cyrus Leung	7d67a9d9f9	[mypy] Fix incorrect type hint for EAGLE3 support (#23617 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-25 23:50:17 -07:00
PapaGoose	88491c1b6b	[Speculators][Speculative Decoding] Fix Qwen 2 Eagle3 Support (#23337 )	2025-08-22 16:39:19 +00:00
Chen Zhang	17373dcd93	[Attention] Refactor AttentionMetadata Preparation for Encoder-only Models (#23154 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-22 05:05:59 +00:00
Harry Mellor	c49848396d	Refactor sliding window configuration to Transformers best practice (#21927 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-09 20:50:48 -07:00
Zhiyu	d57dc2364e	Add ModelOpt Qwen3 nvfp4 support (#20101 ) Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>	2025-08-07 19:18:19 -07:00
Dipika Sikka	9f9c38c392	[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (#21835 ) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>	2025-08-01 19:43:37 -07:00
wang.yuqi	ca4eb82bcb	[Model] Re-add the implicit conversion feature for as_seq_cls_model (#21103 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-18 07:15:07 +00:00
wang.yuqi	6f1229f91d	[Model][2/N] Automatic conversion of CrossEncoding model (#19978 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-03 13:59:23 +00:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Cyrus Leung	4f4a6b844a	[Deprecation] Remove mean pooling default for `Qwen2EmbeddingModel` (#18913 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-30 06:53:37 +00:00
inkcherry	dd2a94596a	[Model] Allow the use of sliding window in Qwen2 (#17772 ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>	2025-05-14 22:29:38 -07:00
Harry Mellor	26d0419309	Update deprecated type hinting in `models` (#18132 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-14 22:06:50 -07:00
Cyrus Leung	d62a076e84	[Model] GritLM supports other attention backends (#18109 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-14 03:33:19 -07:00
Tao He	60f7624334	Implements dual-chunk-flash-attn backend for dual chunk attention with sparse attention support (#11844 )	2025-05-12 19:52:47 -07:00
Woosuk Kwon	b411418ff0	[Chore] Remove Sampler from Model Code (#17084 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-04-24 02:49:33 -07:00
YamPengLi	7699258ef0	[Model] Add Qwen3 and Qwen3MoE (#15289 ) Signed-off-by: YamPengLi <yampayne.lyp@alibaba-inc.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-04-07 04:06:41 -07:00
Tyler Michael Smith	4f5b059f14	Clean up unused padding_idx variables across many model definitions (#13240 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-03-04 21:27:00 +00:00
Harry Mellor	cdc1fa12eb	Remove unused kwargs from model definitions (#13555 )	2025-02-24 17:13:52 -08:00
Jee Jee Li	105b8ce4c0	[Misc] Reduce LoRA-related static variable (#13166 )	2025-02-22 00:21:30 -08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Roger Wang	81763c58a0	[V1] Add V1 support of Qwen2-VL (#12128 ) Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: imkero <kerorek@outlook.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-19 19:52:13 +08:00
Michael Goin	9aa1519f08	Various cosmetic/comment fixes (#12089 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2025-01-16 09:59:06 +00:00
kewang-xlnx	de0526f668	[Misc][Quark] Upstream Quark format to VLLM (#10765 ) Signed-off-by: kewang-xlnx <kewang@xilinx.com> Signed-off-by: kewang2 <kewang2@amd.com> Co-authored-by: kewang2 <kewang2@amd.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>	2025-01-15 11:05:15 -05:00
Jee Jee Li	a3a3ee4e6f	[Misc] Merge bitsandbytes_stacked_params_mapping and packed_modules_mapping (#11924 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-15 07:49:49 +08:00
jiangjiadi	c994223d56	[Bugfix] update the prefix for qwen2 (#11795 ) Co-authored-by: jiadi.jjd <jiadi.jjd@antgroup.com>	2025-01-07 18:36:34 +00:00
Chen Zhang	e20c92bb61	[Kernel] Move attn_type to Attention.__init__() (#11690 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-01-07 00:11:28 +08:00
Cyrus Leung	3f3e92e1f2	[Model] Automatic conversion of classification and reward models (#11469 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-24 18:22:22 +00:00
Jee Jee Li	196c34b0ac	[Misc] Move weights mapper (#11443 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-12-24 13:05:25 +00:00
Cyrus Leung	bf0e382e16	[Model] Composite weight loading for multimodal Qwen2 (#10944 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-07 07:22:52 -07:00
Cyrus Leung	133707123e	[Model] Replace embedding models with pooling adapter (#10769 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-01 08:02:54 +08:00
Cyrus Leung	9b4b150395	[Bugfix] Ignore `lm_head` when loading embedding models (#10719 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-27 19:05:29 +00:00
Jee Jee Li	15cc2a9f1a	[Misc]Further reduce BNB static variable (#10597 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-11-26 22:54:12 -08:00
Cyrus Leung	cf73f0c95e	[Model] Enable optional prefix when loading embedding models (#10639 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-25 18:14:33 +00:00
Cyrus Leung	ed46f14321	[Model] Support `is_causal` HF config field for Qwen2 model (#10621 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-25 09:51:20 +00:00
Isotr0py	c4e464333e	[Misc] Add uninitialized params tracking for `AutoWeightsLoader` (#10327 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-18 09:07:46 +08:00
Roger Wang	643ecf7b11	[V1] Refactor model executable interface for all text-only language models (#10374 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-11-17 05:18:46 +00:00
Cyrus Leung	b40cf6402e	[Model] Support Qwen2 embeddings and use tags to select model tests (#10184 )	2024-11-14 20:23:09 -08:00

1 2

93 Commits