biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Lucas Wilkinson	6cdf015c3c	[Misc] Fix `Current vLLM config is not set.` warnings, assert to avoid issues in the future (#31747 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2026-01-08 15:20:49 -08:00
amitz-nv	ee21291825	[Model] Nemotron Parse 1.1 Support (#30864 ) Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2026-01-05 13:00:14 -08:00
wang.yuqi	911d38ed99	[Model] Let more models to support the score template. (#31335 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2026-01-05 11:54:26 +00:00
Cyrus Leung	aa3868ecfe	[Chore] Remove unused `noqa`s (#31263 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 05:38:46 -08:00
Nicolò Lucchesi	57e9bf1864	[CI] Whisper logprobs tests (#30504 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-12-13 10:49:11 +08:00
Lucas Wilkinson	3e41992fec	[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-12-12 05:57:47 -08:00
Cyrus Leung	7e24e5d4d6	[Deprecation] Remove deprecated task, seed and MM settings (#30397 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:39 -08:00
Yu Jiaqi	43e7593031	Support tokenization_kwargs override (#29794 ) Signed-off-by: piood <2477084691@qq.com>	2025-12-06 09:12:53 +00:00
Ilya Markov	4e26d3b09e	[Compile] Conditional compilation. Introduce compile_ranges (#24252 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Luka Govedič <luka.govedic@gmail.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com>	2025-12-05 18:17:32 +00:00
Chukwuma Nwaugha	6e865b6a83	Refactor example prompts fixture (#29854 ) Signed-off-by: nwaughac@gmail.com	2025-12-05 06:44:32 +00:00
ImaGoodFella	60c3d413af	[Multimodal][Core] Optimize multimodal preprocessing cache by hashing image bytes instead of pixel values (#29621 ) Signed-off-by: Rahul Steiger <rasteiger@ethz.ch> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-02 21:49:02 +08:00
Boyuan Feng	3b221cb661	[BugFix] respect VLLM_LOGGING_LEVEL in logger (#29761 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-12-02 07:49:16 +00:00
Chukwuma Nwaugha	ad7f714d62	hfrunner.classify should return list[list[float]] not list[str] (#29671 ) Signed-off-by: Chukwuma Nwaugha <nwaughac@gmail.com>	2025-11-29 13:57:00 +00:00
Angela Yi	4b17ce6815	Add gpu memory wait before test_async_tp (#28893 ) Signed-off-by: angelayi <yiangela7@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-28 20:19:05 -08:00
Nick Hill	7df331c66b	[BugFix] Fix chunked prompt logprobs + preemption (#29071 )	2025-11-22 16:07:18 -05:00
Lucas Wilkinson	30d6466238	[BugFix] Fix Eagle `IndexError: list index out of range` for even `num_speculative_tokens` (#29102 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-11-22 00:47:05 +00:00
Varun Sundar Rabindranath	fe1cd7704d	[Performance][B200] silu_mul_quant: pack scales in int32 (#28358 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-11-13 10:16:55 -08:00
wangxiyuan	428bc7bf1c	[V0 deprecation] Remove VLLM_USE_V1 usage in most modules (#27955 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-04 20:51:16 -08:00
Nick Hill	0cdbe7b744	[Core] Async scheduling + structured outputs compatibility (#26866 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-01 00:35:04 +00:00
Nick Hill	4fe5895361	[AsyncScheduling] Make async overlap work with logprobs (#27615 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-10-28 22:35:54 +00:00
Cyrus Leung	d31f7844f8	[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-19 05:20:55 -07:00
Isotr0py	6ac5e06f7c	[Chore] Clean up pytorch helper functions in `vllm.utils` (#26908 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: isotr0py <2037008807@qq.com>	2025-10-18 09:48:22 -07:00
Luka Govedič	bd7157a071	[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-17 08:10:23 -06:00
Cyrus Leung	d2740fafbf	[Chore] Separate out `vllm.utils.collections` (#26990 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 08:35:35 +00:00
wangxiyuan	8f4b313c37	[Misc] rename torch_dtype to dtype (#26695 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-10-15 12:11:48 +00:00
wang.yuqi	f54f85129e	[Model][2/N] Improve all pooling task \| Support multi-vector retrieval (#25370 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-10-15 11:14:41 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Yannick Schnider	6431be808f	[Tests] conftest: Extending VllmRunner and HfRunner to accept token_ids as input (#26295 ) Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-10-06 17:19:34 +00:00
Thomas Parnell	d3c84297c3	[CI] Add comment about the single cudagraph capture size that is used (#26252 )	2025-10-06 02:35:37 +00:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Cyrus Leung	b7e8e4e6be	[Bugfix] Always apply MM processor even when no MM items are passed (#26240 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-05 10:10:20 +00:00
Yannick Schnider	5446ad1d24	[test utils] correct wrong typing (#26159 ) Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>	2025-10-03 02:11:49 -07:00
Thomas Parnell	be8921fbba	Change size of single CUDA graph for CI to 4 (#26089 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>	2025-10-02 14:14:28 +00:00
Harry Mellor	a332b84578	[CI] Only capture a single CUDA graph size in CI by default (#25951 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-01 10:03:44 +01:00
Isotr0py	27ec3c78f3	[CI/Build] Fix v1 OOT registration test (#25547 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-24 08:03:13 +00:00
Woosuk Kwon	26e673fe93	[V0 Deprecation] Remove V0 Sequence class & Sampler (#25332 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-21 08:52:15 -07:00
Woosuk Kwon	52c2a8d4ad	[V0 Deprecation] Remove LLMEngine (#25033 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-20 17:56:30 -07:00
Cyrus Leung	bef180f009	[V0 Deprecation] Enable the remaining multimodal tests in V1 (#25307 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-20 17:50:58 +00:00
Cyrus Leung	3d9a1d2de5	[V1] Support `LLM.apply_model` (#18465 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-20 07:14:35 +00:00
Harry Mellor	aed16879a9	Move `ModelConfig` from `config/__init__.py` to `config/model.py` (#25252 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-19 16:22:33 +00:00
Isotr0py	f2718d2948	[Misc] Cleanup test conftest for deprecated encoder-decoder models (#25231 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-19 07:44:56 +00:00
Nick Hill	4db4426404	[CI] Fail subprocess tests with root-cause error (#23795 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-09-10 13:53:21 -07:00
dsinghvi	70549c1245	[CI/Build] Serve images used by multimodal tests through local HTTP Server (#23907 ) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-09-03 16:13:11 +08:00
Christian Pinto	1cb39dbcdd	[Misc] IO Processor plugins for pooling models (#22820 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com>	2025-08-31 23:07:12 -07:00
wang.yuqi	11a7fafaa8	[New Model]: Support GteNewModelForSequenceClassification (#23524 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-28 15:36:42 +08:00
Kyle Sayers	22feac8e95	[Transform] [Quantization] Add transforms to compressed tensors (#22486 )	2025-08-28 02:43:48 -04:00
Chen Zhang	142ac08030	[Frontend] Optimize beam search performance by limiting concurrency (#23599 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-27 04:59:14 +00:00
wang.yuqi	f856c33ce9	[Model] Add multi_label_classification support (#23173 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-19 12:54:30 +00:00
Isotr0py	3dddbf1f25	[Misc] Add tensor schema test coverage for multimodal models (#21754 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Isotr0py <2037008807@qq.com>	2025-08-03 00:52:14 -07:00
wang.yuqi	65f311ce59	[Frontend] Add LLM.reward specific to reward models (#21720 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-29 20:56:03 -07:00

1 2 3 4

198 Commits