biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
elvischenv	83156c7b89	[NVIDIA] Support Flashinfer TRT-LLM Prefill Attention Kernel (#22095 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-08-05 02:45:34 -07:00
lkchen	f4f4e7ef27	[V0 deprecation][P/D] Deprecate v0 `KVConnectorBase` code (1/2) (#21785 ) Signed-off-by: Linkun Chen <github@lkchen.net>	2025-08-04 19:11:33 -07:00
Isotr0py	3dddbf1f25	[Misc] Add tensor schema test coverage for multimodal models (#21754 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Isotr0py <2037008807@qq.com>	2025-08-03 00:52:14 -07:00
Michael Goin	88faa466d7	[CI] Initial tests for SM100 Blackwell runner (#21877 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-01 16:18:38 -07:00
Charent	ad57f23f6a	[Bugfix] Fix: Fix multi loras with tp >=2 and LRU cache (#20873 ) Signed-off-by: charent <19562666+charent@users.noreply.github.com>	2025-07-31 19:48:13 -07:00
Ilya Markov	6e672daf62	Add FlashInfer allreduce RMSNorm Quant fusion (#21069 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-07-31 13:58:38 -07:00
Alexei-V-Ivanov-AMD	0780bb5783	Removing amdproduction Tests (#22027 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>	2025-07-31 09:53:27 -07:00
Simon Mo	452b2a3180	[ci] mark blackwell test optional for now (#21878 )	2025-07-29 18:03:27 -07:00
Simon Mo	0d0cc9e150	[ci] add b200 test placeholder (#21866 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-07-29 17:11:50 -07:00
Reza Barazesh	37efc63b64	[V0 deprecation] Guided decoding (#21347 ) Signed-off-by: Reza Barazesh <rezabarazesh@meta.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-29 03:15:30 -07:00
Michael Goin	afa2607596	[CI] Parallelize Kernels MoE Test (#21764 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-28 18:56:24 -07:00
Robert Shaw	d5b981f8b1	[DP] Internal Load Balancing Per Node [`one-pod-per-node`] (#21238 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-07-23 20:57:32 -07:00
Ming Yang	772ce5af97	[Misc] Add dummy maverick test to CI (#21324 ) Signed-off-by: Ming Yang <minos.future@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-07-23 20:22:42 -07:00
Nick Hill	316b1bf706	[Tests] Add tests for headless internal DP LB (#21450 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-23 07:49:25 -07:00
Alexei-V-Ivanov-AMD	107111a859	Changing "amdproduction" allocation. (#21409 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>	2025-07-22 20:48:31 -07:00
Cyrus Leung	c401c64b4c	[CI/Build] Fix model executor tests (#21387 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-22 20:25:37 -07:00
Michael Goin	005ae9be6c	Fix bad lm-eval fork (#21318 )	2025-07-21 10:47:51 -07:00
Seiji Eicher	d1fb65bde3	Enable v1 metrics tests (#20953 ) Some checks failed Create Release / Create Release (push) Has been cancelled Details Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-07-20 03:22:02 +00:00
Woosuk Kwon	dd572c0ab3	[V0 Deprecation] Remove V0 Spec Decode workers (#21152 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-18 21:47:50 -07:00
Cyrus Leung	c847e34b39	[CI/Build] Fix wrong path in Transformers Nightly Models Test (#20994 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-15 08:53:16 -07:00
Michael Goin	946aadb4a0	[CI/Build] Split Entrypoints Test into LLM and API Server (#20945 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-15 02:44:18 +00:00
Isotr0py	6d0cf239c6	[CI/Build] Add Transformers nightly tests in CI (#20924 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-07-14 16:33:17 +00:00
shineran96	4bed167768	[Model][VLM] Support JinaVL Reranker (#20260 ) Signed-off-by: shineran96 <shinewang96@gmail.com>	2025-07-10 10:43:43 -07:00
Alexei-V-Ivanov-AMD	536fd33003	[CI] Trimming some failing test groups from AMDPRODUCTION. (#20390 )	2025-07-03 08:21:31 -07:00
Nick Hill	657f2f301a	[DP] Support external DP Load Balancer mode (#19790 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-02 10:21:52 -07:00
Thomas Parnell	8615d9776f	[CI/Build] Add new CI job to validate Hybrid Models for every PR (#20147 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>	2025-06-27 23:00:25 -07:00
Yang Wang	8b64c895c0	[CI] Sync test dependency with test.in for torch nightly (#19632 ) Signed-off-by: Yang Wang <elainewy@meta.com> Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Concurrensee <yida.wu@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-06-26 20:55:25 -07:00
Bowen Wang	e9fd658a73	[Feature] Expert Parallelism Load Balancer (EPLB) (#18343 ) Signed-off-by: Bowen Wang <abmfy@icloud.com>	2025-06-26 15:30:21 -07:00
Nick Hill	c40692bf9a	[Misc] Add parallel state `node_count` function (#20045 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-06-25 13:38:53 -07:00
Nick Hill	8619e7158c	[BugFix] Fix multi-node offline data parallel (#19937 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-06-24 12:45:20 -07:00
kourosh hakhamaneshi	5e666f72cd	[Bugfix][Ray] Set the cuda context eagerly in the ray worker (#19583 )	2025-06-19 22:01:16 -07:00
Alexei-V-Ivanov-AMD	4719460644	Fixing Chunked Prefill Test. (#19762 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>	2025-06-19 01:36:16 -07:00
Concurrensee	d65668b4e8	Adding "AMD: Multi-step Tests" to amdproduction. (#19508 ) Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-06-13 17:08:51 -07:00
kourosh hakhamaneshi	e6aab5de29	Revert "[Build/CI] Add tracing deps to vllm container image (#15224 )" (#19378 )	2025-06-12 17:26:40 -07:00
Luka Govedič	f98548b9da	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Sage Moore <sage@neuralmagic.com>	2025-06-12 08:31:04 -07:00
Jerry Zhang	c8134bea15	Fix AOPerModuleConfig name changes (#18869 ) Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>	2025-06-05 18:51:32 -07:00
Woosuk Kwon	b124e1085b	[Bugfix] Fix FA3 full cuda graph correctness (#19106 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-06-03 23:10:15 -07:00
Yan Ru Pei	b712be98c7	feat: add data parallel rank to KVEventBatch (#18925 )	2025-06-03 17:14:20 -07:00
Concurrensee	4ce42f9204	Adding "LoRA Test %N" to AMD production tests (#18929 ) Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>	2025-06-02 20:46:44 -07:00
Nick Hill	2dbe8c0774	[Perf] API-server scaleout with many-to-many server-engine comms (#17546 )	2025-05-30 08:17:00 -07:00
Rabi Mishra	5f1d0c8118	[Bugfix][Failing Test] Fix test_vllm_port.py (#18618 ) Signed-off-by: rabi <ramishra@redhat.com>	2025-05-30 17:13:47 +08:00
Rabi Mishra	b78f844a67	[Bugfix][FailingTest]Fix test_model_load_with_params.py (#18758 ) Signed-off-by: rabi <ramishra@redhat.com>	2025-05-28 05:42:54 +00:00
Mark McLoughlin	06a0338015	[V1][Metrics] Add API for accessing in-memory Prometheus metrics (#17010 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-27 09:37:06 +00:00
Cyrus Leung	82e2339b06	[Doc] Move examples and further reorganize user guide (#18666 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-26 07:38:04 -07:00
Isotr0py	0877750029	[CI/Build] Split pooling and generation extended language models tests in CI (#18705 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-26 04:00:08 -07:00
Michael Goin	0ddf88e16e	[CI] Enable test_initialization to run on V1 (#16736 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-23 15:09:44 -07:00
Cyrus Leung	6dd51c7ef1	[CI/Build] Fix V1 flag being set in entrypoints tests (#18598 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-23 05:51:53 -07:00
Harry Mellor	a1fe24d961	Migrate docs from Sphinx to MkDocs (#18145 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-23 02:09:53 -07:00
cascade	71ea614d4a	[Feature]Add async tensor parallelism using compilation pass (#17882 ) Signed-off-by: cascade812 <cascade812@outlook.com>	2025-05-23 01:03:34 -07:00
Sanger Steel	c32e249a23	[Frontend] [Core] Add Tensorizer support for V1, LoRA adapter serialization and deserialization (#17926 ) Signed-off-by: Sanger Steel <sangersteel@gmail.com>	2025-05-22 18:44:18 -07:00

... 3 4 5 6 7 ...

528 Commits