biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Michael Goin	eb4205fee5	[UX] Integrate DeepGEMM into vLLM wheel via CMake (#37980 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>	2026-04-08 18:56:32 -07:00
Andrey Talman	2111997f96	[release 2.11] Update to torch 2.11 (#34644 )	2026-04-07 18:55:48 -07:00
Woosuk Kwon	4f85bae9d6	[Docs][Model Runner V2] Add Design Docs (#35819 ) Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>	2026-03-02 19:58:14 -08:00
Kyle Sayers	64ac1395e8	[Docs] Clean up speculators docs (#34065 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2026-02-18 13:48:11 -08:00
Michael Goin	c39ee9ee2b	[Docs] Add sections on process architecture and minimum CPU resources (#33940 ) It seems users can be confused about vLLM's performance when running with very small amounts of CPU cores available. We are missing a clear overview of what vLLM's process architecture is, so I added this along with some diagrams in arch_overview.md, and included a section on CPU resource recommendations in optimization.md Signed-off-by: mgoin <mgoin64@gmail.com>	2026-02-06 15:26:43 +00:00
Orion Reblitz-Richardson	68b0a6c1ba	[CI][torch nightlies] Use main Dockerfile with flags for nightly torch tests (#30443 ) Signed-off-by: Orion Reblitz-Richardson <orionr@meta.com> Signed-off-by: Orion Reblitz-Richardson <orionr@gmail.com> Co-authored-by: Kevin H. Luu <khluu000@gmail.com>	2026-01-23 10:22:56 -08:00
Michael Goin	6b2a672e47	[Doc] Add Claude code usage example (#31188 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2026-01-08 13:50:23 +08:00
Amr Mahdi	ff21a0fc85	[docker] Restructure Dockerfile for more efficient and cache-friendly builds (#30626 ) Signed-off-by: Amr Mahdi <amrmahdi@meta.com>	2025-12-15 18:52:19 -08:00
Amr Mahdi	f5d3d93c40	[docker] Build CUDA kernels in separate Docker stage for faster rebuilds (#29452 ) Signed-off-by: Amr Mahdi <amrmahdi@meta.com>	2025-12-03 11:41:53 +00:00
Benjamin Bartels	4d6afcaddc	[CI/Build] Moves to cuda-base runtime image while retaining minimal JIT dependencies (#29270 ) Signed-off-by: bbartels <benjamin@bartels.dev> Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>	2025-11-24 11:40:54 -08:00
Benjamin Bartels	eb5352a770	[CI/build] Removes source compilation from runtime image (#26966 ) Signed-off-by: bbartels <benjamin@bartels.dev>	2025-11-22 10:23:09 -08:00
Chenguang Zheng	4ccffe561f	[Core] Encoder separation for Encode-Prefill-Decode Disaggregation (#25233 ) Signed-off-by: n00909098 <nguyen.kha.long@huawei.com> Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> Signed-off-by: herotai214 <herotai214@gmail.com> Signed-off-by: Khuong Le <khuong.le.manh@huawei.com> Signed-off-by: Khuong Le <lemanhkhuong2611@gmail.com> Co-authored-by: n00909098 <nguyen.kha.long@huawei.com> Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com> Co-authored-by: herotai214 <herotai214@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Khuong Le <khuong.le.manh@huawei.com> Co-authored-by: Khuong Le <lemanhkhuong2611@gmail.com>	2025-11-11 18:58:33 -08:00
Richard Zou	65ac8d8dc4	[Docs] Add guide to debugging vLLM-torch.compile integration (#28094 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-11-05 21:31:46 +00:00
Matvei Pashkovskii	130aa8cbcf	Add load pattern configuration guide to benchmarks (#26886 ) Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com> Signed-off-by: Matvei Pashkovskii <matvei.pashkovskii@amd.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-28 10:49:15 -07:00
Huy Do	becb7de40b	Update PyTorch to 2.9.0+cu129 (#24994 ) Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-21 17:20:18 -04:00
fhl2000	63773a6200	[Docs] add docs for cuda graph v1 (#24374 ) Signed-off-by: fhl <2410591650@qq.com> Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-07 05:25:05 -07:00
Tyler Michael Smith	27edd2aeb4	[Build/CI] Revert back to Ubuntu 20.04, install python 3.12 with uv (#26103 ) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-10-02 22:21:01 -07:00
Huy Do	d4e7a1152d	Update base image to 22.04 (jammy) (#26065 ) Signed-off-by: Huy Do <huydhn@gmail.com>	2025-10-02 05:48:04 -07:00
Sergio Paniego Blanco	099aaee536	Add Hugging Face Inference Endpoints guide to Deployment docs (#25886 ) Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-30 14:35:06 +00:00
Chen Zhang	d696f86e7b	[doc] Hybrid KV Cache Manager design doc (#22688 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 20:19:05 +00:00
WeiQing Chen	289b18e670	[Docs] Update features/disagg_prefill, add v1 examples and development (#22165 ) Signed-off-by: David Chen <530634352@qq.com>	2025-08-07 00:59:23 -07:00
Cyrus Leung	fcfd1eb9c5	[Doc] Remove vLLM prefix and add citation for PagedAttention (#21910 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-30 06:36:34 -07:00
Chen Zhang	76080cff79	[DOC] Fix path of v1 related figures (#21868 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-07-29 19:45:18 -07:00
Varun Sundar Rabindranath	f03e9cf2bb	[Doc] Add FusedMoE Modular Kernel Documentation (#21623 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-07-29 10:32:30 -07:00
Brittany	759b87ef3e	[TPU] Add an optimization doc on TPU (#21155 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-29 07:23:19 -07:00
Michael Yao	260127ea54	[Docs] Add intro and fix 1-2-3 list in frameworks/open-webui.md (#19199 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-07-16 06:11:38 -07:00
Nick Hill	9907fc4494	[Docs] Data Parallel deployment documentation (#20768 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-11 09:42:10 -07:00
Harry Mellor	a1fe24d961	Migrate docs from Sphinx to MkDocs (#18145 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-23 02:09:53 -07:00

28 Commits