biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Ben Browning	8477fe427d	[Tool] `adjust_request` to reasoning parser, and Gemma4 fixes (#39027 ) Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-04-08 19:04:04 +00:00
Vedant V Jhaveri	2e56975657	Generative Scoring (#34539 ) Signed-off-by: Vedant Jhaveri <vjhaveri@linkedin.com> Co-authored-by: Vedant Jhaveri <vjhaveri@linkedin.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2026-03-31 16:02:11 -07:00
wang.yuqi	ed359c497a	[Model] Deprecate the score task (this will not affect users). (#37537 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-03-20 08:07:56 +00:00
Flora Feng	9040151fe1	[V0 Deprecation] Deprecate --disable-frontend-multiprocessing (#37612 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2026-03-20 11:31:43 +08:00
Sage	00f8e0d211	[Frontend] Delegate tokenization serving preprocessing to OpenAIServingRender (#37266 ) Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>	2026-03-17 11:22:54 +00:00
Chauncey	6682c231fa	[Bugfix] Add error handling for FINISHED_ERROR in OpenAIServing (#37148 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-03-16 16:27:47 +00:00
Sergey Zinchenko	4a718e770d	[Bug] Fix Failure in /v1/chat/completions/render for Multimodal Requests (https://github.com/vllm-project/vllm/issues/35665 ) (#35684 )	2026-03-14 14:10:11 +00:00
Sage	06e0bc21d2	[Frontend] Split `OpenAIServingModels` into `OpenAIModelRegistry` + `OpenAIServingModels` (#36536 ) Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>	2026-03-12 03:29:37 -07:00
Sage	4497431df6	[Frontend] Add GPU-less render serving path (`vllm launch render`) (#36166 )	2026-03-08 16:35:09 +01:00
Ning Xie	176c799f4c	[openai api] log exception in exception handler (1/N) (#31164 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2026-03-05 16:00:12 +00:00
Hyunkyun Moon	bc6be89d16	[Frontend] Add vllm launch command for GPU-less preprocessing serving (#34551 ) Signed-off-by: HyunKyun Moon <mhg5303@gmail.com>	2026-03-04 18:41:52 +00:00
pougetat	1659b2e058	[Feature] Add basic metrics for /realtime endpoint (#35500 ) Signed-off-by: Thomas Pouget-Abadie <thomaspou@microsoft.com> Signed-off-by: pougetat <thomas.pougetabadie@gmail.com> Co-authored-by: Thomas Pouget-Abadie <thomaspou@microsoft.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-04 19:56:32 +08:00
wang.yuqi	dab1de9f38	[Frontend][CI] Consolidate instrumentator entrypoints (#34123 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-02-10 07:30:19 +00:00
kourosh hakhamaneshi	a75a5b54c7	[bug-fix] supported_tasks is breaking backward compatibility at init_app_state (#34027 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2026-02-09 09:46:46 +08:00
emricksini-h	325ab6b0a8	[Feature] OTEL tracing during loading (#31162 )	2026-02-05 16:59:28 -08:00
Nicolò Lucchesi	20f5d185a6	[Misc] Rename `translations` to `speech_to_text` for OAI serving component (#33904 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2026-02-05 19:16:52 +00:00
Patrick von Platen	10152d2194	[Realtime API] Adds minimal realtime API based on websockets (#33187 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>	2026-01-30 18:41:29 +08:00
wang.yuqi	7cbbca9aaa	[Frontend] Cleanup api server (#33158 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com>	2026-01-27 15:18:10 +00:00
wang.yuqi	76139d0801	[Frontend] Frontend will only attach supported tasks corresponding entrypoints. (#33139 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2026-01-27 12:15:43 +00:00
Jared Wen	6ee7f18f33	[Logging] add `--disable-access-log-for-endpoints` CLI option (#30011 ) Add a new CLI option --disable-access-log-for-endpoints to suppress uvicorn access logs for specified endpoints (e.g., /health, /metrics, /ping). This addresses the need to reduce log noise in production environments where health check endpoints are frequently polled by load balancers or monitoring systems, generating excessive log entries that obscure meaningful request logs. Fixes #29982 Signed-off-by: JaredforReal <w13431838023@gmail.com>	2026-01-26 21:49:03 +00:00
7. Sun	0f19427db5	[Perf] Cache exc.errors() result in validation exception handler (#32984 ) Signed-off-by: 7. Sun <jhao.sun@gmail.com>	2026-01-24 02:01:35 -08:00
Nick Hill	7fe255889e	[Misc] Log vLLM logo when starting server (#32796 ) Signed-off-by: Nick Hill <nickhill123@gmail.com>	2026-01-23 11:15:12 +08:00
RickyChen / 陳昭儒	69d09fdd6c	[Feature] Add --ssl-ciphers CLI argument for TLS cipher control (#30937 ) Signed-off-by: rickychen-infinirc <ricky.chen@infinirc.com>	2026-01-22 09:53:24 -08:00
Cyrus Leung	d117a4d1a9	[Frontend] Introduce Renderer for processing chat messages (using `ModelConfig`) (#30200 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-22 12:44:22 +00:00
wang.yuqi	4ae77dfd42	[Frontend][1/n] Make pooling entrypoints request schema consensus \| CompletionRequest (#32395 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-01-16 06:17:04 +00:00
Chauncey	707b44cc28	[Refactor] [11/N] to simplify the mcp architecture (#32396 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-15 18:49:31 +08:00
Chauncey	4c1c501a7e	[Refactor] [10/N] to simplify the vLLM openai completion serving architecture (#32369 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-15 07:41:34 +00:00
Chauncey	00e6402d56	[Frontend] track responsesAPI server_load (#32323 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-14 12:00:37 +00:00
Cyrus Leung	3f28174c6a	[Frontend] Standardize use of `create_error_response` (#32319 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-14 11:22:26 +00:00
Chauncey	769d0629e1	[Refactor] [9/N] to simplify the vLLM openai translations serving ar chitecture (#32313 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-14 10:20:58 +00:00
Chauncey	9312a6c03a	[Refactor] [8/N] to simplify the vLLM openai responsesapi_serving architecture (#32260 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-14 07:26:24 +00:00
Chauncey	fefce49807	[Refactor] [6/N] to simplify the vLLM openai chat_completion serving architecture (#32240 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2026-01-13 13:01:39 +00:00
Kevin Šuc	ac9f9330e6	Rename --exclude-log-deltas to --enable-log-deltas (#32020 ) Signed-off-by: Catacomba <kevinsuc16@gmail.com>	2026-01-09 15:30:40 +00:00
Cyrus Leung	aa125ecf0e	[Frontend] Improve error message (#31987 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-01-08 20:07:03 +00:00
R3hankhan	1ab055efe6	[OpenAI] Extend VLLMValidationError to additional validation parameters (#31870 ) Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>	2026-01-07 14:45:49 +00:00
Kevin Šuc	79ed460dd5	[Frontend] [Doc] Exclude log deltas feature (#30322 ) Signed-off-by: Catacomba <kevinsuc16@gmail.com> Signed-off-by: Kevin Šuc <kevinsuc16@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2026-01-05 16:34:35 +00:00
Hojin Yang	dc837bc23e	feat(frontend): add --default-chat-template-kwargs CLI argument (#31343 ) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>	2025-12-30 03:38:47 +00:00
RickyChen / 陳昭儒	b3a2bdf1ac	[Feature] Add offline FastAPI documentation support for air-gapped environments (#30184 ) Signed-off-by: rickychen-infinirc <ricky.chen@infinirc.com> Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-29 16:22:39 +00:00
R3hankhan	769f27e701	[OpenAI] Add parameter metadata to validation errors (#30134 ) Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>	2025-12-23 11:30:12 +00:00
Jakub Zakrzewski	23daef548d	[Frontend] Support using chat template as custom score template for reranking models (#30550 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-23 11:19:16 +00:00
Nathan Price	05a83dc6ee	feat(api): Eager chat template warmup to eliminate first-request latency (#30700 ) Signed-off-by: Nathan Price <nathan@abridge.com>	2025-12-18 00:01:29 +00:00
Chauncey	9ad5b21710	[Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-17 02:27:30 -08:00
Chauncey	2a1776b7ac	[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-15 12:54:52 +00:00
Cyrus Leung	e83b7e379c	Revert "[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 )" (#30199 )	2025-12-07 00:00:22 -08:00
Cyrus Leung	27f4c2fd46	[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 23:15:42 -08:00
Tova Movshovitz	adb315060c	[KVConnector][Feature] Support KV connector cache reset via /reset_prefix_cache (#27170 ) Signed-off-by: tovam <tovam@pliops.com> Signed-off-by: Tova Movshovitz <tovam@pliops.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-05 18:33:26 +00:00
Chauncey	3f42b05fbc	[Refactor] [1/N] to simplify the vLLM serving architecture (#28040 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-03 01:26:39 -08:00
Zhuohan Li	d0cd728907	[Core] Support reseting all running requests' KV while calling `reset_prefix_cache` (#28827 ) Signed-off-by: Zhuohan Li <zhuohan123@gmail.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-12-02 02:25:05 +00:00
sangbumlikeagod	092bb73b8a	[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation (#24209 ) Signed-off-by: sangbumlikeagod <oironese@naver.com> Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>	2025-12-01 18:19:17 +01:00
wang.yuqi	62de4f4257	[Frontend] Resettle pooling entrypoints (#29634 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-01 15:30:43 +08:00

1 2 3 4 5 ...

339 Commits