biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Andrew Bennett	f243abc92d	Fix various typos found in `docs` (#32212 ) Signed-off-by: Andrew Bennett <potatosaladx@meta.com>	2026-01-13 03:41:47 +00:00
Andy Zhang	e68b0dad8b	doc: Update model name for Qwen3-Coder in documentation (#32185 ) Signed-off-by: Andy Zhang <xiazhang@microsoft.com>	2026-01-12 07:10:50 -08:00
Or Ozeri	9cddbdba6d	OffloadingConnector: Add cpu_bytes_to_use configuration (#24498 ) Signed-off-by: Or Ozeri <oro@il.ibm.com>	2026-01-12 15:00:43 +00:00
Jee Jee Li	05e8981234	[Doc] Improve LoRA docs (#32159 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-12 02:19:17 -08:00
Jeremy Teboul	657e9c0e18	[Fix] Introduce audio channels spec (#31595 ) Signed-off-by: Jeremy Teboul <jeremyte@meta.com>	2026-01-09 19:34:51 +00:00
vSeamar	6f351548b2	[Frontend] Implement robust video frame recovery for corrupted videos (#29197 ) Signed-off-by: cmartinez <cmartinez@roblox.com> Signed-off-by: vSeamar <cmartinez@roblox.com>	2026-01-07 01:13:24 +00:00
Jee Jee Li	cbd4690a03	[LoRA]Disable linear LoRA kernel PDL (#31777 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2026-01-06 23:12:25 +08:00
BlankR	6ebb66ccea	[Doc] Fix format of multimodal_inputs.md (#31800 ) Signed-off-by: BlankR <hjyblanche@gmail.com>	2026-01-06 03:30:24 -08:00
labAxiaoming	a01f2faedf	Add multimodal input method in the documentation (#31601 ) Signed-off-by: xiaoming <1259730330@qq.com>	2026-01-02 12:43:30 +00:00
Hojin Yang	dc837bc23e	feat(frontend): add --default-chat-template-kwargs CLI argument (#31343 ) Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>	2025-12-30 03:38:47 +00:00
qli88	0f35429a0c	[CI]Test Group 'NixlConnector PD accuracy tests' is fixed (#31460 ) Signed-off-by: qli88 <qiang.li2@amd.com>	2025-12-29 23:48:56 +00:00
Harry Mellor	decc244767	[Docs] Use relative `md` links instead of absolute `html` links for cross referencing (#31494 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-29 13:33:44 +00:00
Jee Jee Li	ce1eafd1a5	[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674 ) Signed-off-by: bk-201 <joy25810@foxmail.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: bk-201 <joy25810@foxmail.com> Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com> Co-authored-by: Anexdeus <5142168@mail.ru>	2025-12-26 04:48:20 -08:00
Mark Gatere	ba25a65992	[Frontend] add FunctionGemma tool parser support (#31218 ) Signed-off-by: gateremark <gateremg@gmail.com>	2025-12-25 15:29:25 +08:00
Amith KK	42826bbccd	[Doc] Add tool call parser documentation for GPT-OSS models (#31212 ) Signed-off-by: Amith KK <amithkumaran@gmail.com>	2025-12-25 05:29:10 +00:00
Cyrus Leung	d201807339	[Chore] Bump `lm-eval` version (#31264 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 05:39:13 -08:00
Yan Ma	f1c2c20136	[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation (#30538 ) Signed-off-by: Yan Ma <yan.ma@intel.com>	2025-12-23 05:22:15 +00:00
CedricHuang	19cc9468fd	[Feature]: Support NVIDIA ModelOpt HF FP8 variants FP8_PER_CHANNEL_PER_TOKEN and FP8_PB_WO in vLLM (#30957 )	2025-12-21 22:34:49 -05:00
Steve Westerhouse	9d701e90d8	[Doc] Clarify FP8 KV cache computation workflow (#31071 ) Signed-off-by: westers <steve.westerhouse@origami-analytics.com>	2025-12-22 08:41:37 +08:00
Yuxuan Zhang	8a7a414374	GLM-4.7 Tool Parser and Doc Update (#30876 ) Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>	2025-12-20 00:09:58 +00:00
Chauncey	2a1776b7ac	[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-15 12:54:52 +00:00
Xu Song	25221b44bb	Add more docs for regex (#30106 ) Signed-off-by: Xu Song <xusong.vip@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-11 00:12:21 +00:00
Wilson Wu	3bdd426636	Fix typos in comments across multiple files (#30345 ) Signed-off-by: Wilson Wu <iwilsonwu@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-12-09 20:05:28 -08:00
Hubert de La Jonquiere	c72ea10723	[Structured Output][Reasoning] Improves decoding throughput for models using single-token reasoning endings. (#30056 )	2025-12-09 18:54:08 +08:00
Fanli Lin	c2e1987a6e	[Doc] update Intel GPU MM status in Feature x Hardware matrix (#30294 ) Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2025-12-09 05:16:44 +00:00
Or Ozeri	4c6fd25880	kv_transfer: Rename the shared storage connectors (#30201 ) Signed-off-by: Or Ozeri <oro@il.ibm.com>	2025-12-08 20:46:09 -08:00
Ming Yang	60d17251c9	[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP (#28782 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-12-09 00:01:08 +00:00
Zhiyu	cd00c443d2	[Misc] Rename TensorRT Model Optimizer to Model Optimizer (#30091 ) Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>	2025-12-08 07:05:27 +00:00
jeremyteboul	dce6d229f7	Support multiple image/audio embeddings per requests (#29988 ) Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com> Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>	2025-12-07 04:34:24 +00:00
Viacheslav	21bb323542	Gigachat 3 tool parser and tests (#29905 ) Signed-off-by: Viacheslav Barinov <viacheslav.teh@gmail.com>	2025-12-06 12:04:14 +00:00
Hubert de La Jonquiere	befb59e5b1	[Model] Add Holo2 reasoning parser (#30048 ) Signed-off-by: hdlj-h <hubert@hcompany.ai>	2025-12-05 10:38:45 +08:00
Tao Yun	6dcb07f676	support qwen3-vl handle requests with embeddings (#30037 ) Signed-off-by: taoyun <1069423820@qq.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-12-04 17:34:06 +00:00
wang.yuqi	74c4d80c6c	[Model][6/N] Improve all pooling task \| Support chunked prefill with ALL pooling (#27145 ) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-04 13:44:15 +00:00
dtc	842aba501d	[P/D] Introduce Mooncake Transfer Engine as kv_connector (#24718 ) Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: dtc <dtcccc@linux.alibaba.com> Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>	2025-12-04 09:51:36 +00:00
Cyrus Leung	9ae2f60374	[Misc] Various cleanups for MM input processing (#29970 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-04 06:22:20 +00:00
Cyrus Leung	34a984274e	[Misc] Refactor tokenizer interface (#29693 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-29 04:02:21 -08:00
Wilson Wu	18523b87f6	[Docs] Update supported models for Olmo 3 in tool calling documentation (#29411 ) Signed-off-by: Wilson Wu <iwilsonwu@gmail.com>	2025-11-28 02:53:55 +00:00
Harry Mellor	316c8492bf	Scheduled removal of `guided_*` config fields (#29326 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 05:24:05 +00:00
Tyler Michael Smith	4dd42db566	Remove VLLM_SKIP_WARMUP tip (#29331 ) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>	2025-11-24 22:16:05 +00:00
Julien Denize	57430fc95c	Default model load/config/tokenizer to `mistral` format if relevant files exist (#28659 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-11-21 13:58:59 -08:00
jeremyteboul	0730414999	[Core] Add audio_embeds support to chat completions (#29059 ) Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com> Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>	2025-11-21 11:39:47 +08:00
Rob Mulla	dd39f91edb	[Doc] cleanup TPU documentation and remove outdated examples (#29048 ) Signed-off-by: Rob Mulla <rob.mulla@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-21 00:05:59 +00:00
Didier Durand	09540cd918	[Doc]: fix typos in various files (#29010 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-19 04:56:21 -08:00
Didier Durand	7ed27f3cb5	[Doc]: fix typos in various files (#28945 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-18 22:52:30 -08:00
Uranus	6a25ea5f0e	[Docs] Update oneshot imports (#28188 ) Signed-off-by: UranusSeven <109661872+UranusSeven@users.noreply.github.com>	2025-11-19 05:30:08 +00:00
Kevin H. Luu	c64c0b78de	[chore] Move the rest of wikimedia url to S3 (#28921 ) Signed-off-by: Kevin H. Luu <khluu000@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-18 09:44:18 -08:00
Didier Durand	083cf326dc	[Doc]: fix typos in various files (#28863 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-17 20:32:14 -08:00
Didier Durand	63fed55506	[Doc]: fix typos in various files (#28811 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-16 14:30:06 +00:00
Didier Durand	2bb4435cb7	[Doc]: fix typos in various files (#28567 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-15 19:27:50 +00:00
Chauncey	5c9ad138d5	[Frontend] supports interleaved thinking (#28531 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-13 16:14:13 +08:00

1 2 3 4

178 Commits