Jee Jee Li
|
cbd4690a03
|
[LoRA]Disable linear LoRA kernel PDL (#31777)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-01-06 23:12:25 +08:00 |
|
BlankR
|
6ebb66ccea
|
[Doc] Fix format of multimodal_inputs.md (#31800)
Signed-off-by: BlankR <hjyblanche@gmail.com>
|
2026-01-06 03:30:24 -08:00 |
|
labAxiaoming
|
a01f2faedf
|
Add multimodal input method in the documentation (#31601)
Signed-off-by: xiaoming <1259730330@qq.com>
|
2026-01-02 12:43:30 +00:00 |
|
Hojin Yang
|
dc837bc23e
|
feat(frontend): add --default-chat-template-kwargs CLI argument (#31343)
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>
|
2025-12-30 03:38:47 +00:00 |
|
qli88
|
0f35429a0c
|
[CI]Test Group 'NixlConnector PD accuracy tests' is fixed (#31460)
Signed-off-by: qli88 <qiang.li2@amd.com>
|
2025-12-29 23:48:56 +00:00 |
|
Harry Mellor
|
decc244767
|
[Docs] Use relative md links instead of absolute html links for cross referencing (#31494)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-29 13:33:44 +00:00 |
|
Jee Jee Li
|
ce1eafd1a5
|
[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674)
Signed-off-by: bk-201 <joy25810@foxmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: bk-201 <joy25810@foxmail.com>
Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: Anexdeus <5142168@mail.ru>
|
2025-12-26 04:48:20 -08:00 |
|
Mark Gatere
|
ba25a65992
|
[Frontend] add FunctionGemma tool parser support (#31218)
Signed-off-by: gateremark <gateremg@gmail.com>
|
2025-12-25 15:29:25 +08:00 |
|
Amith KK
|
42826bbccd
|
[Doc] Add tool call parser documentation for GPT-OSS models (#31212)
Signed-off-by: Amith KK <amithkumaran@gmail.com>
|
2025-12-25 05:29:10 +00:00 |
|
Cyrus Leung
|
d201807339
|
[Chore] Bump lm-eval version (#31264)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-24 05:39:13 -08:00 |
|
Yan Ma
|
f1c2c20136
|
[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation (#30538)
Signed-off-by: Yan Ma <yan.ma@intel.com>
|
2025-12-23 05:22:15 +00:00 |
|
CedricHuang
|
19cc9468fd
|
[Feature]: Support NVIDIA ModelOpt HF FP8 variants FP8_PER_CHANNEL_PER_TOKEN and FP8_PB_WO in vLLM (#30957)
|
2025-12-21 22:34:49 -05:00 |
|
Steve Westerhouse
|
9d701e90d8
|
[Doc] Clarify FP8 KV cache computation workflow (#31071)
Signed-off-by: westers <steve.westerhouse@origami-analytics.com>
|
2025-12-22 08:41:37 +08:00 |
|
Yuxuan Zhang
|
8a7a414374
|
GLM-4.7 Tool Parser and Doc Update (#30876)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
2025-12-20 00:09:58 +00:00 |
|
Chauncey
|
2a1776b7ac
|
[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-15 12:54:52 +00:00 |
|
Xu Song
|
25221b44bb
|
Add more docs for regex (#30106)
Signed-off-by: Xu Song <xusong.vip@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-11 00:12:21 +00:00 |
|
Wilson Wu
|
3bdd426636
|
Fix typos in comments across multiple files (#30345)
Signed-off-by: Wilson Wu <iwilsonwu@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-12-09 20:05:28 -08:00 |
|
Hubert de La Jonquiere
|
c72ea10723
|
[Structured Output][Reasoning] Improves decoding throughput for models using single-token reasoning endings. (#30056)
|
2025-12-09 18:54:08 +08:00 |
|
Fanli Lin
|
c2e1987a6e
|
[Doc] update Intel GPU MM status in Feature x Hardware matrix (#30294)
Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
|
2025-12-09 05:16:44 +00:00 |
|
Or Ozeri
|
4c6fd25880
|
kv_transfer: Rename the shared storage connectors (#30201)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
|
2025-12-08 20:46:09 -08:00 |
|
Ming Yang
|
60d17251c9
|
[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP (#28782)
Signed-off-by: Ming Yang <minos.future@gmail.com>
|
2025-12-09 00:01:08 +00:00 |
|
Zhiyu
|
cd00c443d2
|
[Misc] Rename TensorRT Model Optimizer to Model Optimizer (#30091)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-12-08 07:05:27 +00:00 |
|
jeremyteboul
|
dce6d229f7
|
Support multiple image/audio embeddings per requests (#29988)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
|
2025-12-07 04:34:24 +00:00 |
|
Viacheslav
|
21bb323542
|
Gigachat 3 tool parser and tests (#29905)
Signed-off-by: Viacheslav Barinov <viacheslav.teh@gmail.com>
|
2025-12-06 12:04:14 +00:00 |
|
Hubert de La Jonquiere
|
befb59e5b1
|
[Model] Add Holo2 reasoning parser (#30048)
Signed-off-by: hdlj-h <hubert@hcompany.ai>
|
2025-12-05 10:38:45 +08:00 |
|
Tao Yun
|
6dcb07f676
|
support qwen3-vl handle requests with embeddings (#30037)
Signed-off-by: taoyun <1069423820@qq.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-12-04 17:34:06 +00:00 |
|
wang.yuqi
|
74c4d80c6c
|
[Model][6/N] Improve all pooling task | Support chunked prefill with ALL pooling (#27145)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-04 13:44:15 +00:00 |
|
dtc
|
842aba501d
|
[P/D] Introduce Mooncake Transfer Engine as kv_connector (#24718)
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: dtc <dtcccc@linux.alibaba.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
|
2025-12-04 09:51:36 +00:00 |
|
Cyrus Leung
|
9ae2f60374
|
[Misc] Various cleanups for MM input processing (#29970)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-04 06:22:20 +00:00 |
|
Cyrus Leung
|
34a984274e
|
[Misc] Refactor tokenizer interface (#29693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-29 04:02:21 -08:00 |
|
Wilson Wu
|
18523b87f6
|
[Docs] Update supported models for Olmo 3 in tool calling documentation (#29411)
Signed-off-by: Wilson Wu <iwilsonwu@gmail.com>
|
2025-11-28 02:53:55 +00:00 |
|
Harry Mellor
|
316c8492bf
|
Scheduled removal of guided_* config fields (#29326)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-25 05:24:05 +00:00 |
|
Tyler Michael Smith
|
4dd42db566
|
Remove VLLM_SKIP_WARMUP tip (#29331)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
|
2025-11-24 22:16:05 +00:00 |
|
Julien Denize
|
57430fc95c
|
Default model load/config/tokenizer to mistral format if relevant files exist (#28659)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-11-21 13:58:59 -08:00 |
|
jeremyteboul
|
0730414999
|
[Core] Add audio_embeds support to chat completions (#29059)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
|
2025-11-21 11:39:47 +08:00 |
|
Rob Mulla
|
dd39f91edb
|
[Doc] cleanup TPU documentation and remove outdated examples (#29048)
Signed-off-by: Rob Mulla <rob.mulla@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-21 00:05:59 +00:00 |
|
Didier Durand
|
09540cd918
|
[Doc]: fix typos in various files (#29010)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-19 04:56:21 -08:00 |
|
Didier Durand
|
7ed27f3cb5
|
[Doc]: fix typos in various files (#28945)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-18 22:52:30 -08:00 |
|
Uranus
|
6a25ea5f0e
|
[Docs] Update oneshot imports (#28188)
Signed-off-by: UranusSeven <109661872+UranusSeven@users.noreply.github.com>
|
2025-11-19 05:30:08 +00:00 |
|
Kevin H. Luu
|
c64c0b78de
|
[chore] Move the rest of wikimedia url to S3 (#28921)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-11-18 09:44:18 -08:00 |
|
Didier Durand
|
083cf326dc
|
[Doc]: fix typos in various files (#28863)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-17 20:32:14 -08:00 |
|
Didier Durand
|
63fed55506
|
[Doc]: fix typos in various files (#28811)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-16 14:30:06 +00:00 |
|
Didier Durand
|
2bb4435cb7
|
[Doc]: fix typos in various files (#28567)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-11-15 19:27:50 +00:00 |
|
Chauncey
|
5c9ad138d5
|
[Frontend] supports interleaved thinking (#28531)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-11-13 16:14:13 +08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
4ca5cd5740
|
[Core][AMD] Migrate fully transparent sleep mode to ROCm platform (#12695)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kliuae <kuanfu.liu@embeddedllm.com>
|
2025-11-12 15:24:12 -08:00 |
|
Chenguang Zheng
|
4ccffe561f
|
[Core] Encoder separation for Encode-Prefill-Decode Disaggregation (#25233)
Signed-off-by: n00909098 <nguyen.kha.long@huawei.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: herotai214 <herotai214@gmail.com>
Signed-off-by: Khuong Le <khuong.le.manh@huawei.com>
Signed-off-by: Khuong Le <lemanhkhuong2611@gmail.com>
Co-authored-by: n00909098 <nguyen.kha.long@huawei.com>
Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: herotai214 <herotai214@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Khuong Le <khuong.le.manh@huawei.com>
Co-authored-by: Khuong Le <lemanhkhuong2611@gmail.com>
|
2025-11-11 18:58:33 -08:00 |
|
xuebwang-amd
|
05576df85c
|
[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model (#24239)
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Co-authored-by: fxmarty-amd <felmarty@amd.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-11-11 12:05:22 -05:00 |
|
iAmir97
|
a7adbc6c6b
|
[Doc] Sleep mode documentation (#28357)
Signed-off-by: Amir Balwel <amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: Amir Balwel <amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-11-10 22:44:35 -08:00 |
|
Kevin H. Luu
|
05f8d69077
|
[chore] Move some wikimedia images to S3 (#28351)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2025-11-09 01:58:26 +00:00 |
|
Harry Mellor
|
d9ab1ad9d1
|
reasoning_content -> reasoning (#27752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-08 12:15:08 +00:00 |
|