Harry Mellor
|
67187554dd
|
[Docs] Enable some more markdown lint rules for the docs (#28731)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-14 18:39:19 +00:00 |
|
Chen Wang
|
9261eb3dc1
|
docs(lora_resolvers): clarify multi-resolver order and storage path requirement (#28153)
Signed-off-by: Chen Wang <Chen.Wang1@ibm.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-14 18:08:30 +00:00 |
|
Julien Denize
|
085424808e
|
Remove audio optional dependency for mistral-common (#28722)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-11-14 09:54:38 -08:00 |
|
Harry Mellor
|
5f3cd7f7f2
|
[Docs] Update the name of Transformers backend -> Transformers modeling backend (#28725)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-14 16:34:14 +00:00 |
|
Fasal Shah
|
8d3748d3c7
|
[Doc] Fix macOS installation dependency resolution issue (#26721)
Signed-off-by: faisal shah <fashah@redhat.com>
|
2025-11-14 12:43:56 +00:00 |
|
Chauncey
|
5c9ad138d5
|
[Frontend] supports interleaved thinking (#28531)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-11-13 16:14:13 +08:00 |
|
Harry Mellor
|
97d1c99302
|
Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 19:14:33 -08:00 |
|
Harry Mellor
|
3226283461
|
[Docs] Add some details about what the MoE block needs for the Transformers backend (#28588)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-13 03:12:14 +00:00 |
|
Michael Goin
|
52eadcec9e
|
[Docs] Update meetups.md description (#28583)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-13 00:00:23 +00:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
4ca5cd5740
|
[Core][AMD] Migrate fully transparent sleep mode to ROCm platform (#12695)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kliuae <kuanfu.liu@embeddedllm.com>
|
2025-11-12 15:24:12 -08:00 |
|
Benjamin Chislett
|
304419576a
|
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-11-13 01:56:40 +09:00 |
|
Harry Mellor
|
a742134cc5
|
Remove deprecated fields from CompilationConfig (#27593)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-12 16:10:28 +00:00 |
|
Chenguang Zheng
|
4ccffe561f
|
[Core] Encoder separation for Encode-Prefill-Decode Disaggregation (#25233)
Signed-off-by: n00909098 <nguyen.kha.long@huawei.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: herotai214 <herotai214@gmail.com>
Signed-off-by: Khuong Le <khuong.le.manh@huawei.com>
Signed-off-by: Khuong Le <lemanhkhuong2611@gmail.com>
Co-authored-by: n00909098 <nguyen.kha.long@huawei.com>
Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: herotai214 <herotai214@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Khuong Le <khuong.le.manh@huawei.com>
Co-authored-by: Khuong Le <lemanhkhuong2611@gmail.com>
|
2025-11-11 18:58:33 -08:00 |
|
Li, Jiang
|
7f829be7d3
|
[CPU] Refactor CPU attention backend (#27954)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-11-12 09:43:06 +08:00 |
|
Michael Goin
|
28534b92b9
|
Add Zurich vLLM Meetup (#28488)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-11 14:53:59 -08:00 |
|
xuebwang-amd
|
05576df85c
|
[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model (#24239)
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Co-authored-by: fxmarty-amd <felmarty@amd.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-11-11 12:05:22 -05:00 |
|
the-codeboy
|
287bbbeb06
|
[Doc] Fix typo in serving docs (#28474)
Signed-off-by: the-codeboy <71213855+the-codeboy@users.noreply.github.com>
|
2025-11-11 16:45:49 +00:00 |
|
Maryam Tahhan
|
fa1970201d
|
[Docs] Fix grammar in CPU installation guide (#28461)
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
|
2025-11-11 14:01:11 +00:00 |
|
iAmir97
|
a7adbc6c6b
|
[Doc] Sleep mode documentation (#28357)
Signed-off-by: Amir Balwel <amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: Amir Balwel <amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-11-10 22:44:35 -08:00 |
|
vllmellm
|
f080a83511
|
[RFC][ROCm][AITER] Keep all AITER kernels in _aiter_ops class like _custom_ops and _ipex_ops (#24490)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-11-10 08:20:53 -08:00 |
|
Kevin H. Luu
|
05f8d69077
|
[chore] Move some wikimedia images to S3 (#28351)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2025-11-09 01:58:26 +00:00 |
|
Benjamin Chislett
|
975676d174
|
[Feat] Drop-in Torch CUDA Profiler (#27841)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-11-08 14:07:37 -08:00 |
|
Harry Mellor
|
d9ab1ad9d1
|
reasoning_content -> reasoning (#27752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-08 12:15:08 +00:00 |
|
Hamid Mukhtar
|
61d25dc44b
|
Update gpu.rocm.inc.md to add support for AMD Ryzen AI MAX / AI 300 Series (gfx1151, gfx1150) (#28308)
Signed-off-by: Hamid Mukhtar <15519013+hammmmy@users.noreply.github.com>
|
2025-11-08 02:09:21 +00:00 |
|
youkaichao
|
155ad56d7b
|
[doc] add guide about the provided PTX was compiled with an unsupported toolchain (#28305)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-11-08 00:26:34 +08:00 |
|
Fadi Arafeh
|
5fb4137c99
|
[README] Add Arm CPUs to the list of supported targets (#28290)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
|
2025-11-07 15:41:47 +00:00 |
|
Fang Han
|
da855b42d2
|
[Doc]: Make extraInit containers fully configurable in helm chart (#27497)
Signed-off-by: Fang Han <fhan0520@gmail.com>
|
2025-11-06 20:27:16 +00:00 |
|
StanHatko
|
e52e4da971
|
[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores (#27953)
Signed-off-by: Stan Hatko <stan_hatko@live.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
|
2025-11-06 23:47:11 +08:00 |
|
Milos Puzovic
|
2176778cd3
|
[Doc] Add Arm CPUs are on the list of supported targets in vLLM (#26018)
Signed-off-by: Milos Puzovic <milos.puzovic@arm.com>
|
2025-11-06 15:30:26 +00:00 |
|
Richard Zou
|
65ac8d8dc4
|
[Docs] Add guide to debugging vLLM-torch.compile integration (#28094)
Signed-off-by: Richard Zou <zou3519@gmail.com>
|
2025-11-05 21:31:46 +00:00 |
|
Jiaju Zhang
|
6fd0df8132
|
[misc] add vLLM Beijing Meetup (#28127)
Signed-off-by: Jiaju Zhang <jjzhang@redhat.com>
|
2025-11-05 17:12:59 +00:00 |
|
Isotr0py
|
3f5a4b6473
|
[Bugfix] Validate custom logits processor xargs for online serving (#27560)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-05 16:53:33 +00:00 |
|
Chauncey
|
e261d37c9a
|
[Refactor] Lazy-loaded reasoning_parser (#28092)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-11-05 15:37:02 +08:00 |
|
Alex Brooks
|
b7cbc25416
|
[Model, Core] Support Granite Speech & LoRA for STT (#24455)
|
2025-11-05 08:33:48 +01:00 |
|
wangxiyuan
|
428bc7bf1c
|
[V0 deprecation] Remove VLLM_USE_V1 usage in most modules (#27955)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-11-04 20:51:16 -08:00 |
|
yt0428
|
05cae69f0f
|
[model] Add support for openPangu_Ultra_MoE (#27521)
Signed-off-by: yuantao <2422264527@qq.com>
Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-11-04 08:17:20 -08:00 |
|
Chauncey
|
c02fccdbd2
|
[Refactor] Lazy import tool_parser (#27974)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-11-04 10:10:10 +08:00 |
|
Aurick Qiao
|
2c19d96777
|
[Spec Decode] Integrate Suffix Decoding from Arctic Inference (#25784)
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
|
2025-11-03 09:23:31 -08:00 |
|
ahao-anyscale
|
cac4c10ef0
|
[BUG] Make 'binary' default option for saving torch compile artifacts when using standalone_compile (#27616)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2025-11-03 11:13:51 -05:00 |
|
zhang-prog
|
40b69e33e7
|
[Model] Add PaddleOCR-VL Model Support (#27758)
Signed-off-by: zhangyue <zhangyue66@baidu.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-03 19:04:22 +08:00 |
|
Harry Mellor
|
799ce45cc1
|
[Docs] Mock all imports for docs (#27873)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-01 10:02:23 +00:00 |
|
Bram Wasti
|
0e0a638c3b
|
Batch invariance doc (#27839)
Signed-off-by: Bram Wasti <bwasti@meta.com>
Signed-off-by: Bram Wasti <bwasti@fb.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-31 17:22:19 -04:00 |
|
Rob Mulla
|
70bfbd7b16
|
Docs update tpu install instructions (#27824)
Signed-off-by: Rob Mulla <rob.mulla@gmail.com>
Signed-off-by: Rob Mulla <RobMulla@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-31 10:29:55 -07:00 |
|
GuanLuo
|
d6517be3cd
|
[Bugfix] Missing NIXL metadata for handshake initialization if instance spans multi-node (#26338)
Signed-off-by: Guan Luo <gluo@nvidia.com>
Signed-off-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
Signed-off-by: Guan Luo <41310872+GuanLuo@users.noreply.github.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2025-10-31 10:16:00 -07:00 |
|
Kebe
|
33a0ea5f32
|
[Docs] add Shanghai Meetup - 2025/10 (#27545)
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: esmeetu <jasonailu87@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: esmeetu <jasonailu87@gmail.com>
|
2025-10-31 00:33:13 +08:00 |
|
Fan Yin
|
9956aae4ea
|
[Model][Ouro] Support Ouro Model (#27794)
Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-30 22:34:41 +08:00 |
|
Zhiyuan Li
|
4e68cc9b6a
|
[Model] Introduce Kimi Linear to vLLM (#27809)
Signed-off-by: lizhiyuan <lizhiyuan@moonshot.cn>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
|
2025-10-30 21:02:27 +08:00 |
|
wang.yuqi
|
4464723f22
|
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-10-30 12:13:05 +00:00 |
|
yitingdc
|
31b55ffc62
|
use stringData in secret yaml to store huggingface token (#25685)
Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
|
2025-10-30 00:47:36 -07:00 |
|
Kuntai Du
|
8bff831f0a
|
[Benchmark] Cleanup deprecated nightly benchmark and adjust the docstring for performance benchmark (#25786)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
|
2025-10-30 04:43:37 +00:00 |
|