biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Varun Sundar Rabindranath	7b80cd8ac3	[Docs] Add Phi-4-reasoning-vision to supported models + examples (#39232 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2026-04-08 02:02:26 +00:00
bsliu	c0817e4d39	[Model] Add support for Cheers multimodal model (#38788 ) Signed-off-by: bsliu <1187291748@qq.com> Signed-off-by: 吴炳贤 <wubingxian24@mails.ucas.ac.cn>	2026-04-02 21:01:40 +08:00
Fynn Schmitt-Ulms	fa246d5231	Fix shape comment in extract_hidden_states example (#38723 ) Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>	2026-04-01 07:29:33 -07:00
liuzhenwei	0c63739135	[EPD] update EPD script arguments (#36742 ) Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>	2026-03-31 12:02:09 +00:00
Maosheng Liao	aae3e688f8	Fix document of torchrun_example.py (#31113 )	2026-03-31 10:54:23 +00:00
haosdent	d39b8daf5f	[Feature] Add Qwen3-ForcedAligner support via token classification pooling (#35367 ) Signed-off-by: haosdent <haosdent@gmail.com>	2026-03-29 00:27:52 +00:00
Matej Rojec	2908094567	Add `/v1/chat/completions/batch` endpoint for batched chat completions (#38011 ) Signed-off-by: Matej Rojec <64556640+MatejRojec@users.noreply.github.com>	2026-03-26 12:13:33 +08:00
Ekagra Ranjan	7b54f60db0	[Cohere] Enable Cohere-Transcribe (#38120 ) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>	2026-03-25 16:13:51 -07:00
Cyrus Leung	ba2f0acc2d	[Misc] Reorganize inputs (#35182 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-03-25 10:22:54 -07:00
Harry Mellor	d215d1efca	[Mypy] Better fixes for the `mypy` issues in `vllm/config` (#37902 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-25 06:14:43 -07:00
Lasha Koroshinadze	e7767eccae	Fix AudioFlamingo3/MusicFlamingo HF parity and RoTE handling (#37643 ) Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>	2026-03-23 10:29:07 +08:00
Aaron Hao	4ee847e400	Comment fix for async rl example (#35244 ) Signed-off-by: hao-aaron <ahao@anyscale.com>	2026-03-19 19:46:07 +00:00
Aaron Hao	5f82706a21	[BUG] Exclude SKIP_TENSORS from get_layer_size() + new weight sync example for dpep (#37334 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2026-03-19 00:45:10 +00:00
Aaron Hao	47a1f11bff	[docs] Add docs for new RL flows (#36188 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-18 09:04:26 +00:00
Athrael Soju	c0745a851a	[Model] Add ColQwen3.5 4.5B support (#36887 ) Signed-off-by: Athrael Soju <athrael.soju@gmail.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2026-03-17 21:17:02 +00:00
Ekagra Ranjan	b5ca9c3557	[Models] Cohere ASR (#35809 ) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>	2026-03-17 21:04:17 +00:00
Isotr0py	a836524d20	[Chore] Replace all base64 usages with faster pybase64 package (#37290 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2026-03-17 14:44:19 +00:00
rasmith	0024f39a32	[ROCm][P/D][MORI][BugFix] Add transfer_id for moriio_connector so moriio_connector to restore P/D functionality (#34907 ) Signed-off-by: Randall Smith <Randall.Smith@amd.com>	2026-03-16 10:36:51 +08:00
Kunshang Ji	53ec16a705	[Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145 ) Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2026-03-12 07:57:47 -07:00
sfeiqiang	8cb24d3aed	[KV Connector] Support using FlexKV as KV Cache Offloading option. (#34328 ) Signed-off-by: phaedonsun <phaedonsun@tencent.com> Co-authored-by: phaedonsun <phaedonsun@tencent.com>	2026-03-12 00:46:20 -07:00
Hongxin Xu	bea02cdf93	Fix routed experts capture for hybrid models (Mamba + Attention) (#35744 ) Signed-off-by: arlenxu <arlenxu@tencent.com> Signed-off-by: xhx1022 <1737006628@qq.com> Co-authored-by: arlenxu <arlenxu@tencent.com>	2026-03-11 08:53:10 -07:00
Silvia Colabrese	f33251ffc8	[Bugfix] Fix Mistral-small `--format` (#36782 ) Signed-off-by: 12010486 <silvia.colabrese@intel.com>	2026-03-11 04:47:52 -07:00
tunglinwood	42fadebecb	[Model] Add support for moonshotai/Kimi-Audio-7B-Instruct (#36127 ) Signed-off-by: tunglinwood <tunglinwood@gmail.com> Signed-off-by: tunglinwood <tomwu.tunglin@gmail.com> Signed-off-by: tunglinwood <113751333+tunglinwood@users.noreply.github.com>	2026-03-10 21:24:48 -07:00
Harry Mellor	c88510083b	Fix Qwen2.5-VL test for Transformers v5 (#36532 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-10 12:05:34 +00:00
wang.yuqi	dcf8862fd4	[Examples][1/n] Resettle basic examples. (#35579 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-08 20:22:53 -07:00
Harry Mellor	a0f44bb616	Allow `markdownlint` to run locally (#36398 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-08 20:05:24 -07:00
Cyrus Leung	de00ebeac4	[Bugfix] Fix simple Mistral-Small example (#36156 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2026-03-05 20:25:11 -08:00
Jiayi Yan	6a895197fa	[Bugfix][CI] fix typos (#34934 ) Signed-off-by: 1195343015 <1195343015@qq.com> Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-05 17:05:46 +00:00
Kunshang Ji	66a2209645	[Hardware] Replace `torch.cuda.synchronize()` api with `torch.accelerator.synchronize` (#36085 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2026-03-05 10:36:39 +00:00
Dr Alex Mitre	3417ba5648	docs: add README for logits_processor examples (#35933 )	2026-03-04 17:09:19 +00:00
Qi Wang	6aa6ad8992	[BugFix] Fix implicit and incorrect assumption on ECConnector is_producer (#34783 ) Signed-off-by: Qi Wang <qiwa@nvidia.com>	2026-03-04 15:01:30 +01:00
Kunshang Ji	16d2ad1d38	[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` (#30681 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-03-04 09:49:47 +00:00
Andreas Karatzas	f7da9cdffc	[ROCm][CI] Support async weight transfer example with platform-aware determinism (#35710 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2026-03-04 09:44:14 +08:00
Jakub Zakrzewski	c8b678e53e	[Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>	2026-03-03 08:32:14 +08:00
Aaron Hao	cad21918e3	[BUG] Fix rlhf_async example (#35788 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2026-03-02 20:36:40 +00:00
Fynn Schmitt-Ulms	9433acb8df	[Spec Decode] Add hidden states extraction system (#33736 ) Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>	2026-03-02 14:29:09 -05:00
Aaron Hao	2ce6f3cf67	[Feat][RL][2/2] Native Weight Syncing API: IPC (#34171 ) Signed-off-by: hao-aaron <ahao@anyscale.com> Signed-off-by: Aaron Hao <ahao@anyscale.com> Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2026-02-27 13:45:21 -07:00
Tyler Michael Smith	eb19955c37	[WideEP] Remove pplx all2all backend (#33724 ) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 14:30:10 -08:00
Jakub Zakrzewski	111d869069	[Model] Add nvidia/llama-nemotron-embed-vl-1b-v2 multimodal embedding model (#35297 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>	2026-02-26 14:17:17 +00:00
Aaron Hao	596ed1f02e	[RL] Validation for pause_mode='keep' (#34992 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2026-02-23 16:30:56 -05:00
Athrael Soju	970861ac0c	[New Model] Add ColModernVBERT (#34558 ) Signed-off-by: Athrael Soju <athrael.soju@gmail.com> Signed-off-by: athrael-soju <athrael-soju@users.noreply.github.com>	2026-02-22 12:23:41 +08:00
Nicolò Lucchesi	ab6f3487a6	[PD] Change kv_load_failure_policy Default from "recompute" to "fail" (#34896 ) Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2026-02-21 01:34:57 -08:00
zhongdaor-nv	a0fe7ea2f0	[feat] Add per-block extra_keys to KV events (#33304 ) Signed-off-by: zhongdaor-nv <zhongdaor@nvidia.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2026-02-20 20:11:40 -08:00
Kata Coder	5719a4e4e6	[Frontend] Support multimodal inputs for late-interaction scoring (ColQwen3) + NewModel: nvidia/nemotron-colembed (#34574 ) Signed-off-by: craftsangjae <craftsangjae@gmail.com>	2026-02-20 20:01:40 -08:00
Vlad Tiberiu Mihailescu	e739c29ea4	[CI/Build] Add opentelemetry libs in default vllm build (requirements/common.txt) (#34466 ) Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com>	2026-02-20 19:54:55 -08:00
junuxyz	c61a98f529	[CI][BugFix] ShellCheck cleanup to remove baseline and preserve runtime behavior (#34514 ) Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>	2026-02-17 12:22:56 +00:00
ChenqianCao	ad65177a19	[Bugfix] Fix 'remove_instance_endpoint' method logic in disagg_proxy_demo (#32922 ) Signed-off-by: ChenqianCao <39755070+ChenqianCao@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2026-02-17 10:06:53 +00:00
Christian Pinto	6930becd45	(bugfix): Fixed encode in LLM entrypoint for IOProcessr plugin prompts (#34618 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com>	2026-02-16 07:33:55 -08:00
Christian Pinto	342a7cda2d	[Misc] Update tests and examples for Prithvi/Terratorch models (#34416 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2026-02-13 23:03:51 -08:00
Kata Coder	d1ea65d0a1	[new model] add COLQwen3 code & Inference (#34398 ) Signed-off-by: craftsangjae <craftsangjae@gmail.com> Signed-off-by: katacoder <craftsangjae@gmail.com>	2026-02-14 12:15:19 +08:00

1 2 3 4 5 ...

833 Commits