Russell Bryant
|
f8acd01ff7
|
[V1] Add structural_tag support using xgrammar (#17085)
|
2025-04-26 14:06:37 +00:00 |
|
Nick Hill
|
70116459c3
|
[BugFix][Frontend] Fix LLM.chat() tokenization (#16081)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-04-25 22:20:05 +00:00 |
|
Jasmond L
|
d5615af9ae
|
[Bugfix] Fix Mistral ChatCompletionRequest Body Exception (#16769)
Signed-off-by: Jasmond Loh <Jasmond.Loh@hotmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-25 07:26:30 -07:00 |
|
Cyrus Leung
|
19dcc02a72
|
[Bugfix] Fix mistral model tests (#17181)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-25 06:03:34 -07:00 |
|
Alex Brooks
|
7feae92c1f
|
[Doc] Move todo out of beam search docstring (#17183)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-25 04:44:58 -07:00 |
|
Maximilien de Bayser
|
05e1fbfc52
|
Add chat template for Llama 4 models (#16428)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-04-24 20:19:36 +00:00 |
|
Harry Mellor
|
0a05ed57e6
|
Simplify TokenizerGroup (#16790)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-24 04:43:56 -07:00 |
|
Chauncey
|
8c87a9ad46
|
[Bugfix] Fix AssertionError: skip_special_tokens=False is not supported for Mistral tokenizers (#16964)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-23 07:24:09 +00:00 |
|
Guillaume Calmettes
|
36fe78769f
|
[Bugfix] validate urls object for multimodal content parts (#16990)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-23 09:43:06 +08:00 |
|
Reid
|
f34410715f
|
[frontend] enhance tool_calls type check (#16882)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-22 15:40:24 +00:00 |
|
Isotr0py
|
83f3c3bd91
|
[Model] Refactor Phi-4-multimodal to use merged processor and support V1 (#15477)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-19 02:26:11 -07:00 |
|
Nicolò Lucchesi
|
2ef0dc53b8
|
[Frontend] Add sampling params to v1/audio/transcriptions endpoint (#16591)
Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Jannis Schönleber <joennlae@gmail.com>
|
2025-04-19 07:03:54 +00:00 |
|
Yang Fan
|
2c1bd848a6
|
[Model][VLM] Add Qwen2.5-Omni model support (thinker only) (#15130)
Signed-off-by: fyabc <suyang.fy@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Xiong Wang <wangxiongts@163.com>
|
2025-04-18 23:14:36 -07:00 |
|
rongfu.leng
|
7bdfd29a35
|
[Misc] add collect_env to cli and docker image (#16759)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-04-17 22:13:35 -07:00 |
|
Mark McLoughlin
|
e4755f7fac
|
[V1][Metrics] Fix http metrics middleware (#15894)
|
2025-04-17 19:52:18 +00:00 |
|
Robin
|
6211b92273
|
[Bugfix]Fix index out of range error in api server log (#16787)
Signed-off-by: WangErXiao <863579016@qq.com>
|
2025-04-17 09:01:07 -07:00 |
|
Nick Hill
|
05fcd1b430
|
[V1][Perf] Faster incremental detokenization (#15137)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-04-17 07:45:24 -07:00 |
|
Robert Shaw
|
2b05b8ce69
|
[V1][Frontend] Improve Shutdown And Logs (#11737)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-04-16 19:48:34 -07:00 |
|
Angky William
|
fdcb850f14
|
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server (#10546)
Signed-off-by: Angky William <angkywilliam@Angkys-MacBook-Pro.local>
Co-authored-by: Angky William <angkywilliam@Angkys-MacBook-Pro.local>
|
2025-04-15 22:31:38 +00:00 |
|
Xihui Cang
|
1666e66443
|
Add "/server_info" endpoint in api_server to retrieve the vllm_config. (#16572)
Signed-off-by: Xihui Cang <xihuicang@gmail.com>
|
2025-04-15 11:50:38 +00:00 |
|
Michael Goin
|
b4fe16c75b
|
Add vllm bench [latency, throughput] CLI commands (#16508)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-14 23:10:35 -07:00 |
|
Alex Brooks
|
6b40996ae8
|
[Core][Bugfix] Fix Offline MM Beam Search (#16390)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-15 10:33:02 +08:00 |
|
courage17340
|
b1308b84a3
|
[Model][VLM] Add Kimi-VL model support (#16387)
Signed-off-by: courage17340 <courage17340@163.com>
|
2025-04-14 21:41:48 +00:00 |
|
Harry Mellor
|
e51929ebca
|
Improve configs - SchedulerConfig (#16533)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-14 17:24:16 +08:00 |
|
Cyrus Leung
|
d9fc8cd9da
|
[V1] Enable multi-input by default (#15799)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-12 08:52:39 +00:00 |
|
wang.yuqi
|
fbf722c6e6
|
[Frontend] support matryoshka representation / support embedding API dimensions (#16331)
|
2025-04-11 23:23:10 -07:00 |
|
Ye (Charlotte) Qi
|
16eda8c43a
|
[Frontend] Added chat templates for LLaMa4 pythonic tool calling (#16463)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Co-authored-by: Kai Wu <kaiwu@meta.com>
|
2025-04-12 06:26:17 +08:00 |
|
Benjamin Kitor
|
82eb61dd4c
|
[misc] use tqdm.auto where appropriate (#16290)
Signed-off-by: Benjamin Kitor <bkitor@gigaio.com>
|
2025-04-09 21:54:54 -07:00 |
|
Guillaume Calmettes
|
1da6a09274
|
[Bugfix]: do not shutdown server if skip_special_use=False for MistralTokenizer (#14094)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 19:43:09 -07:00 |
|
Guillaume Calmettes
|
c3b5189137
|
[Bugfix] catch AssertionError in MistralTokenizer as ValueError (#16344)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 17:33:24 +00:00 |
|
Guillaume Calmettes
|
98d01d3ce2
|
[Bugfix][Frontend] respect provided default guided decoding backend (#15476)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 05:11:10 -07:00 |
|
yihong
|
04149cce27
|
[BugFix] fix some typos found by typos. (#16314)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-04-09 03:43:59 -07:00 |
|
Chauncey
|
102bf967f0
|
[Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-08 19:12:17 -07:00 |
|
Alex Brooks
|
69ecaa7c79
|
[Misc] Add warning for multimodal data in LLM.beam_search (#16241)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-08 04:05:27 -07:00 |
|
Michael Goin
|
b99733d092
|
[Bugfix] Do not skip "empty" parts of chats that are parsable (#16219)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-08 05:14:15 +00:00 |
|
Reid
|
fad6e2538e
|
[Misc] add description attribute in CLI (#15921)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-07 22:30:35 +00:00 |
|
Lu Fang
|
55dcce91df
|
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-07 08:06:27 -07:00 |
|
Isotr0py
|
7c80368710
|
[VLM] Florence-2 supports online serving (#16164)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-07 04:04:02 -07:00 |
|
paolovic
|
da224daaa9
|
[Bugfix] add hf_token to EngineArgs (#16093)
Signed-off-by: paolovic <paul-philipp.luley@uzh.ch>
Co-authored-by: paolovic <paul-philipp.luley@uzh.ch>
|
2025-04-06 14:47:33 +00:00 |
|
Chauncey
|
13affc432d
|
[Misc] Remove redundant code (#16098)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-05 20:03:50 -07:00 |
|
Matthias Matt
|
cefb9e5a28
|
[Frontend] Implement Tool Calling with tool_choice='required' (#13483)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
Signed-off-by: Matt, Matthias <matthias.matt@tuwien.ac.at>
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2025-04-02 07:45:45 -07:00 |
|
Chauncey
|
594a8b9030
|
[Bugfix] Fix the issue where the model name is empty string, causing no response with the model name. (#15938)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-02 06:33:52 -07:00 |
|
Eric Tang
|
ddb94c2605
|
[core] Add tags parameter to wake_up() (#15500)
Signed-off-by: Eric <erictang000@gmail.com>
|
2025-04-02 01:59:27 -07:00 |
|
Chauncey
|
cdb57015a7
|
[Misc] Replace print with logger (#15923)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-02 01:37:38 -07:00 |
|
yihong
|
93491aefc7
|
[BugFix] make sure socket close (#15875)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-04-01 13:10:24 -07:00 |
|
Jennifer Zhao
|
38327cf454
|
[Model] Aya Vision (#15441)
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-04-01 16:30:43 +00:00 |
|
Michael Goin
|
51d7c6a2b2
|
[Model] Support Mistral3 in the HF Transformers format (#15505)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-01 06:10:05 -07:00 |
|
Wei Zeng
|
30d6a015e0
|
[Feature] specify model in config.yaml (#15798)
Signed-off-by: weizeng <weizeng@roblox.com>
|
2025-04-01 01:20:06 -07:00 |
|
Kinfey
|
a164aea35d
|
[Frontend] Add Phi-4-mini function calling support (#14886)
Signed-off-by: Kinfey <kinfeylo@microsoft.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-03-31 22:50:05 -07:00 |
|
wwl2755
|
94744ba41a
|
[V1] [Feature] Collective RPC (#15444)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-03-29 03:39:14 -07:00 |
|