Benjamin Kitor
|
82eb61dd4c
|
[misc] use tqdm.auto where appropriate (#16290)
Signed-off-by: Benjamin Kitor <bkitor@gigaio.com>
|
2025-04-09 21:54:54 -07:00 |
|
Guillaume Calmettes
|
1da6a09274
|
[Bugfix]: do not shutdown server if skip_special_use=False for MistralTokenizer (#14094)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 19:43:09 -07:00 |
|
Guillaume Calmettes
|
c3b5189137
|
[Bugfix] catch AssertionError in MistralTokenizer as ValueError (#16344)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 17:33:24 +00:00 |
|
Guillaume Calmettes
|
98d01d3ce2
|
[Bugfix][Frontend] respect provided default guided decoding backend (#15476)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-09 05:11:10 -07:00 |
|
yihong
|
04149cce27
|
[BugFix] fix some typos found by typos. (#16314)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-04-09 03:43:59 -07:00 |
|
Chauncey
|
102bf967f0
|
[Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-08 19:12:17 -07:00 |
|
Alex Brooks
|
69ecaa7c79
|
[Misc] Add warning for multimodal data in LLM.beam_search (#16241)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-08 04:05:27 -07:00 |
|
Michael Goin
|
b99733d092
|
[Bugfix] Do not skip "empty" parts of chats that are parsable (#16219)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-08 05:14:15 +00:00 |
|
Reid
|
fad6e2538e
|
[Misc] add description attribute in CLI (#15921)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-07 22:30:35 +00:00 |
|
Lu Fang
|
55dcce91df
|
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-07 08:06:27 -07:00 |
|
Isotr0py
|
7c80368710
|
[VLM] Florence-2 supports online serving (#16164)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-07 04:04:02 -07:00 |
|
paolovic
|
da224daaa9
|
[Bugfix] add hf_token to EngineArgs (#16093)
Signed-off-by: paolovic <paul-philipp.luley@uzh.ch>
Co-authored-by: paolovic <paul-philipp.luley@uzh.ch>
|
2025-04-06 14:47:33 +00:00 |
|
Chauncey
|
13affc432d
|
[Misc] Remove redundant code (#16098)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-05 20:03:50 -07:00 |
|
Matthias Matt
|
cefb9e5a28
|
[Frontend] Implement Tool Calling with tool_choice='required' (#13483)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
Signed-off-by: Matt, Matthias <matthias.matt@tuwien.ac.at>
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2025-04-02 07:45:45 -07:00 |
|
Chauncey
|
594a8b9030
|
[Bugfix] Fix the issue where the model name is empty string, causing no response with the model name. (#15938)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-02 06:33:52 -07:00 |
|
Eric Tang
|
ddb94c2605
|
[core] Add tags parameter to wake_up() (#15500)
Signed-off-by: Eric <erictang000@gmail.com>
|
2025-04-02 01:59:27 -07:00 |
|
Chauncey
|
cdb57015a7
|
[Misc] Replace print with logger (#15923)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-02 01:37:38 -07:00 |
|
yihong
|
93491aefc7
|
[BugFix] make sure socket close (#15875)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-04-01 13:10:24 -07:00 |
|
Jennifer Zhao
|
38327cf454
|
[Model] Aya Vision (#15441)
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-04-01 16:30:43 +00:00 |
|
Michael Goin
|
51d7c6a2b2
|
[Model] Support Mistral3 in the HF Transformers format (#15505)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-01 06:10:05 -07:00 |
|
Wei Zeng
|
30d6a015e0
|
[Feature] specify model in config.yaml (#15798)
Signed-off-by: weizeng <weizeng@roblox.com>
|
2025-04-01 01:20:06 -07:00 |
|
Kinfey
|
a164aea35d
|
[Frontend] Add Phi-4-mini function calling support (#14886)
Signed-off-by: Kinfey <kinfeylo@microsoft.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-03-31 22:50:05 -07:00 |
|
wwl2755
|
94744ba41a
|
[V1] [Feature] Collective RPC (#15444)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-03-29 03:39:14 -07:00 |
|
Jinzhen Lin
|
5b800f0932
|
[Bugfix] set VLLM_WORKER_MULTIPROC_METHOD=spawn for vllm.entrypoionts.openai.api_server (#15700)
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
|
2025-03-28 21:12:26 -07:00 |
|
Varun Sundar Rabindranath
|
1286211f57
|
[Bugfix] LoRA V1: add and fix entrypoints tests (#15715)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2025-03-28 21:10:41 -07:00 |
|
pengyuange
|
de1cb38769
|
[Model] Support Skywork-R1V (#15397)
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
|
2025-03-28 20:39:21 -07:00 |
|
daniel-salib
|
f3f8d8fff4
|
implement prometheus fast-api-instrumentor for http service metrics (#15657)
|
2025-03-29 00:12:02 +00:00 |
|
Reid
|
26df46ee59
|
[Misc] cli auto show default value (#15582)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-03-28 22:23:00 +00:00 |
|
Reid
|
fd5fd26902
|
[Frontend] update priority for --api-key and VLLM_API_KEY (#15588)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-03-28 19:40:12 +08:00 |
|
Ce Gao
|
32b14baf8a
|
[Refactor][Frontend] Keep all logic about reasoning into one class (#14428)
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
|
2025-03-28 00:23:30 -07:00 |
|
Jason (Siyu) Zhu
|
cec8c7d7f8
|
Refactor error handling for multiple exceptions in preprocessing (#15650)
Signed-off-by: JasonZhu1313 <jasonchu13@outlook.com>
|
2025-03-28 03:27:20 +00:00 |
|
Yuan Tang
|
66aa4c0bf4
|
[Feature] Add middleware to log API Server responses (#15593)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-03-27 17:49:38 +00:00 |
|
Alex Brooks
|
1711b929b6
|
[Model] Add Reasoning Parser for Granite Models (#14202)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Co-authored-by: Joe Runde <joe@joerun.de>
|
2025-03-26 14:28:07 +00:00 |
|
wwl2755
|
99f536f830
|
[Misc] Enhance warning information to user-defined chat template (#15408)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-03-26 02:21:15 -07:00 |
|
daniel-salib
|
5aefd6ac31
|
Fix raw_request extraction in load_aware_call decorator (#15382)
Signed-off-by: Daniel Salib <danielsalib@meta.com>
|
2025-03-25 22:29:54 -07:00 |
|
Maximilien de Bayser
|
e977c11111
|
Add workaround for shared field_names in pydantic model class (#13925)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-03-25 20:31:08 +00:00 |
|
Chauncey
|
10b34e36b9
|
[Bugfix] Fixed the issue of not being able to input video and image simultaneously (#15387)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-03-25 03:48:08 +00:00 |
|
Cyrus Leung
|
cbcdf2c609
|
[Bugfix] Fix chat template loading (#15143)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-24 13:50:09 +00:00 |
|
Robin
|
d6cd59f122
|
[Frontend] Support tool calling and reasoning parser (#14511)
Signed-off-by: WangErXiao <863579016@qq.com>
|
2025-03-23 14:00:07 -07:00 |
|
Cyrus Leung
|
baec0d4de9
|
Revert "[Feature] specify model in config.yaml (#14855)" (#15293)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-21 08:30:23 -07:00 |
|
Wei Zeng
|
0fa3970deb
|
[Feature] specify model in config.yaml (#14855)
Signed-off-by: weizeng <weizeng@roblox.com>
|
2025-03-21 00:26:03 -07:00 |
|
Chauncey
|
ae65f3e237
|
[Misc]fixed disable these http request logs (#14754)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-03-19 21:53:40 -07:00 |
|
maobaolong
|
26dd972adb
|
[FEAT]Support reset prefix cache by specified device (#15003)
|
2025-03-19 10:54:41 -07:00 |
|
Simon Mo
|
3b457143d2
|
[Bugfix] Register serializers for V0 MQ Engine (#15009)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-03-18 09:14:47 -04:00 |
|
Sebastian Schoennenbeck
|
dd732028f5
|
[Bugfix][Frontend] Fix validation of logprobs in ChatCompletionRequest (#14352)
Signed-off-by: Sebastian Schönnenbeck <sebastian.schoennenbeck@comma-soft.com>
|
2025-03-18 05:50:05 -07:00 |
|
Jun Duan
|
74bc397b0a
|
[Core] Expose API endpoint /is_sleeping (#14312)
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
|
2025-03-15 06:28:14 -07:00 |
|
Robert Shaw
|
d4d93db2c5
|
[V1] V1 Enablement Oracle (#13726)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-03-14 22:02:20 -07:00 |
|
daniel-salib
|
73deea2fdb
|
[Frontend] track server_load (#13950)
|
2025-03-14 09:53:17 -07:00 |
|
Russell Bryant
|
0b0d6421b2
|
[Frontend] Fix log message to use http vs https (#14774)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-14 09:21:09 -07:00 |
|
Guillaume Calmettes
|
fd8e055ffb
|
[BugFix]: properly catch templating error when preprocess input (#13976)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-03-14 05:58:34 -07:00 |
|