Cyrus Leung
|
6738e4a093
|
[Bugfix] Fix SLA tuner initialization (#27355)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-22 20:43:04 -07:00 |
|
Cyrus Leung
|
ceacedc1f9
|
[Benchmark] Add plot utility for parameter sweep (#27168)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-21 20:30:03 -07:00 |
|
Cyrus Leung
|
d31f7844f8
|
[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-19 05:20:55 -07:00 |
|
Cyrus Leung
|
b3aba04e5a
|
[Benchmark] Convenience script for multiple parameter combinations (#27085)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-18 23:57:01 -07:00 |
|
Harry Mellor
|
6c9fdbf725
|
[Docs] Replace rst style double-backtick with md single-backtick (#27091)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-17 02:47:34 -07:00 |
|
Tomas Ruiz
|
965c5f4914
|
vllm bench serve shows num of failed requests (#26478)
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
|
2025-10-16 19:55:09 -07:00 |
|
Cyrus Leung
|
4d4d6bad19
|
[Chore] Separate out vllm.utils.importlib (#27022)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-17 00:48:59 +00:00 |
|
Wentao Ye
|
23583ee28c
|
[Bug] Add Assertion for random-input-len / random-output-len (#26834)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-16 21:36:39 +00:00 |
|
kimbochen
|
013abde6ef
|
Adding Warmup to Benchmark Serving (#26943)
Signed-off-by: Kimbo Chen <chentenghung@gmail.com>
|
2025-10-16 12:44:32 -07:00 |
|
Cyrus Leung
|
334535b6fb
|
[Benchmark] Show E2EL by default for pooling models (#27014)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-16 12:47:09 +00:00 |
|
Cyrus Leung
|
17838e50ef
|
[Benchmark] Use truncation by default for pooling benchmarks (#26992)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-16 16:02:39 +08:00 |
|
Cyrus Leung
|
f6cdc9a02f
|
[Chore] Rename utils submodules (#26920)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-16 03:58:13 +00:00 |
|
Cyrus Leung
|
828523ad8e
|
[Chore] Separate out vllm.utils.async_utils (#26913)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-15 15:33:00 +00:00 |
|
wangxiyuan
|
8f4b313c37
|
[Misc] rename torch_dtype to dtype (#26695)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-10-15 12:11:48 +00:00 |
|
rongfu.leng
|
a27b288e4a
|
[Feature] default --extra-body param to disable thinking in vllm bench serve (#26784)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-10-15 04:23:44 +00:00 |
|
kourosh hakhamaneshi
|
a2986b3e33
|
[Bugfix] Fixes prefix-repetition benchmark script (#26828)
Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com>
|
2025-10-15 02:54:43 +00:00 |
|
Maximilien de Bayser
|
fe3edb4cf0
|
Add support for the /rerank endpoint in vllm bench serve (#26602)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-10-14 04:25:43 +00:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Cyrus Leung
|
5be7ca1b99
|
[Benchmark] Support Infinity API (#26641)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-12 01:45:32 +08:00 |
|
Cyrus Leung
|
4bdf7ac593
|
[Bugfix] Fix SHM cache initialization (#26427)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 02:48:04 -07:00 |
|
Cyrus Leung
|
dc7976dd9f
|
[Misc] Upgrade more code to Python 3.10 (#26463)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 10:43:53 +01:00 |
|
Huy Do
|
8bd696fa53
|
[Bugfix] Incorrect another MM data format in vllm bench throughput (#26462)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-10-09 05:58:46 +00:00 |
|
Cyrus Leung
|
0d4f48fa10
|
[Bugfix] Incorrect MM data format in vllm bench throughput (#26395)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-08 13:52:19 +08:00 |
|
Cyrus Leung
|
44b9af5bb2
|
[Benchmark] Enable MM Embedding benchmarks (#26310)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-06 19:51:58 +00:00 |
|
Roger Wang
|
43c146ca42
|
[Misc] Clean up unnecessary E501 ignore (#26274)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-10-06 07:29:18 +00:00 |
|
Yasmin Moslem
|
7c2ec0fe87
|
[Benchmarking] Add disable_shuffle option for dataset loading (#26258)
Signed-off-by: Yasmin Moslem <48152713+ymoslem@users.noreply.github.com>
|
2025-10-06 07:05:44 +00:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Sergei Skvortsov
|
b71fcd4905
|
[Misc] Add penalties sampling parameters to serve tool (#25974)
Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
|
2025-10-03 15:43:14 -07:00 |
|
Ekagra Ranjan
|
ad2d788016
|
[Bug][Benchmark] Fix duplicate req in oversampling (#26140)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-03 02:55:24 +00:00 |
|
Ekagra Ranjan
|
1cab2f9cad
|
EAGLE 3: Fix preamble so that measured speedup over Eagle 1 becomes 32% instead of 5% on MTBench (#25916)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2025-10-02 11:29:35 -07:00 |
|
Nathan Scott
|
f9e714813a
|
[Benchmark] Finish documented v0.11.0 deprecation of --endpoint-type (#26007)
Signed-off-by: Nathan Scott <nathans@redhat.com>
|
2025-10-01 12:41:57 +00:00 |
|
Zhuohan Li
|
d3bd171123
|
[Benchmark] Support benchmark throughput for external launcher DP (#25913)
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
|
2025-09-30 01:43:57 +00:00 |
|
weiliang
|
f4e4088c99
|
Fix random dataset mismatched token length with config. (#24937)
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-28 08:23:44 +00:00 |
|
WeiQing Chen
|
f1d53d150c
|
[Multimodal][Speculative Decoding]Eagle Eagle3 mm support, enablement on qwen2.5vl (#22872)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
|
2025-09-27 03:35:47 +00:00 |
|
Lucia Fang
|
eea1783989
|
[benchmarks]allow skip ready check for bench serve (#25420)
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>
Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
|
2025-09-23 03:21:48 +00:00 |
|
samzong
|
ce75e15373
|
refactor(benchmarks): add type annotations to wait_for_endpoint parameters (#25218)
Signed-off-by: samzong <samzong.lu@gmail.com>
|
2025-09-19 16:36:52 +00:00 |
|
Roger Wang
|
21da73343a
|
[Misc] Clean up flags in vllm bench serve (#25138)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-09-18 12:43:33 +00:00 |
|
Punitvara
|
05b044e698
|
[Doc] Fix cross-reference warnings (#25058)
Signed-off-by: Punit Vara <punitvara@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-18 02:05:16 -07:00 |
|
Simon Mo
|
a904ea78ea
|
[benchmark] add peak throughput metrics and plot (#23867)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-09-17 22:30:02 -07:00 |
|
samzong
|
4a2d33e371
|
[Docs] vllm/benchmarks/datasets.py fix docstring param format. (#24970)
Signed-off-by: samzong <samzong.lu@gmail.com>
|
2025-09-17 08:11:51 -07:00 |
|
samzong
|
47f670b03b
|
[Docs] improve code formatting and comments for eliminate griffe build warning. (#25010)
Signed-off-by: samzong <samzong.lu@gmail.com>
|
2025-09-17 07:31:20 -07:00 |
|
Zhuohan Li
|
6c47f6bfa4
|
[Core] Remove tokenizer group in vLLM (#24078)
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
|
2025-09-17 08:42:59 +00:00 |
|
Isotr0py
|
5a411ef6c4
|
[Benchmarks] Add MMVU video dataset support and clean up deprecated datasets (#24719)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-17 03:29:43 +00:00 |
|
Ye (Charlotte) Qi
|
ff68035932
|
[Benchmarks] Throw usage error when using dataset-name random and dataset-path together (#24819)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-14 17:50:01 +00:00 |
|
Clayton Coleman
|
bc636f21a6
|
[Benchmark] Allow arbitrary headers to be passed to benchmarked endpoints (#23937)
Signed-off-by: Clayton Coleman <smarterclayton@gmail.com>
|
2025-09-12 13:57:53 -07:00 |
|
Didier Durand
|
bcb06d7baf
|
[Doc]: fix typos in various files (#24726)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-09-12 06:43:12 -07:00 |
|
Tomas Ruiz
|
ee0bc5e1b4
|
Enable --profile in 'vllm bench throughput' (#24575)
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
|
2025-09-10 23:06:19 -07:00 |
|
Ekagra Ranjan
|
fb1a8f932a
|
[Benchmark] Add option to skip oversampling in benchmark (#24457)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
|
2025-09-09 22:00:17 +00:00 |
|
Ming Yang
|
1823a00d67
|
[Misc] Support bench serve long context (#24373)
Signed-off-by: Ming Yang <minos.future@gmail.com>
|
2025-09-08 22:53:10 -07:00 |
|
Ekagra Ranjan
|
41183c1fe0
|
[Spec Decode] Fix offline spec_decode.py (#24257)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-08 20:44:13 +00:00 |
|