biondizzle/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Jialin Ouyang	b30372cbd0	[Perf] Move gc.freeze logic from EngineCoreProc to EngineCore for better coverage (#27896 ) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>	2025-11-10 15:34:18 -08:00
Wentao Ye	4b1ff13221	[Feature] Default `ignore_eos` True for `random` dataset (#28227 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-07 07:35:33 -05:00
汪志鹏	315068eb4a	[FixBug]Aeala/ShareGPT_Vicuna_unfiltered marked as multimodal benchmark (#28265 ) Signed-off-by: princepride <wangzhipeng628@gmail.com>	2025-11-07 09:35:22 +00:00
Jacob Zhong	d72299d47b	Make the cv2 dependency optional (#27780 ) Signed-off-by: Jacob <cmpute@qq.com>	2025-11-06 05:08:55 +00:00
Sophie du Couédic	a4398fbb5e	[Feature][Benchmarks] Support `inf` burstiness (#26941 ) Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com>	2025-11-03 18:33:17 +00:00
Seiji Eicher	b2e65cb4a7	[benchmark] Make request IDs unique across clients by default (#27723 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-10-30 17:40:35 -07:00
Cyrus Leung	ecca3fee76	[Frontend] Add `vllm bench sweep` to CLI (#27639 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-29 05:59:48 -07:00
Eugene Khvedchenya	5e72216d17	Feature/video support in random mm dataset (#25963 ) Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com> Signed-off-by: Eugene Khvedchenya <ekhvedchenia@nvidia.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-29 18:24:52 +08:00
Yeshwanth N	71b1c8b667	[Chore]:Extract math and argparse utilities to separate modules (#27188 ) Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com> Signed-off-by: Yeshwanth N <yeshsurya@gmail.com> Signed-off-by: yeshsurya <yeshsurya@gmail.com>	2025-10-26 04:03:32 -07:00
Lucia Fang	315b860abe	[bugfix]fix empty prompts for async-engine mode in benchmark throughput (#27494 ) Signed-off-by: Lucia Fang <fanglu@fb.com>	2025-10-26 08:16:35 +00:00
Cyrus Leung	b7030d962b	[Benchmark] Enable benchmark to run with `encoding_format="bytes"` (#27467 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-24 11:16:50 +00:00
Cyrus Leung	6738e4a093	[Bugfix] Fix SLA tuner initialization (#27355 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-22 20:43:04 -07:00
Cyrus Leung	ceacedc1f9	[Benchmark] Add plot utility for parameter sweep (#27168 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-21 20:30:03 -07:00
Cyrus Leung	d31f7844f8	[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-19 05:20:55 -07:00
Cyrus Leung	b3aba04e5a	[Benchmark] Convenience script for multiple parameter combinations (#27085 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-18 23:57:01 -07:00
Harry Mellor	6c9fdbf725	[Docs] Replace `rst` style double-backtick with `md` single-backtick (#27091 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-17 02:47:34 -07:00
Tomas Ruiz	965c5f4914	vllm bench serve shows num of failed requests (#26478 ) Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>	2025-10-16 19:55:09 -07:00
Cyrus Leung	4d4d6bad19	[Chore] Separate out `vllm.utils.importlib` (#27022 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-17 00:48:59 +00:00
Wentao Ye	23583ee28c	[Bug] Add Assertion for `random-input-len` / `random-output-len` (#26834 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-16 21:36:39 +00:00
kimbochen	013abde6ef	Adding Warmup to Benchmark Serving (#26943 ) Signed-off-by: Kimbo Chen <chentenghung@gmail.com>	2025-10-16 12:44:32 -07:00
Cyrus Leung	334535b6fb	[Benchmark] Show E2EL by default for pooling models (#27014 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 12:47:09 +00:00
Cyrus Leung	17838e50ef	[Benchmark] Use truncation by default for pooling benchmarks (#26992 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 16:02:39 +08:00
Cyrus Leung	f6cdc9a02f	[Chore] Rename `utils` submodules (#26920 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 03:58:13 +00:00
Cyrus Leung	828523ad8e	[Chore] Separate out `vllm.utils.async_utils` (#26913 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-15 15:33:00 +00:00
wangxiyuan	8f4b313c37	[Misc] rename torch_dtype to dtype (#26695 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-10-15 12:11:48 +00:00
rongfu.leng	a27b288e4a	[Feature] default --extra-body param to disable thinking in vllm bench serve (#26784 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-10-15 04:23:44 +00:00
kourosh hakhamaneshi	a2986b3e33	[Bugfix] Fixes prefix-repetition benchmark script (#26828 ) Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com>	2025-10-15 02:54:43 +00:00
Maximilien de Bayser	fe3edb4cf0	Add support for the /rerank endpoint in vllm bench serve (#26602 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2025-10-14 04:25:43 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Cyrus Leung	5be7ca1b99	[Benchmark] Support Infinity API (#26641 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-12 01:45:32 +08:00
Cyrus Leung	4bdf7ac593	[Bugfix] Fix SHM cache initialization (#26427 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-09 02:48:04 -07:00
Cyrus Leung	dc7976dd9f	[Misc] Upgrade more code to Python 3.10 (#26463 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-09 10:43:53 +01:00
Huy Do	8bd696fa53	[Bugfix] Incorrect another MM data format in vllm bench throughput (#26462 ) Signed-off-by: Huy Do <huydhn@gmail.com>	2025-10-09 05:58:46 +00:00
Cyrus Leung	0d4f48fa10	[Bugfix] Incorrect MM data format in `vllm bench throughput` (#26395 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-08 13:52:19 +08:00
Cyrus Leung	44b9af5bb2	[Benchmark] Enable MM Embedding benchmarks (#26310 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-06 19:51:58 +00:00
Roger Wang	43c146ca42	[Misc] Clean up unnecessary E501 ignore (#26274 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-10-06 07:29:18 +00:00
Yasmin Moslem	7c2ec0fe87	[Benchmarking] Add disable_shuffle option for dataset loading (#26258 ) Signed-off-by: Yasmin Moslem <48152713+ymoslem@users.noreply.github.com>	2025-10-06 07:05:44 +00:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Sergei Skvortsov	b71fcd4905	[Misc] Add penalties sampling parameters to serve tool (#25974 ) Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com> Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>	2025-10-03 15:43:14 -07:00
Ekagra Ranjan	ad2d788016	[Bug][Benchmark] Fix duplicate req in oversampling (#26140 ) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-03 02:55:24 +00:00
Ekagra Ranjan	1cab2f9cad	EAGLE 3: Fix preamble so that measured speedup over Eagle 1 becomes 32% instead of 5% on MTBench (#25916 ) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>	2025-10-02 11:29:35 -07:00
Nathan Scott	f9e714813a	[Benchmark] Finish documented v0.11.0 deprecation of --endpoint-type (#26007 ) Signed-off-by: Nathan Scott <nathans@redhat.com>	2025-10-01 12:41:57 +00:00
Zhuohan Li	d3bd171123	[Benchmark] Support benchmark throughput for external launcher DP (#25913 ) Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>	2025-09-30 01:43:57 +00:00
weiliang	f4e4088c99	Fix random dataset mismatched token length with config. (#24937 ) Signed-off-by: Weiliang Liu <weiliangl@nvidia.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-28 08:23:44 +00:00
WeiQing Chen	f1d53d150c	[Multimodal][Speculative Decoding]Eagle Eagle3 mm support, enablement on qwen2.5vl (#22872 ) Signed-off-by: Junhong <liujunhong11@huawei.com> Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com> Co-authored-by: Junhong <liujunhong11@huawei.com> Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>	2025-09-27 03:35:47 +00:00
Lucia Fang	eea1783989	[benchmarks]allow skip ready check for bench serve (#25420 ) Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>	2025-09-23 03:21:48 +00:00
samzong	ce75e15373	refactor(benchmarks): add type annotations to wait_for_endpoint parameters (#25218 ) Signed-off-by: samzong <samzong.lu@gmail.com>	2025-09-19 16:36:52 +00:00
Roger Wang	21da73343a	[Misc] Clean up flags in `vllm bench serve` (#25138 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-09-18 12:43:33 +00:00
Punitvara	05b044e698	[Doc] Fix cross-reference warnings (#25058 ) Signed-off-by: Punit Vara <punitvara@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-18 02:05:16 -07:00
Simon Mo	a904ea78ea	[benchmark] add peak throughput metrics and plot (#23867 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-09-17 22:30:02 -07:00

1 2 3

116 Commits