[CI/Build][Doc] Fully deprecate old bench scripts for serving / throughput / latency (#24411)

Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-09-09 03:02:35 -07:00
parent 3d2a2de8f7
commit 6fb2788163
4 changed files with 41 additions and 2246 deletions
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -694,7 +694,7 @@ python -m vllm.entrypoints.openai.api_server \
 Send requests with images:

 ```bash
-python benchmarks/benchmark_serving.py \
+vllm bench serve \
  --backend openai-chat \
  --model Qwen/Qwen2.5-VL-7B-Instruct \
  --dataset-name sharegpt \
@@ -721,7 +721,7 @@ python -m vllm.entrypoints.openai.api_server \
 Send requests with videos:

 ```bash
-python benchmarks/benchmark_serving.py \
+vllm bench serve \
  --backend openai-chat \
  --model Qwen/Qwen2.5-VL-7B-Instruct \
  --dataset-name sharegpt \