[CI/Build][Doc] Fully deprecate old bench scripts for serving / throughput / latency (#24411)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
This commit is contained in:
committed by
GitHub
parent
3d2a2de8f7
commit
6fb2788163
@@ -694,7 +694,7 @@ python -m vllm.entrypoints.openai.api_server \
|
||||
Send requests with images:
|
||||
|
||||
```bash
|
||||
python benchmarks/benchmark_serving.py \
|
||||
vllm bench serve \
|
||||
--backend openai-chat \
|
||||
--model Qwen/Qwen2.5-VL-7B-Instruct \
|
||||
--dataset-name sharegpt \
|
||||
@@ -721,7 +721,7 @@ python -m vllm.entrypoints.openai.api_server \
|
||||
Send requests with videos:
|
||||
|
||||
```bash
|
||||
python benchmarks/benchmark_serving.py \
|
||||
vllm bench serve \
|
||||
--backend openai-chat \
|
||||
--model Qwen/Qwen2.5-VL-7B-Instruct \
|
||||
--dataset-name sharegpt \
|
||||
|
||||
Reference in New Issue
Block a user