[ASR] Fix audio benchmark and add RTFx metric (#32300)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
This commit is contained in:
Ekagra Ranjan
2026-02-09 05:02:37 -05:00
committed by GitHub
parent 3025b3cebb
commit 1d5922fade
4 changed files with 90 additions and 19 deletions

View File

@@ -30,6 +30,7 @@ th {
| HuggingFace-Other | ✅ | ✅ | `lmms-lab/LLaVA-OneVision-Data`, `Aeala/ShareGPT_Vicuna_unfiltered` |
| HuggingFace-MTBench | ✅ | ✅ | `philschmid/mt-bench` |
| HuggingFace-Blazedit | ✅ | ✅ | `vdaita/edit_5k_char`, `vdaita/edit_10k_char` |
| HuggingFace-ASR | ✅ | ✅ | `openslr/librispeech_asr`, `facebook/voxpopuli`, `LIUM/tedlium`, `edinburghcstr/ami`, `speechcolab/gigaspeech`, `kensho/spgispeech` |
| Spec Bench | ✅ | ✅ | `wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl` |
| Custom | ✅ | ✅ | Local file: `data.jsonl` |
| Custom MM | ✅ | ✅ | Local file: `mm_data.jsonl` |
@@ -299,6 +300,22 @@ vllm bench serve \
--blazedit-max-distance 0.99
```
`openslr/librispeech_asr`, `facebook/voxpopuli`, `LIUM/tedlium`, `edinburghcstr/ami`, `speechcolab/gigaspeech`, `kensho/spgispeech`
```bash
vllm bench serve \
--model openai/whisper-large-v3-turbo \
--backend openai-audio \
--dataset-name hf \
--dataset-path facebook/voxpopuli --hf-subset en --hf-split test --no-stream --trust-remote-code \
--num-prompts 99999999 \
--no-oversample \
--endpoint /v1/audio/transcriptions \
--ready-check-timeout-sec 600 \
--save-result \
--max-concurrency 512
```
#### Running With Sampling Parameters
When using OpenAI-compatible backends such as `vllm`, optional sampling