Files
vllm/tests/evals/gsm8k/configs/Qwen3-30B-A3B-MXFP4A16.yaml
Dipika Sikka 361dfdc9d8 [Quant] Support MXFP4 W4A16 for compressed-tensors MoE models (#32285)
Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-15 07:25:55 -08:00

6 lines
158 B
YAML

model_name: nm-testing/Qwen3-30B-A3B-MXFP4A16
accuracy_threshold: 0.88
num_questions: 1319
num_fewshot: 5
server_args: "--enforce-eager --max-model-len 4096"