vllm/tests/models/quantization at c68e69f1449cc6d84f43137fcc36c142de1c8fd3 - vllm

Files

xuebwang-amd b129136c7a [ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008 )

Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>

2026-02-10 10:08:05 -05:00

__init__.py

[CI/Build] Reorganize models tests (#17459 )

2025-04-30 23:03:08 -07:00

test_awq.py

[Renderer] Define render_cmpl and render_chat (#34039 )

2026-02-07 05:24:40 -08:00

test_bitsandbytes.py

Default model load/config/tokenizer to mistral format if relevant files exist (#28659 )

2025-11-21 13:58:59 -08:00

test_fp8.py

[1/N][Attention] Restructure attention: move files (#31916 )

2026-01-09 13:10:24 -08:00

test_gguf.py

[Bugfix][Quantization] Support BF16 tensors on GGUF (#29948 )

2025-12-03 10:33:46 +00:00

test_gpt_oss.py

[ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008 )

2026-02-10 10:08:05 -05:00

test_gptq_marlin.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_modelopt.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_mxfp4.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_nvfp4.py

[Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (#30885 )

2026-01-13 15:22:53 -08:00