vllm/tests/models/quantization at 3a4e10c8477c329b9e75ba55ff205a1f258cbd01 - vllm

Files

Roberto L. Castro 8ef50d9a6b [Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (#30885 )

Signed-off-by: LopezCastroRoberto <roberto.lopez.castro@udc.es>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>

2026-01-13 15:22:53 -08:00

__init__.py

…

test_awq.py

…

test_bitblas.py

…

test_bitsandbytes.py

…

test_fp8.py

[1/N][Attention] Restructure attention: move files (#31916 )

2026-01-09 13:10:24 -08:00

test_gguf.py

…

test_gpt_oss_attn_quantization.py

…

test_gptq_bitblas.py

…

test_gptq_marlin_24.py

[Quantization] Deprecate Long Tail of Schemes (#31688 )

2026-01-08 19:07:45 -05:00

test_gptq_marlin.py

…

test_modelopt.py

…

test_mxfp4.py

…

test_nvfp4.py

[Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (#30885 )

2026-01-13 15:22:53 -08:00