This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
c5030c439db3944f2cdbdfbc1283b431e863f73f
vllm
/
tests
/
models
/
quantization
History
EdalatiAli
e5b807607c
[Quant][Feature] Support online MXFP8 quantization for MoE and dense models (
#35448
)
...
Signed-off-by: EdalatiAli <
aliedalati@cohere.com
>
2026-03-16 18:07:39 -04:00
..
__init__.py
…
test_awq.py
[Renderer] Define
render_cmpl
and
render_chat
(
#34039
)
2026-02-07 05:24:40 -08:00
test_bitsandbytes.py
[bnb] Skip moe + bnb test (
#36896
)
2026-03-12 18:03:25 +00:00
test_fp8.py
[1/N][Attention] Restructure attention: move files (
#31916
)
2026-01-09 13:10:24 -08:00
test_gguf.py
…
test_gpt_oss.py
[ROCm][CI] Enable AITER for failing
test_gpt_oss
test case on MI355 (
#36174
)
2026-03-07 13:50:17 -08:00
test_gptq_marlin.py
…
test_modelopt.py
…
test_mxfp4.py
…
test_mxfp8.py
[Quant][Feature] Support online MXFP8 quantization for MoE and dense models (
#35448
)
2026-03-16 18:07:39 -04:00
test_nvfp4.py
[Kernel][Performance] Enable smaller Scaling Factor tiling for NVFP4 small-batch decoding (
#30885
)
2026-01-13 15:22:53 -08:00