Add support for ModelOpt MXFP8 dense models (#33786)

Signed-off-by: Daniel Serebrenik <daserebrenik@nvidia.com>
This commit is contained in:
danisereb
2026-02-08 21:16:48 +02:00
committed by GitHub
parent 1ecfabe525
commit 084aa19f02
6 changed files with 375 additions and 14 deletions

View File

@@ -17,6 +17,7 @@ following `quantization.quant_algo` values:
- `FP8_PER_CHANNEL_PER_TOKEN`: per-channel weight scale and dynamic per-token activation quantization.
- `FP8_PB_WO` (ModelOpt may emit `fp8_pb_wo`): block-scaled FP8 weight-only (typically 128×128 blocks).
- `NVFP4`: ModelOpt NVFP4 checkpoints (use `quantization="modelopt_fp4"`).
- `MXFP8`: ModelOpt MXFP8 checkpoints (use `quantization="modelopt_mxfp8"`).
## Quantizing HuggingFace Models with PTQ