deepseek-v4-quant

Files

biondizzle b5d14aa8b8 Add proper FP8→BF16 dequantization script

Unlike the naive upcast, this properly dequantizes FP8 block-wise weights:
bf16 = fp8_weight * scale_expanded (128x128 blocks).

Also removes the now-unnecessary scale tensors and updates config.
FP8Linear.forward() sees element_size() > 1 and falls back to F.linear().

2026-05-07 15:45:46 +00:00

dequant_fp8_to_bf16.py

Add proper FP8→BF16 dequantization script

2026-05-07 15:45:46 +00:00

model_opt_nvfp4_experts_only.py

Add model_opt_nvfp4_experts_only.py

2026-05-07 15:16:08 +00:00

run_modelopt_nvfp4.sh

Add ModelOpt NVFP4 pipeline: patch, run script, README

2026-05-07 07:22:54 +00:00

upcast_to_bf16.py

Add BF16 upcast script and Blackwell DeepGEMM patch

2026-05-07 14:25:30 +00:00