Files
deepseek-v4-quant/scripts
biondizzle b5d569218c Add full nvfp4 quantization script + complete dequant script
- model_opt_nvfp4_full.py: Full NVFP4 quantization (not experts-only)
  Uses --gpu_max_mem_percentage 0.9 instead of --use_seq_device_map
- dequant_fp8_to_bf16.py: Now handles INT4-packed experts + FP8 shared
  experts + FP8 attention. Complete dequant to pure BF16.
2026-05-08 01:50:53 +00:00
..