deepseek-v4-quant/scripts at 3d38e1d5cd0f84fbc351c676f64528a331929064 - deepseek-v4-quant - Gitea: Git with a cup of tea

biondizzle/deepseek-v4-quant

Files

History

biondizzle 3d38e1d5cd nvfp4_full: drop calib to 128, gpu_max_mem to 0.7 for VRAM headroom

2026-05-08 06:24:45 +00:00

..

dequant_fp8_to_bf16.py

Add resume capability to dequant script (skip already-done shards)

2026-05-08 02:58:24 +00:00

model_opt_nvfp4_experts_only.py

Update nvfp4_experts_only to use dequantized BF16 model

2026-05-07 16:34:37 +00:00

model_opt_nvfp4_full.py

nvfp4_full: drop calib to 128, gpu_max_mem to 0.7 for VRAM headroom

2026-05-08 06:24:45 +00:00

run_modelopt_nvfp4.sh

Add ModelOpt NVFP4 pipeline: patch, run script, README

2026-05-07 07:22:54 +00:00

upcast_to_bf16.py

Add BF16 upcast script and Blackwell DeepGEMM patch

2026-05-07 14:25:30 +00:00