Files
deepseek-v4-quant/scripts
biondizzle efc111a11f Add Patch 4+5: get_weight_scaling_factor and get_weight_scaling_factor_2 CPU safety
Run 10 completed calibration (128/128) but crashed at export in
get_weight_scaling_factor — the weight tensor on GPU was stale after
5+ hours of calibration, and weight_scaling_factor_2.to(weight.device)
triggered cudaErrorIllegalAddress.

Patches 4+5 force weight and quantizer state to CPU before computing
scaling factors. This mirrors the same pattern as Patch 3
(get_activation_scaling_factor).

Calibrated state saved successfully (721.4 GB, 47,696 amax tensors).
Amax snapshot saved (15.4 MB). Re-running with new patches.
2026-05-09 22:43:48 +00:00
..