The previous approach tried to reconstruct hf_ptq's pipeline by importing
individual functions and building a fake argparse.Namespace. This caused
repeated crashes from missing args (KV_QUANT_CFG_CHOICES, dataset,
calib_with_images, etc.).
New approach:
- Call hf_ptq.parse_args() with sys.argv replaced — gets ALL defaults
- Call hf_main(args) — the exact same entry point the shell script uses
- Hook export_quantized to add amax snapshot + state save before export
- No more missing args. No more diverging from the example script.
The only changes from the stock pipeline:
1. Runtime patches (load_calib_amax CPU, export_amax CPU, clamp)
2. Post-calibration hook (snapshot amax, save state, force CPU)