biondizzle
  • Joined on 2025-12-10
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 14:54:53 +00:00
1a36a655ea Fix: use full argparse flag names (--calib_size, --kv_cache_qformat)
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 14:52:03 +00:00
b2849a8944 Fundamental rewrite: call hf_main() instead of rewriting the pipeline
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 13:40:01 +00:00
a70593d886 Update run history: Run 6 (dataloader crash), Run 7 running on 25b4d8d
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 13:37:29 +00:00
25b4d8da06 Fix: add missing args for make_calib_dataloader (dataset, calib_with_images, auto_quantize, specdec)
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 09:29:22 +00:00
d1e15178b2 Update run history: Runs 4-5 (import bugs), Run 6 running on 6c1bff6
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 09:26:25 +00:00
6c1bff6997 Clean rewrite: verified all imports against runtime, removed dead code
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 09:17:13 +00:00
86dd8df302 Fix: KV_QUANT_CFG_CHOICES is in hf_ptq, not mtq
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 08:10:07 +00:00
99f861f48a Update README and memory: Run 3 OOM crash, Run 4 running on f9bbef8
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 08:04:05 +00:00
f9bbef8e91 Fix: patch load_calib_amax instead of amax property setter (can't patch readonly descriptor)
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 08:02:10 +00:00
94179ed9d0 Fix typo: store_only → store_true
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 08:00:53 +00:00
03c10ab3b6 Fix model loading: use modelopt get_model() instead of raw AutoModelForCausalLM
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:47:27 +00:00
9438af5a8c Add commit hashes to run history table
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:44:18 +00:00
d7593fc1dd Update README: run history table, bug #1 already fixed, cost note, don't-repeat mistakes
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:31:10 +00:00
6eaba26914 Defensive quantization: snapshot amax to CPU immediately after calibration
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:10:19 +00:00
3907838409 Remove ModuleList patch (already fixed in modelopt 0.45), fix numbering
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:09:19 +00:00
382c1d872f Fix quant_module import path
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:08:36 +00:00
9291165ba0 Fix imports: QUANT_CFG_CHOICES is in hf_ptq, not modelopt config
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-09 06:07:25 +00:00
a0bacb3cf6 Replace shell wrapper with in-process quantize script
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-08 23:28:34 +00:00
04304fdae6 Add export crash fix patches, update README with bug #5 (repr CUDA crash)
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-08 17:31:37 +00:00
50348989b2 Clarify: V4 is NOT BF16, dequantize first