biondizzle

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 14:54:53 +00:00

1a36a655ea Fix: use full argparse flag names (--calib_size, --kv_cache_qformat)

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 14:52:03 +00:00

b2849a8944 Fundamental rewrite: call hf_main() instead of rewriting the pipeline

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 13:40:01 +00:00

a70593d886 Update run history: Run 6 (dataloader crash), Run 7 running on 25b4d8d

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 13:37:29 +00:00

25b4d8da06 Fix: add missing args for make_calib_dataloader (dataset, calib_with_images, auto_quantize, specdec)

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 09:29:22 +00:00

d1e15178b2 Update run history: Runs 4-5 (import bugs), Run 6 running on 6c1bff6

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 09:26:25 +00:00

6c1bff6997 Clean rewrite: verified all imports against runtime, removed dead code

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 09:17:13 +00:00

86dd8df302 Fix: KV_QUANT_CFG_CHOICES is in hf_ptq, not mtq

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 08:10:07 +00:00

99f861f48a Update README and memory: Run 3 OOM crash, Run 4 running on f9bbef8

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 08:04:05 +00:00

f9bbef8e91 Fix: patch load_calib_amax instead of amax property setter (can't patch readonly descriptor)

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 08:02:10 +00:00

94179ed9d0 Fix typo: store_only → store_true

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 08:00:53 +00:00

03c10ab3b6 Fix model loading: use modelopt get_model() instead of raw AutoModelForCausalLM

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:47:27 +00:00

9438af5a8c Add commit hashes to run history table

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:44:18 +00:00

d7593fc1dd Update README: run history table, bug #1 already fixed, cost note, don't-repeat mistakes

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:31:10 +00:00

6eaba26914 Defensive quantization: snapshot amax to CPU immediately after calibration

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:10:19 +00:00

3907838409 Remove ModuleList patch (already fixed in modelopt 0.45), fix numbering

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:09:19 +00:00

382c1d872f Fix quant_module import path

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:08:36 +00:00

9291165ba0 Fix imports: QUANT_CFG_CHOICES is in hf_ptq, not modelopt config

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-09 06:07:25 +00:00

a0bacb3cf6 Replace shell wrapper with in-process quantize script

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-08 23:28:34 +00:00

04304fdae6 Add export crash fix patches, update README with bug #5 (repr CUDA crash)

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-08 17:31:37 +00:00

50348989b2 Clarify: V4 is NOT BF16, dequantize first