biondizzle

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-07 15:16:09 +00:00

6008cf128d Add model_opt_nvfp4_experts_only.py

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-07 14:29:55 +00:00

a7664aee7d Add BF16 upcast script and Blackwell DeepGEMM patch

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-07 14:27:40 +00:00

7a3b81e833 Add BF16 upcast script and Blackwell DeepGEMM patch

biondizzle created branch modelopt-nvfp4 in biondizzle/deepseek-v4-quant

2026-05-07 07:23:06 +00:00

biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant

2026-05-07 07:23:06 +00:00

ef89ceffbd Add ModelOpt NVFP4 pipeline: patch, run script, README

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-07 03:38:05 +00:00

a0bcabac5a NVFP4-everything: quantize all 2D Linear weights including attention and lm_head

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 03:06:35 +00:00

116933dcf6 Fix: skip .cuda() when low_memory_mode; switch default to nvfp4

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 02:49:26 +00:00

b8bdd00d19 Lower GPU max_memory to 100GiB, add CPU-only fallback for low_memory_mode

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 02:40:50 +00:00

717151b98c Add CPU offloading and max_memory caps for FP8 model loading

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 02:08:10 +00:00

aff12c6951 Fix forward_loop: pass as callable, not via create_forward_loop

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 02:04:55 +00:00

492e44c0f6 Fix dataloader API: max_sample_length not seq_len, proper create_forward_loop

biondizzle created branch nvidia-modelopt in biondizzle/deepseek-v4-quant

2026-05-07 00:11:33 +00:00

biondizzle pushed to nvidia-modelopt at biondizzle/deepseek-v4-quant

2026-05-07 00:11:33 +00:00

b32bb2e84d NVIDIA Model Optimizer branch: nvfp4_experts_only PTQ for DeepSeek V4 Pro

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-07 00:06:02 +00:00

c40607053b Fix remaining gate_proj/up_proj -> w1/w3 references in paired_names

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-07 00:05:28 +00:00

771e42cef3 Fix expert pair dict keys: w1/w3 not gate_proj/up_proj

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-07 00:04:30 +00:00

5f35a5d2b3 Gracefully handle missing scale tensors (BF16 weights with stale index entries)

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-07 00:03:21 +00:00

4470653e15 Fix V4 tensor naming: .scale companions, w1/w3 expert pairs, ffn.gate, hc_* preserve

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-06 23:51:58 +00:00

2b7f063e39 7 commit

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-06 23:50:55 +00:00

be16bd023e sixth commit

biondizzle pushed to master at biondizzle/deepseek-v4-quant

2026-05-06 23:49:38 +00:00

97e7638abc sixth commit