This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-07 15:16:09 +00:00
6008cf128d
Add model_opt_nvfp4_experts_only.py
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-07 14:29:55 +00:00
a7664aee7d
Add BF16 upcast script and Blackwell DeepGEMM patch
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-07 14:27:40 +00:00
7a3b81e833
Add BF16 upcast script and Blackwell DeepGEMM patch
biondizzle
created branch
modelopt-nvfp4
in
biondizzle/deepseek-v4-quant
2026-05-07 07:23:06 +00:00
biondizzle
pushed to
modelopt-nvfp4
at
biondizzle/deepseek-v4-quant
2026-05-07 07:23:06 +00:00
ef89ceffbd
Add ModelOpt NVFP4 pipeline: patch, run script, README
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-07 03:38:05 +00:00
a0bcabac5a
NVFP4-everything: quantize all 2D Linear weights including attention and lm_head
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 03:06:35 +00:00
116933dcf6
Fix: skip .cuda() when low_memory_mode; switch default to nvfp4
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 02:49:26 +00:00
b8bdd00d19
Lower GPU max_memory to 100GiB, add CPU-only fallback for low_memory_mode
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 02:40:50 +00:00
717151b98c
Add CPU offloading and max_memory caps for FP8 model loading
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 02:08:10 +00:00
aff12c6951
Fix forward_loop: pass as callable, not via create_forward_loop
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 02:04:55 +00:00
492e44c0f6
Fix dataloader API: max_sample_length not seq_len, proper create_forward_loop
biondizzle
created branch
nvidia-modelopt
in
biondizzle/deepseek-v4-quant
2026-05-07 00:11:33 +00:00
biondizzle
pushed to
nvidia-modelopt
at
biondizzle/deepseek-v4-quant
2026-05-07 00:11:33 +00:00
b32bb2e84d
NVIDIA Model Optimizer branch: nvfp4_experts_only PTQ for DeepSeek V4 Pro
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-07 00:06:02 +00:00
c40607053b
Fix remaining gate_proj/up_proj -> w1/w3 references in paired_names
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-07 00:05:28 +00:00
771e42cef3
Fix expert pair dict keys: w1/w3 not gate_proj/up_proj
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-07 00:04:30 +00:00
5f35a5d2b3
Gracefully handle missing scale tensors (BF16 weights with stale index entries)
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-07 00:03:21 +00:00
4470653e15
Fix V4 tensor naming: .scale companions, w1/w3 expert pairs, ffn.gate, hc_* preserve
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-06 23:51:58 +00:00
2b7f063e39
7 commit
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-06 23:50:55 +00:00
be16bd023e
sixth commit
biondizzle
pushed to
master
at
biondizzle/deepseek-v4-quant
2026-05-06 23:49:38 +00:00
97e7638abc
sixth commit
First
Previous
...
141
142
143
144
145
...
Next
Last