This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
deepseek-v4-quant
Watch
1
Star
0
Fork
0
You've already forked deepseek-v4-quant
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
f2656dcf6d4013adf8ee7e05ab5be36dc3ef4604
deepseek-v4-quant
/
patches
History
biondizzle
f2656dcf6d
sync B200 deployment files: Dockerfile, docker-compose, patches
2026-05-14 14:13:18 +00:00
..
deepseek_v4_attention.py
sync B200 deployment files: Dockerfile, docker-compose, patches
2026-05-14 14:13:18 +00:00
deepseek_v4.py
sync B200 deployment files: Dockerfile, docker-compose, patches
2026-05-14 14:13:18 +00:00
deepseek_v4.py.bak
S11: Fixed substr mapping, stacking, suffix, and o_a_proj - loads weights but attention forward uses FP8 einsum incompatible with NVFP4
2026-05-10 17:45:53 +00:00
deepseek_v4.py.s11
S11: Fixed substr mapping, stacking, suffix, and o_a_proj - loads weights but attention forward uses FP8 einsum incompatible with NVFP4
2026-05-10 17:45:53 +00:00
patch_finegrained_fp8_blackwell.py
Add BF16 upcast script and Blackwell DeepGEMM patch
2026-05-07 14:25:30 +00:00
patch_vllm_weights.py
vLLM serving: patched deepseek_v4.py, disabled mega_moe, updated docs
2026-05-10 16:14:17 +00:00
quant_module_patched.py
Add ModelOpt NVFP4 pipeline: patch, run script, README
2026-05-07 07:22:54 +00:00
staging_kernel.py
sync B200 deployment files: Dockerfile, docker-compose, patches
2026-05-14 14:13:18 +00:00