deepseek-v4-quant

master

a0bcabac5a · NVFP4-everything: quantize all 2D Linear weights including attention and lm_head · Updated 2026-05-07 03:38:02 +00:00

modelopt-nvfp4 0c77a88757 · sync: latest Dockerfile + nvfp4_linear.py patch from B200 · Updated 2026-05-14 16:47:27 +00:00 biondizzle	1 69	ZIP TAR.GZ
mega-moe-nvfp4 f08bcd456b · tweax n shit · Updated 2026-05-12 23:16:33 +00:00 biondizzle	1 134	ZIP TAR.GZ
nvidia-modelopt 116933dcf6 · Fix: skip .cuda() when low_memory_mode; switch default to nvfp4 · Updated 2026-05-07 03:06:33 +00:00 biondizzle	1 6	ZIP TAR.GZ