Files
deepseek-v4-quant/patches
biondizzle e963325b61 WIP: MegaMoE NVFP4 kernel + diagnostics
- Force use_mega_moe=True for NVFP4 pipeline
- DeepseekV4MegaMoEExperts: load NVFP4 params (float8 block scales,
  float32 global/input scales), convert NVFP4→BF16→MXFP4 in
  finalize_weights for the DeepGEMM mega_moe kernel
- Add _nvfp4_to_bf16 and _bf16_to_mxfp4 conversion methods
- Remove expert_dtype check blocking mega_moe
- Add diagnostics for wo_a and bf16 layer conversion
- Still WIP: attention layer bugs under investigation
2026-05-11 05:19:49 +00:00
..