biondizzle
  • Joined on 2025-12-10
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 06:15:12 +00:00
a2e9b5f17f fix: add --enable-expert-parallel to compose command
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 06:09:43 +00:00
c8564caf9d fix: patch vLLM deepseek_v4.py directly in image
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 06:06:53 +00:00
7c8c6cd67f fix: add PYTHONPATH for deep_gemm import
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 06:04:26 +00:00
cffb373759 fix: symlink NVRTC lib into cuda/lib64 for linker
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 06:02:15 +00:00
983ba02c5b fix: add CUDA/NVRTC lib paths to Dockerfile
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 05:59:11 +00:00
f0471ed1c2 fix: correct CR URL to atl.vultrcr.com
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 05:57:55 +00:00
c234190a80 feat: add Dockerfile + build/push script for NVFP4 container
biondizzle pushed to nvfp4-mega-moe at biondizzle/DeepGEMM 2026-05-11 05:53:07 +00:00
aa9e53d5b2 feat: add build script for in-container compilation
biondizzle pushed to nvfp4-mega-moe at biondizzle/DeepGEMM 2026-05-11 05:52:46 +00:00
328a352119 feat: add Dockerfile for NVFP4 mega moe build
biondizzle created branch nvfp4-mega-moe in biondizzle/DeepGEMM 2026-05-11 05:46:43 +00:00
biondizzle pushed to nvfp4-mega-moe at biondizzle/DeepGEMM 2026-05-11 05:46:43 +00:00
bbf9a5f46a feat: fold weight_scale_2 into block scales in NVFP4 transform
42c215d49b docs: add NVFP4 mega MoE kernel README
36b439ee26 feat: NVFP4 mega MoE kernel (scale_vec::4X, UE4M3 block scales)
891d57b4db Add various optimizations and Mega MoE benchmarks (#316)
7f2a703ed5 [Public release 26/04] Introducing Mega MoE, FP4 Indexer and other features/fixes (#304)
Compare 10 commits »
biondizzle created repository biondizzle/DeepGEMM 2026-05-11 05:44:01 +00:00
biondizzle created branch mega-moe-nvfp4 in biondizzle/deepseek-v4-quant 2026-05-11 05:20:03 +00:00
biondizzle pushed to mega-moe-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 05:20:03 +00:00
e963325b61 WIP: MegaMoE NVFP4 kernel + diagnostics
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 04:57:25 +00:00
7e2f219259 fix: banner uses _os instead of os (not yet imported)
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 04:37:34 +00:00
cf54b4755a fix CRITICAL #7: UE8M0 block scale misinterpreted as E4M3
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 04:28:40 +00:00
7febeaeb71 README: document bugs #5 (input_scale) and #6 (fused_skip_regex), add version banner section, update status
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 04:28:11 +00:00
26aaaba4a2 Add version banner to patch — prints commit, arch, bugs fixed at startup
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 02:23:28 +00:00
67f9086a26 Fix critical dequantization bug: remove input_scale from weight dequant
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-11 02:02:55 +00:00
02b8ea536f Update MEMORY.md and memory files with vLLM NVFP4 serving progress