biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:47:11 +00:00
bf17bd3fc4 more fixes3
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:43:27 +00:00
c68f4e9d6e more fixes2
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:39:25 +00:00
4749a92fca more fixes
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:35:43 +00:00
1ceff541b0 more fixes
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:29:49 +00:00
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 19:20:12 +00:00
57512d5f0d clean up
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 18:47:32 +00:00
0d8e1bd035 restructure: move Dockerfile and docker-compose to root, docker/ → vllm/
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 18:44:09 +00:00
878ad4fc5b fix Dockerfile patch paths and add explicit env vars for debug suppression
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 18:40:18 +00:00
072a1d4a0b clean up
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 18:20:27 +00:00
1150e325bb Consolidate serving into kernel repo
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 17:44:00 +00:00
2687d1fc53 fix: convert global expert IDs to local before GEMM
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 17:34:00 +00:00
128ff84358 fix: 384 experts (not 256), clarify cross-rank reduce is in caller
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 17:28:13 +00:00
1c15dadaa5 cleanup: remove dead _pack_ue4m3_to_uint32, fix data format docs
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 17:02:53 +00:00
008f8cccbd docs: comprehensive README with SF remap probe data, bug history, coordinate table
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 16:57:34 +00:00
1e0cea055c cleanup: remove all debug printfs from CUDA kernel and weight_transform
biondizzle pushed to modelopt-nvfp4 at biondizzle/deepseek-v4-quant 2026-05-14 16:47:29 +00:00
0c77a88757 sync: latest Dockerfile + nvfp4_linear.py patch from B200
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 16:40:49 +00:00
839835cba4 fix: correct SF remap coordinate extraction for flat_rank=8
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 16:26:16 +00:00
1ef2fbc2fd debug: more indices for SF layout dump
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 16:13:07 +00:00
c4b5b52a33 debug: single-thread SF layout dump at specific indices
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-14 15:58:02 +00:00
17e6033ade debug: print specific indices for SF layout coordinate decomposition