biondizzle
  • Joined on 2025-12-10
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 05:19:49 +00:00
5e6d459145 Fix MHC custom op registration
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 05:07:44 +00:00
9ff1679064 Replace MHC TileLang kernels with pure PyTorch
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:50:32 +00:00
5c770c68ca Keep MoE scale tensors: framework warmup needs them
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:42:23 +00:00
e0f385ac45 Fix workspace_shapes: output dim is hidden_dim, not K*2
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:36:02 +00:00
cfd8ec741d Debug: add shape mismatch logging in MoE apply
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:28:09 +00:00
ffc1a5c6a8 Fix workspace_shapes: remove wrong assertion, compute output dim from K
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:21:37 +00:00
f023b3b2c6 Fix: wrap dummy MoE weights in nn.Parameter
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:17:12 +00:00
b06dcb40dc Fix MoE w1=None crash: keep shape-preserving dummy weights on CPU
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 04:10:04 +00:00
c289c44920 Fix BF16 wo_a: per-group BMM instead of flat linear
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:58:27 +00:00
6f9a400ae0 Fix hc_head mapping: checkpoint uses hc_head.hc_fn, model params are flat hc_head_fn
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:54:16 +00:00
909a2710e4 Fix double lm_head mapping: NVFP4 checkpoint already uses correct names
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:47:28 +00:00
4cf5b8b751 Fix compressor path: attn.mla_attn.compressor (not attn.compressor)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:44:41 +00:00
9d41419e9f Debug: print compressor params to diagnose KeyError
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:35:16 +00:00
db5192fe41 Patch from Docker image's vLLM (0.20.2rc1) instead of newer upstream
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:29:07 +00:00
df5a496f5d Fix: make eager_break_during_capture import conditional for older vLLM
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:22:11 +00:00
4ed91b81d0 Fix inverse RoPE formula: swap signs on cross terms
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:22:02 +00:00
fece06f746 Add unit tests for NVFP4 weight mapper and inverse RoPE BF16
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:20:43 +00:00
b0b5113467 Fix weight mapper: compressor → attn.compressor (not mla_attn), quant weights_proj
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 03:13:56 +00:00
396a83ea56 Clean vLLM integration: use official paths, BF16 wo_a, proper weight mapper
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 02:47:32 +00:00
b856ee9315 Clean up debug scripts