biondizzle

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 05:19:49 +00:00

5e6d459145 Fix MHC custom op registration

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 05:07:44 +00:00

9ff1679064 Replace MHC TileLang kernels with pure PyTorch

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:50:32 +00:00

5c770c68ca Keep MoE scale tensors: framework warmup needs them

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:42:23 +00:00

e0f385ac45 Fix workspace_shapes: output dim is hidden_dim, not K*2

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:36:02 +00:00

cfd8ec741d Debug: add shape mismatch logging in MoE apply

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:28:09 +00:00

ffc1a5c6a8 Fix workspace_shapes: remove wrong assertion, compute output dim from K

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:21:37 +00:00

f023b3b2c6 Fix: wrap dummy MoE weights in nn.Parameter

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:17:12 +00:00

b06dcb40dc Fix MoE w1=None crash: keep shape-preserving dummy weights on CPU

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 04:10:04 +00:00

c289c44920 Fix BF16 wo_a: per-group BMM instead of flat linear

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:58:27 +00:00

6f9a400ae0 Fix hc_head mapping: checkpoint uses hc_head.hc_fn, model params are flat hc_head_fn

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:54:16 +00:00

909a2710e4 Fix double lm_head mapping: NVFP4 checkpoint already uses correct names

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:47:28 +00:00

4cf5b8b751 Fix compressor path: attn.mla_attn.compressor (not attn.compressor)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:44:41 +00:00

9d41419e9f Debug: print compressor params to diagnose KeyError

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:35:16 +00:00

db5192fe41 Patch from Docker image's vLLM (0.20.2rc1) instead of newer upstream

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:29:07 +00:00

df5a496f5d Fix: make eager_break_during_capture import conditional for older vLLM

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:22:11 +00:00

4ed91b81d0 Fix inverse RoPE formula: swap signs on cross terms

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:22:02 +00:00

fece06f746 Add unit tests for NVFP4 weight mapper and inverse RoPE BF16

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:20:43 +00:00

b0b5113467 Fix weight mapper: compressor → attn.compressor (not mla_attn), quant weights_proj

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 03:13:56 +00:00

396a83ea56 Clean vLLM integration: use official paths, BF16 wo_a, proper weight mapper

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 02:47:32 +00:00

b856ee9315 Clean up debug scripts