biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:22:13 +00:00
1acf01fc1a Fix token_indices: repeat each token ID top_k times, not arange
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:21:48 +00:00
a478ca4746 Debug: trace runner logic step by step, test L1 GEMM
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:20:43 +00:00
a100bd11c1 Simplify pipeline test: BF16 ref + bridge ref + full runner
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:19:51 +00:00
6eade5e7f8 Fix: gs values are floats not tensors
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:19:25 +00:00
b05a38a9bd Test stages 1-2 first: sort + L1 GEMM
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:19:19 +00:00
9728604ea1 Pipeline test: stage-by-stage with BF16 reference comparison
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:18:22 +00:00
7fff5fd39b Fix: correct intermediate_size=3072, weight key prefix, dequantize shapes
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:17:19 +00:00
4ef345773d Rewrite pipeline test: load real weights, step-by-step vs BF16 reference
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 21:00:02 +00:00
b43541afdd Fix test path setup
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 20:58:08 +00:00
490ddfa294 Pipeline test: use synthetic weights at 256x512 (JIT at 7168x18432 hangs for hours)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:15:52 +00:00
c1bb551446 Fix weight loading: skip already-loaded experts correctly
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:13:43 +00:00
955d7533f2 Use system Python for pipeline test (CuTeDSL in system site-packages)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:13:00 +00:00
925e390b93 Fix import: use direct import from vllm/ subdirectory
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:11:05 +00:00
cd6144b832 Fix imports: all functions are in cutedsl.bridge, not separate modules
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:10:07 +00:00
5e63a0d8a3 Rewrite pipeline test: use raw checkpoint weights, compare runner vs dynamic-gs reference
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:08:36 +00:00
e51eafe288 Rewrite pipeline test: compare runner vs reference with real weights, step-by-step
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:07:45 +00:00
e38d60a6e8 Add pipeline test with real model weights, add swiglu_limit to reference moe_pipeline
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 18:06:46 +00:00
22e0370e6e Fix AttributeError: DeepseekV4MegaMoEExperts has no swiglu_limit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 17:56:06 +00:00
6692166d0f Update CURRENT_BUG.md: Bug 25 (swiglu_limit), shared expert path verification, variable padded offsets
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-05-17 17:52:17 +00:00
a10c582cf4 Add swiglu_limit=10.0 activation clamping (was missing)