biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 13:19:30 +00:00
0c3796966d Add BF16 fallback for shared expert: dequantize NVFP4 → BF16 F.linear
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 12:56:54 +00:00
2866eb92e7 Fix W_gate device: ensure .to(dev) after transpose
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 12:42:20 +00:00
bd10bdbbd9 Fix router gate W_gate shape: must be (H, E) not (E, H)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 12:30:37 +00:00
dc5a24687e Switch router gate from NVFP4 to BF16 (dequantize)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 12:19:39 +00:00
cfea22cd6f Update PyTorch reference with official DSV4 encoding + batched prefill
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 11:37:44 +00:00
bdd9ab9669 Switch lm_head from NVFP4 to BF16 GEMM
biondizzle pushed tag pure-nvfp4 to biondizzle/nvfp4-megamoe-kernel 2026-06-03 11:37:17 +00:00
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 11:17:52 +00:00
3320abfe24 Fix two correctness bugs: compressor pos bias on KV + SwiGLU clamp ordering
biondizzle pushed tag v-official-encoding-path to biondizzle/nvfp4-megamoe-kernel 2026-06-03 11:06:13 +00:00
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 10:53:47 +00:00
7901470e63 doc clean up
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 10:25:26 +00:00
ca7c309463 Add reference/ dir: vLLM tokenizers, reasoning parsers, tool parsers, official inference
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 10:23:05 +00:00
8cfc1cae58 Canonical encoding: derive special token IDs from official encoding module + tokenizer
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 09:59:06 +00:00
a86d6d90a5 Replace hand-rolled prompt with official DSV4 encoder (canonical path)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 09:31:02 +00:00
284fc9ca86 Fix: thread comp_rope_cos/comp_rope_sin through forward_attention
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 09:19:14 +00:00
6a3374da18 Cross-check 2 complete: block-aligned comp_pos + compress_rope_theta wired through
5003e756e2 WIP: cross-check 2 fix — block-aligned compressed RoPE positions + compress_rope_theta support
Compare 2 commits »
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 09:01:04 +00:00
572bdd2840 auto: pre-test commit
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 08:47:55 +00:00
3c06fd5591 Test 2: fix topk tensor shape (flatten before iterating)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 08:36:05 +00:00
89f6e64057 README: document test harness gotchas (timeout arg, stale procs, screen names)
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 08:21:43 +00:00
29d6986dd4 Test 2: fix quantize_to_nvfp4 import
biondizzle pushed to master at biondizzle/nvfp4-megamoe-kernel 2026-06-03 08:20:24 +00:00
60b9bbd470 Test 2: fix import - use mHCLayer from dsv4.layers.mhc, fixed prompt encoding