This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:27:19 +00:00
7fbbdc5204
diag: validate router output before MoE
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:26:53 +00:00
f5fa84016e
diag: sync+error check after each layer on first token
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:14:41 +00:00
91b3929605
fix: call moe_runner.run() and se_runner.run() (not __call__)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:08:05 +00:00
03c45d4bfb
fix: pass int32 token_ids to hash router (was int64)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:01:16 +00:00
62efde5c9f
fix: router — use cuBLAS BF16 GEMM + activation_topk CUDA kernel (production path, not CuTeDSL fused)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:59:21 +00:00
5591a725e1
fix: router kernel — infer OperandMajorMode from tensor layout (same pattern as MoE GEMM)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:56:02 +00:00
0ab5d8c317
fix: disable broken CuTeDSL fused router — use BF16 linear + activation_topk (both are production paths)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:54:20 +00:00
c339fe7ad9
fix: router A operand major mode MN (not K) — fixes CuTeDSL local_tile coord error
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:42:13 +00:00
b7a8c44d26
single_shot: eager MoE/SE weight processing, stale GPU cleanup, --prefill-tokens flag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:32:48 +00:00
15f45b57c3
fix: correct Nvfp4Linear dimension inference from checkpoint weights
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:30:40 +00:00
e671780008
fix: transpose checkpoint weights before make_b_k_major in Nvfp4Linear/SharedExpert
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:18:37 +00:00
e8a7a9256f
fix: convert uint8 checkpoint weights to float4_e2m1fn_x2 for CuTeDSL GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:10:54 +00:00
172448514c
fix: fold weight_scale_2 into global_scale_b for NVFP4 GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:04:50 +00:00
563df02aef
fix: import SF_VEC_SIZE from quantize in gemm_runner (was NameError)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 00:00:09 +00:00
be476b2ce2
router: catch CuTeDSL warmup failures fast, don't let MLIR errors slow down init
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 23:55:18 +00:00
56dff8d185
fix: W_gate is (H, E) but F.linear expects (E, H), transpose before linear
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 23:54:17 +00:00
5396a04c28
router: broaden except to catch all CuTeDSL errors, fall through to cuBLAS+activation_topk path
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 23:53:15 +00:00
3b5b9f487c
fix: compute num_tma_load_bytes inside cute.compile context
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 23:51:44 +00:00
1bc0da0f35
fix: properly scope swap code inside else/guard blocks, replace continue with if guard
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 23:50:41 +00:00
d0d765e1f2
fix: replace break statements with flag-based loops in router kernel (CuTeDSL restriction)
First
Previous
...
20
21
22
23
24
...
Next
Last