This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:26:28 +00:00
b1778eedf8
wip: Step 2 gate/up pairing — SiLU validated, runtime conditionals blocked by CuTeDSL
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:16:37 +00:00
842bb42ed1
wip: Step 1 SiLU validation complete, Step 2 gate/up pairing planning
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:12:25 +00:00
77cc28cc92
fix: cutlass.Float32 not cutlass.float32_t in fused epilogue
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:10:58 +00:00
ed89e678be
wip: add run_fused_swiglu_grouped_gemm bridge + step1 test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:07:09 +00:00
2fcd5f1902
wip: fused SwiGLU Stage 1 - SiLU in registers (full acc_vec)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 03:04:40 +00:00
9cdf79fd9c
wip: fused SwiGLU kernel scaffold + bridge interleave + plan
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:17:45 +00:00
2f8b26c176
chore: remove unused _expert_id_range after bincount migration
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:17:24 +00:00
7e2adb7e85
perf: replace expert counting O(n*E) comparison with torch.bincount O(n)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:16:51 +00:00
d59b10e170
fix: zero out x_norm for underflow blocks before division in NVFP4 quantization
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:14:59 +00:00
c8fa87fac7
fix: detect zero blocks in NVFP4 quantization, force FP4+FP8 to exact zero
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:11:43 +00:00
3c6b5a0522
chore: deprecate prepare_weights_from_dequantized and prepare_weights_direct
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:08:28 +00:00
3181f74c86
fix: correct scale factor dimensions in warmup (K_sf = ceil_div(K_packed,8) not ceil_div(K_packed,16))
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 02:08:09 +00:00
cc6b094450
fix: root-cause JIT memory corruption myth, add eager warmup, remove _needs_token_refill
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 01:36:31 +00:00
039a9e27d6
fix: handle 3D swa_indices and correct kv_bf16 expand dims
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 01:28:07 +00:00
b3f6f260ce
feat: add native CuTeDSL SWA decode attention kernel stub + batched SDPA fallback
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 00:02:18 +00:00
268dc251c1
fix: replace _allocate_buffers with _ensure_buffer_size for dynamic sizing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-20 00:00:02 +00:00
09669dded4
fix: dynamic buffer sizing in nvfp4_linear for varying token counts
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-19 23:19:01 +00:00
02b9c1ac20
nuke vllm because this keep confusing people
02b57071be
Update README.md and CURRENT_BUG.md: eliminate stale issues, document NaN investigation, clarify our kernels are clean
7070fadf72
Add full layer NaN test (attention + MoE, multi-layer chain)
152b0749df
Use 16 experts for MoE runner test (fits in memory)
daa59a7c75
Add MoE runner NaN test (grouped GEMM with real weights)
Compare 166 commits »
biondizzle
pushed to
proper-nvfp4-integration
at
biondizzle/nvfp4-megamoe-kernel
2026-05-19 23:04:39 +00:00
02b9c1ac20
nuke vllm because this keep confusing people
biondizzle
created repository
biondizzle/dsv4-nvfp4-workspace
2026-05-19 22:43:52 +00:00
First
Previous
...
99
100
101
102
103
...
Next
Last