This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 09:02:54 +00:00
851ec9b4d5
P3 WIP: fused RMSNorm + quantize kernel skeleton (not yet integrated)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:43:42 +00:00
b13c1057f5
test: verify GEMM shape with production weight format
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:41:23 +00:00
40fb49d670
test: verify GEMM output shape
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:41:08 +00:00
f01d3f3eac
wip: SE fused SwiGLU deinterleave fix
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:29:06 +00:00
1726cb64a9
fix: interleave_l1_weights granularity_bf16 (not granularity) in SE
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:25:55 +00:00
553275d810
feat: P1 — add eager warmup_fused_swiglu_compilation for SharedExpert (1-group)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:24:38 +00:00
5ed4c86137
fix: expert_offsets for 4-expert fused SwiGLU test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:23:31 +00:00
53362d2579
test: isolate fused SwiGLU — test no-clamp first
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:22:31 +00:00
ae4506d722
fix: w_gs is scalar not iterable
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:21:34 +00:00
b0c71b947e
test: fused SwiGLU — smoke test + correctness comparison with graceful degradation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:20:29 +00:00
2cfca36095
fix: compute correct gs from data in fused SwiGLU test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:19:41 +00:00
4a05a40cf0
fix: fused SwiGLU test — proper weight quant + 128-token alignment
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:18:28 +00:00
fa769b6214
fix: pad activation as uint8 view for float4 dtype
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:17:37 +00:00
024be1a60b
fix: test weight quantization dtype for fused SwiGLU test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:16:42 +00:00
19afa52e80
fix: use cute.where() directly for clamp in fused SwiGLU
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:16:04 +00:00
5c746bbdf2
fix: TensorSSA-compatible clamp in fused SwiGLU kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:12:56 +00:00
3a30f35c68
fix: cute.math.fmin/fmax → cute.arch.fmin/fmax in fused SwiGLU kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:11:23 +00:00
fca72427ea
fix: add fp4_out/sf_out/l2_global_scale params to fused_swiglu kernel() signature
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:10:01 +00:00
55ea109cca
test: fused SwiGLU kernel compilation + correctness (P0/P1 gate)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 08:00:07 +00:00
7904cf05c4
Add set_fused_swiglu() method to Nvfp4MoE
First
Previous
...
10
11
12
13
14
...
Next
Last