This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:45:24 +00:00
9264023e3b
Update STAGE_D.md with D5b results: merge cos 0.961, LSE err=0.0
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:43:05 +00:00
2ced9d0da7
D5b: Fix reference computation - use logsumexp for stable LSE, fix o_unnorm definition
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:36:28 +00:00
2883e042ca
D5b MILESTONE: SWA+sink merge works! cos 0.969
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:35:41 +00:00
70763030c0
D5b: Use normalized O + LSE for merge (correct formula), always output LSE
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:33:46 +00:00
28949da6e4
D5b: Clean up merge test - stable formula for both ref and kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:31:54 +00:00
9e1859827f
D5b: Use reference per-row LSE for proper O normalization
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:31:06 +00:00
48d37d652e
D5b: Fix kernel_obj reference
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:30:02 +00:00
caf89c65bf
D5b: Fix syntax error
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:19:27 +00:00
3dd9cd6a94
D5b: Debug reference formula mismatch, add numerically stable merge
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:18:07 +00:00
98390df27e
D5b: Python SWA+sink merge test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:16:54 +00:00
60e03fe84a
Update STAGE_D.md: D5a done, CG-2/CG-3 status updated, tOrP0 offset rule added
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:15:15 +00:00
edc283e6c1
D5a: Fix LSE formula - lse = ln(row_sum) + row_max * ln(2)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:14:00 +00:00
6ca294ed6d
D5a: Use tensor indexing for LSE write
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:13:09 +00:00
7e91d76669
D5a: Use cute.store for LSE write
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:12:32 +00:00
751abd9b18
D5a: Fix LSE - compute row_max_safe from final row_max, remove mLSE None check
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:11:41 +00:00
d6ea7f3ebd
D5a: Fix - add normalize param to __init__
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:10:41 +00:00
c80f223d08
D5a: Add normalize flag + LSE output
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:08:00 +00:00
542bc7b1b0
D1.3: Use const_expr if for tOrP0 compile-time selection
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:07:11 +00:00
37edd783ce
D1.3: Pre-compute tOrP0_offset in _setup, use const_expr for compile-time selection
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:06:18 +00:00
972fbd48b9
D1.3: Use const_expr for tOrP0 offset (compile-time conditional)
First
Previous
...
75
76
77
78
79
...
Next
Last