This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:52:00 +00:00
4f28673bec
debug: disable sinks in SDPA to check |X| impact
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:40:03 +00:00
e3db90b56c
switch back to original prompt
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:39:34 +00:00
d2cf5ccc32
CRITICAL FIX: use SDPA for short sequences (FMHA padding bug)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:28:48 +00:00
5f98855141
test with simpler prompt
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:17:01 +00:00
152af7295a
debug: compare FMHA vs SDPA output at layer 0
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:07:07 +00:00
59c75ca4e9
fix: cast attn_out back to BF16 after sink correction
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 06:03:14 +00:00
e5245ea34e
fix: V tensor must be (B, n_h, hd, N) for FMHA — was transposed wrong
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 05:58:03 +00:00
91abf0f921
FMHA + analytic sink bias correction using LSE
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 05:55:11 +00:00
fac269c938
fix verify_attention: proper multi-head SDPA + GQA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 05:53:50 +00:00
2333fc8b4b
fix verify_attention.py: proper nvfp4_linear calls
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 05:51:37 +00:00
c09f68c867
add verify_attention.py: single-layer attention component test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:51:19 +00:00
04dd7545b3
switch to production FMHA for full run
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:51:17 +00:00
738088cf49
revert: K=V with RoPE + inverse RoPE is the correct DSV4 approach
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:46:16 +00:00
781ee43521
try separate K (RoPE'd) and V (raw) — no inverse RoPE needed
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:45:59 +00:00
889521009b
re-enable inverse RoPE (confirmed necessary — without it output is garbage)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:40:43 +00:00
92e465ca04
debug: disable inverse RoPE to check impact on output
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:38:43 +00:00
c69dc51b3b
switch to SDPA with sinks (better residual control)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:32:03 +00:00
3ed8f3cc44
switch back to production FMHA kernel (with FP4 LUT fix)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:25:04 +00:00
ae79bd8fce
debug: add top-5 logit predictions
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 04:16:15 +00:00
aafe2eee12
CRITICAL FIX: FP4 LUT was 4x too large!
First
Previous
...
25
26
27
28
29
...
Next
Last