This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 22:00:56 +00:00
12c166245d
Use kt directly as TMA GMEM coordinate
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:59:55 +00:00
8c9d4eb1ef
Try Int32(0) + kv_coord += 1 (matching reference pattern)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:58:44 +00:00
7e832aa527
Use kvh.count for GMEM tile coordinate (pipeline-tracked SSA value)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:56:57 +00:00
254f7be884
Fix diag: remove .rank
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:56:24 +00:00
4c1dbfd0f3
Add cute.printf shape diagnostics to example9
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:55:43 +00:00
a476324682
Fix TMA shape diag: use ct tensors for LayoutEnum
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:55:05 +00:00
fd6b1e82d8
TMA shape diag: pure Python, no JIT
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:54:24 +00:00
2f670e33d1
TMA shape diagnostic: exact setup from example9 + shape prints
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:53:25 +00:00
b64227e5b6
Fix group_modes range in TMA shape diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:52:57 +00:00
be27720cb2
Add TMA shape diagnostic
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:51:35 +00:00
845ad98b22
Fix TMA indexing: 4-mode tensors, kt at mode 2 (GMEM tile dim)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:50:40 +00:00
61b0501a8b
Fix test_fmha_v3_stage_c.py: 8-mode TMA indexing + O rescale (from example9)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:43:12 +00:00
0996ffc1ba
Add fmha_v3_stage_c_example10: 8-mode TMA + O rescale + paired-atom epilogue
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:36:25 +00:00
328f9b0080
Test n=384
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:36:02 +00:00
2da0a452d1
Quick test n=128,256
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:35:41 +00:00
0d3caced47
Add: O rescale (correction_rescale) in softmax loop + remove pk from TMA/MMA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:32:19 +00:00
c47d229e6a
Sweep test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:31:21 +00:00
a751b3baf7
Sweep test: n=128,256,384,512,1024
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:29:05 +00:00
beaf60db5c
DOCUMENT: TMA 8-mode indexing — the bug that cost us a full day. README + inline comments.
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 21:21:54 +00:00
27116110ab
Fix identity diag: same 8D TMA indexing fix
First
Previous
...
88
89
90
91
92
...
Next
Last