This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 03:19:55 +00:00
791cd8b9c7
merge: keep our fmha.py (coordinate-indexed SMEM-P + epilogue_tma_store)
6313974fba
D1.5: Fix SMEM-P - use coordinate-indexed store (same proven pattern)
Compare 2 commits »
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 03:19:19 +00:00
3fcb7a0a48
feat: SMEM-P with make_tiled_copy_tv + partition_S
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 03:18:40 +00:00
153db24be2
D1.5: Always output un-normalized O + LSE (epilogue_tma_store only, no TMEM round-trip normalize)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 03:16:35 +00:00
d68ab348bb
feat: SMEM-P using make_tiled_copy_A from PV MMA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:45:09 +00:00
b4a985631b
fix: fence_proxy not fence
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:44:15 +00:00
95f0898c64
merge: resolve conflict (keep our version)
228ec3c638
D1.5: Replace broken make_cotiled_copy SMEM-P with coordinate-indexed store
Compare 2 commits »
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:43:13 +00:00
c58ca550ae
feat: SMEM-P with make_tiled_copy_tv + manual fill
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:42:30 +00:00
8faac948fc
feat: SMEM-P using make_tiled_copy_tv + logical sP view
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:41:40 +00:00
eda7d40df2
Merge branch 'master' of ssh://sweetapi.com:2222/biondizzle/nvfp4-megamoe-kernel
952c25e227
D1.5: Use tCtO_fake layout for epilogue_tma_store (needs STAGE dim)
Compare 2 commits »
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:41:23 +00:00
0a980de7ad
feat: SMEM-P using make_cotiled_copy (one-row-per-thread)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:40:49 +00:00
85eb2bc4bb
D1.5: Remove duplicate tTMrO definition (keep unconditional one)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:40:17 +00:00
83077db55e
merge
86ff386ea8
D1.5: Move tTMrO after O rescale atoms (fix tTMEM_LOADcO reference)
Compare 2 commits »
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:36:16 +00:00
cd223e1b98
fix: reorder tTMrO definition after tTMEM_LOADcO
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:34:54 +00:00
54e94d44ef
fix: tTMrO scoping + restore SMEM-P coordinate write
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:32:53 +00:00
6ead708c7d
D1.5: Move tTMrO def before softmax loop (CuTeDSL scoping)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:31:14 +00:00
5a34865062
debug: zero-fill sP to check deadlock
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:30:33 +00:00
81652629e3
D1.5: Use proven Stage C approach - normalize via TMEM round-trip + epilogue_tma_store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 02:15:09 +00:00
974cddbf7b
test: add try/except for SMEM-P coord test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 01:59:26 +00:00
5fd556db63
test: use FmhaKernel for SMEM-P coord test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 01:58:39 +00:00
e50ba7212c
test: SMEM-P coordinate verification test
First
Previous
...
71
72
73
74
75
...
Next
Last