This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:38:26 +00:00
059c2e6cd9
D1: P store as BF16 using PV A-fragment layout
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:36:42 +00:00
2efd6be8af
D1: P store uses tOrP0.layout (PV A-fragment TMEM layout)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:35:51 +00:00
7751eab711
D1 fix: P store uses PV A-fragment layout (p_tmem_s.outer)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:34:00 +00:00
fe1826b0de
D1: test raw unnormalized output via epilogue_tma_store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:32:55 +00:00
091cb59be5
test: paired atoms epilog from old commit
6ee28d8
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:29:52 +00:00
f23d55fd3f
D1: paired atoms epilogue (no TMEM round-trip)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:26:17 +00:00
7df3c7c952
d1: sweep hd=64,128,256
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:25:31 +00:00
81378133cc
fix: use mV.iterator
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:25:00 +00:00
a66a9efd4c
fix: use mQ not q for LayoutEnum
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:24:17 +00:00
d2aaab5a32
d1: add diagnostic script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:22:26 +00:00
a2d063a48b
D1: N-tile support for HEAD_DIM>256
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:20:49 +00:00
7bc097163d
d1: add hd=512 test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:20:13 +00:00
32995c2ba3
d1: add quick regression test (hd=64 only)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:19:56 +00:00
eed981bee5
D1: Parameterize HEAD_DIM in FmhaKernel (64→512)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:10:43 +00:00
1a6c5e3822
docs: revised Stage D/E plan — indexer removes paged TMA, one kernel for CSA/HCA/SWA, sink merge
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:05:10 +00:00
a846193c4a
cleanup: remove archive/ (240 stale files), stale example9/10, fix test table, add Stage D plan
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 03:01:04 +00:00
f3d0d67ae9
docs: update README with Stage C TMEM layout mismatch findings and status
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 02:54:55 +00:00
9c331de7ba
fix: revert to composition layout for hand-constructed atoms (matching CUTLASS)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 02:54:02 +00:00
3a2d3c66da
fix: use logical_divide (not composition) for O rescale/normalize atoms to match get_tmem_load_op layout
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 02:50:55 +00:00
3aba5cc6da
fix: add NO-OP TMEM round-trip to re-map O from MMA to epilog layout
First
Previous
...
83
84
85
86
87
...
Next
Last