This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:57:51 +00:00
fe0588d906
fix: simplify UMMA dump script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:55:44 +00:00
948a3f8a7a
add UMMA descriptor dump script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:53:37 +00:00
e5ba0ca119
debug: clean QK verify with scalar sanity + MMA result
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:49:38 +00:00
a04d794979
debug: skip TMEM alloc — test SMEM loads only
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:48:10 +00:00
72c97f2546
debug: minimal UMMA descriptor (just start_addr + version)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:47:00 +00:00
9a51bfa578
fix: align SMEM layout properly (128B aligned tmem + Q)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:44:56 +00:00
2a765be715
fix: correct SMEM size for row-major (not swizzled)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:43:41 +00:00
c64bd7b875
debug: read Q/K directly from SMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:41:32 +00:00
58b610c96c
fix: proper early return for SMEM load test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:40:19 +00:00
82bc2c4a49
debug: verify SMEM loads + scalar QK sanity check
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:39:13 +00:00
53139d24bf
debug: verify TMEM r/w works before MMA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:38:10 +00:00
a9d71ff6ab
debug: print TMEM values after MMA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:37:03 +00:00
bfb1e177ce
debug: try all-lane MMA + print tmem_base
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:35:32 +00:00
d3510980e4
feat: SWIZZLE_NONE UMMA descriptors with row-major SMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:23:39 +00:00
8c67c31497
add CuTe descriptor printing script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:20:57 +00:00
d29d6b575f
add UMMA descriptor diagnostic script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:18:48 +00:00
ab84ad0f86
feat: implement canonical UMMA SMEM layout with SWIZZLE_128B
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:07:55 +00:00
ecbc75255c
fix: correct UMMA descriptor format from CUTLASS source
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:03:55 +00:00
fe7d561143
debug: print UMMA descriptor values for diagnosis
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 08:02:56 +00:00
c5f7a9a15c
fix: align SMEM buffers to 16 bytes for UMMA descriptors
First
Previous
...
51
52
53
54
55
...
Next
Last