This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:52:15 +00:00
7a21fa4bd8
test: add 2nd tmem_store to column 1
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:51:24 +00:00
4b129c146e
test: add 1 tmem_load back
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:50:50 +00:00
61f19ce891
test: skip tmem_load, only store+dealloc
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:50:11 +00:00
2513e1a692
test: use 64 threads, fence outside warp guard, 1 store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:49:22 +00:00
abfe9dbaa1
test: only 1 tmem_store to verify single column works
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:48:29 +00:00
5795589abc
test: TMEM 4 columns, individual store calls + loop load
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:46:50 +00:00
8a428f6127
test: TMEM column addressing test (128 cols, store+load)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:45:37 +00:00
ee3fe6d6b2
test: tmem_load column 1 only
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:44:31 +00:00
6c38c6e442
test: read 8 TMEM columns individually (no loop)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:43:18 +00:00
bcc6ed114d
test: add 8KB padding after sQ to prevent MMA read overrun
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:41:53 +00:00
764ed01d6f
test: try M=64 in descriptor + idesc to debug 4x factor
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:40:20 +00:00
4cb656e583
test: try idesc=0 (same as gau-nernst)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:39:26 +00:00
cfba8484da
test: try idesc with N=128 (full extent) + 128 TMEM cols
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:38:28 +00:00
30f0056b11
test: clean rewrite with SMEM Q/K verification and dot product check
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:37:08 +00:00
7eb85a71fc
test: add Q SMEM verification output + bf16_to_f32_host
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:35:59 +00:00
8f23c2aaf6
test: verify SMEM Q layout by reading back canonical data
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:35:04 +00:00
004046a6a8
test: read only 1 TMEM column after MMA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:33:47 +00:00
41128122e3
test: clean rewrite, 32 TMEM cols, MMA N=32, tmem_load loop
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:32:36 +00:00
58be79957d
test: 32 TMEM cols, add MMA call with N=32, read S from TMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 09:30:38 +00:00
22fb861447
test: 2 tmem_stores with syncwarp between
First
Previous
...
49
50
51
52
53
...
Next
Last