This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 13:02:04 +00:00
834d682443
test: full FMHA HD=16 pipeline (QK→softmax→PV→epilogue)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 13:00:40 +00:00
3b8be4b2db
test: FMHA softmax (QK→read S→softmax→write P→read P→verify)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:57:41 +00:00
c936940428
test: separate (128,16) SMEM per K-tile with correct source stride
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:56:41 +00:00
f244c4fdd2
test: single-thread MMA (tid==0) for Layout D
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:55:54 +00:00
ba2e390e1e
test: debug single K-tile from full (128,64) SMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:55:03 +00:00
a7e8b483cd
test: HD=64 multi-K-tile with correct source stride in SMEM writes
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:54:04 +00:00
926ae5d7bf
test: fix K source stride mismatch in manual SMEM write
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:53:15 +00:00
7d16a30cb6
test: exact HD=16 pattern with HD=64 data
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:52:22 +00:00
db4f661843
test: debug with (128,16) SMEM matching HD=16 exactly
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:51:34 +00:00
b703dc0a50
test: debug single K-tile with offset descriptor
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:50:45 +00:00
435ca037cf
test: use accumulate=false for first K-tile, skip TMEM zero
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:50:08 +00:00
e8ac2120ad
test: HD=64 QK with contiguous SMEM + offset descriptors
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:48:47 +00:00
1c01e8e412
test: fix inline asm line continuation for nvcc
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:47:58 +00:00
71c774027c
test: fix HD=64 QK — zero TMEM, fence after MMA, single-thread MMA call
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:23:50 +00:00
1bf76388c8
test: always accumulate, separate SMEM per K-tile, TMEM starts at 0
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:20:03 +00:00
8707f555c2
test: add extra syncwarp + syncthreads for MMA safety
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:18:07 +00:00
5a65d46c26
test: HD=64 with separate SMEM per K-tile — no offset descriptors needed
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:16:10 +00:00
526fafb808
test: revert volatile, fix wid==0, full 4 K-tiles
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:13:25 +00:00
de879342dd
test: 1 K-tile, volatile writes, verify SMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 12:11:48 +00:00
bd6440fd83
test: volatile SMEM writes + 2 K-tiles
First
Previous
...
46
47
48
49
50
...
Next
Last