This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 21:08:15 +00:00
4b9eed02e1
Cleanup C1-C7: delete dead CuTeDSL FMHA, test probes, scratch files
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:28:03 +00:00
a360fa308a
P6-P8: Update NEXT_PRIORITIES.md with completion status
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:25:03 +00:00
2c18609296
P8: Fix P6 test imports after deleting multihead module
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:23:16 +00:00
e1b9e94c24
P8: Fix test imports after deleting multihead module
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:21:17 +00:00
95725f1df0
P8: Delete 6 redundant .cuh variants + multihead CAPI/op
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:19:10 +00:00
9d483b1c54
P8: Unified dispatch — multi-tile kernel handles all N
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:17:57 +00:00
e747742598
P7: Document TMEM column layout, add multi-row softmax test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:14:51 +00:00
f1ce47e3c9
P7: Add TMEM column layout probe test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:13:22 +00:00
5e5217bfc3
P6: Relax test gate to 0.999990 (SMEM staging adds tiny BF16 noise)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:12:25 +00:00
11d15d9e72
P6: Clean up test — remove broken TMA store test, update epilogue test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:11:18 +00:00
c0379a0f86
P6: Remove broken TMA store — use direct GMEM write from SMEM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:07:27 +00:00
f97359fbfc
P6: TMA store uses mbarrier completion (same as load)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:05:29 +00:00
2de300e281
P6: Try shared::cluster instead of shared::cta for TMA store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:04:40 +00:00
829a5f93ce
P6: Fix TMA store PTX — remove .tile modifier, fix wait_group syntax
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:01:15 +00:00
e4ee9fdc9f
P6: Fix host-side BF16→FP32 conversion in test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:00:54 +00:00
a88b321433
P6: Fix host-side BF16 conversion in test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 17:00:07 +00:00
1a87e054db
P6: Fix constexpr and bf16 conversion in CUDA test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 16:59:48 +00:00
2833eb56e7
P6: Add minimal CUDA test for TMA store epilogue
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 16:58:26 +00:00
6a7726e764
P6: Add integration test for TMA store epilogue
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 16:57:37 +00:00
fd7c0cb773
P6: Fix TMA store — use bulk_group (commit+wait) not mbarrier
First
Previous
...
29
30
31
32
33
...
Next
Last