This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:26:56 +00:00
8740ab5b27
Stage C: manual kv_coord + correct K GMEM slice + O rescale fence
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:24:39 +00:00
f39c3a8e38
Add example4: manual kv_coord Int32 for GMEM tile indexing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:09:54 +00:00
711480a4fb
README: add test harness instructions
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:08:15 +00:00
83f8a13add
run_test.sh: SIGKILL all children of screen session on cleanup
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:07:25 +00:00
f4cfeae262
Add check_log.sh convenience script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:06:02 +00:00
61122e10c6
Fix quoting in run_test.sh
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:05:45 +00:00
af475affab
Add run_test.sh harness (screen + log)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:57:35 +00:00
d0626e0434
FIX: only slice GMEM tensors (SMEM already 2D from tma_partition)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:56:41 +00:00
15a2fdadbc
FIX: consistent GMEM/SMEM slicing for K and V TMA partitions
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:51:59 +00:00
5c1423b2c5
FIX: keep GMEM iteration dimension FREE in TMA K/V partition slices
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:47:10 +00:00
018c09644b
Add diagnostic test for multi-tile TMA pipeline (identity softmax)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:42:50 +00:00
ec895e831e
FIX: acc_scale was double-multiplying by scale_log2
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:39:45 +00:00
2147cce95d
Stage C: integrate example3 multi-tile fixes into unit test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:32:35 +00:00
dad35818f3
README + MEMORY: update Stage C status to single-tile only, document multi-tile blocker
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 16:23:39 +00:00
e4c82873bb
FMHA Stage-C multi-tile: combined K+V barrier, final_o_bar, acc_pipe producer
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 15:54:05 +00:00
39fa5b96b0
restore tBgK to kh.count indexing (single-tile working), add TODO for multi-tile
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 15:52:58 +00:00
f1854bab26
FIX: use unsliced tBgK with (None, kt, None, 0) for proper GMEM tile indexing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 15:52:09 +00:00
3d2cb0e52b
CRITICAL FIX: keep GMEM iteration dim free in tBgK/tVgV slice
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 15:49:49 +00:00
a04b219f0f
add explicit acc_pipe.consumer_wait before final normalize
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 15:41:21 +00:00
ff27a261b1
FMHA Stage-C multi-tile: Fix 1 (s_k=n), Fix 2 (TMA kt indexing), Fix 3 (O rescale)
First
Previous
...
93
94
95
96
97
...
Next
Last