This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:23:47 +00:00
f3503fc1ee
FIX: TMEM offset bug in O rescale/normalize — use tOtO0.iterator not tOtO.iterator
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:07:09 +00:00
0b7ae7c969
Diag: test n=384 (3 tiles) to find crash boundary
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:06:30 +00:00
640ec3e96e
Diag: test all sizes 128-1024
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:05:47 +00:00
02d993ecac
DEBUG: disable O rescale to isolate NaN cause
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:01:14 +00:00
1c3970fe58
Add NaN/inf checking to stage C test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 18:00:02 +00:00
d7a0fc2bc2
CRITICAL FIX: K GMEM slice (None,None,0,0) not (None,0,None,0)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:59:03 +00:00
b6a2904e93
Diag: try K slice (None,None,0,0) keeping mode 1 (CUTLASS ref style)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:58:00 +00:00
01621e1520
Diag: try runtime Int32(0+0) for kv_coord with cutlass.range
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:57:01 +00:00
beecc4df47
Diag: use Python range() unrolling like stage C test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:56:16 +00:00
200430bd3f
Fix diagnostic test: same Int32(kt) + n_kv_tiles fixes
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:51:27 +00:00
c23ebd5b57
Try cutlass.range with Int32(kt) — now n_kv_tiles is Python int
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:50:09 +00:00
4a41df51c4
FIX: n_kv_tiles as Python int (s_k//128) for range() unrolling
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:47:46 +00:00
70409636f7
Option 2: Python range() with Int32(kt) for TMA GMEM coord
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:47:11 +00:00
b55a38c4c3
Add example5: use cutlass.range induction variable as TMA GMEM coord
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:41:24 +00:00
aacad257ea
README: add fire_b200_test docs, update multi-tile blocker with real findings
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:39:29 +00:00
93c28b9c29
Clean up debug prints, set kv_coord as Int32(0)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:34:31 +00:00
1bba851911
DEBUG: try plain Python int kv_coord (like CUTLASS ref)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:32:54 +00:00
15b2a28d29
DEBUG: hardcode kv_coord=1 to test if TMA uses it
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:30:07 +00:00
ff9ef6dcde
DEBUG: try K slice (None,0,None,0) keeping mode 2 free
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-22 17:28:48 +00:00
cec6f59d66
DEBUG: print tBgK/tVgV shapes before/after slice
First
Previous
...
92
93
94
95
96
...
Next
Last