This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:05:15 +00:00
a762820352
D1.3: Use MLIR-compatible expression for tOrP0 offset (same as Stage C)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:03:55 +00:00
4fa4239f95
D1.3: Initialize tOrP0 before conditional for CuTeDSL scoping
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:03:02 +00:00
2bb3eb95ed
D1.3: Fix tOrP0 for SMEM-P - skip make_tensor when offset is 0
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:02:05 +00:00
eabea91b64
D1.3: Fix tOrP0 offset - scale FP32 columns to BF16 elements
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:01:20 +00:00
47eade4afc
D1.3: Fix CuTeDSL scoping - define tOrP0 unconditionally with p0 offset
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 21:00:31 +00:00
0e81fc18aa
D1.3: Fix critical bug - add TMEM column offset for P0 in PV GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:58:07 +00:00
29f4480e26
D1.3: Revert to d1.3-pre-sm100-helpers baseline for testing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:57:14 +00:00
adb4398505
D1.3: DIAGNOSTIC - test epilogue_tma_store raw PV without any round-trips
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:56:15 +00:00
0e41816636
D1.3: Remove NO-OP round-trip, keep normalize + epilogue_tma_store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:54:29 +00:00
8bc8b21470
D1.3: Full correction_epilog with TMA store, normalize in reg before SMEM write
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:52:45 +00:00
d769e01a16
D1.3: Apply transform_partitioned_tensor_layout before epilogue helpers
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:50:31 +00:00
cc18fddc7e
D1.3: Replace NO-op TMEM round-trip with correction_epilog using epilogue_tmem_copy_and_partition + epilogue_smem_copy_and_partition
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:20:15 +00:00
993ec32567
SMEM-P: test permutation 4 (swap m↔n2)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:19:24 +00:00
c7a299d7d9
SMEM-P: add iterator offset debug print
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:14:34 +00:00
4943af749d
SMEM-P: add tCrP debug print, reset permute to 0
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:14:22 +00:00
5a11f7c09a
SMEM-P: test permutation 1 (swap m↔n0)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:13:49 +00:00
d5081fe6f0
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:13:46 +00:00
fd54d657b2
SMEM-P: add debug_permute flag for coordinate permutation testing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:12:29 +00:00
06409401ca
SMEM-P: disable debug flags, revert to original mapping
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 20:11:51 +00:00
b8f0f0890a
SMEM-P: fix scoping error, disable debug_p_one, enable debug_swap_mn
First
Previous
...
76
77
78
79
80
...
Next
Last