This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:43:32 +00:00
8cff68a28f
D1: Use cutlass.range for k_sub loops (CuTeDSL immutable handle)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:42:27 +00:00
14c9000997
D1: Fix kvh scoping - define before loops, consume V via pipeline
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:41:26 +00:00
553ee7be57
D1: Fix kvb→kvh typo in PV GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:40:13 +00:00
9c0dbab280
D1: Remove qh.commit() - pipeline handles commit internally
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:38:16 +00:00
5c267fd2ad
D1: TMA producer uses acquire_and_advance + commit (no wait_and_advance)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 06:36:20 +00:00
3e00c8e1bd
D1: Use same pipeline API as working code (acquire_and_advance) for k_sub path
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 05:02:19 +00:00
fcc69a5c56
D1: Add PipelineState for k_sub TMA path
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:59:49 +00:00
22fedc4ed9
D1: Fix pipeline API for K sub-tile path (producer_acquire/commit)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:57:10 +00:00
e93dabe43c
D1: K sub-tile MMA path using pipeline barriers
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:53:47 +00:00
fd28718483
D1: Fix TMA copies in k_sub path (no mbarrier, use cp_async wait)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:51:54 +00:00
170b483c2f
D1: Add K sub-tile loop for hd=512 (const_expr guarded, hd≤256 path unchanged)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:43:13 +00:00
bc5240c740
D1: Debug TMA partition shapes at hd=512
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:41:13 +00:00
2e732ce3a7
D1: K sub-tiling - qk_mma_tiler K-dim = k_tile=256, SMEM fits at hd=512
biondizzle
pushed tag
v0.4-d1-hd256
to
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:33:00 +00:00
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:32:45 +00:00
4564a264db
Docs: Update STAGE_D.md, README.md status for D1 hd≤256 milestone
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:07:41 +00:00
085d72ea8f
D1: Full test with TMEM-P at hd=64,128,256,512
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:06:28 +00:00
2b3435f97c
D1: Remove debug prints, clean up
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:05:19 +00:00
38c6486fc7
D1: const_expr for sP layout selection (CuTeDSL)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:04:30 +00:00
a945edea79
D1: Python if for sP layout (trace-time, not MLIR)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-24 04:03:30 +00:00
955b023164
D1: Tiny 4-mode sP placeholder for TMEM-P path
First
Previous
...
68
69
70
71
72
...
Next
Last