This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:37:01 +00:00
e0aa7ccd19
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:37:01 +00:00
4f8559ae2e
SMEM-P: implement full 128-value write in softmax loop using coordinate mapping
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:33:37 +00:00
63f68eda52
SMEM-P: fix BF16 value creation (use constant)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:32:14 +00:00
aa82a0faf5
SMEM-P: implement CUTLASS LLM coordinate mapping pattern (minimal test)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:30:10 +00:00
c9b44e6bf9
SMEM-P: fix thread_idx tuple access
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:29:31 +00:00
97e97b63ea
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:29:28 +00:00
dee046287e
SMEM-P: add debug to understand thread partitioning
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:22:31 +00:00
5b6a4fbef9
Update STAGE_D.md: manual SMEM addressing blocked on layout mapping
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:21:31 +00:00
060cea5d0f
SMEM-P: implement simple test pattern instead of coord lookup
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:20:44 +00:00
56bed1066d
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:20:41 +00:00
6c08a95620
Start implementing manual SMEM-P addressing (helpers are a trap)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:14:46 +00:00
7bf69a0265
Implement manual SMEM-P copy instead of cute.copy (helpers are a trap)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:14:08 +00:00
944fa9b155
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:14:08 +00:00
e765685951
Try flattening sP and rP_bf16_qk with group_modes to fix rank mismatch
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:13:09 +00:00
5ee0c20736
Add debug prints for SMEM-P partition layouts
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:12:14 +00:00
55dcee2d29
Fix SMEM-P: use BF16 copy atom and BF16 source with QK C-fragment layout
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:11:10 +00:00
77e01acd13
Fix SMEM-P copy: use tcgen05.copy.St32x32bOp with Float32 and copy from rP_words (Float32) not rP_bf16
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:09:57 +00:00
01fd6d03db
Update STAGE_D.md with current action plan - starting NVFP4-0 verification and D1.3 validation on B200
biondizzle
pushed tag
d1.3-pre-sm100-helpers
to
biondizzle/nvfp4-megamoe-kernel
2026-05-23 19:00:18 +00:00
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 18:37:54 +00:00
5756b6e4ec
📋
Update STAGE_D.md: D1.3
✅
SOLVED, D1.4
✅
IMPLEMENTED, D1.5
🟡
complex refactor, checklist updated
First
Previous
...
78
79
80
81
82
...
Next
Last