This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:45:23 +00:00
d36dbba01c
Fix B2 indexer: increase TMEM_COLS to 512 for full 128-row MMA output
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:44:11 +00:00
797345dfe9
Add B2 score debug test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:39:53 +00:00
afb82b9c89
Fix B2 indexer: replace broken 16x256b TMEM read with proven 32x32b.x8
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:36:12 +00:00
99e50fcb58
Add B2 minimal debug test to find hang point
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:25:56 +00:00
e21bd14408
Fix B1 test LSE reference shape handling
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:24:21 +00:00
4fe7f9dc37
Fix B1 FMHA: swap V matrix canonical layout args (dd, kk) not (kk, dd)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:23:47 +00:00
29a95a3db6
Add B1 QK vs PV isolation test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:22:03 +00:00
c322e3f301
Add B1 FMHA debug test for cosine failure investigation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:21:31 +00:00
5447d1d1dc
Add comprehensive B2 FP8 indexer unit test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:20:12 +00:00
38eecb28d8
Add comprehensive B1 mixed FP8 FMHA unit test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:09:39 +00:00
f2063c0588
B1: minimal debug test for mixed FP8 FMHA (1 head, N=128)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:07:39 +00:00
0cea0b33ff
B1 test: fix BF16 reference to use PyTorch SDPA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-03 00:06:28 +00:00
a51d19a7fc
B1: add mixed FP8 FMHA cosine verification test (HD=512, N=128-2048)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 23:19:10 +00:00
b9243fe40a
B2: FP8 tensor-core indexer scoring + weighted ReLU + top-k
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 22:53:18 +00:00
a9d5e09f4c
B1: mixed FP8/BF16 decode FMHA integration
biondizzle
pushed tag
pre-b1
to
biondizzle/nvfp4-megamoe-kernel
2026-06-02 22:48:59 +00:00
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 22:31:17 +00:00
2eb4f0886e
things
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 21:52:45 +00:00
9d4a014fad
Fix NameError: dequantize_nvfp4 not in scope in forward_attention
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 21:39:04 +00:00
9ba6476d3f
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-02 21:35:03 +00:00
845227c06c
Fix stale lock file in CUDA loader — prevents infinite spin on crash recovery
First
Previous
...
7
8
9
10
11
...
Next
Last