This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 09:02:28 +00:00
eb5c538c9b
Complete multi-PV-tile fixes: pv_n_tile, v_fmha layout, MMA construction, n_corr_tiles
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 09:02:03 +00:00
eedcfd7d21
Fix v_fmha layout to use pv_n_tile instead of head_dim for multi-PV-tile support
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 09:00:20 +00:00
fcdfc4239c
D1.4: Add pv_n_tile and n_pv_tiles for multi-PV-tile support (tcgen05 MMA max N=256)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:45:27 +00:00
b13da6b7a0
diag: add 2-CTA check + fix LayoutEnum in MMA test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:44:39 +00:00
c34291843b
fix: remove bad import in NVFP4 diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:40:26 +00:00
8a8e0c5ed6
fix: import ceil_div in quantize.py (was NameError at runtime)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:39:14 +00:00
538dbb0643
fix: use quantize_activation_nvfp4 in diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:38:22 +00:00
e2f599e4af
fix: use correct API for NVFP4-0 diag (sf_vec_size + mma_tiler_mn)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:30:44 +00:00
5572b74591
fix: use Sm100BlockScaledPersistentDenseGemmKernel in diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:29:17 +00:00
6b1330ba47
fix: use randint+view for FP4/FP8 tensors in diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:28:16 +00:00
3733927f28
fix: NVFP4-0 diag script — import SF_VEC_SIZE from quantize.py
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 08:26:59 +00:00
6d8f7db2dd
diag: NVFP4-0 primitive verification script
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 07:39:11 +00:00
d9780c0a0c
docs: add NVFP4 precision roadmap to STAGE_D.md (3 honest buckets + speculative bucket)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:55:26 +00:00
74d0822214
shit carmine left dangling
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:43:03 +00:00
3b167a4362
D1.2: TMEM budget verified on B200. Split-PV mandatory at hd=512 (MMA max N=256)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:41:44 +00:00
99000cba8d
D1.2: fix probe for hd=512 (MMA max N=256, use pv_n_tile)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:40:57 +00:00
60824b62db
D1.2: simplify TMEM budget probe, fix printf args
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:40:07 +00:00
de439bcd75
fix: cuda.CUstream import
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:39:30 +00:00
1c20b826d9
D1.2: TMEM budget probe using @cute.jit for MLIR context
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-23 06:38:09 +00:00
6575e83f6d
fix: remove unused v_fmha_layout from probe
First
Previous
...
80
81
82
83
84
...
Next
Last