This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:09:51 +00:00
db17d8db9a
fix: cvta.to.shared PTX for SMEM address
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:09:30 +00:00
e12a81ae36
fix: include cstdint
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:09:09 +00:00
0c73a024ba
fix: guard CUTLASS includes with __CUDA_ARCH__ for host compilation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:08:28 +00:00
41e59a2423
FMHA SM100: Add SMEM descriptor construction for tcgen05.mma
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:06:49 +00:00
3eb432d064
fix: CUTLASS path /root/cutlass
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:06:14 +00:00
66d9f5c60f
fix: --x cu for .cuh compilation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:05:56 +00:00
4dcd80ea0d
fix: use full nvcc path
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:05:33 +00:00
fac7275f2b
test: nvcc compilation test for FMHA SM100 kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 05:04:46 +00:00
230c350c77
FMHA SM100: Raw CUDA C++ decode kernel — initial skeleton
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:59:03 +00:00
b2d0417a46
NVFP4-1.1: Mark fp4_quant.py as toolchain-blocked, clean up test files
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:57:48 +00:00
650bcdcccf
test: f32 vs i32 GMEM store
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:56:18 +00:00
cc37ce6dbf
test: absolute minimum CuTeDSL int store + float cmp
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:55:25 +00:00
c4fdfc7789
test: isolate which fp4_quant function causes LLVM ERROR
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:54:28 +00:00
b3eb46d4ec
NVFP4-1.1: Restore threshold RNE approach — inline PTX blocked by toolchain
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:50:23 +00:00
71ee1485ea
test: constraints runner
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:50:10 +00:00
c55c237fcd
test: different constraint strings + bitcast approach
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:49:04 +00:00
4806e9ba11
test: llvm.inline_asm with Int32._mlir_type matching cvt_i8_bf16 pattern
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:47:05 +00:00
ade49d964d
fix: test_ptx_runner path
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:46:55 +00:00
dc9596c6bc
test: sub-process isolation for each f32→i32 approach
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-28 04:46:07 +00:00
136a89f4e3
test: compare nvvm.inline_ptx approaches + arith.fptosi
First
Previous
...
55
56
57
58
59
...
Next
Last