This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:25:22 +00:00
1eb9c43217
Rewrite CUTLASS kernel based on NVIDIA example 72b (nv_float4_t, CollectiveBuilder, OpClassBlockScaledTensorOp)
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:23:03 +00:00
8a9af441dc
Fix includes: use cutlass/float_subbyte.h (has float_e2m1_t and float_ue4m3_t), point to latest CUTLASS
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:18:28 +00:00
d789f5e3e0
Add CCCL include path for CUTLASS 3.x
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:17:31 +00:00
12588047fd
Fix setup.py: use include_dirs and extra_compile_args (correct PyTorch extension API)
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:14:07 +00:00
1b1c3a42fe
Fix setup.py source paths
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:12:48 +00:00
f375c80bfe
feat: CUTLASS NVFP4 block-scaled GEMM kernel (native SM100 Blackwell)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:11:17 +00:00
f375c80bfe
feat: CUTLASS NVFP4 block-scaled GEMM kernel (native SM100 Blackwell)
56c7880296
Native NVFP4 TileLang kernel: tcgen05 block-scaled MMA
bf13665dbe
Implement TileLang NVFP4 mega_moe L1/L2 kernels
ebc0ab0cac
Fix: keep scales as float8_e4m3fn, don't pack to uint32 (min_all_cuda unsupported)
94233c4dd3
Fix __init__.py: remove private imports
Compare 6 commits »
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 23:02:08 +00:00
56c7880296
Native NVFP4 TileLang kernel: tcgen05 block-scaled MMA
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 22:37:00 +00:00
bf13665dbe
Implement TileLang NVFP4 mega_moe L1/L2 kernels
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 21:54:42 +00:00
ebc0ab0cac
Fix: keep scales as float8_e4m3fn, don't pack to uint32 (min_all_cuda unsupported)
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 21:43:48 +00:00
94233c4dd3
Fix __init__.py: remove private imports
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 21:41:45 +00:00
1a452ffabd
Fix weight_transform signature to match nightly vLLM finalize_weights call
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 16:11:11 +00:00
47ca5631d8
Fix __init__.py: only import from package modules
c2b752c2fe
Initial: TileLang NVFP4 mega_moe kernel package
Compare 2 commits »
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 16:08:39 +00:00
47ca5631d8
Fix __init__.py: only import from package modules
biondizzle
created branch
master
in
biondizzle/nvfp4-megamoe-kernel
2026-05-13 15:44:58 +00:00
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 15:44:58 +00:00
c2b752c2fe
Initial: TileLang NVFP4 mega_moe kernel package
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 15:22:36 +00:00
673f67681f
Add vLLM integration layer and packaging
biondizzle
created branch
main
in
biondizzle/nvfp4-megamoe-kernel
2026-05-13 14:51:14 +00:00
biondizzle
pushed to
main
at
biondizzle/nvfp4-megamoe-kernel
2026-05-13 14:51:14 +00:00
a4b90b5780
Initial commit: NVFP4 mega_moe kernel in TileLang
biondizzle
created repository
biondizzle/nvfp4-megamoe-kernel
2026-05-13 14:49:50 +00:00
First
Previous
...
130
131
132
133
134
...
Next
Last