This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
DeepGEMM
Watch
1
Star
0
Fork
0
You've already forked DeepGEMM
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
75f1c8544b6649b41f4b0908225cb954cfae9b13
DeepGEMM
/
csrc
/
jit_kernels
History
biondizzle
75f1c8544b
fix: remove smem_inner_dim doubling for packed FP4 TMA — must match MMA row width (BLOCK_K/2)
2026-05-12 17:14:44 +00:00
..
heuristics
NVFP4: fix SF pipeline — 2 K-cols per BLOCK_K for group=16
2026-05-12 08:08:17 +00:00
impls
fix: remove smem_inner_dim doubling for packed FP4 TMA — must match MMA row width (BLOCK_K/2)
2026-05-12 17:14:44 +00:00