This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
nvfp4-megamoe-kernel
Watch
1
Star
0
Fork
0
You've already forked nvfp4-megamoe-kernel
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
master
Add File
New File
Upload File
Apply Patch
nvfp4-megamoe-kernel
/
dsv4
/
layers
History
biondizzle
f259d63930
CRITICAL FIX: SE swizzled buffers were allocated then overwritten with None — graph capture would fall through to broken Python path
2026-06-06 07:01:52 +00:00
..
__init__.py
Restructure: cutedsl/ -> dsv4/ with proper layering
2026-05-21 17:30:44 +00:00
grouped_linear.py
Restore A/B split + gsa scalar fix (error is pre-existing, not regression)
2026-06-04 01:03:36 +00:00
linear.py
Add safety check for swizzled buffers: fall through to Python path if None
2026-06-04 04:32:00 +00:00
mhc.py
CUDA graph: Fix per-step allocations in decode loop
2026-06-03 16:38:35 +00:00
moe.py
Fix NameError: add rows/cols variables to MoE swizzle
2026-06-04 03:14:27 +00:00
router.py
Fix dense router BF16 dispatch: use torch.matmul instead of F.linear
2026-06-04 05:58:24 +00:00
shared_expert.py
CRITICAL FIX: SE swizzled buffers were allocated then overwritten with None — graph capture would fall through to broken Python path
2026-06-06 07:01:52 +00:00