This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
nvfp4-megamoe-kernel
Watch
1
Star
0
Fork
0
You've already forked nvfp4-megamoe-kernel
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
5487a58df4cc0d05ebf44603c61fc0b4121fbf07
nvfp4-megamoe-kernel
/
dsv4
/
layers
History
biondizzle
5487a58df4
Fix NameError: add rows/cols variables to MoE swizzle
2026-06-04 03:14:27 +00:00
..
__init__.py
Restructure: cutedsl/ -> dsv4/ with proper layering
2026-05-21 17:30:44 +00:00
grouped_linear.py
Restore A/B split + gsa scalar fix (error is pre-existing, not regression)
2026-06-04 01:03:36 +00:00
linear.py
Blackwell swizzle CUDA kernel for CUDA graph capture
2026-06-04 03:03:02 +00:00
mhc.py
CUDA graph: Fix per-step allocations in decode loop
2026-06-03 16:38:35 +00:00
moe.py
Fix NameError: add rows/cols variables to MoE swizzle
2026-06-04 03:14:27 +00:00
router.py
CRITICAL FIX: runtime activation global scale to prevent E4M3 overflow
2026-06-01 14:21:16 +00:00
shared_expert.py
Blackwell swizzle CUDA kernel for CUDA graph capture
2026-06-04 03:03:02 +00:00