This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
nvfp4-megamoe-kernel
Watch
1
Star
0
Fork
0
You've already forked nvfp4-megamoe-kernel
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
8546ed725f9ef321fe3323df71edecb17b750cf0
nvfp4-megamoe-kernel
/
dsv4
/
ops
History
biondizzle
e77455c3ba
DEBUG: add sync inside quantize_nvfp4_gpu_fused to catch async errors
2026-06-04 01:05:47 +00:00
..
__init__.py
Restructure: cutedsl/ -> dsv4/ with proper layering
2026-05-21 17:30:44 +00:00
custom_ops.py
Stage E: head-packed MQA/GQA, batch dim, custom_op, integration API
2026-05-27 15:15:03 +00:00
gemm_runner.py
Fix: Add out= parameter to run_fused_swiglu_grouped_gemm signature
2026-06-03 21:45:15 +00:00
layouts.py
Restructure: cutedsl/ -> dsv4/ with proper layering
2026-05-21 17:30:44 +00:00
quantize.py
DEBUG: add sync inside quantize_nvfp4_gpu_fused to catch async errors
2026-06-04 01:05:47 +00:00
rope_cuda.py
fix: rope_cuda path — kernels/cuda not ops/cuda
2026-06-02 09:06:36 +00:00
router.py
router: catch CuTeDSL warmup failures fast, don't let MLIR errors slow down init
2026-06-01 00:00:07 +00:00