This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
nvfp4-megamoe-kernel
Watch
1
Star
0
Fork
0
You've already forked nvfp4-megamoe-kernel
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
edc8e7ee8d91da9a9530e49d44ccdad585a62416
nvfp4-megamoe-kernel
/
dsv4
/
kernels
History
biondizzle
f566b9b748
Fix FP8 quantize return type (2-tuple not 3)
2026-06-02 10:02:01 +00:00
..
attention
perf: skip MQA GQA expansion in FMHA (stride=0, no 128x K/V copy)
2026-06-02 03:54:03 +00:00
cache
fix: correct gather.py kernel_dir path
2026-05-30 21:12:09 +00:00
compressor
KV-1/KV-2/KV-3: NVFP4 compressed KV + FP8 indexer keys
2026-06-02 10:00:50 +00:00
cuda
Fix FP8 quantize return type (2-tuple not 3)
2026-06-02 10:02:01 +00:00
gemm
fix: use cute.where() directly for clamp in fused SwiGLU
2026-06-02 08:16:41 +00:00
indexer
P0 COMPLETE: Eliminate ALL .item() CPU-GPU syncs from NVFP4 activation path
2026-06-01 21:05:03 +00:00
router
Switch router to Nvfp4Linear production GEMM (custom CuTeDSL kernel crashes MLIR)
2026-06-01 11:17:54 +00:00
__init__.py
Restructure: cutedsl/ -> dsv4/ with proper layering
2026-05-21 17:30:44 +00:00