nvfp4-megamoe-kernel/dsv4 at efe63caea97366fc6f6ffb184e64ada38eef8af7 - nvfp4-megamoe-kernel - Gitea: Git with a cup of tea

biondizzle/nvfp4-megamoe-kernel

Files

History

biondizzle 62efde5c9f fix: router — use cuBLAS BF16 GEMM + activation_topk CUDA kernel (production path, not CuTeDSL fused)

2026-06-01 01:01:15 +00:00

..

E1: Wire LayerCacheHandle gather methods + CUDA gather kernels

2026-05-30 21:09:21 +00:00

fix: router — use cuBLAS BF16 GEMM + activation_topk CUDA kernel (production path, not CuTeDSL fused)

2026-06-01 01:01:15 +00:00

fix: transpose checkpoint weights before make_b_k_major in Nvfp4Linear/SharedExpert

2026-06-01 00:30:37 +00:00

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

Fix remaining mHC API references: layer_compare.py, layer.py comment

2026-05-31 18:38:34 +00:00

fix: import SF_VEC_SIZE from quantize in gemm_runner (was NameError)

2026-06-01 00:04:48 +00:00

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

__init__.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00