nvfp4-megamoe-kernel/dsv4/layers at be476b2ce2fa08dc9ef02b56c33bfe6c2edc60ca - nvfp4-megamoe-kernel - Gitea: Git with a cup of tea

biondizzle/nvfp4-megamoe-kernel

Files

History

biondizzle 2a886fe0f2 Add --no-thinking mode to skip thinking tokens and use second-best

2026-05-31 19:24:21 +00:00

..

__init__.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

attention.py

E2/E3: compressor bridge, indexer bridge, flush pipeline wiring

2026-05-30 21:16:54 +00:00

embedding.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

ffn.py

Layer dispatch: config, schedule, attention/FFN sub-blocks, TransformerLayer

2026-05-21 23:11:09 +00:00

grouped_linear.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

linear.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00

mhc.py

Add --no-thinking mode to skip thinking tokens and use second-best

2026-05-31 19:24:21 +00:00

moe.py

NVFP4-1.1 integration: GPU-only quantize kernel + MoE pipeline wiring

2026-05-25 16:19:07 +00:00

norm.py

Fix layer construction: match existing API signatures, add RMSNorm impl

2026-05-21 23:31:58 +00:00

router.py

Router: full kernel stack — hash, topk, activation+topk, dense decode/prefill

2026-05-21 21:54:05 +00:00

shared_expert.py

Restructure: cutedsl/ -> dsv4/ with proper layering

2026-05-21 17:30:44 +00:00