[Model] add optimal triton fused moe configs for NemotronH MoE (#27967)

Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
This commit is contained in:
tomeras91
2025-11-04 14:59:43 +02:00
committed by GitHub
parent 77f8001f53
commit e4ee658672
5 changed files with 589 additions and 0 deletions

View File

@@ -590,6 +590,7 @@ def main(args: argparse.Namespace):
"DeepseekV3ForCausalLM",
"DeepseekV32ForCausalLM",
"Glm4MoeForCausalLM",
"NemotronHForCausalLM",
):
E = config.n_routed_experts
topk = config.num_experts_per_tok