[Kernel] Expand MoE weight loading + Add Fused Marlin MoE Kernel (#7527)

Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
This commit is contained in:
Dipika Sikka
2024-08-21 19:17:10 -04:00
committed by GitHub
parent 5844017285
commit 8678a69ab5
15 changed files with 2375 additions and 85 deletions

View File

@@ -920,7 +920,7 @@ class JambaForCausalLM(nn.Module, HasInnerState):
weight_loader = param.weight_loader
weight_loader(param,
loaded_weight,
weight_name,
name,
shard_id=shard_id,
expert_id=expert_id)
break