Add support for Mistral Large 3 inference with Flashinfer MoE (#33174)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com> Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
This commit is contained in:
committed by
GitHub
parent
73419abfae
commit
f0bca83ee4
@@ -135,6 +135,8 @@ class TestData:
|
||||
layer.w2_input_scale,
|
||||
)
|
||||
layer.custom_routing_function = Llama4MoE.custom_routing_function
|
||||
layer.routing_method_type = RoutingMethodType.Llama4
|
||||
layer.renormalize = False
|
||||
layer.intermediate_size_per_partition = n
|
||||
layer.ep_rank = 0
|
||||
layer.local_num_experts = e
|
||||
|
||||
Reference in New Issue
Block a user