[BugFix] Fix DeepGEMM over-allocating workspace (#28254)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
This commit is contained in:
Lucas Wilkinson
2025-11-10 17:01:17 -05:00
committed by GitHub
parent bf6a3d0ff5
commit 6dec9f6109

View File

@@ -215,7 +215,7 @@ class DeepGemmExperts(mk.FusedMoEPermuteExpertsUnpermute):
)
assert M_sum % block_m == 0
workspace1 = (M_sum, max(N, K))
workspace1 = (M_sum, N)
workspace2 = (M_sum, max(N // 2, K))
output = (M, K)
return (workspace1, workspace2, output)