[Bugfix][EPLB] Disabled shared expert overlap when EPLB is enabled (#28377)
Signed-off-by: Sage Moore <sage@neuralmagic.com> Signed-off-by: Sage Moore <sagemoore@utexas.edu> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
This commit is contained in:
@@ -28,13 +28,18 @@ class SharedFusedMoE(FusedMoE):
|
|||||||
super().__init__(**kwargs)
|
super().__init__(**kwargs)
|
||||||
self._shared_experts = shared_experts
|
self._shared_experts = shared_experts
|
||||||
|
|
||||||
# Disable shared expert overlap if we are not using
|
# Disable shared expert overlap if we are using eplb, because of
|
||||||
# flashinfer + DP since there is nothing to be gained in this case.
|
# correctness issues, or if using flashinfer with DP, since there
|
||||||
# Disabling the overlap optimization also prevents the shared experts
|
# is nothing to be gained in this case. Disabling the overlap
|
||||||
# from being hidden from torch.compile.
|
# optimization also prevents the shared experts from being hidden
|
||||||
|
# from torch.compile.
|
||||||
self.use_overlapped = (
|
self.use_overlapped = (
|
||||||
use_overlapped
|
use_overlapped
|
||||||
and not (self.use_flashinfer_cutlass_kernels and self.dp_size > 1)
|
and not (
|
||||||
|
# TODO(wentao): find the root cause and remove this condition
|
||||||
|
self.enable_eplb
|
||||||
|
or (self.use_flashinfer_cutlass_kernels and self.dp_size > 1)
|
||||||
|
)
|
||||||
and self._shared_experts is not None
|
and self._shared_experts is not None
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user