[Bugfix] Allow skipping MoE in NVFP4 (fix for MTP) (#25987)

Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
This commit is contained in:
Benjamin Chislett
2025-10-06 16:16:30 -04:00
committed by GitHub
parent f23b4c04fd
commit 2161efe978
5 changed files with 18 additions and 5 deletions

View File

@@ -1194,6 +1194,8 @@ class FusedMoE(CustomOp):
if quant_config is None
else quant_config.get_quant_method(self, prefix)
)
if quant_method is None:
quant_method = UnquantizedFusedMoEMethod(moe)
assert quant_method is not None
assert isinstance(quant_method, FusedMoEMethodBase)