[Bugfix] Disable RoutingMethodType.[Renormalize,RenormalizeNaive] TRTLLM per-tensor FP8 MoE (#33620)

Signed-off-by: mgoin <mgoin64@gmail.com>
This commit is contained in:
Michael Goin
2026-02-03 05:37:15 -05:00
committed by GitHub
parent 83449a5ff0
commit e346e2d056

View File

@@ -72,8 +72,10 @@ def _supports_routing_method(
# NOTE(dbari): as above, potentially allow others here.
return routing_method in [
RoutingMethodType.Llama4,
RoutingMethodType.Renormalize,
RoutingMethodType.RenormalizeNaive,
# NOTE(mgoin): Disabled to investigate accuracy issues.
# See https://github.com/vllm-project/vllm/issues/33532
# RoutingMethodType.Renormalize,
# RoutingMethodType.RenormalizeNaive,
]
else:
raise ValueError("Unsupported quantization scheme.")