[Frontend][torch.compile] CompilationConfig Overhaul (#20283): Set up -O infrastructure (#26847)

Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: adabeyta <aabeyta@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: adabeyta <aabeyta@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-27 04:55:58 -05:00
parent 00d3310d2d
commit 0838b52e2e
13 changed files with 735 additions and 64 deletions
--- a/vllm/config/model.py
+++ b/vllm/config/model.py
@@ -1752,6 +1752,14 @@ class ModelConfig:
        logger.info("Using max model len %s", max_model_len)
        return max_model_len

+    def is_model_moe(
+        self,
+    ) -> bool:
+        return self.get_num_experts() > 1
+
+    def is_quantized(self) -> bool:
+        return getattr(self.hf_config, "quantization_config", None) is not None
+

 def get_served_model_name(model: str, served_model_name: str | list[str] | None):
    """