[V1] [Hybrid] Enable compile and piecewise CUDA graph for MiniMax-Text models (#22589)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
This commit is contained in:
@@ -339,6 +339,7 @@ class CompilationConfig:
|
||||
"vllm.mamba_mixer2",
|
||||
"vllm.mamba_mixer",
|
||||
"vllm.short_conv",
|
||||
"vllm.linear_attention",
|
||||
]
|
||||
|
||||
def compute_hash(self) -> str:
|
||||
|
||||
Reference in New Issue
Block a user