[V1] v1 engine + full CUDA graph support for PLaMo2 (#23998)
Signed-off-by: Hemmi Shinichi <shemmi@preferred.jp> Signed-off-by: nopperl <54780682+nopperl@users.noreply.github.com> Co-authored-by: Hemmi Shinichi <shemmi@preferred.jp> Co-authored-by: Thomas Parnell <tom.parnell@gmail.com>
This commit is contained in:
@@ -340,6 +340,7 @@ class CompilationConfig:
|
||||
"vllm.mamba_mixer",
|
||||
"vllm.short_conv",
|
||||
"vllm.linear_attention",
|
||||
"vllm.plamo2_mamba_mixer",
|
||||
]
|
||||
|
||||
def compute_hash(self) -> str:
|
||||
|
||||
Reference in New Issue
Block a user