[V1] v1 engine + full CUDA graph support for PLaMo2 (#23998)

Signed-off-by: Hemmi Shinichi <shemmi@preferred.jp>
Signed-off-by: nopperl <54780682+nopperl@users.noreply.github.com>
Co-authored-by: Hemmi Shinichi <shemmi@preferred.jp>
Co-authored-by: Thomas Parnell <tom.parnell@gmail.com>
This commit is contained in:
nopperl
2025-09-04 00:24:02 +09:00
committed by GitHub
parent 6d80ae83e1
commit fa4311d85f
6 changed files with 349 additions and 125 deletions

View File

@@ -340,6 +340,7 @@ class CompilationConfig:
"vllm.mamba_mixer",
"vllm.short_conv",
"vllm.linear_attention",
"vllm.plamo2_mamba_mixer",
]
def compute_hash(self) -> str: