[FEAT][ROCm]: Support AITER MLA (#15893)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by: qli88 <qiang.li2@amd.com>
This commit is contained in:
@@ -1248,7 +1248,7 @@ class ModelConfig:
|
||||
or getattr(self.hf_config, "is_matryoshka", False))
|
||||
|
||||
|
||||
BlockSize = Literal[8, 16, 32, 64, 128]
|
||||
BlockSize = Literal[1, 8, 16, 32, 64, 128]
|
||||
CacheDType = Literal["auto", "fp8", "fp8_e4m3", "fp8_e5m2"]
|
||||
PrefixCachingHashAlgo = Literal["builtin", "sha256"]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user