[Bugfix] Fix DeepseekV32 AssertionError: num_kv_heads == 1 (#33090)
Signed-off-by: NickLucche <nlucches@redhat.com>
(cherry picked from commit 492a7983dd)
This commit is contained in:
@@ -322,7 +322,7 @@ class TpKVTopology:
|
||||
# Figure out whether the first dimension of the cache is K/V
|
||||
# or num_blocks. This is used to register the memory regions correctly.
|
||||
kv_cache_shape = self.attn_backend.get_kv_cache_shape(
|
||||
num_blocks=1, block_size=16, num_kv_heads=4, head_size=1
|
||||
num_blocks=1, block_size=16, num_kv_heads=1, head_size=1
|
||||
)
|
||||
# Non-MLA backends caches have 5 dims [2, num_blocks, H,N,D],
|
||||
# we just mock num_blocks to 1 for the dimension check below.
|
||||
|
||||
Reference in New Issue
Block a user