hotfix attn alibi wo head mapping (#496)
Co-authored-by: oliveryuan <oliveryuan@basemind.com>
This commit is contained in:
@@ -408,6 +408,7 @@ class PagedAttentionWithALiBi(PagedAttention):
|
||||
query,
|
||||
key_cache,
|
||||
value_cache,
|
||||
self.head_mapping,
|
||||
self.scale,
|
||||
input_metadata.block_tables,
|
||||
input_metadata.context_lens,
|
||||
|
||||
Reference in New Issue
Block a user