[Bugfix] Fix ModernBert load & Enable sliding window attention for bidirectional attention. (#22637)

Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
This commit is contained in:
wang.yuqi
2025-08-12 15:23:17 +08:00
committed by GitHub
parent 2f4657952b
commit 6d729c43fb
4 changed files with 101 additions and 59 deletions

View File

@@ -384,6 +384,8 @@ class FlashAttentionImpl(AttentionImpl):
self.alibi_slopes = alibi_slopes
if sliding_window is None:
self.sliding_window = (-1, -1)
elif attn_type == AttentionType.ENCODER_ONLY:
self.sliding_window = (sliding_window - 1, sliding_window - 1)
else:
self.sliding_window = (sliding_window - 1, 0)
self.kv_cache_dtype = kv_cache_dtype