[Core] Support Local Chunked Attention for Hybrid KV Cache (#19351)

Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Lu Fang <fanglu@meta.com>
This commit is contained in:
Lucia Fang
2025-07-19 11:48:38 +08:00
committed by GitHub
parent 466e878f2a
commit 9a9fda1423
9 changed files with 351 additions and 19 deletions

View File

@@ -120,6 +120,7 @@ class AttentionMetadataBuilder(abc.ABC, Generic[M]):
num_kv_heads: int,
use_alibi: bool,
use_sliding_window: bool,
use_local_attention: bool,
num_sms: int,
) -> bool:
return False