[ROCm] AITER fused RoPE+KVCache (#33443)

Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: Rohan Potdar <66227218+Rohan138@users.noreply.github.com>
Co-authored-by: charlifu <charlifu@amd.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Douglas Lehr <91553416+dllehr-amd@users.noreply.github.com>
This commit is contained in:
Rohan Potdar
2026-02-23 21:06:00 -06:00
committed by GitHub
parent 95642441d0
commit 2ff4e51152
19 changed files with 1211 additions and 83 deletions

View File

@@ -179,7 +179,7 @@ def create_and_prepopulate_kv_cache(
block_table[i, :num_blocks_for_seq] = inv_perm[start:end]
start_block_idx += num_blocks_for_seq
# Create a realistic slot mapping that corresponds to the block table
# Create a realistic slot mapping that corresponds to the block table
for i in range(batch_size):
token_offsets = torch.arange(int(query_lens[i])) + int(context_lens[i])
block_indices = token_offsets // block_size