Stop using title frontmatter and fix doc that can only be reached by search (#20623)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2025-07-08 11:27:40 +01:00
committed by GitHub
parent b4bab81660
commit b942c094e3
81 changed files with 82 additions and 238 deletions

View File

@@ -1,6 +1,4 @@
---
title: Automatic Prefix Caching
---
# Automatic Prefix Caching
The core idea of [PagedAttention](https://blog.vllm.ai/2023/06/20/vllm.html) is to partition the KV cache of each request into KV Blocks. Each block contains the attention keys and values for a fixed number of tokens. The PagedAttention algorithm allows these blocks to be stored in non-contiguous physical memory so that we can eliminate memory fragmentation by allocating the memory on demand.