[CPU] Support FP8 KV cache (#14741)

Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-03-15 13:07:36 +08:00
parent 877e352262
commit a2ae496589
8 changed files with 122 additions and 36 deletions
--- a/docs/source/getting_started/installation/cpu.md
+++ b/docs/source/getting_started/installation/cpu.md
@@ -189,7 +189,7 @@ vLLM CPU backend supports the following vLLM features:
 - Model Quantization (`INT8 W8A8, AWQ, GPTQ`)
 - Chunked-prefill
 - Prefix-caching
- FP8-E5M2 KV-Caching (TODO)
+- FP8-E5M2 KV cache

 ## Related runtime environment variables