vllm/vllm/attention/ops at 983a40a8bb2ef2a0ed9c5134d49358c38d6b03ae - vllm

Files

Yu-Zhou d0a7a2769d [Hardware][Gaudi][Feature] Support Contiguous Cache Fetch (#12139 )

Signed-off-by: yuzhou <yuzhou@habana.ai>
Signed-off-by: zhouyu5 <yu.zhou@intel.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>

2025-02-18 19:40:19 -08:00

2025-02-02 11:58:18 -08:00

__init__.py

2024-03-25 04:39:33 +00:00

hpu_paged_attn.py

2025-02-18 19:40:19 -08:00

ipex_attn.py

2025-02-02 11:58:18 -08:00

nki_flash_attn.py

2025-02-11 21:12:37 -08:00

paged_attn.py

2025-02-02 11:58:18 -08:00

prefix_prefill.py

2025-02-13 22:21:50 -08:00

triton_decode_attention.py

2025-02-04 18:22:24 -08:00

triton_flash_attention.py

2025-02-03 11:16:59 -08:00