vllm/vllm/platforms at 95a935fc48563ec63de02a65d41fd2d7cb1d9ea5 - vllm

Files

Woosuk Kwon 98a3a81024 [ROCm] Add attention sink to use_rocm_custom_paged_attention (#22329 )

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: Minseok Lee <47620120+minseokl@users.noreply.github.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>

2025-08-05 23:30:38 -07:00

__init__.py

[TPU] Support Pathways in vLLM (#21417 )

2025-07-30 10:02:12 -07:00

cpu.py

[CPU] Enable shared-memory based pipeline parallel for CPU backend (#21289 )

2025-07-21 09:07:08 -07:00

cuda.py

[V1] port xformers backend to v1 (#21342 )

2025-08-05 10:04:46 -07:00

interface.py

[V1] port xformers backend to v1 (#21342 )