vllm/tests/v1/attention at 5caaeb714ce3fd08de9c2e87848b4825bb4b676d - vllm

Files

Benjamin Chislett c30b405b8f [Spec Decode] Enable FlashInfer Spec Decoding (#25196 )

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Co-authored-by: lhsjohn <huashuoli@tencent.com>

2025-09-23 22:29:58 -04:00

test_attention_backends_selection.py

[Attention] Unify mamba and attention backend selection (#23171 )

2025-08-25 09:09:36 +00:00

test_attention_backends.py

[V1] Add sliding window support to Flex Attention backend (#24089 )

2025-09-21 05:08:07 +00:00

test_attention_splitting.py

[Spec Decode] Enable FlashInfer Spec Decoding (#25196 )

2025-09-23 22:29:58 -04:00

test_chunked_local_attention.py

fix some typos (#24071 )

2025-09-02 20:44:50 -07:00

test_mla_backends.py

[Attention] FlashAttention MLA cudagraph support (#23958 )

2025-09-08 22:05:26 +00:00

utils.py

Add FLASHINFER_MLA to backend selector test (#24753 )

2025-09-12 22:30:07 +00:00