vllm/tests/v1/attention at 53d2420b4447fbcab572dc23d2c3bb9224a8a561 - vllm

Files

Lucas Wilkinson abe93bce59 [Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624 )

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com>

2025-12-09 17:18:10 -08:00

test_attention_backends_selection.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_attention_backends.py

[v1] Add real sliding window calculation to FlexAttention direct BlockMask building (#26015 )

2025-12-01 13:12:51 +00:00

test_attention_splitting.py

[Attention] Make split_decodes_and_prefills(..., require_uniform=True) support padding (#29644 )

2025-12-09 07:24:01 +00:00

test_batch_reordering.py

[BugFix] Reordering extend logic fix (#27739 )

2025-10-29 21:39:34 -07:00

test_chunked_local_attention.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_mla_backends.py

[Attention] Refactor FA block_size limitations to hybrid models only (#29084 )

2025-11-22 06:38:44 -08:00

test_rocm_attention_backends_selection.py

[Attention] Update attention imports (#29540 )

2025-11-27 11:19:09 -05:00

test_sparse_mla_backends.py

Add TP parameter to attention tests (#27683 )

2025-11-03 13:04:40 -08:00

utils.py

[Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624 )

2025-12-09 17:18:10 -08:00