vllm/tests/v1/spec_decode at 8c32c6e4b485f1cae1a1dc8a3f9895cf63f3e7af - vllm

Files

Matthew Bonanni b30dfa03c5 [Attention] Refactor CUDA attention backend selection logic (#24794 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-11-11 07:40:44 -05:00

test_eagle.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_max_len.py

[Bugfix] Spec decode + structured output + spec model max len edge case (#28298 )

2025-11-08 19:44:25 +00:00

test_mtp.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_ngram.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_speculators_eagle3.py

[Speculators] Move tests + fix integration (#27308 )

2025-10-29 00:54:21 -07:00

test_tree_attention.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00