vllm/tests/v1/spec_decode at 4cd332f3cffc6a5137c57a49cc1f66773ae375da - vllm

Files

Lucas Wilkinson abe93bce59 [Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624 )

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com>

2025-12-09 17:18:10 -08:00

test_eagle.py

[ROCm][CI] Fix test_max_len.py for Rocm (#29916 )

2025-12-08 16:58:30 -05:00

test_max_len.py

[CI] Fix Flaky test_eagle_max_len Test (#30306 )

2025-12-09 07:33:34 +00:00

test_mtp.py

Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145 )" (#30199 )

2025-12-07 00:00:22 -08:00

test_ngram.py

Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145 )" (#30199 )

2025-12-07 00:00:22 -08:00

test_speculators_eagle3.py

[Rocm][CI] Fix test_speculator_eagle3 by skipping the CompressedTensorw4a16 Model (#30001 )

2025-12-04 07:52:28 +00:00

test_tree_attention.py

[Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624 )

2025-12-09 17:18:10 -08:00