vllm/tests/v1/spec_decode at b7d59ffce2f951e0ec8d1dc3a2f1e3d27f779906 - vllm

Files

Lucas Wilkinson 28ef9ba399 [BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA (#34552 )

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>

2026-03-03 07:21:57 -08:00

__init__.py

Test: added acceptance length tests (#32030 )

2026-01-20 18:55:15 +00:00

test_acceptance_length.py

move spec decode slow test to test_areas.yaml (#33365 )

2026-02-02 06:28:36 -08:00

test_eagle.py

[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA (#34552 )

2026-03-03 07:21:57 -08:00

test_extract_hidden_states.py

[Spec Decode] Add hidden states extraction system (#33736 )

2026-03-02 14:29:09 -05:00

test_max_len.py

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

test_mtp.py

[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA (#34552 )

2026-03-03 07:21:57 -08:00

test_ngram.py

[Cleanup] Remove obsolete spec decoding compatibility logic (#32003 )

2026-01-09 05:44:18 +00:00

test_speculators_eagle3.py

[Rocm][CI] Fix test_speculator_eagle3 by skipping the CompressedTensorw4a16 Model (#30001 )

2025-12-04 07:52:28 +00:00

test_tree_attention.py

[ROCm][CI] Fix spec decode logprobs flakiness and parametrize tree attention backends (#34599 )

2026-02-20 20:25:23 -08:00