vllm/tests/v1/spec_decode at 738d0a281fab2e151a67b370c26b4e4360362f8f - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Fynn Schmitt-Ulms 04bf5a35fa [Spec Decode] Update extract_hidden_states to use deferred kv_connector clear (#37013 )

2026-03-16 14:53:45 +01:00

..

__init__.py

Test: added acceptance length tests (#32030 )

2026-01-20 18:55:15 +00:00

test_acceptance_length.py

[Hardware] Replace torch.cuda.device_count/current_device/set_device API (#36145 )

2026-03-12 07:57:47 -07:00

test_eagle_step_kernel.py

feat(spec_decode): fuse EAGLE step slot mapping and metadata updates (#33503 )

2026-03-11 04:35:33 +00:00

test_eagle.py

Reapply [Attention] Refactor check_and_update_config (#35122 )

2026-03-09 07:17:14 -07:00

test_extract_hidden_states.py

[Spec Decode] Update extract_hidden_states to use deferred kv_connector clear (#37013 )

2026-03-16 14:53:45 +01:00

test_max_len.py

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

test_mtp.py

[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA (#34552 )

2026-03-03 07:21:57 -08:00

test_ngram.py

[Cleanup] Remove obsolete spec decoding compatibility logic (#32003 )

2026-01-09 05:44:18 +00:00

test_speculators_eagle3.py

[Rocm][CI] Fix test_speculator_eagle3 by skipping the CompressedTensorw4a16 Model (#30001 )

2025-12-04 07:52:28 +00:00

test_tree_attention.py

[ROCm][CI] Fix spec decode logprobs flakiness and parametrize tree attention backends (#34599 )

2026-02-20 20:25:23 -08:00