vllm/tests/v1/worker at 86d15bfd8d681a2ca2f3b2e550149a5ba3282ef1 - vllm

Files

Matthew Bonanni b30dfa03c5 [Attention] Refactor CUDA attention backend selection logic (#24794 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-11-11 07:40:44 -05:00

__init__.py

[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (#10681 )

2024-12-26 19:02:58 +09:00

test_gpu_input_batch.py

[Chore] Separate out vllm.utils.platform_utils.py (#27374 )

2025-10-23 19:08:06 +00:00

test_gpu_model_runner.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_utils.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_worker_memory_snapshot.py

[Chore] Separate out vllm.utils.mem_utils (#27143 )

2025-10-18 10:06:59 +00:00