vllm/tests/v1/worker at 02a41691932683aa544b8a0139586f43e2f8b4bd - vllm

Files

Matthew Bonanni 66e674cdd5 [Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>

2025-12-05 09:48:43 -08:00

__init__.py

[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (#10681 )

2024-12-26 19:02:58 +09:00

test_gpu_input_batch.py

[Chore] Separate out vllm.utils.platform_utils.py (#27374 )

2025-10-23 19:08:06 +00:00

test_gpu_model_runner.py

[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 )

2025-12-05 09:48:43 -08:00

test_gpu_profiler.py

[Feat] Iteration-level profiling for Torch and CUDA profiler (#28987 )