vllm/vllm/worker at 59e17dd4a0f058e0bef38a371316b87fee3e882e - vllm

Files

Boyuan Feng 94e6b2d55f Allow users to specify kv cache memory size (#21489 )

Signed-off-by: Boyuan Feng <boyuan@meta.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-09-11 13:41:07 +00:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

cache_engine.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

enc_dec_model_runner.py

[V0 Deprecation] Remove pooling model support in V0 (#23434 )

2025-08-29 00:04:02 -07:00

model_runner_base.py

[V0 Deprecation] Remove pooling model support in V0 (#23434 )

2025-08-29 00:04:02 -07:00

model_runner.py

Allow users to specify kv cache memory size (#21489 )

2025-09-11 13:41:07 +00:00

utils.py

[V0 Deprecation] Remove Prompt Adapters (#20588 )

2025-07-23 16:36:48 -07:00

worker_base.py

[P/D] Add a shutdown method to the Connector API (#22699 )

2025-09-07 23:07:00 -07:00

worker.py

Allow users to specify kv cache memory size (#21489 )

2025-09-11 13:41:07 +00:00