vllm/vllm/executor at dac6a3f6ed14ea4061b672f9290bfdf8bcdd996d - vllm

Files

Cody Yu f942efb5a3 [Dynamic Spec Decoding] Auto-disable by the running queue size (#4592 )

Co-authored-by: Cade Daniel <edacih@gmail.com>

2024-05-08 21:44:00 +00:00

__init__.py

2024-03-11 11:03:45 -07:00

cpu_executor.py

2024-05-03 17:47:07 -07:00

distributed_gpu_executor.py

2024-04-27 11:17:45 -07:00

executor_base.py

2024-05-03 17:47:07 -07:00

gpu_executor.py

2024-05-08 21:44:00 +00:00

multiproc_worker_utils.py

2024-05-02 11:13:25 -07:00

neuron_executor.py

2024-05-03 17:47:07 -07:00

ray_gpu_executor.py

2024-05-03 17:47:07 -07:00

ray_utils.py

2024-04-26 00:16:58 -07:00