vllm/vllm/engine at 5aef49806da2e6cc8a92c948d44e8a722469135f - vllm

Files

Yanyi Liu 5aef49806d [Feature] Add load generation config from model (#11164 )

Signed-off-by: liuyanyi <wolfsonliu@163.com>
Signed-off-by: Yanyi Liu <wolfsonliu@163.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

2024-12-19 10:50:38 +00:00

multiprocessing

[Doc] Update docs to refer to pooling models (#11093 )

2024-12-11 13:36:27 +00:00

output_processor

[Doc] Create a new "Usage" section (#10827 )

2024-12-05 11:19:35 +08:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

[Feature] Add load generation config from model (#11164 )

2024-12-19 10:50:38 +00:00

async_llm_engine.py

[Bugfix] Fix request cancellation without polling (#11190 )

2024-12-17 12:26:32 -08:00

async_timeout.py

[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654 )

2024-06-19 13:57:12 -07:00

llm_engine.py

[Feature] Add load generation config from model (#11164 )

2024-12-19 10:50:38 +00:00

metrics_types.py

monitor metrics of tokens per step using cudagraph batchsizes (#11031 )

2024-12-09 22:35:36 -08:00

metrics.py

monitor metrics of tokens per step using cudagraph batchsizes (#11031 )

2024-12-09 22:35:36 -08:00

protocol.py

[Doc] Update docs to refer to pooling models (#11093 )

2024-12-11 13:36:27 +00:00