vllm/vllm/v1/engine at 4d251ad00ea11ecdf369aa10b9abecf96505b8dc - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Mark McLoughlin bc32bc73aa [V1][Metrics] Implement vllm:lora_requests_info metric (#13504 )

2025-02-24 20:01:33 -08:00

..

__init__.py

[V1][Core] Generic mechanism for handling engine utility (#13060 )

2025-02-19 17:09:22 +08:00

async_llm.py

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine) (#10980 )

2025-02-24 08:29:41 -08:00

core_client.py

[V1][BugFix] Fix engine core client shutdown hangs (#13298 )

2025-02-23 13:07:43 -08:00

core.py

[v1] torchrun compatibility (#13642 )

2025-02-23 22:47:24 +08:00

detokenizer.py

[V1] Logprobs and prompt logprobs support (#9880 )

2025-02-07 07:26:20 -08:00

llm_engine.py

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine) (#10980 )

2025-02-24 08:29:41 -08:00

logprobs.py

[V1] Logprobs and prompt logprobs support (#9880 )

2025-02-07 07:26:20 -08:00

mm_input_cache.py

[V1] Consolidate MM cache size to vllm.envs (#13239 )

2025-02-13 20:19:03 -08:00

output_processor.py

[V1][Metrics] Implement vllm:lora_requests_info metric (#13504 )

2025-02-24 20:01:33 -08:00

parallel_sampling.py

[V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine) (#10980 )

2025-02-24 08:29:41 -08:00

processor.py

[v1] Support allowed_token_ids in v1 Sampler (#13210 )

2025-02-22 14:13:05 +08:00