vllm/vllm/v1/core at 5d91d2b292be9b1d6b121d36d242d5077a031e4b - vllm

Files

maang-h 5d91d2b292 [Doc] Add allocate_slots parameter docs (#29777 )

Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>

2025-12-02 23:23:09 +00:00

sched

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine (#29764 )

2025-12-02 22:42:28 +00:00

__init__.py

[V1] Implement vLLM V1 [1/N] (#9289 )

2024-10-22 01:24:07 -07:00

block_pool.py

[Core][Observability] Add KV cache residency metrics (#27793 )

2025-12-01 18:27:53 +00:00

encoder_cache_manager.py

[Misc] Simplify max tokens in multimodal registry (#27500 )

2025-10-24 23:56:01 -07:00

kv_cache_coordinator.py

[Core][Observability] Add KV cache residency metrics (#27793 )

2025-12-01 18:27:53 +00:00

kv_cache_manager.py

[Doc] Add allocate_slots parameter docs (#29777 )

2025-12-02 23:23:09 +00:00

kv_cache_metrics.py

[Core][Observability] Add KV cache residency metrics (#27793 )

2025-12-01 18:27:53 +00:00

kv_cache_utils.py

[Hybrid Allocator] Support KV cache groups with different block_size (#29143 )

2025-11-25 10:30:57 -05:00

single_type_kv_cache_manager.py

[Hybrid Allocator] Support KV cache groups with different block_size (#29143 )

2025-11-25 10:30:57 -05:00