vllm/vllm/v1/core at bd2b52fc2dd09b401991835c8a2a6f2ef940b2e4 - vllm

Files

SungMinCho a0b782f9cc [Metrics] Model FLOPs Utilization estimation (#30738 )

Signed-off-by: SungMinCho <tjdals4565@gmail.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>

2025-12-18 01:40:51 +00:00

sched

[Metrics] Model FLOPs Utilization estimation (#30738 )

2025-12-18 01:40:51 +00:00

__init__.py

[V1] Implement vLLM V1 [1/N] (#9289 )

2024-10-22 01:24:07 -07:00

block_pool.py

[P/D] KV Load Failure Recovery/Abort Configuration (#26813 )

2025-12-10 11:00:52 -08:00

encoder_cache_manager.py

[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475 )

2025-12-16 14:18:17 -08:00

kv_cache_coordinator.py

[Core][Observability] Add KV cache residency metrics (#27793 )

2025-12-01 18:27:53 +00:00

kv_cache_manager.py

[P/D] KV Load Failure Recovery/Abort Configuration (#26813 )

2025-12-10 11:00:52 -08:00

kv_cache_metrics.py

[Core][Observability] Add KV cache residency metrics (#27793 )

2025-12-01 18:27:53 +00:00

kv_cache_utils.py

[Bugfix] fix confusing OOM errors during v1 init (#28051 )

2025-12-10 23:17:41 +00:00

single_type_kv_cache_manager.py

[Hybrid Allocator] Support KV cache groups with different block_size (#29143 )

2025-11-25 10:30:57 -05:00