Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
4c16ba617f76b342dd0e62deba1f96ed6cee74fa
vllm/vllm/v1/core
History
Or Ozeri 028599739d [BugFix] scheduler: Fix resuming of preempted requests after async load (#31583)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2026-01-10 12:39:25 -08:00
..
sched
[BugFix] scheduler: Fix resuming of preempted requests after async load (#31583)
2026-01-10 12:39:25 -08:00
__init__.py
[V1] Implement vLLM V1 [1/N] (#9289)
2024-10-22 01:24:07 -07:00
block_pool.py
[Prefix Cache] Include lora_name in BlockStored event for deterministic KV-cache reconstruction (#27577)
2025-12-30 00:17:16 +00:00
encoder_cache_manager.py
[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475)
2025-12-16 14:18:17 -08:00
kv_cache_coordinator.py
[Feat][Core] Support multiple KV cache groups in Hybrid KV Coordinator (#31707)
2026-01-09 10:53:20 -08:00
kv_cache_manager.py
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector (#30166)
2025-12-26 18:25:46 -08:00
kv_cache_metrics.py
[Core][Observability] Add KV cache residency metrics (#27793)
2025-12-01 18:27:53 +00:00
kv_cache_utils.py
[refactor] refactor memory constants usage (#31865)
2026-01-07 18:37:31 +00:00
single_type_kv_cache_manager.py
[Model] Add support for openPangu moe model (#28775)
2025-12-30 08:11:38 -08:00
Powered by Gitea Version: 1.25.2 Page: 529ms Template: 2ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API