This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
66e86f1dbd565292a253e7d2d6851f65dc4f14ba
vllm
/
vllm
/
v1
/
core
History
Yifan Qiao
91e4521f9f
[Feat][v1] Simple yet General CPU KV Cache Offloading (
#37160
)
...
Signed-off-by: Yifan Qiao <
yifanqiao@berkeley.edu
> Signed-off-by: Yifan Qiao <
yifanqiao@inferact.ai
>
2026-03-31 17:58:37 -07:00
..
sched
[Feat][v1] Simple yet General CPU KV Cache Offloading (
#37160
)
2026-03-31 17:58:37 -07:00
__init__.py
[V1] Implement vLLM V1 [1/N] (
#9289
)
2024-10-22 01:24:07 -07:00
block_pool.py
[feat] Add per-block extra_keys to KV events (
#33304
)
2026-02-20 20:11:40 -08:00
encoder_cache_manager.py
[Refactor] Move profiling methods to MM budget (
#33559
)
2026-02-02 23:27:00 +08:00
kv_cache_coordinator.py
[BugFix] Avoid prefix cache hit in the same schedule step for mamba layers (
#29387
)
2026-02-10 07:41:16 +00:00
kv_cache_manager.py
[Core] add option to schedule requests based on full ISL (
#37307
)
2026-03-24 13:01:12 -04:00
kv_cache_metrics.py
[Core][Observability] Add KV cache residency metrics (
#27793
)
2025-12-01 18:27:53 +00:00
kv_cache_utils.py
fix: disambiguate multimodal prefix cache keys (
#36708
)
2026-03-20 10:33:20 +08:00
single_type_kv_cache_manager.py
[BUGFIX][Mamba][Qwen3.5] Zero freed SSM cache blocks on GPU (
#35219
)
2026-03-10 03:32:20 -07:00