This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
eec6942014c4408d8d9e4c3a37324f7ff35fc5aa
vllm
/
vllm
/
v1
/
core
History
Raushan Turganbay
f38ee34a0a
[feat] Enable mm caching for transformers backend (
#21358
)
...
Signed-off-by: raushan <
raushan@huggingface.co
>
2025-07-22 08:18:46 -07:00
..
sched
Implement Async Scheduling (
#19970
)
2025-07-14 23:01:46 -07:00
__init__.py
[V1] Implement vLLM V1 [1/N] (
#9289
)
2024-10-22 01:24:07 -07:00
block_pool.py
[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool (
#21222
)
2025-07-22 06:17:47 -07:00
encoder_cache_manager.py
[V1] Add API docs for EncoderCacheManager (
#19294
)
2025-06-18 13:37:01 +08:00
kv_cache_coordinator.py
[V1] Hybrid allocator without prefix caching (
#20661
)
2025-07-13 16:55:14 +00:00
kv_cache_manager.py
[v1][core] Support for attention free models (
#20811
)
2025-07-15 14:20:01 +00:00
kv_cache_utils.py
[feat] Enable mm caching for transformers backend (
#21358
)
2025-07-22 08:18:46 -07:00
single_type_kv_cache_manager.py
[Core] Support Local Chunked Attention for Hybrid KV Cache (
#19351
)
2025-07-18 20:48:38 -07:00