This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
eb5819b2d9ff4e5a019de97c333bbedf2a2def1a
vllm
/
vllm
/
v1
/
core
History
Lily Liu
f49e5aff11
[V1][Spec Decode] KV cache slots for eagle heads (
#16370
)
...
Signed-off-by: LiuXiaoxuanPKU <
lilyliupku@gmail.com
>
2025-04-12 19:42:51 -07:00
..
sched
[V1][Spec Decode] KV cache slots for eagle heads (
#16370
)
2025-04-12 19:42:51 -07:00
__init__.py
[V1] Implement vLLM V1 [1/N] (
#9289
)
2024-10-22 01:24:07 -07:00
block_pool.py
[V1][Minor] Optimize get_cached_block (
#16135
)
2025-04-06 20:48:14 +00:00
encoder_cache_manager.py
Enforce valid max_num_batched_tokens when disable_chunked_mm_input=True (
#16447
)
2025-04-11 08:09:52 +00:00
kv_cache_manager.py
[V1][Spec Decode] KV cache slots for eagle heads (
#16370
)
2025-04-12 19:42:51 -07:00
kv_cache_utils.py
[Feature] Estimate max-model-len use available KV cache memory (
#16168
)
2025-04-08 19:12:51 -07:00
specialized_manager.py
[V1] Implement sliding window attention in kv_cache_manager (
#14097
)
2025-04-01 00:33:17 -07:00