vllm/vllm/core at 45a1a69b9841a4cb7cc70788cf7dea1a2d3ec3d6 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

afeldman-nm 4238bc82f2 [Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

..

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

block_manager_v1.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

block_manager_v2.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

embedding_model_block_manager.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

evictor_v1.py

[Core] Enable prefix caching with block manager v2 enabled (#4142 )

2024-05-01 11:20:32 -07:00

evictor_v2.py

[mypy][6/N] Fix all the core subdirectory typing (#4450 )

2024-05-02 03:01:00 +00:00

interfaces.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

policy.py

[Chunked Prefill][4/n] Chunked prefill scheduler. (#3853 )

2024-04-05 10:17:58 -07:00

scheduler.py

[Core] Fix scheduler considering "no LoRA" as "LoRA" (#4897 )

2024-05-20 17:48:32 -07:00