Announce paper release (#1036)

This commit is contained in:
Woosuk Kwon
2023-09-13 17:38:13 -07:00
committed by GitHub
parent f04908cae7
commit eda1a7cad3
2 changed files with 15 additions and 1 deletions

View File

@@ -43,6 +43,7 @@ vLLM is flexible and easy to use with:
For more information, check out the following:
* `vLLM announcing blog post <https://vllm.ai>`_ (intro to PagedAttention)
* `vLLM paper <https://arxiv.org/abs/2309.06180>`_ (SOSP 2023)
* `How continuous batching enables 23x throughput in LLM inference while reducing p50 latency <https://www.anyscale.com/blog/continuous-batching-llm-inference>`_ by Cade Daniel et al.