Announce paper release (#1036)

2023-09-13 17:38:13 -07:00
parent f04908cae7
commit eda1a7cad3
2 changed files with 15 additions and 1 deletions
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -43,6 +43,7 @@ vLLM is flexible and easy to use with:
 For more information, check out the following:

 * `vLLM announcing blog post <https://vllm.ai>`_ (intro to PagedAttention)
+* `vLLM paper <https://arxiv.org/abs/2309.06180>`_ (SOSP 2023)
 * `How continuous batching enables 23x throughput in LLM inference while reducing p50 latency <https://www.anyscale.com/blog/continuous-batching-llm-inference>`_ by Cade Daniel et al.