Add logo and polish readme (#156)

2023-06-19 16:31:13 +08:00
parent 5822ede66e
commit a255885f83
14 changed files with 264 additions and 48 deletions
--- a/docs/source/assets/figures/perf_a100_n1_dark.png
+++ b/docs/source/assets/figures/perf_a100_n1_dark.png
--- a/docs/source/assets/figures/perf_a100_n1_light.png
+++ b/docs/source/assets/figures/perf_a100_n1_light.png
--- a/docs/source/assets/figures/perf_a100_n3_dark.png
+++ b/docs/source/assets/figures/perf_a100_n3_dark.png
--- a/docs/source/assets/figures/perf_a100_n3_light.png
+++ b/docs/source/assets/figures/perf_a100_n3_light.png
--- a/docs/source/assets/figures/perf_a10g_n1_dark.png
+++ b/docs/source/assets/figures/perf_a10g_n1_dark.png
--- a/docs/source/assets/figures/perf_a10g_n1_light.png
+++ b/docs/source/assets/figures/perf_a10g_n1_light.png
--- a/docs/source/assets/figures/perf_a10g_n3_dark.png
+++ b/docs/source/assets/figures/perf_a10g_n3_dark.png
--- a/docs/source/assets/figures/perf_a10g_n3_light.png
+++ b/docs/source/assets/figures/perf_a10g_n3_light.png
--- a/docs/source/assets/logos/vllm-logo-only-light.png
+++ b/docs/source/assets/logos/vllm-logo-only-light.png
--- a/docs/source/assets/logos/vllm-logo-text-dark.png
+++ b/docs/source/assets/logos/vllm-logo-text-dark.png
--- a/docs/source/assets/logos/vllm-logo-text-light.png
+++ b/docs/source/assets/logos/vllm-logo-text-light.png
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -1,18 +1,43 @@
 Welcome to vLLM!
 ================

-**vLLM** is a fast and easy-to-use library for LLM inference and serving.
-Its core features include:
+.. figure:: ./assets/logos/vllm-logo-text-light.png
+  :width: 60%
+  :align: center
+  :alt: vLLM
+  :class: no-scaled-link

- State-of-the-art performance in serving throughput
- Efficient management of attention key and value memory with **PagedAttention**
- Seamless integration with popular HuggingFace models
- Dynamic batching of incoming requests
- Optimized CUDA kernels
- High-throughput serving with various decoding algorithms, including *parallel sampling* and *beam search*
- Tensor parallelism support for distributed inference
- Streaming outputs
- OpenAI-compatible API server
+.. raw:: html
+
+   <p style="text-align:center">
+   <strong>Easy, fast, and cheap LLM serving for everyone
+   </strong>
+   </p>
+
+   <p style="text-align:center">
+   <a class="github-button" href="https://github.com/WoosukKwon/vllm" data-show-count="true" data-size="large" aria-label="Star skypilot-org/skypilot on GitHub">Star</a>
+   <a class="github-button" href="https://github.com/WoosukKwon/vllm/subscription" data-icon="octicon-eye" data-size="large" aria-label="Watch skypilot-org/skypilot on GitHub">Watch</a>
+   <a class="github-button" href="https://github.com/WoosukKwon/vllm/fork" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork skypilot-org/skypilot on GitHub">Fork</a>
+   </p>
+
+
+
+vLLM is a fast and easy to use library for LLM inference and serving.
+
+vLLM is fast with:
+
+* State-of-the-art serving throughput
+* Efficient management of attention key and value memory with **PagedAttention**
+* Dynamic batching of incoming requests
+* Optimized CUDA kernels
+
+vLLM is flexible and easy to use with:
+
+* Seamless integration with popular HuggingFace models
+* High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
+* Tensor parallelism support for distributed inference
+* Streaming outputs
+* OpenAI-compatible API server

 For more information, please refer to our `blog post <>`_.