[Doc] Improve GitHub links (#11491)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung
2024-12-26 06:49:26 +08:00
committed by GitHub
parent b689ada91e
commit 6ad909fdda
31 changed files with 147 additions and 136 deletions

View File

@@ -55,7 +55,7 @@ for output in outputs:
More API details can be found in the {doc}`Offline Inference
</dev/offline_inference/offline_index>` section of the API docs.
The code for the `LLM` class can be found in [vllm/entrypoints/llm.py](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/llm.py).
The code for the `LLM` class can be found in <gh-file:vllm/entrypoints/llm.py>.
### OpenAI-compatible API server
@@ -66,7 +66,7 @@ This server can be started using the `vllm serve` command.
vllm serve <model>
```
The code for the `vllm` CLI can be found in [vllm/scripts.py](https://github.com/vllm-project/vllm/blob/main/vllm/scripts.py).
The code for the `vllm` CLI can be found in <gh-file:vllm/scripts.py>.
Sometimes you may see the API server entrypoint used directly instead of via the
`vllm` CLI command. For example:
@@ -75,7 +75,7 @@ Sometimes you may see the API server entrypoint used directly instead of via the
python -m vllm.entrypoints.openai.api_server --model <model>
```
That code can be found in [vllm/entrypoints/openai/api_server.py](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/api_server.py).
That code can be found in <gh-file:vllm/entrypoints/openai/api_server.py>.
More details on the API server can be found in the {doc}`OpenAI Compatible
Server </serving/openai_compatible_server>` document.
@@ -105,7 +105,7 @@ processing.
- **Output Processing**: Processes the outputs generated by the model, decoding the
token IDs from a language model into human-readable text.
The code for `LLMEngine` can be found in [vllm/engine/llm_engine.py].
The code for `LLMEngine` can be found in <gh-file:vllm/engine/llm_engine.py>.
### AsyncLLMEngine
@@ -115,10 +115,9 @@ incoming requests. The `AsyncLLMEngine` is designed for online serving, where it
can handle multiple concurrent requests and stream outputs to clients.
The OpenAI-compatible API server uses the `AsyncLLMEngine`. There is also a demo
API server that serves as a simpler example in
[vllm/entrypoints/api_server.py].
API server that serves as a simpler example in <gh-file:vllm/entrypoints/api_server.py>.
The code for `AsyncLLMEngine` can be found in [vllm/engine/async_llm_engine.py].
The code for `AsyncLLMEngine` can be found in <gh-file:vllm/engine/async_llm_engine.py>.
## Worker
@@ -252,7 +251,3 @@ big problem.
In summary, the complete config object `VllmConfig` can be treated as an
engine-level global state that is shared among all vLLM classes.
[vllm/engine/async_llm_engine.py]: https://github.com/vllm-project/vllm/tree/main/vllm/engine/async_llm_engine.py
[vllm/engine/llm_engine.py]: https://github.com/vllm-project/vllm/tree/main/vllm/engine/llm_engine.py
[vllm/entrypoints/api_server.py]: https://github.com/vllm-project/vllm/tree/main/vllm/entrypoints/api_server.py