[Doc] Improve GitHub links (#11491)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung
2024-12-26 06:49:26 +08:00
committed by GitHub
parent b689ada91e
commit 6ad909fdda
31 changed files with 147 additions and 136 deletions

View File

@@ -46,7 +46,7 @@ for output in outputs:
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
A code example can be found in [examples/offline_inference.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference.py).
A code example can be found here: <gh-file:examples/offline_inference.py>
### `LLM.beam_search`
@@ -103,7 +103,7 @@ for output in outputs:
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
A code example can be found in [examples/offline_inference_chat.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_chat.py).
A code example can be found here: <gh-file:examples/offline_inference_chat.py>
If the model doesn't have a chat template or you want to specify another one,
you can explicitly pass a chat template:
@@ -120,7 +120,7 @@ outputs = llm.chat(conversation, chat_template=custom_template)
## Online Inference
Our [OpenAI Compatible Server](../serving/openai_compatible_server) provides endpoints that correspond to the offline APIs:
Our [OpenAI Compatible Server](../serving/openai_compatible_server.md) provides endpoints that correspond to the offline APIs:
- [Completions API](#completions-api) is similar to `LLM.generate` but only accepts text.
- [Chat API](#chat-api) is similar to `LLM.chat`, accepting both text and [multi-modal inputs](#multimodal-inputs) for models with a chat template.