[Doc] Improve GitHub links (#11491)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-26 06:49:26 +08:00
parent b689ada91e
commit 6ad909fdda
31 changed files with 147 additions and 136 deletions
--- a/docs/source/models/generative_models.md
+++ b/docs/source/models/generative_models.md
@@ -46,7 +46,7 @@ for output in outputs:
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
 ```

-A code example can be found in [examples/offline_inference.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference.py).
+A code example can be found here: <gh-file:examples/offline_inference.py>

 ### `LLM.beam_search`

@@ -103,7 +103,7 @@ for output in outputs:
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
 ```

-A code example can be found in [examples/offline_inference_chat.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_chat.py).
+A code example can be found here: <gh-file:examples/offline_inference_chat.py>

 If the model doesn't have a chat template or you want to specify another one,
 you can explicitly pass a chat template:
@@ -120,7 +120,7 @@ outputs = llm.chat(conversation, chat_template=custom_template)

 ## Online Inference

-Our [OpenAI Compatible Server](../serving/openai_compatible_server) provides endpoints that correspond to the offline APIs:
+Our [OpenAI Compatible Server](../serving/openai_compatible_server.md) provides endpoints that correspond to the offline APIs:

 - [Completions API](#completions-api) is similar to `LLM.generate` but only accepts text.
 - [Chat API](#chat-api)  is similar to `LLM.chat`, accepting both text and [multi-modal inputs](#multimodal-inputs) for models with a chat template.