[Docs] Reduce custom syntax used in docs (#27009)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 04:05:34 +01:00
parent 965c5f4914
commit 4ffd6e8942
65 changed files with 381 additions and 402 deletions
--- a/docs/features/multimodal_inputs.md
+++ b/docs/features/multimodal_inputs.md
@@ -3,7 +3,7 @@
 This page teaches you how to pass multi-modal inputs to [multi-modal models][supported-mm-models] in vLLM.

 !!! note
-    We are actively iterating on multi-modal support. See [this RFC](gh-issue:4194) for upcoming changes,
+    We are actively iterating on multi-modal support. See [this RFC](https://github.com/vllm-project/vllm/issues/4194) for upcoming changes,
    and [open an issue on GitHub](https://github.com/vllm-project/vllm/issues/new/choose) if you have any feedback or feature requests.

 !!! tip
@@ -129,7 +129,7 @@ You can pass a single image to the `'image'` field of the multi-modal dictionary
        print(generated_text)
    ```

-Full example: <gh-file:examples/offline_inference/vision_language.py>
+Full example: [examples/offline_inference/vision_language.py](../../examples/offline_inference/vision_language.py)

 To substitute multiple images inside the same text prompt, you can pass in a list of images instead:

@@ -162,7 +162,7 @@ To substitute multiple images inside the same text prompt, you can pass in a lis
        print(generated_text)
    ```

-Full example: <gh-file:examples/offline_inference/vision_language_multi_image.py>
+Full example: [examples/offline_inference/vision_language_multi_image.py](../../examples/offline_inference/vision_language_multi_image.py)

 If using the [LLM.chat](../models/generative_models.md#llmchat) method, you can pass images directly in the message content using various formats: image URLs, PIL Image objects, or pre-computed embeddings:

@@ -346,13 +346,13 @@ Instead of NumPy arrays, you can also pass `'torch.Tensor'` instances, as shown
    !!! note
        'process_vision_info' is only applicable to Qwen2.5-VL and similar models.

-Full example: <gh-file:examples/offline_inference/vision_language.py>
+Full example: [examples/offline_inference/vision_language.py](../../examples/offline_inference/vision_language.py)

 ### Audio Inputs

 You can pass a tuple `(array, sampling_rate)` to the `'audio'` field of the multi-modal dictionary.

-Full example: <gh-file:examples/offline_inference/audio_language.py>
+Full example: [examples/offline_inference/audio_language.py](../../examples/offline_inference/audio_language.py)

 ### Embedding Inputs

@@ -434,11 +434,11 @@ Our OpenAI-compatible server accepts multi-modal data via the [Chat Completions
    A chat template is **required** to use Chat Completions API.
    For HF format models, the default chat template is defined inside `chat_template.json` or `tokenizer_config.json`.

-    If no default chat template is available, we will first look for a built-in fallback in <gh-file:vllm/transformers_utils/chat_templates/registry.py>.
+    If no default chat template is available, we will first look for a built-in fallback in [vllm/transformers_utils/chat_templates/registry.py](../../vllm/transformers_utils/chat_templates/registry.py).
    If no fallback is available, an error is raised and you have to provide the chat template manually via the `--chat-template` argument.

-    For certain models, we provide alternative chat templates inside <gh-dir:examples>.
-    For example, VLM2Vec uses <gh-file:examples/template_vlm2vec_phi3v.jinja> which is different from the default one for Phi-3-Vision.
+    For certain models, we provide alternative chat templates inside [examples](../../examples).
+    For example, VLM2Vec uses [examples/template_vlm2vec_phi3v.jinja](../../examples/template_vlm2vec_phi3v.jinja) which is different from the default one for Phi-3-Vision.

 ### Image Inputs

@@ -524,7 +524,7 @@ Then, you can use the OpenAI client as follows:
    print("Chat completion output:", chat_response.choices[0].message.content)
    ```

-Full example: <gh-file:examples/online_serving/openai_chat_completion_client_for_multimodal.py>
+Full example: [examples/online_serving/openai_chat_completion_client_for_multimodal.py](../../examples/online_serving/openai_chat_completion_client_for_multimodal.py)

 !!! tip
    Loading from local file paths is also supported on vLLM: You can specify the allowed local media path via `--allowed-local-media-path` when launching the API server/engine,
@@ -595,7 +595,7 @@ Then, you can use the OpenAI client as follows:
    print("Chat completion output from image url:", result)
    ```

-Full example: <gh-file:examples/online_serving/openai_chat_completion_client_for_multimodal.py>
+Full example: [examples/online_serving/openai_chat_completion_client_for_multimodal.py](../../examples/online_serving/openai_chat_completion_client_for_multimodal.py)

 !!! note
    By default, the timeout for fetching videos through HTTP URL is `30` seconds.
@@ -719,7 +719,7 @@ Alternatively, you can pass `audio_url`, which is the audio counterpart of `imag
    print("Chat completion output from audio url:", result)
    ```

-Full example: <gh-file:examples/online_serving/openai_chat_completion_client_for_multimodal.py>
+Full example: [examples/online_serving/openai_chat_completion_client_for_multimodal.py](../../examples/online_serving/openai_chat_completion_client_for_multimodal.py)

 !!! note
    By default, the timeout for fetching audios through HTTP URL is `10` seconds.