[Doc] Improve GitHub links (#11491)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-26 06:49:26 +08:00
parent b689ada91e
commit 6ad909fdda
31 changed files with 147 additions and 136 deletions
--- a/docs/source/usage/compatibility_matrix.md
+++ b/docs/source/usage/compatibility_matrix.md
@@ -82,7 +82,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     -
     -
   * - [LoRA](#lora-adapter)
-     - [✗](https://github.com/vllm-project/vllm/pull/9057)
+     - [✗](gh-pr:9057)
     - ✅
     -
     -
@@ -168,10 +168,10 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     -
   * - <abbr title="Encoder-Decoder Models">enc-dec</abbr>
     - ✗
-     - [✗](https://github.com/vllm-project/vllm/issues/7366)
+     - [✗](gh-issue:7366)
     - ✗
     - ✗
-     - [✗](https://github.com/vllm-project/vllm/issues/7366)
+     - [✗](gh-issue:7366)
     - ✅
     - ✅
     -
@@ -205,7 +205,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/pull/8199)
+     - [✗](gh-pr:8199)
     - ✅
     - ✗
     - ✅
@@ -244,7 +244,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✗
     - ✗
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/8198)
+     - [✗](gh-issue:8198)
     - ✅
     -
     -
@@ -253,8 +253,8 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     -
   * - <abbr title="Multimodal Inputs">mm</abbr>
     - ✅
-     -  [✗](https://github.com/vllm-project/vllm/pull/8348)
-     -  [✗](https://github.com/vllm-project/vllm/pull/7199)
+     -  [✗](gh-pr:8348)
+     -  [✗](gh-pr:7199)
     - ?
     - ?
     - ✅
@@ -273,14 +273,14 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/6137)
+     - [✗](gh-issue:6137)
     - ✅
     - ✗
     - ✅
     - ✅
     - ✅
     - ?
-     - [✗](https://github.com/vllm-project/vllm/issues/7968)
+     - [✗](gh-issue:7968)
     - ✅
     -
     -
@@ -290,14 +290,14 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/6137)
+     - [✗](gh-issue:6137)
     - ✅
     - ✗
     - ✅
     - ✅
     - ✅
     - ?
-     - [✗](https://github.com/vllm-project/vllm/issues/7968>)
+     - [✗](gh-issue:7968>)
     - ?
     - ✅
     -
@@ -314,7 +314,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/9893)
+     - [✗](gh-issue:9893)
     - ?
     - ✅
     - ✅
@@ -338,7 +338,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - CPU
     - AMD
   * - [CP](#chunked-prefill)
-     - [✗](https://github.com/vllm-project/vllm/issues/2729)
+     - [✗](gh-issue:2729)
     - ✅
     - ✅
     - ✅
@@ -346,7 +346,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
   * - [APC](#apc)
-     - [✗](https://github.com/vllm-project/vllm/issues/3687)
+     - [✗](gh-issue:3687)
     - ✅
     - ✅
     - ✅
@@ -359,7 +359,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/pull/4830)
+     - [✗](gh-pr:4830)
     - ✅
   * - <abbr title="Prompt Adapter">prmpt adptr</abbr>
     - ✅
@@ -367,7 +367,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/8475)
+     - [✗](gh-issue:8475)
     - ✅
   * - [SD](#spec_decode)
     - ✅
@@ -439,7 +439,7 @@ Check the '✗' with links to see tracking issue for unsupported feature/hardwar
     - ✅
     - ✅
     - ✅
-     - [✗](https://github.com/vllm-project/vllm/issues/8477)
+     - [✗](gh-issue:8477)
     - ✅
   * - best-of
     - ✅
--- a/docs/source/usage/lora.md
+++ b/docs/source/usage/lora.md
@@ -47,8 +47,7 @@ outputs = llm.generate(
 )
 ```

-Check out [examples/multilora_inference.py](https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py)
-for an example of how to use LoRA adapters with the async engine and how to use more advanced configuration options.
+Check out <gh-file:examples/multilora_inference.py> for an example of how to use LoRA adapters with the async engine and how to use more advanced configuration options.

 ## Serving LoRA Adapters

--- a/docs/source/usage/multimodal_inputs.md
+++ b/docs/source/usage/multimodal_inputs.md
@@ -5,7 +5,7 @@
 This page teaches you how to pass multi-modal inputs to [multi-modal models](#supported-mm-models) in vLLM.

 ```{note}
-We are actively iterating on multi-modal support. See [this RFC](https://github.com/vllm-project/vllm/issues/4194) for upcoming changes,
+We are actively iterating on multi-modal support. See [this RFC](gh-issue:4194) for upcoming changes,
 and [open an issue on GitHub](https://github.com/vllm-project/vllm/issues/new/choose) if you have any feedback or feature requests.
 ```

@@ -60,7 +60,7 @@ for o in outputs:
    print(generated_text)
 ```

-A code example can be found in [examples/offline_inference_vision_language.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_vision_language.py).
+Full example: <gh-file:examples/offline_inference_vision_language.py>

 To substitute multiple images inside the same text prompt, you can pass in a list of images instead:

@@ -91,7 +91,7 @@ for o in outputs:
    print(generated_text)
 ```

-A code example can be found in [examples/offline_inference_vision_language_multi_image.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_vision_language_multi_image.py).
+Full example: <gh-file:examples/offline_inference_vision_language_multi_image.py>

 Multi-image input can be extended to perform video captioning. We show this with [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) as it supports videos:

@@ -125,13 +125,13 @@ for o in outputs:
 You can pass a list of NumPy arrays directly to the {code}`'video'` field of the multi-modal dictionary
 instead of using multi-image input.

-Please refer to [examples/offline_inference_vision_language.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_vision_language.py) for more details.
+Full example: <gh-file:examples/offline_inference_vision_language.py>

 ### Audio

 You can pass a tuple {code}`(array, sampling_rate)` to the {code}`'audio'` field of the multi-modal dictionary.

-Please refer to [examples/offline_inference_audio_language.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_audio_language.py) for more details.
+Full example: <gh-file:examples/offline_inference_audio_language.py>

 ### Embedding

@@ -208,7 +208,7 @@ A chat template is **required** to use Chat Completions API.

 Although most models come with a chat template, for others you have to define one yourself.
 The chat template can be inferred based on the documentation on the model's HuggingFace repo.
-For example, LLaVA-1.5 (`llava-hf/llava-1.5-7b-hf`) requires a chat template that can be found [here](https://github.com/vllm-project/vllm/blob/main/examples/template_llava.jinja).
+For example, LLaVA-1.5 (`llava-hf/llava-1.5-7b-hf`) requires a chat template that can be found here: <gh-file:examples/template_llava.jinja>
 ```

 ### Image
@@ -271,7 +271,7 @@ chat_response = client.chat.completions.create(
 print("Chat completion output:", chat_response.choices[0].message.content)
 ```

-A full code example can be found in [examples/openai_chat_completion_client_for_multimodal.py](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_completion_client_for_multimodal.py).
+Full example: <gh-file:examples/openai_chat_completion_client_for_multimodal.py>

 ```{tip}
 Loading from local file paths is also supported on vLLM: You can specify the allowed local media path via `--allowed-local-media-path` when launching the API server/engine,
@@ -296,7 +296,7 @@ $ export VLLM_IMAGE_FETCH_TIMEOUT=<timeout>

 Instead of {code}`image_url`, you can pass a video file via {code}`video_url`.

-You can use [these tests](https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_video.py) as reference.
+You can use [these tests](gh-file:entrypoints/openai/test_video.py) as reference.

 ````{note}
 By default, the timeout for fetching videos through HTTP URL url is `30` seconds.
@@ -399,7 +399,7 @@ result = chat_completion_from_url.choices[0].message.content
 print("Chat completion output from audio url:", result)
 ```

-A full code example can be found in [examples/openai_chat_completion_client_for_multimodal.py](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_completion_client_for_multimodal.py).
+Full example: <gh-file:examples/openai_chat_completion_client_for_multimodal.py>

 ````{note}
 By default, the timeout for fetching audios through HTTP URL is `10` seconds.
@@ -435,7 +435,7 @@ Since VLM2Vec has the same model architecture as Phi-3.5-Vision, we have to expl
 to run this model in embedding mode instead of text generation mode.

 The custom chat template is completely different from the original one for this model,
-and can be found [here](https://github.com/vllm-project/vllm/blob/main/examples/template_vlm2vec.jinja).
+and can be found here: <gh-file:examples/template_vlm2vec.jinja>
 ```

 Since the request schema is not defined by OpenAI client, we post a request to the server using the lower-level `requests` library:
@@ -475,7 +475,7 @@ vllm serve MrLight/dse-qwen2-2b-mrl-v1 --task embed \
 Like with VLM2Vec, we have to explicitly pass `--task embed`.

 Additionally, `MrLight/dse-qwen2-2b-mrl-v1` requires an EOS token for embeddings, which is handled
-by [this custom chat template](https://github.com/vllm-project/vllm/blob/main/examples/template_dse_qwen2_vl.jinja).
+by a custom chat template: <gh-file:examples/template_dse_qwen2_vl.jinja>
 ```

 ```{important}
@@ -483,4 +483,4 @@ Also important, `MrLight/dse-qwen2-2b-mrl-v1` requires a placeholder image of th
 example below for details.
 ```

-A full code example can be found in [examples/openai_chat_embedding_client_for_multimodal.py](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_embedding_client_for_multimodal.py).
+Full example: <gh-file:examples/openai_chat_embedding_client_for_multimodal.py>
--- a/docs/source/usage/spec_decode.md
+++ b/docs/source/usage/spec_decode.md
@@ -4,8 +4,8 @@

 ```{warning}
 Please note that speculative decoding in vLLM is not yet optimized and does
-not usually yield inter-token latency reductions for all prompt datasets or sampling parameters. The work
-to optimize it is ongoing and can be followed in [this issue.](https://github.com/vllm-project/vllm/issues/4630)
+not usually yield inter-token latency reductions for all prompt datasets or sampling parameters.
+The work to optimize it is ongoing and can be followed here: <gh-issue:4630>
 ```

 ```{warning}
@@ -176,7 +176,7 @@ speculative decoding, breaking down the guarantees into three key areas:
   >   distribution. [View Test Code](https://github.com/vllm-project/vllm/blob/47b65a550866c7ffbd076ecb74106714838ce7da/tests/samplers/test_rejection_sampler.py#L252)
   > - **Greedy Sampling Equality**: Confirms that greedy sampling with speculative decoding matches greedy sampling
   >   without it. This verifies that vLLM's speculative decoding framework, when integrated with the vLLM forward pass and the vLLM rejection sampler,
-   >   provides a lossless guarantee. Almost all of the tests in [this directory](https://github.com/vllm-project/vllm/tree/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e)
+   >   provides a lossless guarantee. Almost all of the tests in <gh-dir:tests/spec_decode/e2e>.
   >   verify this property using [this assertion implementation](https://github.com/vllm-project/vllm/blob/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e/conftest.py#L291)

 3. **vLLM Logprob Stability**
@@ -202,4 +202,4 @@ For mitigation strategies, please refer to the FAQ entry *Can the output of a pr
 - [A Hacker's Guide to Speculative Decoding in vLLM](https://www.youtube.com/watch?v=9wNAgpX6z_4)
 - [What is Lookahead Scheduling in vLLM?](https://docs.google.com/document/d/1Z9TvqzzBPnh5WHcRwjvK2UEeFeq5zMZb5mFE8jR0HCs/edit#heading=h.1fjfb0donq5a)
 - [Information on batch expansion](https://docs.google.com/document/d/1T-JaS2T1NRfdP51qzqpyakoCXxSXTtORppiwaj5asxA/edit#heading=h.kk7dq05lc6q8)
- [Dynamic speculative decoding](https://github.com/vllm-project/vllm/issues/4565)
+- [Dynamic speculative decoding](gh-issue:4565)
--- a/docs/source/usage/structured_outputs.md
+++ b/docs/source/usage/structured_outputs.md
@@ -131,7 +131,7 @@ completion = client.chat.completions.create(
 print(completion.choices[0].message.content)
 ```

-The complete code of the examples can be found on [examples/openai_chat_completion_structured_outputs.py](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_completion_structured_outputs.py).
+Full example: <gh-file:examples/openai_chat_completion_structured_outputs.py>

 ## Experimental Automatic Parsing (OpenAI API)

@@ -257,4 +257,4 @@ outputs = llm.generate(
 print(outputs[0].outputs[0].text)
 ```

-A complete example with all options can be found in [examples/offline_inference_structured_outputs.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_structured_outputs.py).
+Full example: <gh-file:examples/offline_inference_structured_outputs.py>
--- a/docs/source/usage/usage_stats.md
+++ b/docs/source/usage/usage_stats.md
@@ -4,7 +4,7 @@ vLLM collects anonymous usage data by default to help the engineering team bette

 ## What data is collected?

-You can see the up to date list of data collected by vLLM in the [usage_lib.py](https://github.com/vllm-project/vllm/blob/main/vllm/usage/usage_lib.py).
+The list of data collected by the latest version of vLLM can be found here: <gh-file:vllm/usage/usage_lib.py>

 Here is an example as of v0.4.0: