[Docs] Reduce custom syntax used in docs (#27009)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2025-10-17 04:05:34 +01:00
committed by GitHub
parent 965c5f4914
commit 4ffd6e8942
65 changed files with 381 additions and 402 deletions

View File

@@ -4,19 +4,19 @@ vLLM is a Python library that supports the following CPU variants. Select your C
=== "Intel/AMD x86"
--8<-- "docs/getting_started/installation/cpu/x86.inc.md:installation"
--8<-- "docs/getting_started/installation/cpu.x86.inc.md:installation"
=== "ARM AArch64"
--8<-- "docs/getting_started/installation/cpu/arm.inc.md:installation"
--8<-- "docs/getting_started/installation/cpu.arm.inc.md:installation"
=== "Apple silicon"
--8<-- "docs/getting_started/installation/cpu/apple.inc.md:installation"
--8<-- "docs/getting_started/installation/cpu.apple.inc.md:installation"
=== "IBM Z (S390X)"
--8<-- "docs/getting_started/installation/cpu/s390x.inc.md:installation"
--8<-- "docs/getting_started/installation/cpu.s390x.inc.md:installation"
## Requirements
@@ -24,19 +24,19 @@ vLLM is a Python library that supports the following CPU variants. Select your C
=== "Intel/AMD x86"
--8<-- "docs/getting_started/installation/cpu/x86.inc.md:requirements"
--8<-- "docs/getting_started/installation/cpu.x86.inc.md:requirements"
=== "ARM AArch64"
--8<-- "docs/getting_started/installation/cpu/arm.inc.md:requirements"
--8<-- "docs/getting_started/installation/cpu.arm.inc.md:requirements"
=== "Apple silicon"
--8<-- "docs/getting_started/installation/cpu/apple.inc.md:requirements"
--8<-- "docs/getting_started/installation/cpu.apple.inc.md:requirements"
=== "IBM Z (S390X)"
--8<-- "docs/getting_started/installation/cpu/s390x.inc.md:requirements"
--8<-- "docs/getting_started/installation/cpu.s390x.inc.md:requirements"
## Set up using Python
@@ -52,19 +52,19 @@ Currently, there are no pre-built CPU wheels.
=== "Intel/AMD x86"
--8<-- "docs/getting_started/installation/cpu/x86.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/cpu.x86.inc.md:build-wheel-from-source"
=== "ARM AArch64"
--8<-- "docs/getting_started/installation/cpu/arm.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/cpu.arm.inc.md:build-wheel-from-source"
=== "Apple silicon"
--8<-- "docs/getting_started/installation/cpu/apple.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/cpu.apple.inc.md:build-wheel-from-source"
=== "IBM Z (s390x)"
--8<-- "docs/getting_started/installation/cpu/s390x.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/cpu.s390x.inc.md:build-wheel-from-source"
## Set up using Docker
@@ -72,24 +72,24 @@ Currently, there are no pre-built CPU wheels.
=== "Intel/AMD x86"
--8<-- "docs/getting_started/installation/cpu/x86.inc.md:pre-built-images"
--8<-- "docs/getting_started/installation/cpu.x86.inc.md:pre-built-images"
### Build image from source
=== "Intel/AMD x86"
--8<-- "docs/getting_started/installation/cpu/x86.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/cpu.x86.inc.md:build-image-from-source"
=== "ARM AArch64"
--8<-- "docs/getting_started/installation/cpu/arm.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/cpu.arm.inc.md:build-image-from-source"
=== "Apple silicon"
--8<-- "docs/getting_started/installation/cpu/arm.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/cpu.arm.inc.md:build-image-from-source"
=== "IBM Z (S390X)"
--8<-- "docs/getting_started/installation/cpu/s390x.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/cpu.s390x.inc.md:build-image-from-source"
## Related runtime environment variables

View File

@@ -157,7 +157,7 @@ See [deployment-docker-pre-built-image][deployment-docker-pre-built-image] for i
### Build image from source
You can use <gh-file:docker/Dockerfile.tpu> to build a Docker image with TPU support.
You can use [docker/Dockerfile.tpu](../../../docker/Dockerfile.tpu) to build a Docker image with TPU support.
```bash
docker build -f docker/Dockerfile.tpu -t vllm-tpu .

View File

@@ -11,7 +11,7 @@ vLLM contains pre-compiled C++ and CUDA (12.8) binaries.
# --8<-- [start:set-up-using-python]
!!! note
PyTorch installed via `conda` will statically link `NCCL` library, which can cause issues when vLLM tries to use `NCCL`. See <gh-issue:8420> for more details.
PyTorch installed via `conda` will statically link `NCCL` library, which can cause issues when vLLM tries to use `NCCL`. See <https://github.com/vllm-project/vllm/issues/8420> for more details.
In order to be performant, vLLM has to compile many cuda kernels. The compilation unfortunately introduces binary incompatibility with other CUDA versions and PyTorch versions, even for the same PyTorch version with different building configurations.

View File

@@ -4,15 +4,15 @@ vLLM is a Python library that supports the following GPU variants. Select your G
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:installation"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:installation"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:installation"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:installation"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:installation"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:installation"
## Requirements
@@ -24,15 +24,15 @@ vLLM is a Python library that supports the following GPU variants. Select your G
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:requirements"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:requirements"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:requirements"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:requirements"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:requirements"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:requirements"
## Set up using Python
@@ -42,29 +42,29 @@ vLLM is a Python library that supports the following GPU variants. Select your G
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:set-up-using-python"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:set-up-using-python"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:set-up-using-python"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:set-up-using-python"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:set-up-using-python"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:set-up-using-python"
### Pre-built wheels
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:pre-built-wheels"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:pre-built-wheels"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:pre-built-wheels"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:pre-built-wheels"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:pre-built-wheels"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:pre-built-wheels"
[](){ #build-from-source }
@@ -72,15 +72,15 @@ vLLM is a Python library that supports the following GPU variants. Select your G
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:build-wheel-from-source"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:build-wheel-from-source"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:build-wheel-from-source"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:build-wheel-from-source"
## Set up using Docker
@@ -88,40 +88,40 @@ vLLM is a Python library that supports the following GPU variants. Select your G
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:pre-built-images"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:pre-built-images"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:pre-built-images"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:pre-built-images"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:pre-built-images"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:pre-built-images"
### Build image from source
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:build-image-from-source"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:build-image-from-source"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:build-image-from-source"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:build-image-from-source"
## Supported features
=== "NVIDIA CUDA"
--8<-- "docs/getting_started/installation/gpu/cuda.inc.md:supported-features"
--8<-- "docs/getting_started/installation/gpu.cuda.inc.md:supported-features"
=== "AMD ROCm"
--8<-- "docs/getting_started/installation/gpu/rocm.inc.md:supported-features"
--8<-- "docs/getting_started/installation/gpu.rocm.inc.md:supported-features"
=== "Intel XPU"
--8<-- "docs/getting_started/installation/gpu/xpu.inc.md:supported-features"
--8<-- "docs/getting_started/installation/gpu.xpu.inc.md:supported-features"

View File

@@ -146,7 +146,7 @@ Building the Docker image from source is the recommended way to use vLLM with RO
#### (Optional) Build an image with ROCm software stack
Build a docker image from <gh-file:docker/Dockerfile.rocm_base> which setup ROCm software stack needed by the vLLM.
Build a docker image from [docker/Dockerfile.rocm_base](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile.rocm_base) which setup ROCm software stack needed by the vLLM.
**This step is optional as this rocm_base image is usually prebuilt and store at [Docker Hub](https://hub.docker.com/r/rocm/vllm-dev) under tag `rocm/vllm-dev:base` to speed up user experience.**
If you choose to build this rocm_base image yourself, the steps are as follows.
@@ -170,7 +170,7 @@ DOCKER_BUILDKIT=1 docker build \
#### Build an image with vLLM
First, build a docker image from <gh-file:docker/Dockerfile.rocm> and launch a docker container from the image.
First, build a docker image from [docker/Dockerfile.rocm](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile.rocm) and launch a docker container from the image.
It is important that the user kicks off the docker build using buildkit. Either the user put `DOCKER_BUILDKIT=1` as environment variable when calling docker build command, or the user needs to set up buildkit in the docker daemon configuration /etc/docker/daemon.json as follows and restart the daemon:
```bash
@@ -181,10 +181,10 @@ It is important that the user kicks off the docker build using buildkit. Either
}
```
<gh-file:docker/Dockerfile.rocm> uses ROCm 6.3 by default, but also supports ROCm 5.7, 6.0, 6.1, and 6.2, in older vLLM branches.
[docker/Dockerfile.rocm](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile.rocm) uses ROCm 6.3 by default, but also supports ROCm 5.7, 6.0, 6.1, and 6.2, in older vLLM branches.
It provides flexibility to customize the build of docker image using the following arguments:
- `BASE_IMAGE`: specifies the base image used when running `docker build`. The default value `rocm/vllm-dev:base` is an image published and maintained by AMD. It is being built using <gh-file:docker/Dockerfile.rocm_base>
- `BASE_IMAGE`: specifies the base image used when running `docker build`. The default value `rocm/vllm-dev:base` is an image published and maintained by AMD. It is being built using [docker/Dockerfile.rocm_base](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile.rocm_base)
- `ARG_PYTORCH_ROCM_ARCH`: Allows to override the gfx architecture values from the base docker image
Their values can be passed in when running `docker build` with `--build-arg` options.

View File

@@ -75,7 +75,7 @@ vllm serve facebook/opt-13b \
-tp=8
```
By default, a ray instance will be launched automatically if no existing one is detected in the system, with `num-gpus` equals to `parallel_config.world_size`. We recommend properly starting a ray cluster before execution, referring to the <gh-file:examples/online_serving/run_cluster.sh> helper script.
By default, a ray instance will be launched automatically if no existing one is detected in the system, with `num-gpus` equals to `parallel_config.world_size`. We recommend properly starting a ray cluster before execution, referring to the [examples/online_serving/run_cluster.sh](https://github.com/vllm-project/vllm/blob/main/examples/online_serving/run_cluster.sh) helper script.
# --8<-- [end:supported-features]
# --8<-- [start:distributed-backend]