Make distinct code and console admonitions so readers are less likely to miss them (#20585)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 03:55:28 +01:00
parent 31c5d0a1b7
commit af107d5a0e
52 changed files with 192 additions and 162 deletions
--- a/docs/getting_started/installation/cpu.md
+++ b/docs/getting_started/installation/cpu.md
@@ -76,7 +76,7 @@ Currently, there are no pre-built CPU wheels.

 ### Build image from source

-??? Commands
+??? console "Commands"

    ```bash
    docker build -f docker/Dockerfile.cpu \
@@ -149,7 +149,7 @@ vllm serve facebook/opt-125m

 - If using vLLM CPU backend on a machine with hyper-threading, it is recommended to bind only one OpenMP thread on each physical CPU core using `VLLM_CPU_OMP_THREADS_BIND` or using auto thread binding feature by default. On a hyper-threading enabled platform with 16 logical CPU cores / 8 physical CPU cores:

-??? Commands
+??? console "Commands"

    ```console
    $ lscpu -e # check the mapping between logical CPU cores and physical CPU cores
--- a/docs/getting_started/installation/gpu/rocm.inc.md
+++ b/docs/getting_started/installation/gpu/rocm.inc.md
@@ -95,7 +95,7 @@ Currently, there are no pre-built ROCm wheels.

 4. Build vLLM. For example, vLLM on ROCM 6.3 can be built with the following steps:

-    ??? Commands
+    ??? console "Commands"

        ```bash
        pip install --upgrade pip
@@ -206,7 +206,7 @@ DOCKER_BUILDKIT=1 docker build \

 To run the above docker image `vllm-rocm`, use the below command:

-??? Command
+??? console "Command"

    ```bash
    docker run -it \
--- a/docs/getting_started/installation/intel_gaudi.md
+++ b/docs/getting_started/installation/intel_gaudi.md
@@ -237,7 +237,7 @@ As an example, if a request of 3 sequences, with max sequence length of 412 come

 Warmup is an optional, but highly recommended step occurring before vLLM server starts listening. It executes a forward pass for each bucket with dummy data. The goal is to pre-compile all graphs and not incur any graph compilation overheads within bucket boundaries during server runtime. Each warmup step is logged during vLLM startup:

-??? Logs
+??? console "Logs"

    ```text
    INFO 08-01 22:26:47 hpu_model_runner.py:1066] [Warmup][Prompt][1/24] batch_size:4 seq_len:1024 free_mem:79.16 GiB
@@ -286,7 +286,7 @@ When there's large amount of requests pending, vLLM scheduler will attempt to fi

 Each described step is logged by vLLM server, as follows (negative values correspond to memory being released):

-??? Logs
+??? console "Logs"

    ```text
    INFO 08-02 17:37:44 hpu_model_runner.py:493] Prompt bucket config (min, step, max_warmup) bs:[1, 32, 4], seq:[128, 128, 1024]