Make distinct code and console admonitions so readers are less likely to miss them (#20585)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2025-07-08 03:55:28 +01:00
committed by GitHub
parent 31c5d0a1b7
commit af107d5a0e
52 changed files with 192 additions and 162 deletions

View File

@@ -57,7 +57,7 @@ By default, we optimize model inference using CUDA graphs which take up extra me
You can adjust `compilation_config` to achieve a better balance between inference speed and memory usage:
??? Code
??? code
```python
from vllm import LLM
@@ -129,7 +129,7 @@ reduce the size of the processed multi-modal inputs, which in turn saves memory.
Here are some examples:
??? Code
??? code
```python
from vllm import LLM

View File

@@ -7,7 +7,7 @@ vLLM uses the following environment variables to configure the system:
All environment variables used by vLLM are prefixed with `VLLM_`. **Special care should be taken for Kubernetes users**: please do not name the service as `vllm`, otherwise environment variables set by Kubernetes might conflict with vLLM's environment variables, because [Kubernetes sets environment variables for each service with the capitalized service name as the prefix](https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables).
??? Code
??? code
```python
--8<-- "vllm/envs.py:env-vars-definition"