Make distinct code and console admonitions so readers are less likely to miss them (#20585)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -12,7 +12,7 @@ vllm serve unsloth/Llama-3.2-1B-Instruct
|
||||
|
||||
Then query the endpoint to get the latest metrics from the server:
|
||||
|
||||
??? Output
|
||||
??? console "Output"
|
||||
|
||||
```console
|
||||
$ curl http://0.0.0.0:8000/metrics
|
||||
@@ -33,7 +33,7 @@ Then query the endpoint to get the latest metrics from the server:
|
||||
|
||||
The following metrics are exposed:
|
||||
|
||||
??? Code
|
||||
??? code
|
||||
|
||||
```python
|
||||
--8<-- "vllm/engine/metrics.py:metrics-definitions"
|
||||
|
||||
@@ -60,7 +60,7 @@ To identify the particular CUDA operation that causes the error, you can add `--
|
||||
|
||||
If GPU/CPU communication cannot be established, you can use the following Python script and follow the instructions below to confirm whether the GPU/CPU communication is working correctly.
|
||||
|
||||
??? Code
|
||||
??? code
|
||||
|
||||
```python
|
||||
# Test PyTorch NCCL
|
||||
@@ -170,7 +170,7 @@ WARNING 12-11 14:50:37 multiproc_worker_utils.py:281] CUDA was previously
|
||||
|
||||
or an error from Python that looks like this:
|
||||
|
||||
??? Logs
|
||||
??? console "Logs"
|
||||
|
||||
```console
|
||||
RuntimeError:
|
||||
@@ -214,7 +214,7 @@ if __name__ == '__main__':
|
||||
|
||||
vLLM heavily depends on `torch.compile` to optimize the model for better performance, which introduces the dependency on the `torch.compile` functionality and the `triton` library. By default, we use `torch.compile` to [optimize some functions](gh-pr:10406) in the model. Before running vLLM, you can check if `torch.compile` is working as expected by running the following script:
|
||||
|
||||
??? Code
|
||||
??? code
|
||||
|
||||
```python
|
||||
import torch
|
||||
|
||||
@@ -10,7 +10,7 @@ The list of data collected by the latest version of vLLM can be found here: <gh-
|
||||
|
||||
Here is an example as of v0.4.0:
|
||||
|
||||
??? Output
|
||||
??? console "Output"
|
||||
|
||||
```json
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user