Make distinct code and console admonitions so readers are less likely to miss them (#20585)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -57,7 +57,7 @@ By default, we optimize model inference using CUDA graphs which take up extra me
|
||||
|
||||
You can adjust `compilation_config` to achieve a better balance between inference speed and memory usage:
|
||||
|
||||
??? Code
|
||||
??? code
|
||||
|
||||
```python
|
||||
from vllm import LLM
|
||||
@@ -129,7 +129,7 @@ reduce the size of the processed multi-modal inputs, which in turn saves memory.
|
||||
|
||||
Here are some examples:
|
||||
|
||||
??? Code
|
||||
??? code
|
||||
|
||||
```python
|
||||
from vllm import LLM
|
||||
|
||||
Reference in New Issue
Block a user