[Doc] Convert docs to use colon fences (#12471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -1,8 +1,8 @@
|
||||
# Built-in Extensions
|
||||
|
||||
```{toctree}
|
||||
:::{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
runai_model_streamer
|
||||
tensorizer
|
||||
```
|
||||
:::
|
||||
|
||||
@@ -48,6 +48,6 @@ You can read further about CPU buffer memory limiting [here](https://github.com/
|
||||
vllm serve /home/meta-llama/Llama-3.2-3B-Instruct --load-format runai_streamer --model-loader-extra-config '{"memory_limit":5368709120}'
|
||||
```
|
||||
|
||||
```{note}
|
||||
:::{note}
|
||||
For further instructions about tunable parameters and additional parameters configurable through environment variables, read the [Environment Variables Documentation](https://github.com/run-ai/runai-model-streamer/blob/master/docs/src/env-vars.md).
|
||||
```
|
||||
:::
|
||||
|
||||
@@ -11,6 +11,6 @@ For more information on CoreWeave's Tensorizer, please refer to
|
||||
[CoreWeave's Tensorizer documentation](https://github.com/coreweave/tensorizer). For more information on serializing a vLLM model, as well a general usage guide to using Tensorizer with vLLM, see
|
||||
the [vLLM example script](https://docs.vllm.ai/en/stable/getting_started/examples/offline_inference/tensorize_vllm_model.html).
|
||||
|
||||
```{note}
|
||||
:::{note}
|
||||
Note that to use this feature you will need to install `tensorizer` by running `pip install vllm[tensorizer]`.
|
||||
```
|
||||
:::
|
||||
|
||||
Reference in New Issue
Block a user