Make distinct code and console admonitions so readers are less likely to miss them (#20585)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 03:55:28 +01:00
parent 31c5d0a1b7
commit af107d5a0e
52 changed files with 192 additions and 162 deletions
--- a/docs/features/quantization/modelopt.md
+++ b/docs/features/quantization/modelopt.md
@@ -14,7 +14,7 @@ You can quantize HuggingFace models using the example scripts provided in the Te

 Below is an example showing how to quantize a model using modelopt's PTQ API:

-??? Code
+??? code

    ```python
    import modelopt.torch.quantization as mtq
@@ -50,7 +50,7 @@ with torch.inference_mode():

 The quantized checkpoint can then be deployed with vLLM. As an example, the following code shows how to deploy `nvidia/Llama-3.1-8B-Instruct-FP8`, which is the FP8 quantized checkpoint derived from `meta-llama/Llama-3.1-8B-Instruct`, using vLLM:

-??? Code
+??? code

    ```python
    from vllm import LLM, SamplingParams