Make distinct code and console admonitions so readers are less likely to miss them (#20585)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2025-07-08 03:55:28 +01:00
committed by GitHub
parent 31c5d0a1b7
commit af107d5a0e
52 changed files with 192 additions and 162 deletions

View File

@@ -42,7 +42,7 @@ The Quark quantization process can be listed for 5 steps as below:
Quark uses [Transformers](https://huggingface.co/docs/transformers/en/index)
to fetch model and tokenizer.
??? Code
??? code
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -65,7 +65,7 @@ Quark uses the [PyTorch Dataloader](https://pytorch.org/tutorials/beginner/basic
to load calibration data. For more details about how to use calibration datasets efficiently, please refer
to [Adding Calibration Datasets](https://quark.docs.amd.com/latest/pytorch/calibration_datasets.html).
??? Code
??? code
```python
from datasets import load_dataset
@@ -98,7 +98,7 @@ kv-cache and the quantization algorithm is AutoSmoothQuant.
AutoSmoothQuant config file for Llama is
`examples/torch/language_modeling/llm_ptq/models/llama/autosmoothquant_config.json`.
??? Code
??? code
```python
from quark.torch.quantization import (Config, QuantizationConfig,
@@ -145,7 +145,7 @@ HuggingFace `safetensors`, you can refer to
[HuggingFace format exporting](https://quark.docs.amd.com/latest/pytorch/export/quark_export_hf.html)
for more exporting format details.
??? Code
??? code
```python
import torch
@@ -176,7 +176,7 @@ for more exporting format details.
Now, you can load and run the Quark quantized model directly through the LLM entrypoint:
??? Code
??? code
```python
from vllm import LLM, SamplingParams