[Doc][2/N] Reorganize Models and Usage sections (#11755)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
19
docs/source/features/quantization/index.md
Normal file
19
docs/source/features/quantization/index.md
Normal file
@@ -0,0 +1,19 @@
|
||||
(quantization-index)=
|
||||
|
||||
# Quantization
|
||||
|
||||
Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.
|
||||
|
||||
```{toctree}
|
||||
:caption: Contents
|
||||
:maxdepth: 1
|
||||
|
||||
supported_hardware
|
||||
auto_awq
|
||||
bnb
|
||||
gguf
|
||||
int8
|
||||
fp8
|
||||
fp8_e5m2_kvcache
|
||||
fp8_e4m3_kvcache
|
||||
```
|
||||
Reference in New Issue
Block a user