[Docs] Adding links and intro to Speculators and LLM Compressor (#32849)

Signed-off-by: Aidan Reilly <aireilly@redhat.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-01-29 22:12:35 +00:00
parent bfb9bdaf3f
commit 133765760b
5 changed files with 73 additions and 7 deletions
--- a/docs/features/quantization/README.md
+++ b/docs/features/quantization/README.md
@@ -2,7 +2,10 @@

 Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

-Contents:
+!!! tip
+    To get started with quantization, see [LLM Compressor](llm_compressor.md), a library for optimizing models for deployment with vLLM that supports FP8, INT8, INT4, and other quantization formats.
+
+The following are the supported quantization formats for vLLM:

 - [AutoAWQ](auto_awq.md)
 - [BitsAndBytes](bnb.md)