diff --git a/docs/features/quantization/README.md b/docs/features/quantization/README.md index 7b5287bad..8b4dcf019 100644 --- a/docs/features/quantization/README.md +++ b/docs/features/quantization/README.md @@ -14,7 +14,7 @@ Contents: - [INT4 W4A16](int4.md) - [INT8 W8A8](int8.md) - [FP8 W8A8](fp8.md) -- [NVIDIA TensorRT Model Optimizer](modelopt.md) +- [NVIDIA Model Optimizer](modelopt.md) - [AMD Quark](quark.md) - [Quantized KV Cache](quantized_kvcache.md) - [TorchAO](torchao.md) diff --git a/docs/features/quantization/modelopt.md b/docs/features/quantization/modelopt.md index c48ccb719..b02d5ba9e 100644 --- a/docs/features/quantization/modelopt.md +++ b/docs/features/quantization/modelopt.md @@ -1,6 +1,6 @@ -# NVIDIA TensorRT Model Optimizer +# NVIDIA Model Optimizer -The [NVIDIA TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) is a library designed to optimize models for inference with NVIDIA GPUs. It includes tools for Post-Training Quantization (PTQ) and Quantization Aware Training (QAT) of Large Language Models (LLMs), Vision Language Models (VLMs), and diffusion models. +The [NVIDIA Model Optimizer](https://github.com/NVIDIA/Model-Optimizer) is a library designed to optimize models for inference with NVIDIA GPUs. It includes tools for Post-Training Quantization (PTQ) and Quantization Aware Training (QAT) of Large Language Models (LLMs), Vision Language Models (VLMs), and diffusion models. We recommend installing the library with: @@ -10,7 +10,7 @@ pip install nvidia-modelopt ## Quantizing HuggingFace Models with PTQ -You can quantize HuggingFace models using the example scripts provided in the TensorRT Model Optimizer repository. The primary script for LLM PTQ is typically found within the `examples/llm_ptq` directory. +You can quantize HuggingFace models using the example scripts provided in the Model Optimizer repository. The primary script for LLM PTQ is typically found within the `examples/llm_ptq` directory. Below is an example showing how to quantize a model using modelopt's PTQ API: