Update nm to rht in doc links + refine fp8 doc (#17678)

Signed-off-by: mgoin <mgoin64@gmail.com>
2025-05-05 20:41:14 -04:00
parent 90bd2ae172
commit 98834fefaa
2 changed files with 16 additions and 72 deletions
--- a/docs/source/serving/offline_inference.md
+++ b/docs/source/serving/offline_inference.md
@@ -95,7 +95,7 @@ You can convert the model checkpoint to a sharded checkpoint using <gh-file:exam

 Quantized models take less memory at the cost of lower precision.

-Statically quantized models can be downloaded from HF Hub (some popular ones are available at [Neural Magic](https://huggingface.co/neuralmagic))
+Statically quantized models can be downloaded from HF Hub (some popular ones are available at [Red Hat AI](https://huggingface.co/RedHatAI))
 and used directly without extra configuration.

 Dynamic quantization is also supported via the `quantization` option -- see [here](#quantization-index) for more details.