Stop using title frontmatter and fix doc that can only be reached by search (#20623)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@@ -1,6 +1,4 @@
|
||||
---
|
||||
title: INT8 W8A8
|
||||
---
|
||||
# INT8 W8A8
|
||||
|
||||
vLLM supports quantizing weights and activations to INT8 for memory savings and inference acceleration.
|
||||
This quantization method is particularly useful for reducing model size while maintaining good performance.
|
||||
|
||||
Reference in New Issue
Block a user