[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)

Signed-off-by: yan ma <yan.ma@intel.com>
2024-11-19 02:18:05 +08:00
parent 281cc4b3cd
commit 6b2d25efc7
7 changed files with 146 additions and 52 deletions
--- a/docs/source/quantization/supported_hardware.rst
+++ b/docs/source/quantization/supported_hardware.rst
@@ -27,7 +27,7 @@ The table below shows the compatibility of various quantization implementations
     - ✅︎
     - ✅︎
     - ✗
-     - ✗
+     - ✅︎
     - ✅︎
     - ✗
     - ✗
@@ -38,8 +38,8 @@ The table below shows the compatibility of various quantization implementations
     - ✅︎
     - ✅︎
     - ✗
-     - ✗
-     - ✗
+     - ✅︎
+     - ✅︎
     - ✗
     - ✗
   * - Marlin (GPTQ/AWQ/FP8)
@@ -129,4 +129,4 @@ Notes:

 Please note that this compatibility chart may be subject to change as vLLM continues to evolve and expand its support for different hardware platforms and quantization methods.

-For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory <https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/layers/quantization>`_ or consult with the vLLM development team.
+For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory <https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/layers/quantization>`_ or consult with the vLLM development team.