[ Misc ] non-uniform quantization via compressed-tensors for Llama (#6515)

This commit is contained in:
Robert Shaw
2024-07-18 22:39:18 -04:00
committed by GitHub
parent d4201e06d5
commit dbe5588554
11 changed files with 301 additions and 91 deletions

View File

@@ -2,4 +2,5 @@ Meta-Llama-3-8B-Instruct.yaml
Meta-Llama-3-8B-Instruct-FP8.yaml
Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
Meta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml