[ Kernel ] Enable Dynamic Per Token fp8 (#6547)

This commit is contained in:
Robert Shaw
2024-07-19 19:08:15 -04:00
committed by GitHub
parent 07eb6f19f3
commit 4cc24f01b1
7 changed files with 67 additions and 38 deletions

View File

@@ -3,4 +3,5 @@ Meta-Llama-3-8B-Instruct-FP8.yaml
Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
Meta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
Meta-Llama-3-8B-Instruct-Channelwise-compressed-tensors.yaml
Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml