Add chat template for Llama 4 models (#16428)

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
This commit is contained in:
Maximilien de Bayser
2025-04-24 17:19:36 -03:00
committed by GitHub
parent fe92176321
commit 05e1fbfc52
4 changed files with 139 additions and 1 deletions

View File

@@ -152,10 +152,11 @@ Recommended flags: `--tool-call-parser mistral --chat-template examples/tool_cha
Supported models:
All Llama 3.1 and 3.2 models should be supported.
All Llama 3.1, 3.2 and 4 models should be supported.
* `meta-llama/Llama-3.1-*`
* `meta-llama/Llama-3.2-*`
* `meta-llama/Llama-4-*`
The tool calling that is supported is the [JSON based tool calling](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling). For [pythonic tool calling](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/text_prompt_format.md#zero-shot-function-calling) introduced by the Llama-3.2 models, see the `pythonic` tool parser below.
@@ -176,6 +177,12 @@ images.
Recommended flags: `--tool-call-parser llama3_json --chat-template {see_above}`
VLLM also provides a JSON based chat template for Llama 4:
* `examples/tool_chat_template_llama4_json.jinja` - this is based on the "official" chat template for the Llama 4
models, but tweaked so that it works better with vLLM.
For Llama 4 use `--tool-call-parser llama4_json examples/tool_chat_template_llama4_json.jinja`.
#### IBM Granite
Supported models: