Add chat template for Llama 4 models (#16428)

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2025-04-24 17:19:36 -03:00
parent fe92176321
commit 05e1fbfc52
4 changed files with 139 additions and 1 deletions
--- a/docs/source/features/tool_calling.md
+++ b/docs/source/features/tool_calling.md
@@ -152,10 +152,11 @@ Recommended flags: `--tool-call-parser mistral --chat-template examples/tool_cha

 Supported models:

-All Llama 3.1 and 3.2 models should be supported.
+All Llama 3.1, 3.2 and 4 models should be supported.

 * `meta-llama/Llama-3.1-*`
 * `meta-llama/Llama-3.2-*`
+* `meta-llama/Llama-4-*`

 The tool calling that is supported is the [JSON based tool calling](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling). For [pythonic tool calling](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/text_prompt_format.md#zero-shot-function-calling) introduced by the Llama-3.2 models, see the `pythonic` tool parser below.

@@ -176,6 +177,12 @@ images.

 Recommended flags: `--tool-call-parser llama3_json --chat-template {see_above}`

+VLLM also provides a JSON based chat template for Llama 4:
+* `examples/tool_chat_template_llama4_json.jinja` - this is based on the "official" chat template for the Llama 4
+models, but tweaked so that it works better with vLLM.
+
+For Llama 4 use `--tool-call-parser llama4_json examples/tool_chat_template_llama4_json.jinja`.
+
 #### IBM Granite

 Supported models: