GLM-5.x models would either crash or silently drop tool response content when using the OpenAI chat completions API with tools. Two separate bugs were responsible:

Tool parser regex mismatch — Parser expected newline between function name and arguments, but GLM's template does not include one
Content format detection failure — vLLM auto-detected "openai" format incorrectly, causing tool response content to be dropped

Bug #1: Tool Parser Regex Mismatch

Problem

The func_detail_regex in glm4_moe_tool_parser.py required a literal newline between the function name and the first argument tag.

GLM-5.x chat template outputs tool calls without that newline - the function name is immediately followed by the first argument tag. The regex would fail to match, causing tool call extraction to fail silently.

Fix

Changed the regex to use \\s* (optional whitespace) instead of mandatory \\n, and made the arguments group optional for zero-argument calls:

# Before
r"\[TOOL_START\]([^\n]*)\n(.*)\[TOOL_END\]"

# After  
r"\[TOOL_START\]\s*([\w.\-]+)\s*((?:\[ARG_KEY\].*)?)\s*\[TOOL_END\]"

Also fixed tc_args_raw to default to empty string, preventing crashes on zero-argument tool calls.

File: glm4_moe_tool_parser.py

Bug #2: Content Format Detection Failure

Problem

vLLM's _detect_content_format() function analyzes Jinja templates to determine whether message content should be formatted as strings or OpenAI-style arrays.

For GLM-5.x, the template contains a loop {% for tr in m.content %} for handling tool responses with multiple results. vLLM saw this loop and detected "openai" format, converting tool message content to:

[{"type": "text", "text": "the actual content"}]

However, the GLM template's first branch checks {% if m.content is string %} before using that loop. Since arrays are not strings, the template took the wrong branch and the content was lost.

The model would respond: "The function returned no output" even though valid content was provided.

Root Cause

The template has two branches for tool messages:

{%- if m.content is string %}
    {{ '<observations>' + m.content + '</observations>' }}
{%- else %}
    {% for tr in m.content %}  <!-- expects objects with .name property -->
    ...
{% endif %}

vLLM's detection saw the for loop and chose "openai" format. But the is string check failed for arrays, and the else branch expected objects with .name properties that {"type": "text"} objects don't have.

Fix

Added _is_glm_model() detection function to vllm/renderers/hf.py that forces "string" content format for GLM models, bypassing the incorrect auto-detection:

def _is_glm_model(tokenizer: HfTokenizer, model_config: "ModelConfig") -> bool:
    """Check if this is a GLM model that requires string content format."""
    name_or_path = tokenizer.name_or_path.lower()
    glm_indicators = ["glm-4", "glm-5", "glm4", "glm5", "zai-org/glm"]
    return any(ind in name_or_path for ind in glm_indicators)

Called in _resolve_chat_template_content_format() before auto-detection.

File: vllm_patches/hf.py

Files

File	Description
`glm4_moe_tool_parser.py`	Fixed tool parser (regex fix)
`utils.py`	Utility functions for partial JSON/tag handling
`vllm_patches/hf.py`	Patched renderer (content format fix)
`Dockerfile`	Overlays patched files onto base vLLM image

Deployment

Docker Build

docker build -t your-registry/vllm-glm51-patched:latest .
docker push your-registry/vllm-glm51-patched:latest

Kubernetes

Update your deployment to use the patched image and ensure these vLLM args:

extraArgs:
  - "--tool-call-parser=glm47"
  - "--enable-auto-tool-choice"

Verification

Tool response content is now properly passed to the model:

Model response: The test function was called successfully! It returned the value **42**.
PASS: Model referenced the tool result (42)

vLLM Issue #32829 (streaming long string parameters)
GLM-5.1 chat template: https://huggingface.co/zai-org/GLM-5.1-FP8/raw/main/chat_template.jinja

README.md

vLLM GLM-5.x Tool Calling Patches

Summary

Bug #1: Tool Parser Regex Mismatch

Problem

Fix

Bug #2: Content Format Detection Failure

Problem

Root Cause

Fix

Files

Deployment

Docker Build

Kubernetes

Verification

Related