Symptom: When the model makes a tool call and receives a response, it would act as if the response was empty ("The function returned no output") even though valid content was provided.

Root Cause: The func_detail_regex required a newline between the function name and first argument tag, but GLM-5.1's chat template does NOT include that newline. The regex silently failed to match, tool call extraction failed, and somewhere in that failure path the tool response content got lost.

Model output format (no newline after name):

[TOOL_CALL_START]function_name[ARG_KEY]value[ARG_END]...[TOOL_CALL_END]

Old regex (broken):

r"\[TOOL_CALL_START\]([^\n]*)\n(.*)\[TOOL_CALL_END\]"  # Requires \n after name

Fixed regex:

r"\[TOOL_CALL_START\]\s*([\w.\-]+)\s*((?:\[ARG_KEY\].*)?)\s*\[TOOL_CALL_END\]"

The fix:

Uses \s* instead of mandatory \n
Makes the arguments group optional for zero-argument calls
Accepts word chars, dots, and hyphens in function names

Issue 2: Zero-Argument Tool Calls Crash

Symptom: TypeError: 'NoneType' object is not iterable when tool has no arguments.

Fix: The tc_args_raw is now defaulted to empty string: tc_args_raw = tc_detail.group(2) or ""

Issue 3: Streaming Path vs Non-Streaming Path Inconsistency

Both paths now use the same robust extraction helpers for consistency.

Files

File	Description
`glm4_moe_tool_parser.py`	Fixed tool parser
`utils.py`	Utility functions for partial JSON/tag handling
`Dockerfile`	Overlays patched files onto base image
`Jenkinsfile`	CI/CD pipeline for building and pushing
`tests/`	Test suite for tool call validation

Testing

Requirements

pip install httpx regex

Running Tests

export VLLM_API_BASE="https://api.vultrinference.com/v1"
export VLLM_API_KEY="your-api-key"
export VLLM_MODEL="zai-org/GLM-5.1-FP8"

python tests/test_tool_diagnosis.py

Test Cases

Test	Description
`test_simple_tool_response`	Verifies model can see tool response content
`test_without_tools_param`	Tests behavior without tools param in follow-up
`test_different_content_formats`	String vs array content formats

Deployment

Jenkins Pipeline

curl -X POST "https://jenkins.sweetapi.com/job/vllm-glm-build/buildWithParameters" \
  -u "admin:TOKEN" \
  -d "IMAGE_TAG=latest"

Manual Build

docker build -t atl.vultrcr.com/vllm/vllm-glm51-patched:latest .
docker push atl.vultrcr.com/vllm/vllm-glm51-patched:latest

Images

Base: vllm/vllm-openai:glm51-cu130
Output: atl.vultrcr.com/vllm/vllm-glm51-patched:<tag>

vLLM Issue #32829 (streaming long string parameters)
GLM-5.1 chat template: https://huggingface.co/zai-org/GLM-5.1-FP8/raw/main/chat_template.jinja

README.md

vLLM GLM Tool Parser Patch

Issues Fixed

Issue 1: Tool Response Content Ignored (CRITICAL)