GLM-5.x models would either crash or silently drop tool response content when using the OpenAI chat completions API with tools. Two separate bugs were responsible:
The `func_detail_regex` in `glm4_moe_tool_parser.py` required a literal newline between the function name and the first argument tag.
GLM-5.x chat template outputs tool calls without that newline - the function name is immediately followed by the first argument tag. The regex would fail to match, causing tool call extraction to fail silently.
### Fix
Changed the regex to use `\\s*` (optional whitespace) instead of mandatory `\\n`, and made the arguments group optional for zero-argument calls:
Also fixed `tc_args_raw` to default to empty string, preventing crashes on zero-argument tool calls.
**File:** `glm4_moe_tool_parser.py`
---
## Bug #2: Content Format Detection Failure
### Problem
vLLM's `_detect_content_format()` function analyzes Jinja templates to determine whether message content should be formatted as strings or OpenAI-style arrays.
For GLM-5.x, the template contains a loop `{% for tr in m.content %}` for handling tool responses with multiple results. vLLM saw this loop and detected "openai" format, converting tool message content to:
```json
[{"type": "text", "text": "the actual content"}]
```
However, the GLM template's first branch checks `{% if m.content is string %}` before using that loop. Since arrays are not strings, the template took the wrong branch and the content was lost.
The model would respond: *"The function returned no output"* even though valid content was provided.
vLLM's detection saw the `for` loop and chose "openai" format. But the `is string` check failed for arrays, and the `else` branch expected objects with `.name` properties that `{"type": "text"}` objects don't have.
Added `_is_glm_model()` detection function to `vllm/renderers/hf.py` that forces "string" content format for GLM models, bypassing the incorrect auto-detection: