Commit Graph

9 Commits

Author SHA1 Message Date
7c1ed0408b fix: recursive _fix_schema to handle nested properties=[] at any depth 2026-04-12 20:52:44 +00:00
a9911386e0 strip guided_json, guided_regex too; fix parameters.properties array 2026-04-12 20:27:44 +00:00
ccedd3ecee fix: add chat_template_kwargs to STRIP_PARAMS, fix parameters.properties array 2026-04-12 20:23:10 +00:00
c66511e16f fix: handle parameters.properties being array, not just parameters itself 2026-04-12 20:17:06 +00:00
e03e41eb4f fix vLLM/SGLang schema mismatc 2026-04-12 19:57:47 +00:00
7ecbac2dc0 Fix UnboundLocalError in health(), switch from on_event to lifespan 2026-04-12 19:41:08 +00:00
774964a4db Add error dump logging: capture full request+response on 4xx/5xx from SGLang 2026-04-12 19:28:04 +00:00
db9231f796 Fix middleware: handle SGLang startup lag gracefully
- Add /health endpoint that returns 503 until SGLang is ready
- Background task polls SGLang until it accepts connections
- Catch ConnectError/TimeoutException instead of crashing
- Return 503 JSON error when SGLang backend is unavailable
- haproxy health-checks middleware /health, which reflects SGLang state
2026-04-12 19:06:38 +00:00
bbe40ac8c0 Add middleware to strip vLLM-only params (logprobs/top_logprobs) before forwarding to SGLang
SGLang's Mistral tool-call parser rejects logprobs/top_logprobs with 422,
while vLLM accepts them. Clients like OpenClaw send these by default.

New architecture: haproxy (port N) → middleware (port N+2) → SGLang (port N+1)
The middleware is a thin FastAPI app that strips incompatible params from
chat completion request bodies and passes everything else through unchanged.
2026-04-12 18:58:37 +00:00