6476c9c12a
fix: content-length 16 not 15, remove 'timeout check' (not valid in haproxy 2.4 server line)
2026-04-12 17:29:08 +00:00
725e61d792
fix: haproxy 2.4 compat — use errorfile instead of http-request return
...
haproxy 2.4 (Ubuntu 22.04) doesn't support http-request return with
payload/content-type syntax (that's 2.8+). Switch to errorfile-based
stub responses: http-request deny deny_status N + errorfile N path.
2026-04-12 17:26:45 +00:00
1ddc08c88b
haproxy: intercept /health too — instant response based on backend state
...
SGLang's /health takes ~1.001s, racing the 1s k8s probe timeout.
Now haproxy health-checks SGLang in the background (5s interval, 3s check timeout)
and responds to /health probes instantly: 200 if backend is up, 503 if not.
2026-04-12 17:21:04 +00:00
7fb373fdfc
Add haproxy proxy: /metrics returns 200 empty, everything else proxies to SGLang
...
SGLang now runs on port+1, haproxy binds the original vLLM port.
haproxy serves a stub /metrics endpoint (200, empty body) and
passes all other traffic through to SGLang via raw TCP proxy.
2026-04-12 17:09:58 +00:00
dd3a981497
Log all received args to /tmp/vllm-shim.log
2026-04-12 04:37:24 +00:00
513f8bb5dd
we dont need to compile aiter
2026-04-12 04:16:50 +00:00
2ac7778c15
Rewrite README: explain the shim, current state, and how to adapt for other models
2026-04-12 03:07:43 +00:00
71f7fe0881
fix aiter
2026-04-12 02:56:27 +00:00
b6151ba5db
fix aiter
2026-04-12 02:47:33 +00:00
4d444bebbb
use a shim
2026-04-12 02:19:55 +00:00
c86fbe0166
Fix Jenkinsfile: agent any, nightly default, proper quoting
2026-04-12 00:22:29 +00:00
d71248d0f6
init commit
2026-04-11 23:39:36 +00:00