biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 19:57:50 +00:00
e03e41eb4f fix vLLM/SGLang schema mismatc
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 19:41:39 +00:00
7ecbac2dc0 Fix UnboundLocalError in health(), switch from on_event to lifespan
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 19:28:31 +00:00
774964a4db Add error dump logging: capture full request+response on 4xx/5xx from SGLang
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 19:06:40 +00:00
db9231f796 Fix middleware: handle SGLang startup lag gracefully
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 18:58:44 +00:00
bbe40ac8c0 Add middleware to strip vLLM-only params (logprobs/top_logprobs) before forwarding to SGLang
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 18:29:15 +00:00
359aa94337 Update README: haproxy proxy layer, /health probe fix, current state
6476c9c12a fix: content-length 16 not 15, remove 'timeout check' (not valid in haproxy 2.4 server line)
725e61d792 fix: haproxy 2.4 compat — use errorfile instead of http-request return
1ddc08c88b haproxy: intercept /health too — instant response based on backend state
7fb373fdfc Add haproxy proxy: /metrics returns 200 empty, everything else proxies to SGLang
Compare 6 commits »
biondizzle pushed to metrics at biondizzle/vllm-to-sglang 2026-04-12 18:27:07 +00:00
359aa94337 Update README: haproxy proxy layer, /health probe fix, current state
biondizzle pushed to metrics at biondizzle/vllm-to-sglang 2026-04-12 17:29:09 +00:00
6476c9c12a fix: content-length 16 not 15, remove 'timeout check' (not valid in haproxy 2.4 server line)
biondizzle pushed to metrics at biondizzle/vllm-to-sglang 2026-04-12 17:27:02 +00:00
725e61d792 fix: haproxy 2.4 compat — use errorfile instead of http-request return
biondizzle created branch metrics in biondizzle/vllm-to-sglang 2026-04-12 17:21:11 +00:00
biondizzle pushed to metrics at biondizzle/vllm-to-sglang 2026-04-12 17:21:11 +00:00
1ddc08c88b haproxy: intercept /health too — instant response based on backend state
7fb373fdfc Add haproxy proxy: /metrics returns 200 empty, everything else proxies to SGLang
dd3a981497 Log all received args to /tmp/vllm-shim.log
Compare 3 commits »
biondizzle pushed to cmm at biondizzle/vllm 2026-04-12 06:56:54 +00:00
013b73e9b2 Fix managed KV cache: use __cuda_array_interface__ instead of UntypedStorage.from_blob
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 04:16:53 +00:00
513f8bb5dd we dont need to compile aiter
2ac7778c15 Rewrite README: explain the shim, current state, and how to adapt for other models
Compare 2 commits »
biondizzle pushed to cmm at biondizzle/vllm 2026-04-12 03:44:18 +00:00
c77342da87 KV cache: prefer CPU placement, zero via CPU not GPU
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 02:56:30 +00:00
71f7fe0881 fix aiter
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 02:47:36 +00:00
b6151ba5db fix aiter
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 02:20:01 +00:00
4d444bebbb use a shim
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-12 00:22:29 +00:00
c86fbe0166 Fix Jenkinsfile: agent any, nightly default, proper quoting
biondizzle created branch master in biondizzle/vllm-to-sglang 2026-04-11 23:39:41 +00:00
biondizzle pushed to master at biondizzle/vllm-to-sglang 2026-04-11 23:39:41 +00:00
d71248d0f6 init commit