biondizzle
  • Joined on 2025-12-10
biondizzle pushed to master at biondizzle/vllm-glm 2026-04-15 07:27:04 +00:00
2cfd5f5027 fix git
biondizzle pushed to master at biondizzle/vllm-glm 2026-04-15 07:25:27 +00:00
64784741de fix lmcache
biondizzle pushed to master at biondizzle/vllm-glm 2026-04-15 04:43:31 +00:00
0b70c975bd feat: add pip install lmcache for KV cache offloading
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 11:26:21 +00:00
3ee933951c Tool parser: fallback to <|tool_call_begin|> when no section marker
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 10:17:03 +00:00
120c8d9d8d keep everything .py
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 10:16:08 +00:00
6999ed8a3a Fix finish_reason_ variable name in non-streaming path
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 09:51:54 +00:00
3f2708a095 keep everything .py
043f51322f Patch vLLM serving layer to flush reasoning on finish_reason=length
Compare 2 commits »
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 08:40:22 +00:00
778f1bfe66 Fix is_reasoning_end to handle multi-turn prompt tokens
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 08:22:12 +00:00
f5266646eb Make is_reasoning_end() always return False
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 07:49:09 +00:00
055b14cb67 fix reasoning parser
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 07:48:30 +00:00
9051c610d2 Fix reasoning parser for multi-turn conversations
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 07:09:58 +00:00
c5e6414daf Fix last empty content delta in Case 5 (post-section-close)
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 06:47:48 +00:00
a404735b2d Fix empty content deltas and leaked section markers in streaming
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 06:18:37 +00:00
fcf8fd134e we need to forward the context using the old way
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 05:48:09 +00:00
d0c9c5c482 we actually need the empty deltas to keep the stream going
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 05:06:33 +00:00
d4568f1d80 more speculative decoding fixes
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 03:49:43 +00:00
d4813de98f fix empty content deltas
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 03:31:01 +00:00
72339bfe20 Document speculative decoding tool parser bug and re-parse-and-diff fix
biondizzle pushed to master at biondizzle/vllm-kimi25-eagle 2026-04-14 03:13:30 +00:00
9be82d3574 add the tool call parser fixes for eagle decode
biondizzle pushed to master at biondizzle/vllm-deepseek-v32-mtp 2026-04-14 00:51:14 +00:00
5182342e03 need partial overlap function from newer utils. just inlined it