Files
vllm-kimi25-eagle/kimi_k2_reasoning_parser.py
biondizzle 778f1bfe66 Fix is_reasoning_end to handle multi-turn prompt tokens
Instead of always returning False (which broke tool call streaming),
use a heuristic: if think-end appears in the token IDs but is
followed by more than 3 tokens (chat template wrapping like
<|im_end|>, user markers, etc.), it's from a prior turn's prompt
and reasoning hasn't started in the current generation. Return False.
If think-end is at or near the end, it's from generated tokens and
reasoning has ended. Return True.
2026-04-14 08:39:18 +00:00

16 KiB