vllm-kimi25-eagle/kimi_k2_reasoning_parser.py at master

Files

biondizzle 778f1bfe66 Fix is_reasoning_end to handle multi-turn prompt tokens

Instead of always returning False (which broke tool call streaming),
use a heuristic: if think-end appears in the token IDs but is
followed by more than 3 tokens (chat template wrapping like
<|im_end|>, user markers, etc.), it's from a prior turn's prompt
and reasoning hasn't started in the current generation. Return False.
If think-end is at or near the end, it's from generated tokens and
reasoning has ended. Return True.

2026-04-14 08:39:18 +00:00

16 KiB

Raw Permalink Blame History

View Raw

16 KiB Raw Permalink Blame History

16 KiB

Raw Permalink Blame History