vllm/reasoning/deepseek_r1_reasoning_parser.py

# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

from collections.abc import Sequence

from vllm.entrypoints.openai.protocol import DeltaMessage
from vllm.reasoning.basic_parsers import BaseThinkingReasoningParser


class DeepSeekR1ReasoningParser(BaseThinkingReasoningParser):
    """
    Reasoning parser for DeepSeek R1 model.

    The DeepSeek R1 model uses <think>...</think> tokens to denote reasoning
    text. This parser extracts the reasoning content from the model output.
    """

    @property
    def start_token(self) -> str:
        """The token that starts reasoning content."""
        return "<think>"

    @property
    def end_token(self) -> str:
        """The token that ends reasoning content."""
        return "</think>"

    def extract_reasoning_streaming(
        self,
        previous_text: str,
        current_text: str,
        delta_text: str,
        previous_token_ids: Sequence[int],
        current_token_ids: Sequence[int],
        delta_token_ids: Sequence[int],
    ) -> DeltaMessage | None:
        ret = super().extract_reasoning_streaming(
            previous_text,
            current_text,
            delta_text,
            previous_token_ids,
            current_token_ids,
            delta_token_ids,
        )
        if (
            ret is not None
            and self.start_token_id not in previous_token_ids
            and self.start_token_id not in delta_token_ids
        ):
            if self.end_token_id in delta_token_ids:
                # end token in delta with more tokens,
                # extract reasoning content and content
                end_index = delta_text.find(self.end_token)
                reasoning = delta_text[:end_index]
                content = delta_text[end_index + len(self.end_token) :]
                return DeltaMessage(
                    reasoning=reasoning,
                    content=content if content else None,
                )
            elif self.end_token_id in previous_token_ids:
                # end token in previous, thinking content ends
                return DeltaMessage(content=delta_text)
            else:
                # no end token in previous or delta, reasoning content continues
                return DeltaMessage(reasoning=delta_text)

        return ret
[Misc] Add SPDX-License-Identifier headers to python source files (#12628) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com> 2025-02-02 14:58:18 -05:00			`# SPDX-License-Identifier: Apache-2.0`
[Misc] Add SPDX-FileCopyrightText (#19100) Signed-off-by: simon-mo <simon.mo@hey.com> 2025-06-03 11:20:17 -07:00			`# SPDX-FileCopyrightText: Copyright contributors to the vLLM project`
[Misc] Add SPDX-License-Identifier headers to python source files (#12628) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com> 2025-02-02 14:58:18 -05:00
Update deprecated Python 3.8 typing (#13971) 2025-03-03 01:34:51 +00:00			`from collections.abc import Sequence`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`from vllm.entrypoints.openai.protocol import DeltaMessage`
			`from vllm.reasoning.basic_parsers import BaseThinkingReasoningParser`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00

[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`class DeepSeekR1ReasoningParser(BaseThinkingReasoningParser):`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00			`"""`
			`Reasoning parser for DeepSeek R1 model.`

[Refactor][Frontend] Keep all logic about reasoning into one class (#14428) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-03-28 15:23:30 +08:00			`The DeepSeek R1 model uses <think>...</think> tokens to denote reasoning`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00			`text. This parser extracts the reasoning content from the model output.`
			`"""`

[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`@property`
			`def start_token(self) -> str:`
			`"""The token that starts reasoning content."""`
			`return "<think>"`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`@property`
			`def end_token(self) -> str:`
			`"""The token that ends reasoning content."""`
			`return "</think>"`
[Frontend] Support tool calling and reasoning parser (#14511) Signed-off-by: WangErXiao <863579016@qq.com> 2025-03-24 05:00:07 +08:00
`reasoning_content` -> `reasoning` (#27752) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-11-08 04:15:08 -08:00			`def extract_reasoning_streaming(`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00			`self,`
			`previous_text: str,`
			`current_text: str,`
			`delta_text: str,`
			`previous_token_ids: Sequence[int],`
			`current_token_ids: Sequence[int],`
			`delta_token_ids: Sequence[int],`
			`) -> DeltaMessage \| None:`
`reasoning_content` -> `reasoning` (#27752) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-11-08 04:15:08 -08:00			`ret = super().extract_reasoning_streaming(`
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`previous_text,`
			`current_text,`
			`delta_text,`
			`previous_token_ids,`
			`current_token_ids,`
			`delta_token_ids,`
			`)`
			`if (`
			`ret is not None`
			`and self.start_token_id not in previous_token_ids`
			`and self.start_token_id not in delta_token_ids`
			`):`
[Refactor][Frontend] Keep all logic about reasoning into one class (#14428) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-03-28 15:23:30 +08:00			`if self.end_token_id in delta_token_ids:`
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`# end token in delta with more tokens,`
[Bugfix]: Reasoning output bug according to the chat template change (#13025) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-02-11 15:49:03 +08:00			`# extract reasoning content and content`
[Refactor][Frontend] Keep all logic about reasoning into one class (#14428) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-03-28 15:23:30 +08:00			`end_index = delta_text.find(self.end_token)`
`reasoning_content` -> `reasoning` (#27752) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-11-08 04:15:08 -08:00			`reasoning = delta_text[:end_index]`
[Refactor][Frontend] Keep all logic about reasoning into one class (#14428) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-03-28 15:23:30 +08:00			`content = delta_text[end_index + len(self.end_token) :]`
			`return DeltaMessage(`
`reasoning_content` -> `reasoning` (#27752) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-11-08 04:15:08 -08:00			`reasoning=reasoning,`
[Refactor][Frontend] Keep all logic about reasoning into one class (#14428) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-03-28 15:23:30 +08:00			`content=content if content else None,`
			`)`
			`elif self.end_token_id in previous_token_ids:`
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`# end token in previous, thinking content ends`
[Bugfix]: Reasoning output bug according to the chat template change (#13025) Signed-off-by: Ce Gao <cegao@tensorchord.ai> 2025-02-11 15:49:03 +08:00			`return DeltaMessage(content=delta_text)`
			`else:`
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`# no end token in previous or delta, reasoning content continues`
`reasoning_content` -> `reasoning` (#27752) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> 2025-11-08 04:15:08 -08:00			`return DeltaMessage(reasoning=delta_text)`
[Frontend] Support reasoning content for deepseek r1 (#12473) Signed-off-by: Ce Gao <cegao@tensorchord.ai> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com> 2025-01-29 11:38:08 +08:00
[Model] Support SeedOss Reason Parser (#24263) Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> 2025-09-24 08:15:51 +08:00			`return ret`