[Model][Speculative Decoding] DeepSeek MTP spec decode (#12755)

Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
This commit is contained in:
Lucia Fang
2025-02-19 01:06:23 -08:00
committed by GitHub
parent 983a40a8bb
commit f525c0be8b
14 changed files with 727 additions and 46 deletions

View File

@@ -1307,6 +1307,8 @@ class ExecuteModelRequest(
previous_hidden_states: Optional[HiddenStates] = None
# The number of forward steps to run.
num_steps: int = 1
# The step index for spec model input.
spec_step_idx: Optional[int] = None
# Finished request ids since last step.
finished_requests_ids: List[str] = msgspec.field(default_factory=list)
# The last sampled token ids for multi step decoding.