[BugFix] [DP/EP] Fix slow execution when BS <= DP (#25407)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Chris Bamford <chrisbam4d@gmail.com>
This commit is contained in:
Matthew Bonanni
2025-09-22 20:26:17 -04:00
committed by GitHub
parent 090197034f
commit ac0048c0ae
2 changed files with 5 additions and 4 deletions

View File

@@ -487,7 +487,7 @@ class Worker(WorkerBase):
sort_by="self_cuda_time_total"))
def execute_dummy_batch(self) -> None:
self.model_runner._dummy_run(1)
self.model_runner._dummy_run(1, uniform_decode=True)
def add_lora(self, lora_request: LoRARequest) -> bool:
return self.model_runner.add_lora(lora_request)