[Model] Interface to enable batch-level DP support (#23733)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-08-27 21:41:22 +08:00
parent 16dc4052b0
commit fe8d7b6f03
8 changed files with 38 additions and 4 deletions
--- a/docs/configuration/optimization.md
+++ b/docs/configuration/optimization.md
@@ -168,8 +168,11 @@ llm = LLM(
    Batch-level DP is not to be confused with API request-level DP
    (which is instead controlled by `data_parallel_size`).

-The availability of batch-level DP is based on model implementation.
-Currently, the following models support `mm_encoder_tp_mode="data"`:
+Batch-level DP needs to be implemented on a per-model basis,
+and enabled by setting `supports_encoder_tp_data = True` in the model class.
+Regardless, you need to set `mm_encoder_tp_mode="data"` in engine arguments to use this feature.
+
+Known supported models:

 - Llama4 (<gh-pr:18368>)
 - MiniCPM-V-4 (<gh-pr:23327>)