[Misc] Add tensor schema test coverage for multimodal models (#21754)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
This commit is contained in:
Isotr0py
2025-08-03 15:52:14 +08:00
committed by GitHub
parent 337eb23bcc
commit 3dddbf1f25
7 changed files with 222 additions and 15 deletions

View File

@@ -51,13 +51,14 @@ class DeepseekVL2ImagePixelInputs(TensorSchema):
"""
Dimensions:
- bn: Batch size * number of images
- p: Number of patches
- c: Number of channels (3)
- h: Height of each image
- w: Width of each image
"""
type: Literal["pixel_values"]
data: Annotated[Union[torch.Tensor, list[torch.Tensor]],
TensorShape("bn", 3, "h", "w")]
TensorShape("bn", "p", 3, "h", "w", dynamic_dims={"p"})]
images_spatial_crop: Annotated[torch.Tensor, TensorShape("bn", 2)]