[Core][VLM] Stack multimodal tensors to represent multiple images within each prompt (#7902)

This commit is contained in:
Peter Salas
2024-08-27 18:53:56 -07:00
committed by GitHub
parent 9c71c97ae2
commit fab5f53e2d
15 changed files with 214 additions and 60 deletions

View File

@@ -45,8 +45,6 @@ Base Classes
.. autodata:: vllm.multimodal.NestedTensors
.. autodata:: vllm.multimodal.BatchedTensors
.. autodata:: vllm.multimodal.BatchedTensorInputs
.. autoclass:: vllm.multimodal.MultiModalDataBuiltins