[Doc] Guide for adding multi-modal plugins (#6205)

2024-07-10 14:55:34 +08:00
parent 5ed3505d82
commit 8a924d2248
7 changed files with 64 additions and 23 deletions
--- a/docs/source/dev/multimodal/multimodal_index.rst
+++ b/docs/source/dev/multimodal/multimodal_index.rst
@@ -7,17 +7,21 @@ Multi-Modality
    
 vLLM provides experimental support for multi-modal models through the :mod:`vllm.multimodal` package.

-Multi-modal input can be passed alongside text and token prompts to :ref:`supported models <supported_vlms>`
+Multi-modal inputs can be passed alongside text and token prompts to :ref:`supported models <supported_vlms>`
 via the ``multi_modal_data`` field in :class:`vllm.inputs.PromptStrictInputs`.

-.. note::
-   ``multi_modal_data`` can accept keys and values beyond the builtin ones, as long as a customized plugin is registered through 
-   the :class:`~vllm.multimodal.MULTIMODAL_REGISTRY`.
+Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
+by following :ref:`this guide <adding_multimodal_plugin>`.

-To implement a new multi-modal model in vLLM, please follow :ref:`this guide <enabling_multimodal_inputs>`.
+Looking to add your own multi-modal model? Please follow the instructions listed :ref:`here <enabling_multimodal_inputs>`.

-..
-  TODO: Add more instructions on how to add new plugins once embeddings is in.
+Guides
++++++
+
+.. toctree::
+   :maxdepth: 1
+
+   adding_multimodal_plugin

 Module Contents
 +++++++++++++++
@@ -36,10 +40,14 @@ Registry
 Base Classes
 ------------

-.. autoclass:: vllm.multimodal.MultiModalDataDict
+.. autodata:: vllm.multimodal.BatchedTensors
+
+.. autoclass:: vllm.multimodal.MultiModalDataBuiltins
    :members:
    :show-inheritance:

+.. autodata:: vllm.multimodal.MultiModalDataDict
+
 .. autoclass:: vllm.multimodal.MultiModalInputs
    :members:
    :show-inheritance: