[Doc] Guide for adding multi-modal plugins (#6205)

This commit is contained in:
Cyrus Leung
2024-07-10 14:55:34 +08:00
committed by GitHub
parent 5ed3505d82
commit 8a924d2248
7 changed files with 64 additions and 23 deletions

View File

@@ -7,17 +7,21 @@ Multi-Modality
vLLM provides experimental support for multi-modal models through the :mod:`vllm.multimodal` package.
Multi-modal input can be passed alongside text and token prompts to :ref:`supported models <supported_vlms>`
Multi-modal inputs can be passed alongside text and token prompts to :ref:`supported models <supported_vlms>`
via the ``multi_modal_data`` field in :class:`vllm.inputs.PromptStrictInputs`.
.. note::
``multi_modal_data`` can accept keys and values beyond the builtin ones, as long as a customized plugin is registered through
the :class:`~vllm.multimodal.MULTIMODAL_REGISTRY`.
Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
by following :ref:`this guide <adding_multimodal_plugin>`.
To implement a new multi-modal model in vLLM, please follow :ref:`this guide <enabling_multimodal_inputs>`.
Looking to add your own multi-modal model? Please follow the instructions listed :ref:`here <enabling_multimodal_inputs>`.
..
TODO: Add more instructions on how to add new plugins once embeddings is in.
Guides
++++++
.. toctree::
:maxdepth: 1
adding_multimodal_plugin
Module Contents
+++++++++++++++
@@ -36,10 +40,14 @@ Registry
Base Classes
------------
.. autoclass:: vllm.multimodal.MultiModalDataDict
.. autodata:: vllm.multimodal.BatchedTensors
.. autoclass:: vllm.multimodal.MultiModalDataBuiltins
:members:
:show-inheritance:
.. autodata:: vllm.multimodal.MultiModalDataDict
.. autoclass:: vllm.multimodal.MultiModalInputs
:members:
:show-inheritance: