[Doc] Expand Multimodal API Reference (#11852)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung
2025-01-09 01:14:14 +08:00
committed by GitHub
parent ca47e176af
commit 5984499e47
9 changed files with 141 additions and 73 deletions

View File

@@ -2,10 +2,6 @@
# Multi-Modality
```{eval-rst}
.. currentmodule:: vllm.multimodal
```
vLLM provides experimental support for multi-modal models through the {mod}`vllm.multimodal` package.
Multi-modal inputs can be passed alongside text and token prompts to [supported models](#supported-mm-models)
@@ -13,61 +9,20 @@ via the `multi_modal_data` field in {class}`vllm.inputs.PromptType`.
Looking to add your own multi-modal model? Please follow the instructions listed [here](#enabling-multimodal-inputs).
## Module Contents
```{eval-rst}
.. automodule:: vllm.multimodal
```
### Registry
```{eval-rst}
.. autodata:: vllm.multimodal.MULTIMODAL_REGISTRY
```
```{eval-rst}
.. autoclass:: vllm.multimodal.MultiModalRegistry
:members:
:show-inheritance:
```
### Base Classes
```{eval-rst}
.. automodule:: vllm.multimodal.base
:members:
:show-inheritance:
```
### Input Classes
```{eval-rst}
.. automodule:: vllm.multimodal.inputs
:members:
:show-inheritance:
```
### Audio Classes
```{eval-rst}
.. automodule:: vllm.multimodal.audio
:members:
:show-inheritance:
```
### Image Classes
```{eval-rst}
.. automodule:: vllm.multimodal.image
:members:
:show-inheritance:
```
### Video Classes
```{eval-rst}
.. automodule:: vllm.multimodal.video
:members:
:show-inheritance:
## Submodules
```{toctree}
:maxdepth: 1
inputs
parse
processing
profiling
registry
```