[feat] Enable mm caching for transformers backend (#21358)
Signed-off-by: raushan <raushan@huggingface.co>
This commit is contained in:
committed by
GitHub
parent
b194557a6c
commit
f38ee34a0a
@@ -18,7 +18,7 @@ These models are what we list in [supported-text-models][supported-text-models]
|
||||
|
||||
### Transformers
|
||||
|
||||
vLLM also supports model implementations that are available in Transformers. This does not currently work for all models, but most decoder language models and common vision language models are supported! Vision-language models currently accept only image inputs, and require setting `--disable_mm_preprocessor_cache` when running. Support for video inputs and caching of multi-modal preprocessors will be added in future releases.
|
||||
vLLM also supports model implementations that are available in Transformers. This does not currently work for all models, but most decoder language models and common vision language models are supported! Vision-language models currently accept only image inputs. Support for video inputs will be added in future releases.
|
||||
|
||||
To check if the modeling backend is Transformers, you can simply do this:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user