[feat] Enable mm caching for transformers backend (#21358)

Signed-off-by: raushan <raushan@huggingface.co>
2025-07-22 17:18:46 +02:00
parent b194557a6c
commit f38ee34a0a
4 changed files with 7 additions and 18 deletions
--- a/docs/models/supported_models.md
+++ b/docs/models/supported_models.md
@@ -18,7 +18,7 @@ These models are what we list in [supported-text-models][supported-text-models]

 ### Transformers

-vLLM also supports model implementations that are available in Transformers. This does not currently work for all models, but most decoder language models and common vision language models are supported! Vision-language models currently accept only image inputs, and require setting `--disable_mm_preprocessor_cache` when running. Support for video inputs and caching of multi-modal preprocessors will be added in future releases.
+vLLM also supports model implementations that are available in Transformers. This does not currently work for all models, but most decoder language models and common vision language models are supported! Vision-language models currently accept only image inputs. Support for video inputs will be added in future releases.

 To check if the modeling backend is Transformers, you can simply do this: