[VLM] Separate text-only and vision variants of the same model architecture (#13157)
This commit is contained in:
@@ -699,10 +699,10 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
*
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
- * `DeepseekVLV2ForCausalLM`
|
||||
- * `DeepseekVLV2ForCausalLM`<sup>^</sup>
|
||||
* DeepSeek-VL2
|
||||
* T + I<sup>+</sup>
|
||||
* `deepseek-ai/deepseek-vl2-tiny`, `deepseek-ai/deepseek-vl2-small`, `deepseek-ai/deepseek-vl2` etc. (see note)
|
||||
* `deepseek-ai/deepseek-vl2-tiny`, `deepseek-ai/deepseek-vl2-small`, `deepseek-ai/deepseek-vl2` etc.
|
||||
*
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
@@ -713,10 +713,10 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
*
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
- * `ChatGLMModel`
|
||||
- * `GLM4VForCausalLM`<sup>^</sup>
|
||||
* GLM-4V
|
||||
* T + I
|
||||
* `THUDM/glm-4v-9b` etc.
|
||||
* `THUDM/glm-4v-9b`, `THUDM/cogagent-9b-20241220` etc.
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
@@ -825,7 +825,7 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
*
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
- * `QWenLMHeadModel`
|
||||
- * `QwenVLForConditionalGeneration`<sup>^</sup>
|
||||
* Qwen-VL
|
||||
* T + I<sup>E+</sup>
|
||||
* `Qwen/Qwen-VL`, `Qwen/Qwen-VL-Chat`, etc.
|
||||
@@ -862,13 +862,12 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
* ✅︎
|
||||
:::
|
||||
|
||||
<sup>^</sup> You need to set the architecture name via `--hf-overrides` to match the one in vLLM.
|
||||
• For example, to use DeepSeek-VL2 series models:
|
||||
`--hf-overrides '{"architectures": ["DeepseekVLV2ForCausalLM"]}'`
|
||||
<sup>E</sup> Pre-computed embeddings can be inputted for this modality.
|
||||
<sup>+</sup> Multiple items can be inputted per text prompt for this modality.
|
||||
|
||||
:::{note}
|
||||
To use DeepSeek-VL2 series models, you have to pass `--hf_overrides '{"architectures": ["DeepseekVLV2ForCausalLM"]}'` when running vLLM.
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
H2O-VL series models will be available in V1 once we support backends other than FlashAttention.
|
||||
:::
|
||||
|
||||
Reference in New Issue
Block a user