[Bugfix] Make Gemma3 MM V0 only for now (#14971)
Some checks failed
Create Release / Create Release (push) Has been cancelled
Some checks failed
Create Release / Create Release (push) Has been cancelled
Signed-off-by: Roger Wang <ywang@roblox.com>
This commit is contained in:
@@ -763,7 +763,7 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
* `google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc.
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
* ⚠️
|
||||
*
|
||||
- * `GLM4VForCausalLM`<sup>^</sup>
|
||||
* GLM-4V
|
||||
* T + I
|
||||
@@ -948,8 +948,11 @@ V1 currently uses a simplified attention pattern:
|
||||
- Uses causal attention for all tokens, including image tokens
|
||||
- Generates reasonable outputs but does not match the original model's attention for text + image inputs
|
||||
- Will be updated in the future to support the correct behavior
|
||||
- Does not support `"do_pan_and_scan": True`
|
||||
|
||||
This limitation exists because the model's mixed attention pattern (bidirectional for images, causal otherwise) is not yet supported by vLLM's attention backends.
|
||||
|
||||
For these reasons, `Gemma3ForConditionalGeneration` is supported only on V0 at the moment.
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
|
||||
Reference in New Issue
Block a user