[Model] Support NVLM-D and fix QK Norm in InternViT (#9045)
Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
This commit is contained in:
@@ -315,6 +315,9 @@ Multimodal Language Models
|
||||
|
||||
.. _supported_vlms:
|
||||
|
||||
Text Generation
|
||||
---------------
|
||||
|
||||
.. list-table::
|
||||
:widths: 25 25 25 25 5 5
|
||||
:header-rows: 1
|
||||
@@ -384,7 +387,13 @@ Multimodal Language Models
|
||||
- Image
|
||||
- :code:`meta-llama/Llama-3.2-90B-Vision-Instruct`, :code:`meta-llama/Llama-3.2-11B-Vision`, etc.
|
||||
-
|
||||
-
|
||||
* - :code:`NVLM_D_Model`
|
||||
- NVLM-D 1.0
|
||||
- Image\ :sup:`E+`
|
||||
- :code:`nvidia/NVLM-D-72B`, etc.
|
||||
-
|
||||
- ✅︎
|
||||
* - :code:`PaliGemmaForConditionalGeneration`
|
||||
- PaliGemma
|
||||
- Image\ :sup:`E`
|
||||
|
||||
Reference in New Issue
Block a user