[Bugfix] Proper input validation for multi-modal encoder-decoder models (#16156)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung
2025-04-09 00:45:21 +08:00
committed by GitHub
parent dc96fd54c6
commit 4ebc0b9640
10 changed files with 113 additions and 62 deletions

View File

@@ -56,7 +56,7 @@ def run_florence2():
def run_mllama():
engine_args = EngineArgs(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
max_model_len=4096,
max_model_len=8192,
max_num_seqs=2,
limit_mm_per_prompt={"image": 1},
dtype="half",