vllm/examples/pooling at be0a3f7570726ca49cc9b53f9b48175418bddda0 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Jakub Zakrzewski c8b678e53e [Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735 )

Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>

2026-03-03 08:32:14 +08:00

..

[Doc] Update usage of --limit-mm-per-prompt (#34148 )

2026-02-09 21:12:13 -08:00

[Model] Add nvidia/llama-nemotron-embed-vl-1b-v2 multimodal embedding model (#35297 )

2026-02-26 14:17:17 +00:00

(bugfix): Fixed encode in LLM entrypoint for IOProcessr plugin prompts (#34618 )

2026-02-16 07:33:55 -08:00

[Frontend][last/5] Make pooling entrypoints request schema consensus. (#31127 )

2026-02-09 06:42:38 +00:00

[Model] Add support for nvidia/llama-nemotron-rerank-vl-1b-v2 (#35735 )

2026-03-03 08:32:14 +08:00

[Frontend][2/n] Make pooling entrypoints request schema consensus | ChatRequest (#32574 )

2026-01-22 10:32:44 +00:00

[new model] add COLQwen3 code & Inference (#34398 )

2026-02-14 12:15:19 +08:00