[New Model]: nomic-embed-text-v2-moe (#17785)
This commit is contained in:
@@ -622,7 +622,7 @@ Specified using `--task embed`.
|
||||
* [PP](#distributed-serving)
|
||||
- * `BertModel`
|
||||
* BERT-based
|
||||
* `BAAI/bge-base-en-v1.5`, etc.
|
||||
* `BAAI/bge-base-en-v1.5`, `Snowflake/snowflake-arctic-embed-xs`, etc.
|
||||
*
|
||||
*
|
||||
- * `Gemma2Model`
|
||||
@@ -635,6 +635,16 @@ Specified using `--task embed`.
|
||||
* `parasail-ai/GritLM-7B-vllm`.
|
||||
* ✅︎
|
||||
* ✅︎
|
||||
- * `GteModel`
|
||||
* GteModel
|
||||
* `Snowflake/snowflake-arctic-embed-m-v2.0`.
|
||||
*
|
||||
* ︎
|
||||
- * `NomicBertModel`
|
||||
* NomicBertModel
|
||||
* `nomic-ai/nomic-embed-text-v1`, `nomic-ai/nomic-embed-text-v2-moe`, `Snowflake/snowflake-arctic-embed-m-long`, etc.
|
||||
* ︎
|
||||
* ︎
|
||||
- * `LlamaModel`, `LlamaForCausalLM`, `MistralModel`, etc.
|
||||
* Llama-based
|
||||
* `intfloat/e5-mistral-7b-instruct`, etc.
|
||||
@@ -647,12 +657,12 @@ Specified using `--task embed`.
|
||||
* ✅︎
|
||||
- * `RobertaModel`, `RobertaForMaskedLM`
|
||||
* RoBERTa-based
|
||||
* `sentence-transformers/all-roberta-large-v1`, `sentence-transformers/all-roberta-large-v1`, etc.
|
||||
* `sentence-transformers/all-roberta-large-v1`, etc.
|
||||
*
|
||||
*
|
||||
- * `XLMRobertaModel`
|
||||
* XLM-RoBERTa-based
|
||||
* `intfloat/multilingual-e5-large`, `jinaai/jina-reranker-v2-base-multilingual`, etc.
|
||||
* `intfloat/multilingual-e5-large`, `jinaai/jina-reranker-v2-base-multilingual`, `Snowflake/snowflake-arctic-embed-l-v2.0`, `jinaai/jina-embeddings-v3`(see note), etc.
|
||||
*
|
||||
*
|
||||
:::
|
||||
@@ -670,6 +680,10 @@ For both the 1.5B and 7B variants, you also need to enable `--trust-remote-code`
|
||||
See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882).
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
`jinaai/jina-embeddings-v3` supports multiple tasks through lora, while vllm temporarily only supports text-matching tasks by merging lora weights.
|
||||
:::
|
||||
|
||||
If your model is not in the above list, we will try to automatically convert the model using
|
||||
{func}`~vllm.model_executor.models.adapters.as_embedding_model`. By default, the embeddings
|
||||
of the whole prompt are extracted from the normalized hidden state corresponding to the last token.
|
||||
|
||||
Reference in New Issue
Block a user