[Model] Deprecate the score task (this will not affect users). (#37537)

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
This commit is contained in:
wang.yuqi
2026-03-20 16:07:56 +08:00
committed by GitHub
parent dcee9be95a
commit ed359c497a
22 changed files with 184 additions and 163 deletions

View File

@@ -31,28 +31,29 @@ Of course, we also have "plugin" tasks that allow users to customize input and o
### Pooling Tasks
| Pooling Tasks | Granularity | Outputs |
|--------------------|---------------|-------------------------------------------------|
| `classify` | Sequence-wise | probability vector of classes for each sequence |
| `score` (see note) | Sequence-wise | reranker score for each sequence |
| `embed` | Sequence-wise | vector representations for each sequence |
| `token_classify` | Token-wise | probability vector of classes for each token |
| `token_embed` | Token-wise | vector representations for each token |
| Pooling Tasks | Granularity | Outputs |
|-----------------------|---------------|-------------------------------------------------|
| `classify` (see note) | Sequence-wise | probability vector of classes for each sequence |
| `embed` | Sequence-wise | vector representations for each sequence |
| `token_classify` | Token-wise | probability vector of classes for each token |
| `token_embed` | Token-wise | vector representations for each token |
!!! note
Within classification tasks, there is a specialized subcategory: Cross-encoder (aka reranker) models. These models are a subset of classification models that accept two prompts as input and output num_labels equal to 1.
### Score Types
| Pooling Tasks | Granularity | Outputs | Score Types | scoring function |
|--------------------|---------------|-------------------------------------------------|--------------------|--------------------------|
| `classify` | Sequence-wise | probability vector of classes for each sequence | nan | nan |
| `score` (see note) | Sequence-wise | reranker score for each sequence | `cross-encoder` | linear classifier |
| `embed` | Sequence-wise | vector representations for each sequence | `bi-encoder` | cosine similarity |
| `token_classify` | Token-wise | probability vector of classes for each token | nan | nan |
| `token_embed` | Token-wise | vector representations for each token | `late-interaction` | late interaction(MaxSim) |
The scoring models is designed to compute similarity scores between two input prompts. It supports three model types (aka `score_type`): `cross-encoder`, `late-interaction`, and `bi-encoder`.
The score models is designed to compute similarity scores between two input prompts. It supports three model types (aka `score_type`): `cross-encoder`, `late-interaction`, and `bi-encoder`.
| Pooling Tasks | Granularity | Outputs | Score Types | scoring function |
|-----------------------|---------------|----------------------------------------------|--------------------|--------------------------|
| `classify` (see note) | Sequence-wise | reranker score for each sequence | `cross-encoder` | linear classifier |
| `embed` | Sequence-wise | vector representations for each sequence | `bi-encoder` | cosine similarity |
| `token_classify` | Token-wise | probability vector of classes for each token | nan | nan |
| `token_embed` | Token-wise | vector representations for each token | `late-interaction` | late interaction(MaxSim) |
!!! note
Only when a classification model outputs num_labels equal to 1 can it be used as a scoring model and have its scoring API enabled.
### Pooling Usages
@@ -85,14 +86,16 @@ enabling the corresponding APIs.
### Offline APIs corresponding to pooling tasks
| Task | APIs |
|------------------|----------------------------------------------------------------------------|
| `embed` | `LLM.embed(...)`,`LLM.encode(..., pooling_task="embed")`, `LLM.score(...)` |
| `classify` | `LLM.classify(...)`, `LLM.encode(..., pooling_task="classify")` |
| `score` | `LLM.score(...)` |
| `token_classify` | `LLM.reward(...)`, `LLM.encode(..., pooling_task="token_classify")` |
| `token_embed` | `LLM.encode(..., pooling_task="token_embed")`, `LLM.score(...)` |
| `plugin` | `LLM.encode(..., pooling_task="plugin")` |
| Task | APIs |
|------------------|---------------------------------------------------------------------------------------|
| `embed` | `LLM.embed(...)`, `LLM.encode(..., pooling_task="embed")`, `LLM.score(...)`(see note) |
| `classify` | `LLM.classify(...)`, `LLM.encode(..., pooling_task="classify")`, `LLM.score(...)` |
| `token_classify` | `LLM.reward(...)`, `LLM.encode(..., pooling_task="token_classify")` |
| `token_embed` | `LLM.encode(..., pooling_task="token_embed")`, `LLM.score(...)` |
| `plugin` | `LLM.encode(..., pooling_task="plugin")` |
!!! note
Only when a classification model outputs num_labels equal to 1 can it be used as a scoring model and have its scoring API enabled.
### `LLM.classify`
@@ -206,11 +209,11 @@ If `--runner pooling` has been set (manually or automatically) but the model doe
vLLM will attempt to automatically convert the model according to the architecture names
shown in the table below.
| Architecture | `--convert` | Supported pooling tasks |
| ----------------------------------------------- | ----------- | ------------------------------------- |
| `*ForTextEncoding`, `*EmbeddingModel`, `*Model` | `embed` | `token_embed`, `embed` |
| `*ForRewardModeling`, `*RewardModel` | `embed` | `token_embed`, `embed` |
| `*For*Classification`, `*ClassificationModel` | `classify` | `token_classify`, `classify`, `score` |
| Architecture | `--convert` | Supported pooling tasks |
|-------------------------------------------------|-------------|------------------------------|
| `*ForTextEncoding`, `*EmbeddingModel`, `*Model` | `embed` | `token_embed`, `embed` |
| `*ForRewardModeling`, `*RewardModel` | `embed` | `token_embed`, `embed` |
| `*For*Classification`, `*ClassificationModel` | `classify` | `token_classify`, `classify` |
!!! tip
You can explicitly set `--convert <type>` to specify how to convert the model.
@@ -251,3 +254,7 @@ Pooling models now default support all pooling, you can use it without any setti
- Extracting hidden states prefers using `token_embed` task.
- Named Entity Recognition (NER) and reward models prefers using `token_classify` task.
### Score task
`score` task is deprecated and will be removed in v0.20. Please use `classify` instead. Only when a classification model outputs num_labels equal to 1 can it be used as a scoring model and have its scoring API enabled.