[Frontend][3/n] Make pooling entrypoints request schema consensus | EmbedRequest & ClassifyRequest (#32905)

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This commit is contained in:
wang.yuqi
2026-01-23 20:03:44 +08:00
committed by GitHub
parent 3f3f89529d
commit 05f3d714db
11 changed files with 330 additions and 265 deletions

View File

@@ -197,7 +197,7 @@ The following [sampling parameters](../api/README.md#inference-parameters) are s
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:completion-sampling-params"
--8<-- "vllm/entrypoints/openai/completion/protocol.py:completion-sampling-params"
```
The following extra parameters are supported:
@@ -205,7 +205,7 @@ The following extra parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:completion-extra-params"
--8<-- "vllm/entrypoints/openai/completion/protocol.py:completion-extra-params"
```
### Chat API
@@ -228,7 +228,7 @@ The following [sampling parameters](../api/README.md#inference-parameters) are s
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:chat-completion-sampling-params"
--8<-- "vllm/entrypoints/openai/chat_completion/protocol.py:chat-completion-sampling-params"
```
The following extra parameters are supported:
@@ -236,7 +236,7 @@ The following extra parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:chat-completion-extra-params"
--8<-- "vllm/entrypoints/openai/chat_completion/protocol.py:chat-completion-extra-params"
```
### Responses API
@@ -253,7 +253,7 @@ The following extra parameters in the request object are supported:
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:responses-extra-params"
--8<-- "vllm/entrypoints/openai/responses/protocol.py:responses-extra-params"
```
The following extra parameters in the response object are supported:
@@ -261,7 +261,7 @@ The following extra parameters in the response object are supported:
??? code
```python
--8<-- "vllm/entrypoints/openai/protocol.py:responses-response-extra-params"
--8<-- "vllm/entrypoints/openai/responses/protocol.py:responses-response-extra-params"
```
### Embeddings API
@@ -378,23 +378,53 @@ The following [pooling parameters][vllm.PoolingParams] are supported.
```python
--8<-- "vllm/pooling_params.py:common-pooling-params"
--8<-- "vllm/pooling_params.py:embedding-pooling-params"
--8<-- "vllm/pooling_params.py:embed-pooling-params"
```
The following extra parameters are supported by default:
The following Embeddings API parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/pooling/embed/protocol.py:embedding-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:completion-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:encoding-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:embed-params"
```
For chat-like input (i.e. if `messages` is passed), these extra parameters are supported instead:
The following extra parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/pooling/embed/protocol.py:chat-embedding-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:completion-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:encoding-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:embed-extra-params"
```
For chat-like input (i.e. if `messages` is passed), the following parameters are supported:
The following parameters are supported by default:
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:chat-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:encoding-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:embed-params"
```
these extra parameters are supported instead:
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:chat-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:encoding-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:embed-extra-params"
```
### Transcriptions API
@@ -659,14 +689,48 @@ The following [pooling parameters][vllm.PoolingParams] are supported.
```python
--8<-- "vllm/pooling_params.py:common-pooling-params"
--8<-- "vllm/pooling_params.py:classification-pooling-params"
--8<-- "vllm/pooling_params.py:classify-pooling-params"
```
The following Classification API parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:completion-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-params"
```
The following extra parameters are supported:
```python
--8<-- "vllm/entrypoints/pooling/classify/protocol.py:classification-extra-params"
```
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:completion-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-extra-params"
```
For chat-like input (i.e. if `messages` is passed), the following parameters are supported:
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:chat-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-params"
```
these extra parameters are supported instead:
??? code
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:chat-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-extra-params"
```
### Score API
@@ -882,12 +946,21 @@ The following [pooling parameters][vllm.PoolingParams] are supported.
```python
--8<-- "vllm/pooling_params.py:common-pooling-params"
--8<-- "vllm/pooling_params.py:classification-pooling-params"
--8<-- "vllm/pooling_params.py:classify-pooling-params"
```
The following Score API parameters are supported:
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/score/protocol.py:score-extra-params"
```
The following extra parameters are supported:
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-extra-params"
--8<-- "vllm/entrypoints/pooling/score/protocol.py:score-extra-params"
```
@@ -963,12 +1036,22 @@ The following [pooling parameters][vllm.PoolingParams] are supported.
```python
--8<-- "vllm/pooling_params.py:common-pooling-params"
--8<-- "vllm/pooling_params.py:classification-pooling-params"
--8<-- "vllm/pooling_params.py:classify-pooling-params"
```
The following Re-rank API parameters are supported:
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-extra-params"
--8<-- "vllm/entrypoints/pooling/score/protocol.py:score-extra-params"
```
The following extra parameters are supported:
```python
--8<-- "vllm/entrypoints/pooling/base/protocol.py:pooling-common-extra-params"
--8<-- "vllm/entrypoints/pooling/base/protocol.py:classify-extra-params"
--8<-- "vllm/entrypoints/pooling/score/protocol.py:rerank-extra-params"
```