Allow markdownlint to run locally (#36398)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-09 03:05:24 +00:00
parent fde4771bbd
commit a0f44bb616
47 changed files with 394 additions and 392 deletions
--- a/docs/models/pooling_models.md
+++ b/docs/models/pooling_models.md
@@ -31,7 +31,7 @@ vLLM will attempt to automatically convert the model according to the architectu
 shown in the table below.

 | Architecture                                    | `--convert` | Supported pooling tasks               |
-|-------------------------------------------------|-------------|---------------------------------------|
+| ----------------------------------------------- | ----------- | ------------------------------------- |
 | `*ForTextEncoding`, `*EmbeddingModel`, `*Model` | `embed`     | `token_embed`, `embed`                |
 | `*ForRewardModeling`, `*RewardModel`            | `embed`     | `token_embed`, `embed`                |
 | `*For*Classification`, `*ClassificationModel`   | `classify`  | `token_classify`, `classify`, `score` |
@@ -46,7 +46,7 @@ Each pooling model in vLLM supports one or more of these tasks according to
 enabling the corresponding APIs:

 | Task             | APIs                                                                          |
-|------------------|-------------------------------------------------------------------------------|
+| ---------------- | ----------------------------------------------------------------------------- |
 | `embed`          | `LLM.embed(...)`, `LLM.score(...)`\*, `LLM.encode(..., pooling_task="embed")` |
 | `classify`       | `LLM.classify(...)`, `LLM.encode(..., pooling_task="classify")`               |
 | `score`          | `LLM.score(...)`                                                              |
@@ -69,7 +69,7 @@ If the model has been converted via `--convert` (see above),
 the pooler assigned to each task has the following attributes by default:

 | Task       | Pooling Type | Normalization | Softmax |
-|------------|--------------|---------------|---------|
+| ---------- | ------------ | ------------- | ------- |
 | `embed`    | `LAST`       | ✅︎            | ❌      |
 | `classify` | `LAST`       | ❌            | ✅︎      |

@@ -314,7 +314,7 @@ An OpenAI client example can be found here: [examples/pooling/embed/openai_embed
 vLLM supports ColBERT models with multiple encoder backbones:

 | Architecture | Backbone | Example HF Models |
-|---|---|---|
+| - | - | - |
 | `HF_ColBERT` | BERT | `answerdotai/answerai-colbert-small-v1`, `colbert-ir/colbertv2.0` |
 | `ColBERTModernBertModel` | ModernBERT | `lightonai/GTE-ModernColBERT-v1` |
 | `ColBERTJinaRobertaModel` | Jina XLM-RoBERTa | `jinaai/jina-colbert-v2` |
@@ -379,7 +379,7 @@ An example can be found here: [examples/pooling/score/colbert_rerank_online.py](
 ColQwen3 is based on [ColPali](https://arxiv.org/abs/2407.01449), which extends ColBERT's late interaction approach to **multi-modal** inputs. While ColBERT operates on text-only token embeddings, ColPali/ColQwen3 can embed both **text and images** (e.g. PDF pages, screenshots, diagrams) into per-token L2-normalized vectors and compute relevance via MaxSim scoring. ColQwen3 specifically uses Qwen3-VL as its vision-language backbone.

 | Architecture | Backbone | Example HF Models |
-|---|---|---|
+| - | - | - |
 | `ColQwen3` | Qwen3-VL | `TomoroAI/tomoro-colqwen3-embed-4b`, `TomoroAI/tomoro-colqwen3-embed-8b` |
 | `OpsColQwen3Model` | Qwen3-VL | `OpenSearch-AI/Ops-Colqwen3-4B`, `OpenSearch-AI/Ops-Colqwen3-8B` |
 | `Qwen3VLNemotronEmbedModel` | Qwen3-VL | `nvidia/nemotron-colembed-vl-4b-v2`, `nvidia/nemotron-colembed-vl-8b-v2` |
@@ -507,7 +507,7 @@ Llama Nemotron VL Embedding models combine the bidirectional Llama embedding bac
 single-vector embeddings from text and/or images.

 | Architecture | Backbone | Example HF Models |
-|---|---|---|
+| - | - | - |
 | `LlamaNemotronVLModel` | Bidirectional Llama + SigLIP | `nvidia/llama-nemotron-embed-vl-1b-v2` |

 Start the server:
@@ -567,7 +567,7 @@ Llama Nemotron VL reranker models combine the same bidirectional Llama + SigLIP
 backbone with a sequence-classification head for cross-encoder scoring and reranking.

 | Architecture | Backbone | Example HF Models |
-|---|---|---|
+| - | - | - |
 | `LlamaNemotronVLForSequenceClassification` | Bidirectional Llama + SigLIP | `nvidia/llama-nemotron-rerank-vl-1b-v2` |

 Start the server: