Commit Graph

191 Commits

Author SHA1 Message Date
Divakar Verma
b9dbc5c4ab [Mamba][APC] Add test case to compare apc outputs (#34977)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2026-03-26 16:40:35 +00:00
wang.yuqi
dcdc145893 [CI] Reorganize scoring tests (#38207)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-26 12:07:01 +00:00
Ilya Boytsov
8b6c6b9505 [Model] Add LFM2-ColBERT-350M support (#37528)
Signed-off-by: Ilya Boytsov <ilyaboytsov1805@gmail.com>
2026-03-20 14:57:57 +00:00
bigshanedogg
2390d44209 [Model] Add HyperCLOVAX-SEED-Think-14B language model support (#37107)
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
2026-03-16 06:40:05 +00:00
whyiug
1ce13cf992 [Model] Add support for BERT-like Chinese ERNIE pooling models (#36385)
Signed-off-by: whyiug <whyiug@hotmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-13 03:23:53 +00:00
wang.yuqi
6ecabe4936 [CI Failure] Fix Language Models Test (Extended Pooling) daily CI Failure (#36761)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-03-12 12:22:05 +08:00
Harry Mellor
65986db6ba Make Gemma and Gemma 2 accept inputs_embeds like Gemma 3 (#36787)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-11 18:12:43 +00:00
Angela Yi
13e79fc811 [ci] Update rtol for test_classification (#36556)
Signed-off-by: angelayi <yiangela7@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
2026-03-11 03:08:16 -07:00
Micah Williamson
4ff9b045fe [ROCm][CI] Prep Tests For Change To ROCM_ATTN As New Default Backend On ROCm (#36025)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2026-03-09 13:27:55 -05:00
Andreas Karatzas
1e0f917b34 [ROCm][CI] Fix logprob divergence for TitanML/tiny-mixtral under AITER rms_norm (#36101)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-09 12:07:44 -05:00
Andreas Karatzas
a1ffa56a1e [CI] Fix bge-m3 similarity reference values after *Defination* typo fix (#36208)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-06 05:07:29 +00:00
Jiayi Yan
6a895197fa [Bugfix][CI] fix typos (#34934)
Signed-off-by: 1195343015 <1195343015@qq.com>
Signed-off-by: Jiayi Yan <66017932+1195343015@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-05 17:05:46 +00:00
Kunshang Ji
66a2209645 [Hardware] Replace torch.cuda.synchronize() api with torch.accelerator.synchronize (#36085)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-03-05 10:36:39 +00:00
wang.yuqi
fd4a90f337 [CI] And PPL test for Qwen3.5. (#35853)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-03 13:15:51 +00:00
Andreas Karatzas
c34963f138 [ROCm][CI] Disable skinny GEMMs in language model standard tests to fix non-determinism (#35152)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-03-02 15:04:18 +08:00
Wentao Ye
d24bdd7c4b [CI] Bump mteb version to mteb[bm25s]>=2, <3 for pooling model unit tests (#34961)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2026-02-21 20:23:24 -08:00
Andreas Karatzas
89358f0d35 [CI] Fix ColBERT HF comparison tests on AMD CI + refactor (#34567)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-02-20 20:12:05 -08:00
Ilya Boytsov
071d863e20 Extend ColBERT support to non-standard BERT backbones (#34170)
Signed-off-by: Ilya Boytsov <ilya.boytsov@aleph-alpha.com>
2026-02-13 09:53:09 +00:00
Cyrus Leung
b96f7314b4 [Refactor] Pass Renderer to Input Processor (#34329)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-11 19:38:11 -08:00
Chen Zhang
97fa8f6590 [BugFix] Avoid prefix cache hit in the same schedule step for mamba layers (#29387)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2026-02-10 07:41:16 +00:00
chengchengpei
965525667b Onboard voyage-4-nano (#33720)
Signed-off-by: Chengcheng Pei <chengchengpei@outlook.com>
Signed-off-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com>
Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-06 06:23:34 +00:00
wang.yuqi
1c3a221d3b [Bugfix] Fix corner case of sparse embedding (#33886)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-02-05 02:51:22 -08:00
Andreas Karatzas
3e472e81f9 [ROCm][Bugfix][CI] Fix hybrid models and their tests (Mamba/Jamba/Bamba) (#32710)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com>
2026-02-05 10:01:23 +00:00
Ilya Boytsov
439afa4eea feat: Add ColBERT late interaction model support (#33686)
Signed-off-by: Ilya Boytsov <ilyaboytsov1805@gmail.com>
Signed-off-by: Ilya Boytsov <boytsovpanamera@mail.ru>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-02-05 08:05:13 +08:00
Cyrus Leung
80f921ba4b [Bugfix] Fix normalize still being passed to PoolerConfig (#33794)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-02-04 23:56:02 +08:00
Cyrus Leung
f0a1c8453a [Frontend] Use new Renderer for Completions and Tokenize API (#32863)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-31 04:51:15 -08:00
Andreas Karatzas
6c00645712 [CI][Pooling] Stabilize ModernBERT test (#32909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-27 05:26:48 +00:00
Maximilien de Bayser
ff365eea94 Support bge-m3 sparse embeddings and colbert embeddings (#14526)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
2026-01-22 23:52:57 +08:00
Andreas Karatzas
eb1629da24 [ROCm][CI] Fix AITER test flakiness by using explicit attention backend (#32346)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com>
2026-01-22 13:55:25 +08:00
Lucas Wilkinson
2261340806 [Misc] Remove pad_for_cudagraphs from config (#30143)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
2026-01-20 15:05:48 -05:00
wang.yuqi
c88860d759 [Frontend] Score entrypoint support data_1 & data_2 and queries & documents as inputs (#32577)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-01-19 14:07:46 +00:00
Isotr0py
8cc26acd8b [Performance] Improve Triton prefill attention kernel's performance (#32403)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-17 20:19:59 -08:00
Cyrus Leung
232214b2ae [Bugfix] Replace PoolingParams.normalize with use_activation (#32243)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-13 10:45:42 +00:00
Cyrus Leung
583a90e005 [Refactor] Separate sequence and token pooling types (#32026)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2026-01-10 04:53:24 +00:00
Divakar Verma
a1648c4045 [ROCm][CI] Fix test_token_classification.py::test_bert_models (#31993)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2026-01-09 04:04:33 +00:00
Bijaya Dangol
59d260f5e4 [Model] Add Grok-2 (#31847)
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
2026-01-08 04:59:48 -08:00
Andreas Karatzas
2a42ae790d [ROCm][CI] Fix ModernBERT token classification test numerical accuracy on ROCm (#31820)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-06 23:21:15 +00:00
wang.yuqi
43d384bab4 [CI] Increase the MTEB_EMBED_TOL threshold to 5e-4. (#31797)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-01-06 19:30:05 +08:00
Isotr0py
ee2e69d6cd [Bugfix][CI/Build] Fix failing pooling models test due to Triton kernel accuracy diff (#31776)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-06 00:44:22 -08:00
wang.yuqi
911d38ed99 [Model] Let more models to support the score template. (#31335)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2026-01-05 11:54:26 +00:00
Isotr0py
367856de14 [CI/Build] Revive skipped reward models e2e test (#31665)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-05 02:33:46 +00:00
Andreas Karatzas
f2b6dfd237 [ROCm][CI] Fix language generation test accuracy by disabling HF flash_sdp and mem_efficient_sdp (#31597)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-05 02:17:05 +00:00
Andreas Karatzas
89f1f25310 [CI] Skip Phi-MoE test due to old API util (#31632)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-05 08:52:07 +08:00
Andreas Karatzas
013b54088c [ROCm][CI] Fix ModernBERT token classification test (#31612)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-02 04:19:08 +00:00
Andreas Karatzas
cf16342d43 [ROCm][CI] Update MiniCPM model test: MiniCPM3-4B to MiniCPM4.1-8B and simplify attention backend testing (#31551)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-31 00:12:01 -08:00
wang.yuqi
1ff67df182 [CI] Reorganization pooling_mteb_test (#31265)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-12-24 23:36:20 +08:00
Jakub Zakrzewski
23daef548d [Frontend] Support using chat template as custom score template for reranking models (#30550)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-23 11:19:16 +00:00
Chauncey
2a1776b7ac [Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-15 12:54:52 +00:00
wang.yuqi
4429d934de [Model] Automatic conversion of TokenClassification model (#30666)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-15 08:13:00 +00:00
Cyrus Leung
64251f48df [Chore] Adjust tokenizer import to avoid circular imports (#30601)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-13 04:42:39 -08:00