Commit Graph

4003 Commits

Author SHA1 Message Date
Andreas Karatzas
f2b6dfd237 [ROCm][CI] Fix language generation test accuracy by disabling HF flash_sdp and mem_efficient_sdp (#31597)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-05 02:17:05 +00:00
Andreas Karatzas
89f1f25310 [CI] Skip Phi-MoE test due to old API util (#31632)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-05 08:52:07 +08:00
Andreas Karatzas
4f9ce35afe [CI][Bugfix] Fix token counting in chunked prefill compl test (#31630)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-03 14:28:49 +08:00
jeremyteboul
97a01308e9 Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2026-01-03 04:31:09 +00:00
Xingyu Liu
0eee877f67 [Core] Parse vLLM engine required fields from hf_config to model_arch_config (#28454)
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <38244988+charlotte12l@users.noreply.github.com>
2026-01-02 15:13:15 -08:00
Nick Hill
bd877162eb [BugFix] Support online dense model DP without overhead (#30739)
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: njhill <nickhill123@gmail.com>
2026-01-02 23:36:38 +08:00
Xinyu Chen
08f425bad1 CustomOp: test forward dispatch for grouped_topk (#31530)
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
2026-01-02 10:04:01 -05:00
Andreas Karatzas
013b54088c [ROCm][CI] Fix ModernBERT token classification test (#31612)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-02 04:19:08 +00:00
Andreas Karatzas
21de6d4b02 [CI][Bugfix] Fix token counting in chunked prefill streaming test (#31565)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-31 23:05:14 +00:00
baonudesifeizhai
d722e9e614 Add GLM-ASR multimodal support (#31436)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-31 23:12:24 +08:00
Andreas Karatzas
cf16342d43 [ROCm][CI] Update MiniCPM model test: MiniCPM3-4B to MiniCPM4.1-8B and simplify attention backend testing (#31551)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-31 00:12:01 -08:00
B-201
ecd49ce7e6 [Fix] Align fused moe lora_b shape with peft (#31534)
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-31 09:44:59 +08:00
yt0428
3f52fa5aa2 [Model] Add support for openPangu moe model (#28775)
Signed-off-by: yuantao <2422264527@qq.com>
Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-30 08:11:38 -08:00
Nicolò Lucchesi
ab1af6aa3e [CI][NIXL] Split DPEP tests (#31491)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-12-30 07:26:12 -05:00
ZT-AIA
f84bf7d79b Add Loraconfig parameter to get_punica_wrapper function (#31408)
Signed-off-by: ZT-AIA <1028681969@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-12-29 22:27:31 -08:00
Hojin Yang
dc837bc23e feat(frontend): add --default-chat-template-kwargs CLI argument (#31343)
Signed-off-by: effortprogrammer <yhjhoward7@gmail.com>
2025-12-30 03:38:47 +00:00
wangln19
358bfd315c fix: update kimi k2 tool parser logic (#31207)
Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local>
Signed-off-by: Wang Linian <wanglinian@stu.pku.edu.cn>
Co-authored-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-12-30 10:01:58 +08:00
Sage
39512aba72 [Prefix Cache] Include lora_name in BlockStored event for deterministic KV-cache reconstruction (#27577)
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Co-authored-by: Sage <80211083+sagiahrac@users.noreply.github.com>
2025-12-30 00:17:16 +00:00
Alexei-V-Ivanov-AMD
d63b969675 [CI/ROCm] Fixing "V1 Test attention (H100)" test group. (#31187)
Signed-off-by: DCCS-4560 <alivanov@chi-mi325x-pod1-108.ord.vultr.cpe.ice.amd.com>
Signed-off-by: <>
Co-authored-by: DCCS-4560 <alivanov@chi-mi325x-pod1-108.ord.vultr.cpe.ice.amd.com>
Co-authored-by: root <root@chi-mi325x-pod1-108.ord.vultr.cpe.ice.amd.com>
2025-12-29 16:53:59 -05:00
amittell
9c884faa95 [Bugfix] Preserve tool call id/type/name in streaming finish chunk (#31438)
Signed-off-by: amittell <mittell@me.com>
Signed-off-by: Alex Mittell <mittell@me.com>
2025-12-29 21:10:52 +08:00
Chauncey
48d5ca4e8b [CI] fix test_chat_truncation_content_not_null test (#31488)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-29 12:47:08 +00:00
twj
bf73a3e4d7 [Bugfix][Frontend] Fix Jina reranker multimodal input compatibility (#31445)
Signed-off-by: tianwenjing <tianwenjing@jfgenius.com>
Signed-off-by: twj <151701930+twjww@users.noreply.github.com>
Co-authored-by: tianwenjing <tianwenjing@jfgenius.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-29 01:13:18 -08:00
Andreas Karatzas
45c1ca1ca1 [ROCm][CI] Skip DeepGemm-dependent test on ROCm platform (#31462)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-29 16:31:10 +09:00
Boyuan Feng
2f12cd32c0 [BugFix] Fix cache issue in compilation_config (#31376)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
2025-12-27 09:30:39 -05:00
Isotr0py
3d024985ab [CI/Build] Ignore max transformers version for more common tests (#31401)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-27 13:06:26 +00:00
baonudesifeizhai
8711b21676 Fix/get raw stream patch #30905 (#30912)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-12-26 20:08:47 -08:00
Yifan Qiao
52bf066516 [Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector (#30166)
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: KuntaiDu <kuntai@uchicago.edu>
2025-12-26 18:25:46 -08:00
Kunshang Ji
5326c89803 [XPU][CI]skip test_preprocess_error_handling due to fork/spawn issue (#31381)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-12-26 21:40:44 +00:00
Jee Jee Li
ce1eafd1a5 [Core] Initialize LoRA support for tower and connector in multi-modal models (#26674)
Signed-off-by: bk-201 <joy25810@foxmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: bk-201 <joy25810@foxmail.com>
Co-authored-by: prashanth058 <prashanth.dannamaneni@uipath.com>
Co-authored-by: Anexdeus <5142168@mail.ru>
2025-12-26 04:48:20 -08:00
Andreas Karatzas
c79dbfa9ad [CI] Fix flaky vision beam search test with flexible semantic validation (#31324)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-26 04:39:32 +00:00
Isotr0py
2cd94259c8 [CI/Build] Ignore max transformers version skipping for initialization tests (#30619)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-26 10:50:32 +08:00
oscardev256
b7165d53c6 Feature/isaac 0.1 (#28367)
Signed-off-by: oscardev256 <42308241+oscardev256@users.noreply.github.com>
Signed-off-by: Oscar Gonzalez <ogonzal6@alumni.jh.edu>
Signed-off-by: Yang <lymailforjob@gmail.com>
Co-authored-by: Yang <lymailforjob@gmail.com>
2025-12-25 18:49:11 -08:00
Nick Hill
81786c8774 [BugFix] Fix async scheduling + reasoning with struct output (#31332)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
2025-12-25 23:01:02 +00:00
SongHe
2d6001f491 [Model][Ernie4.5-VL] Support video metadata for timestamp rendering (#31274)
Signed-off-by: dengsonghe <dengsonghe@baidu.com>
Co-authored-by: dengsonghe <dengsonghe@baidu.com>
2025-12-25 14:07:15 +00:00
Amir Samani
030fc44914 use the same stream for cuda graph catpure and replay for NCCL (#29207)
Signed-off-by: Amir Samani <asamani@nvidia.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2025-12-25 19:10:03 +08:00
Mark Gatere
ba25a65992 [Frontend] add FunctionGemma tool parser support (#31218)
Signed-off-by: gateremark <gateremg@gmail.com>
2025-12-25 15:29:25 +08:00
Richard Zou
254f6b9867 [Bugfix] Fix eagle dp tests on A100 (#31241)
Signed-off-by: Richard Zou <zou3519@gmail.com>
2025-12-25 00:05:04 +00:00
Cyrus Leung
09dc7c690c [Chore][1/2] Drop v0.14 deprecations (#31285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 09:54:01 -08:00
wang.yuqi
1ff67df182 [CI] Reorganization pooling_mteb_test (#31265)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-12-24 23:36:20 +08:00
Cyrus Leung
aa3868ecfe [Chore] Remove unused noqas (#31263)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 05:38:46 -08:00
Andreas Karatzas
0247a91e00 [ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm (#28979)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-23 22:42:30 -08:00
Michael Goin
8ee90c83f8 Add --max-model-len auto to auto-fit context to available memory (#29431)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-12-23 21:37:14 -08:00
Micah Williamson
3ce791ac77 [ROCm][CI] Set VLLM_FLOAT32_MATMUL_PRECISION="tf32" For terratorch Tests In AMD CI (#31242)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-12-24 03:21:50 +00:00
Andreas Karatzas
e42894f5b5 [ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance (#31235)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-24 02:56:58 +00:00
Chen Zhang
538e830caa [KVEvent] User request.block_hash for parent block_hash (#30544)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
2025-12-23 18:23:43 -08:00
rongfu.leng
4ed11105d7 [Misc] Remove unused custom ops copy_blocks and copy_blocks_mla (#30967)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-12-23 18:22:35 -08:00
Vadim Gimpelson
bc0a5a0c08 [CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
2025-12-23 17:21:50 -08:00
Andreas Karatzas
bfa2c0bbb9 [ROCm][Bugfix] Fix RuntimeError in MMEncoderAttention by replacing .view() with .reshape() (#31203)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-23 21:48:01 +00:00
Mark McLoughlin
f790068600 [Core] Add a random suffix to frontend-provided request IDs (#27987)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-12-23 13:05:39 -08:00
Joachim Studnia
38c361f99d Fix edge case Mistral tool parser (#30724)
Signed-off-by: Joachim Studnia <joachim@mistral.ai>
Signed-off-by: Joachim Studnia <studniajoachim@gmail.com>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: juliendenize <julien.denize@mistral.ai>
Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2025-12-23 14:19:58 +00:00