Wentao Ye
|
d24bdd7c4b
|
[CI] Bump mteb version to mteb[bm25s]>=2, <3 for pooling model unit tests (#34961)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-02-21 20:23:24 -08:00 |
|
Roman
|
98b0205c3c
|
[Frontend] Add automatic language detection for Whisper transcription (#34342)
Signed-off-by: space_check <roman.vuskov@rwth-aachen.de>
Signed-off-by: Roman <45857014+spacecheck@users.noreply.github.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2026-02-21 04:49:41 -08:00 |
|
Nicolò Lucchesi
|
ab6f3487a6
|
[PD] Change kv_load_failure_policy Default from "recompute" to "fail" (#34896)
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-21 01:34:57 -08:00 |
|
BADAOUI Abdennacer
|
8dc8a99b56
|
[ROCm] Enable bitsandbytes quantization support on ROCm (#34688)
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
|
2026-02-21 00:34:55 -08:00 |
|
Andreas Karatzas
|
54254f7a61
|
[ROCm][CI] Fix spec decode logprobs flakiness and parametrize tree attention backends (#34599)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-20 20:25:23 -08:00 |
|
Andreas Karatzas
|
89358f0d35
|
[CI] Fix ColBERT HF comparison tests on AMD CI + refactor (#34567)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-20 20:12:05 -08:00 |
|
zhongdaor-nv
|
a0fe7ea2f0
|
[feat] Add per-block extra_keys to KV events (#33304)
Signed-off-by: zhongdaor-nv <zhongdaor@nvidia.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-20 20:11:40 -08:00 |
|
Andreas Karatzas
|
991d6bff38
|
[CI][MCP][Harmony] Heavy refactoring Harmony & MCP response tests and stabilizing with deterministic test infrastructure (#33949)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-20 20:03:32 -08:00 |
|
Kata Coder
|
5719a4e4e6
|
[Frontend] Support multimodal inputs for late-interaction scoring (ColQwen3) + NewModel: nvidia/nemotron-colembed (#34574)
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
|
2026-02-20 20:01:40 -08:00 |
|
pougetat
|
11be2c74dc
|
[Realtime] Add Qwen3-ASR realtime streaming support (#34613)
Signed-off-by: Thomas Pouget-Abadie <thomaspou@microsoft.com>
Co-authored-by: Thomas Pouget-Abadie <thomaspou@microsoft.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2026-02-20 19:59:42 -08:00 |
|
Xin Yang
|
7a5adad480
|
[Kernel] Optimize sample_recovered_tokens_kernel (#34974)
Signed-off-by: Xin Yang <xyangx@amazon.com>
|
2026-02-20 19:59:06 -08:00 |
|
Yanan Cao
|
9d7577b2bd
|
[Kernel] [Helion] [9/N] Canonicalize GPU variant names to base model names (#34928)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-20 19:55:51 -08:00 |
|
Ryan Rock
|
0632ed8778
|
[AMD][CI] Fix test_custom_allreduce for A100 testgroup (#34735)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
2026-02-20 21:33:04 +00:00 |
|
Lucas Wilkinson
|
aaefc58ee0
|
[CI] Revert PRs 34818 and 33600 (#34979)
|
2026-02-20 13:25:50 -08:00 |
|
Wei Zhao
|
f24b2de3d3
|
[Test] Add FP8 KV Cache Testing for MLA Backends (#34473)
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
|
2026-02-20 18:51:58 +00:00 |
|
Yanan Cao
|
a6d0299c75
|
[Kernel] [Helion] [6/N] Add num_tokens dimension to silu_mul autotuning and dispatching (#34185)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2026-02-20 08:36:51 -08:00 |
|
Xin Yang
|
b1c4f0b265
|
[Kernel] Optimize grouped topk kernel (#34206)
Signed-off-by: Xin Yang <xyangx@amazon.com>
|
2026-02-20 01:34:45 -08:00 |
|
Micah Williamson
|
f5432e35a3
|
[ROCm][CI] Loosen RemoteOpenAIServer Startup Timeout (#34922)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-02-20 05:37:49 +00:00 |
|
rasmith
|
0c1dc42748
|
[CI][AMD][BugFix][P/D] Add default_vllm_config to test_moriio_connector.py so tests pass (#33739)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2026-02-19 21:32:40 -08:00 |
|
Varun Chawla
|
676f82ae81
|
Add validation to reject non-text content in system messages (#34072)
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
|
2026-02-19 21:30:33 -08:00 |
|
Matthias Gehre
|
4e2c7caf2d
|
[Bugfix] Add regression test for MoE quant_config under torch.compile (#34335)
Signed-off-by: Matthias Gehre <matthias.gehre@amd.com>
|
2026-02-20 13:27:26 +08:00 |
|
Matthew Bonanni
|
662205d34e
|
[Bugfix] Fix Basic Models Test (#34818)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-19 14:49:07 -08:00 |
|
Cyrus Leung
|
23210a911e
|
[CI/Build] Try to make beam search test less flaky (#34885)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-19 19:16:58 +08:00 |
|
Cyrus Leung
|
1391378861
|
[Bugfix] Fix edge case in UUID data parsing (#34884)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-19 02:24:30 -08:00 |
|
Andreas Karatzas
|
f6220f9877
|
[ROCm][Test] Fix beam search determinism failures from batch-size-dependent FP divergence and removed wrong marker (#34878)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-19 08:25:26 +00:00 |
|
Tal Nir
|
f75b61a9e9
|
[Voxtral Realtime] Fix engine crash on empty multimodal embeddings (#34862)
Signed-off-by: Tal Nir <tal@nervexneurotech.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-18 23:21:47 -08:00 |
|
Jaeyeon Kim(김재연)
|
9681068cf9
|
[Frontend] Fix reasoning_tokens for text-based parsers in Responses API (#33513)
Signed-off-by: Jaeyeon Kim <anencore94@gmail.com>
|
2026-02-18 23:16:41 -08:00 |
|
rasmith
|
2b84ac669c
|
[CI][AMD][BugFix] Use torch.testing.assert_close instead of assert torch.allclose in test_rocm_skinny_gemms.py (#34181)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2026-02-18 23:10:19 +00:00 |
|
Aaron Hao
|
e99ba957ec
|
[BUG] Fixing Weight Sync unit test (#34841)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2026-02-18 17:20:10 -05:00 |
|
Kyle Sayers
|
64ac1395e8
|
[Docs] Clean up speculators docs (#34065)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
2026-02-18 13:48:11 -08:00 |
|
Cyrus Leung
|
61cf087680
|
[Bugfix] Fix lora tests (#34834)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-02-18 13:22:31 -08:00 |
|
Wenlong Wang
|
847a57cd12
|
[Bugfix][MoE Kernel] Fix incorrect routing selection for models without expert groups (e.g., MiniMax-M2.1) (#34673)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-02-18 13:03:24 -08:00 |
|
rasmith
|
fcd6ac97ed
|
[CI][AMD][BugFix] Skip tests in test_unquantized_backend_selection that should not run on ROCm (#34655)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2026-02-18 15:00:40 -05:00 |
|
Michael Goin
|
caeb887bf6
|
[Bugfix] Fix NVFP4 TRTLLM MoE non-gated support; add gsm8k for Nemotron-3-Nano FP8+NVFP4 (#34725)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-02-18 09:39:22 -08:00 |
|
Burkhard Ringlein
|
e24663c5a9
|
Add unit tests for fp8 output fusion of triton_attn (#34228)
Signed-off-by: Burkhard Ringlein <ngl@zurich.ibm.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-02-18 06:22:49 -05:00 |
|
ElizaWszola
|
a88b3be7c4
|
[Bugfix] Fix quant RMS norm fusion for quantization with TMA-aligned scales (#33255)
Signed-off-by: ElizaWszola <ewszola@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2026-02-17 23:35:04 -08:00 |
|
Cyrus Leung
|
30ebe0dc3c
|
[CI/Build] Remove use of skip_v1 (#34699)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-18 12:19:11 +08:00 |
|
Andreas Karatzas
|
cef65f0715
|
[ROCm][CI] Removed hard-coded attn backend requirement for Qwen VL (#34753)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-02-18 03:59:53 +00:00 |
|
Russell Bryant
|
6f3b2047ab
|
[Core] Fix SSRF bypass via backslash-@ URL parsing inconsistency (#34743)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: isotr0py <2037008807@qq.com>
|
2026-02-18 03:53:35 +00:00 |
|
Cyrus Leung
|
a0d8d944e2
|
[Renderer] Move MM Hash parsing into Renderer (#34711)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-17 19:18:55 -08:00 |
|
Jongseok Park
|
c656ba3b4d
|
[Kernel] Triton-based Top-k and Top-p sampler kernels (#33538)
Signed-off-by: js_park <cakeng@naver.com>
Signed-off-by: Jongseok Park <37990712+cakeng@users.noreply.github.com>
Signed-off-by: Sunga Kim <sunga.kim@berkeley.edu>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Sunga Kim <sunga.kim@berkeley.edu>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-17 23:14:30 +00:00 |
|
Flora Feng
|
1e4a084c8e
|
[CI] Fix flaky test_parsable_context (#34717)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-02-17 18:42:52 +00:00 |
|
Richard Zou
|
7967e854da
|
[BugFix] Fix sp tests (#34716)
Signed-off-by: Richard Zou <zou3519@gmail.com>
|
2026-02-17 17:07:56 +00:00 |
|
almayne
|
6bd6d0c3c1
|
Fixed whisper CPU test that does not spawn properly. (#34324)
Signed-off-by: Anna Mayne <anna.mayne@arm.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-02-17 06:46:23 -08:00 |
|
Cyrus Leung
|
574fe75245
|
[Renderer] Move InputPreprocessor into Renderer (2/2) (#34560)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2026-02-17 05:29:01 -08:00 |
|
junuxyz
|
c61a98f529
|
[CI][BugFix] ShellCheck cleanup to remove baseline and preserve runtime behavior (#34514)
Signed-off-by: junuxyz <216036880+junuxyz@users.noreply.github.com>
|
2026-02-17 12:22:56 +00:00 |
|
kourosh hakhamaneshi
|
c464b57374
|
[Ray] Propagate third-party env vars to Ray workers via prefix matching (#34383)
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-02-17 01:08:42 -08:00 |
|
haosdent
|
b68fd899d1
|
[Bugfix] Fix fused MoE int32 overflow in stride*offset without perf regression (#34507)
Signed-off-by: haosdent <haosdent@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-02-16 17:58:49 -08:00 |
|
Nicolò Lucchesi
|
6cc403e67d
|
[Bugfix][CI] Fix flaky entrypoints/openai/test_response_api_with_harmony.py::test_function_calling[openai/gpt-oss-20b] (#34624)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-02-16 16:11:07 +00:00 |
|
Almog Tavor
|
72d5951d02
|
[Bugfix] Treat generation_config max_tokens as default not ceiling (#34063)
Signed-off-by: almogtavor <almogtavor@gmail.com>
|
2026-02-16 07:58:24 -08:00 |
|