Nicolò Lucchesi
c212202d93
[Misc] Bound NIXL upper bound version ( #35495 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2026-03-02 16:57:07 +08:00
Hongxia Yang
f26650d649
[ROCm] add amd-quark package in requirements for rocm to use quantized models ( #35658 )
...
Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com >
Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com >
2026-03-02 06:02:43 +00:00
Lucas Wilkinson
8b5014d3dd
[Attention] FA4 integration ( #32974 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com >
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com >
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com >
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com >
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com >
2026-03-01 23:44:57 +00:00
Sage Moore
9e2cabdf9c
[ROCm] Update the torch version in rocm_build.txt to use the official 2.10 release ( #34387 )
...
Signed-off-by: Sage Moore <sage@neuralmagic.com >
2026-02-26 16:28:45 +00:00
Wentao Ye
d24bdd7c4b
[CI] Bump mteb version to mteb[bm25s]>=2, <3 for pooling model unit tests ( #34961 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com >
2026-02-21 20:23:24 -08:00
Andreas Karatzas
d403c1da1c
[CI] Stabilizing ROCm amd-ci signal and minor name fix in upstream ( #35008 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-22 04:01:10 +00:00
Cyrus Leung
965fe45935
[CI/Build] Fix gRPC version mismatch ( #35013 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2026-02-21 12:14:41 -07:00
BADAOUI Abdennacer
8dc8a99b56
[ROCm] Enable bitsandbytes quantization support on ROCm ( #34688 )
...
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com >
2026-02-21 00:34:55 -08:00
Vlad Tiberiu Mihailescu
e739c29ea4
[CI/Build] Add opentelemetry libs in default vllm build (requirements/common.txt) ( #34466 )
...
Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com >
2026-02-20 19:54:55 -08:00
Wei Zhao
ea5f903f80
Bump Flashinfer Version and Re-enable DeepSeek NVFP4 AR+Norm Fusion ( #34899 )
...
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-20 13:37:31 -08:00
Harry Mellor
6ce80f7071
Ensure that MkDocs v2 does not get installed ( #34958 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-20 15:38:11 +00:00
Kyle Sayers
64ac1395e8
[Docs] Clean up speculators docs ( #34065 )
...
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com >
2026-02-18 13:48:11 -08:00
Teng Ma
d7ff22204a
[Misc] Add mooncake-transfer-engine to kv_connectors requirements ( #34826 )
...
Signed-off-by: Teng Ma <teng-ma@linux.alibaba.com >
2026-02-18 18:26:24 +00:00
Andreas Karatzas
03a8770a6d
[ROCm][CI] Fix plugins test group; updating terratorch and dependencies ( #34589 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-16 07:33:42 -08:00
Harry Mellor
a21cedf4ff
Bump lm-eval version for Transformers v5 compatibility ( #33994 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-16 05:24:35 -08:00
Christian Pinto
342a7cda2d
[Misc] Update tests and examples for Prithvi/Terratorch models ( #34416 )
...
Signed-off-by: Christian Pinto <christian.pinto@ibm.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-02-13 23:03:51 -08:00
Andreas Karatzas
de42abb366
[CI] Heavy refactoring of Voxtral multimodal audio model tests ( #34294 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-13 20:04:29 -08:00
Patrick von Platen
6c0baee610
[Voxtral Realtime] Refactor & Improve buffering logic ( #34428 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-12 09:46:43 -08:00
Seiji Eicher
5045d5c983
Patch protobuf for CVE-2026-0994 ( #34253 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com >
Co-authored-by: Kevin H. Luu <khluu000@gmail.com >
2026-02-11 02:25:04 -08:00
Nick Hill
79504027ef
[Misc] Bump fastsafetensors version for latest fixes ( #34273 )
...
Signed-off-by: Nick Hill <nickhill123@gmail.com >
2026-02-11 00:30:09 -08:00
zofia
b482f71e9f
[XPU][7/N] enable xpu fp8 moe ( #34202 )
...
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com >
2026-02-11 03:33:59 +00:00
Andrey Talman
f97ca67176
[Release 2.10] Update to Torch 2.10 - final release ( #30525 )
2026-02-08 13:51:09 -08:00
wang.yuqi
6ed5eda300
[CI][Build] Pin grpcio-tools==1.78.0 ( #34048 )
...
Signed-off-by: wang.yuqi <noooop@126.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2026-02-07 05:24:35 -08:00
Andreas Karatzas
c490d8cc73
[ROCm][CI] Pinning lm-eval version to resolve multi-modal small eval bug ( #34038 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-06 22:21:08 -08:00
Dimitrios Bariamis
207c3a0c20
Fix RoutingMethodType logic ( #33919 )
...
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com >
Signed-off-by: mgoin <mgoin64@gmail.com >
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com >
Co-authored-by: mgoin <mgoin64@gmail.com >
2026-02-06 14:03:34 -08:00
Harry Mellor
ba5cbbf107
Bump HF Hub client to get bug fix ( #33984 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-06 11:25:33 +00:00
emricksini-h
325ab6b0a8
[Feature] OTEL tracing during loading ( #31162 )
2026-02-05 16:59:28 -08:00
Cyrus Leung
32a02c7ca2
Apply #33621 to main ( #33758 )
...
Signed-off-by: Zachary Aristei <zaristei@nvidia.com >
Co-authored-by: zaristei2 <zaristei2@gmail.com >
Co-authored-by: Zachary Aristei <zaristei@nvidia.com >
2026-02-04 05:35:39 -08:00
Kunshang Ji
f79f777803
[XPU][2/N] add support unquantized moe support for xpu ( #33659 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com >
2026-02-04 02:12:25 -08:00
Wentao Ye
655efb3e69
[Dependency] Remove comments of ray in dependency files ( #33351 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com >
2026-02-03 15:30:47 -08:00
Kunshang Ji
e10604480b
[XPU][1/N] Deprecate ipex and switch to vllm-xpu-kernels for xpu platform ( #33379 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com >
2026-02-02 22:46:10 -08:00
Harry Mellor
8b7346d5f1
Update huggingface-hub again ( #33567 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-02-02 09:20:54 -08:00
Greg Pereira
d6416fdde9
pin LMCache to v0.3.9 or greater with vLLM v0.15.0 ( #33440 )
...
Signed-off-by: greg pereira <grpereir@redhat.com >
Co-authored-by: Michael Goin <mgoin64@gmail.com >
2026-01-31 20:50:38 -07:00
Andreas Karatzas
0fb3157267
[ROCm][CI] Update huggingface-hub pin ( #33492 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-02-01 02:51:54 +00:00
Harry Mellor
ce0afe2451
Update huggingface-hub pin for the last time before Transformers v5 ( #33473 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2026-01-31 09:14:24 -08:00
Dimitrios Bariamis
f0bca83ee4
Add support for Mistral Large 3 inference with Flashinfer MoE ( #33174 )
...
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com >
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com >
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk >
2026-01-30 22:48:27 -08:00
Patrick von Platen
40c35038d2
[Voxtral] Streaming example ( #33042 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Roger Wang <hey@rogerw.io >
2026-01-29 03:22:49 -08:00
Jeffrey Wang
a97b5e206d
Relax protobuf library version constraints ( #33202 )
...
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com >
2026-01-28 04:15:53 +00:00
Fadi Arafeh
10e94c84f6
[CPU][Feat] Update PyTorch to v2.10 for CPU Backend ( #32869 )
...
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com >
Co-authored-by: Li, Jiang <jiang1.li@intel.com >
2026-01-23 21:13:06 +08:00
Isotr0py
444e2e7e1f
[Misc] Bump opencv-python dependecy version to 4.13 ( #32668 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-22 15:51:15 +00:00
Huy Do
f5fdec8ce2
Upgrade transformers-4.57.5 ( #32287 )
...
Signed-off-by: Huy Do <huydhn@gmail.com >
2026-01-22 05:19:19 +00:00
elvischenv
808d6fd7b9
Bump Flashinfer to v0.6.1 ( #30993 )
...
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com >
2026-01-21 08:49:50 -08:00
Daniel Mescheder
cdd03d25d3
[CI/Build] Fix dependency conflict between model-hosting-container-standards and starlette ( #32560 )
...
Signed-off-by: Daniel Mescheder <dmesch@amazon.com >
Co-authored-by: Daniel Mescheder <dmesch@amazon.com >
2026-01-19 03:27:08 -08:00
TJian
41c544f78a
[ROCm] [CI] [Release] Rocm wheel pipeline with sccache ( #32264 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
2026-01-16 02:56:18 +08:00
Andreas Karatzas
ae1eba6a9a
[ROCm][CI] Pin transformers 4.57.3 to fix jina test failures ( #32350 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com >
2026-01-15 15:19:34 +08:00
David
6b176095e3
[Build] Relax anthropic version pin from ==0.71.0 to >=0.71.0 ( #32289 )
...
Signed-off-by: Michael Goin <mgoin64@gmail.com >
Co-authored-by: Michael Goin <mgoin64@gmail.com >
2026-01-13 23:21:39 -08:00
HappyAmazonian
2f4a71daf2
[Misc] Add In-Container restart capability through supervisord for sagemaker entrypoint ( #28502 )
...
Signed-off-by: Shen Teng <sheteng@amazon.com >
Signed-off-by: HappyAmazonian <91216626+HappyAmazonian@users.noreply.github.com >
2026-01-13 13:06:10 -08:00
Isotr0py
cee7436a26
[Misc] Make scipy as optional audio/benchmark dependency ( #32096 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2026-01-11 00:18:57 -08:00
TJian
7a05d2dc65
[CI] [ROCm] Fix tests/entrypoints/test_grpc_server.py on ROCm ( #31970 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
2026-01-09 12:54:20 +08:00
Chang Su
791b2fc30a
[grpc] Support gRPC server entrypoint ( #30190 )
...
Signed-off-by: Chang Su <chang.s.su@oracle.com >
Signed-off-by: njhill <nickhill123@gmail.com >
Signed-off-by: Nick Hill <nickhill123@gmail.com >
Co-authored-by: njhill <nickhill123@gmail.com >
Co-authored-by: Simon Mo <simon.mo@hey.com >
2026-01-07 23:24:46 -08:00