Commit Graph

341 Commits

Author SHA1 Message Date
Andrey Talman
f97ca67176 [Release 2.10] Update to Torch 2.10 - final release (#30525) 2026-02-08 13:51:09 -08:00
wang.yuqi
6ed5eda300 [CI][Build] Pin grpcio-tools==1.78.0 (#34048)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2026-02-07 05:24:35 -08:00
Andreas Karatzas
c490d8cc73 [ROCm][CI] Pinning lm-eval version to resolve multi-modal small eval bug (#34038)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-02-06 22:21:08 -08:00
Dimitrios Bariamis
207c3a0c20 Fix RoutingMethodType logic (#33919)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2026-02-06 14:03:34 -08:00
Harry Mellor
ba5cbbf107 Bump HF Hub client to get bug fix (#33984)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-06 11:25:33 +00:00
emricksini-h
325ab6b0a8 [Feature] OTEL tracing during loading (#31162) 2026-02-05 16:59:28 -08:00
Cyrus Leung
32a02c7ca2 Apply #33621 to main (#33758)
Signed-off-by: Zachary Aristei <zaristei@nvidia.com>
Co-authored-by: zaristei2 <zaristei2@gmail.com>
Co-authored-by: Zachary Aristei <zaristei@nvidia.com>
2026-02-04 05:35:39 -08:00
Kunshang Ji
f79f777803 [XPU][2/N] add support unquantized moe support for xpu (#33659)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-02-04 02:12:25 -08:00
Wentao Ye
655efb3e69 [Dependency] Remove comments of ray in dependency files (#33351)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2026-02-03 15:30:47 -08:00
Kunshang Ji
e10604480b [XPU][1/N] Deprecate ipex and switch to vllm-xpu-kernels for xpu platform (#33379)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-02-02 22:46:10 -08:00
Harry Mellor
8b7346d5f1 Update huggingface-hub again (#33567)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-02 09:20:54 -08:00
Greg Pereira
d6416fdde9 pin LMCache to v0.3.9 or greater with vLLM v0.15.0 (#33440)
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-31 20:50:38 -07:00
Andreas Karatzas
0fb3157267 [ROCm][CI] Update huggingface-hub pin (#33492)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-02-01 02:51:54 +00:00
Harry Mellor
ce0afe2451 Update huggingface-hub pin for the last time before Transformers v5 (#33473)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-01-31 09:14:24 -08:00
Dimitrios Bariamis
f0bca83ee4 Add support for Mistral Large 3 inference with Flashinfer MoE (#33174)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-01-30 22:48:27 -08:00
Patrick von Platen
40c35038d2 [Voxtral] Streaming example (#33042)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2026-01-29 03:22:49 -08:00
Jeffrey Wang
a97b5e206d Relax protobuf library version constraints (#33202)
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
2026-01-28 04:15:53 +00:00
Fadi Arafeh
10e94c84f6 [CPU][Feat] Update PyTorch to v2.10 for CPU Backend (#32869)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-01-23 21:13:06 +08:00
Isotr0py
444e2e7e1f [Misc] Bump opencv-python dependecy version to 4.13 (#32668)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-22 15:51:15 +00:00
Huy Do
f5fdec8ce2 Upgrade transformers-4.57.5 (#32287)
Signed-off-by: Huy Do <huydhn@gmail.com>
2026-01-22 05:19:19 +00:00
elvischenv
808d6fd7b9 Bump Flashinfer to v0.6.1 (#30993)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2026-01-21 08:49:50 -08:00
Daniel Mescheder
cdd03d25d3 [CI/Build] Fix dependency conflict between model-hosting-container-standards and starlette (#32560)
Signed-off-by: Daniel Mescheder <dmesch@amazon.com>
Co-authored-by: Daniel Mescheder <dmesch@amazon.com>
2026-01-19 03:27:08 -08:00
TJian
41c544f78a [ROCm] [CI] [Release] Rocm wheel pipeline with sccache (#32264)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-16 02:56:18 +08:00
Andreas Karatzas
ae1eba6a9a [ROCm][CI] Pin transformers 4.57.3 to fix jina test failures (#32350)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-15 15:19:34 +08:00
David
6b176095e3 [Build] Relax anthropic version pin from ==0.71.0 to >=0.71.0 (#32289)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-13 23:21:39 -08:00
HappyAmazonian
2f4a71daf2 [Misc] Add In-Container restart capability through supervisord for sagemaker entrypoint (#28502)
Signed-off-by: Shen Teng <sheteng@amazon.com>
Signed-off-by: HappyAmazonian <91216626+HappyAmazonian@users.noreply.github.com>
2026-01-13 13:06:10 -08:00
Isotr0py
cee7436a26 [Misc] Make scipy as optional audio/benchmark dependency (#32096)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-11 00:18:57 -08:00
TJian
7a05d2dc65 [CI] [ROCm] Fix tests/entrypoints/test_grpc_server.py on ROCm (#31970)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-09 12:54:20 +08:00
Chang Su
791b2fc30a [grpc] Support gRPC server entrypoint (#30190)
Signed-off-by: Chang Su <chang.s.su@oracle.com>
Signed-off-by: njhill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: njhill <nickhill123@gmail.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2026-01-07 23:24:46 -08:00
Andreas Karatzas
364a8bc6dc [ROCm][CI] Fix plugin tests (2 GPUs) failures on ROCm and removing VLLM_FLOAT32_MATMUL_PRECISION from all ROCm tests (#31829)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-07 01:12:23 +00:00
Andreas Karatzas
e5d427e93a [ROCm][CI] Pinning timm lib version to fix ImportError in Multi-Modal Tests (Nemotron) (#31835)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-06 23:23:11 +00:00
Robert Shaw
81323ea221 [CI] Fix CPU MM PRocessor Test (#31764)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-06 04:22:18 +00:00
amitz-nv
ee21291825 [Model] Nemotron Parse 1.1 Support (#30864)
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-05 13:00:14 -08:00
wang.yuqi
76fd458aa7 [CI] Bump sentence-transformer from 3.2.1 to 5.2.0 (#31664)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-01-04 21:45:01 -08:00
qli88
0f35429a0c [CI]Test Group 'NixlConnector PD accuracy tests' is fixed (#31460)
Signed-off-by: qli88 <qiang.li2@amd.com>
2025-12-29 23:48:56 +00:00
Andreas Karatzas
573dd0e6f0 [ROCm] Migrate xgrammar to upstream release (#31327)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-28 00:08:29 -08:00
Andreas Karatzas
96142f2094 [ROCm][CI] Added perceptron lib in requirements for isaac multi-modal test (#31441)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-28 04:15:14 +00:00
Patrick von Platen
48e744976c [Mistral common] Ensure all functions are imported from the top & only use public methods (#31138)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-26 04:48:24 -08:00
oscardev256
b7165d53c6 Feature/isaac 0.1 (#28367)
Signed-off-by: oscardev256 <42308241+oscardev256@users.noreply.github.com>
Signed-off-by: Oscar Gonzalez <ogonzal6@alumni.jh.edu>
Signed-off-by: Yang <lymailforjob@gmail.com>
Co-authored-by: Yang <lymailforjob@gmail.com>
2025-12-25 18:49:11 -08:00
Cyrus Leung
d201807339 [Chore] Bump lm-eval version (#31264)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 05:39:13 -08:00
Li, Jiang
e3ab93c896 [CPU] Refactor CPU fused MOE (#30531)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-12-18 14:36:49 +08:00
Andrey Talman
e06d0bf0aa 2.9.1 PyTorch release update (#28495) 2025-12-17 12:20:22 -08:00
shanjiaz
009a773828 bump up compressed tensors version to 0.13.0 (#30799)
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
2025-12-16 21:01:04 -08:00
Michael Goin
811cdf5197 Update model-hosting-container-standards to 0.1.10 (#30815)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
2025-12-16 17:52:14 -08:00
Nick Hill
947dfda9c2 [LMCache] Relax lmcache version requirement (#30425)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-11 18:18:47 -09:00
Ye (Charlotte) Qi
e458270a95 [Misc] Add mcp to requirements (#30474)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-12-11 20:06:09 +00:00
Nick Hill
6ccb7baeb1 [LMCache] Fix breakage due to new LMCache version (#30216)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-10 11:52:01 -08:00
Andreas Karatzas
ed7af3178a [ROCm][CI] Attempt to fix the failures under a subgroup of the e2e the test group (#29358)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
2025-12-10 05:33:13 +00:00
Johnny Yang
d1b5e7afbf [TPU] Bump tpu-inference to 0.12.0 (#30221)
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-12-08 20:10:10 +00:00
Deboleina
02a4169193 [Tests] Tool call tests for openai/gpt-oss-20b (#26237)
Signed-off-by: Debolina Roy <debroy@redhat.com>
2025-12-05 19:03:29 -08:00