Commit Graph

320 Commits

Author SHA1 Message Date
Daniel Mescheder
cdd03d25d3 [CI/Build] Fix dependency conflict between model-hosting-container-standards and starlette (#32560)
Signed-off-by: Daniel Mescheder <dmesch@amazon.com>
Co-authored-by: Daniel Mescheder <dmesch@amazon.com>
2026-01-19 03:27:08 -08:00
TJian
41c544f78a [ROCm] [CI] [Release] Rocm wheel pipeline with sccache (#32264)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-16 02:56:18 +08:00
Andreas Karatzas
ae1eba6a9a [ROCm][CI] Pin transformers 4.57.3 to fix jina test failures (#32350)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-15 15:19:34 +08:00
David
6b176095e3 [Build] Relax anthropic version pin from ==0.71.0 to >=0.71.0 (#32289)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-13 23:21:39 -08:00
HappyAmazonian
2f4a71daf2 [Misc] Add In-Container restart capability through supervisord for sagemaker entrypoint (#28502)
Signed-off-by: Shen Teng <sheteng@amazon.com>
Signed-off-by: HappyAmazonian <91216626+HappyAmazonian@users.noreply.github.com>
2026-01-13 13:06:10 -08:00
Isotr0py
cee7436a26 [Misc] Make scipy as optional audio/benchmark dependency (#32096)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2026-01-11 00:18:57 -08:00
TJian
7a05d2dc65 [CI] [ROCm] Fix tests/entrypoints/test_grpc_server.py on ROCm (#31970)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-09 12:54:20 +08:00
Chang Su
791b2fc30a [grpc] Support gRPC server entrypoint (#30190)
Signed-off-by: Chang Su <chang.s.su@oracle.com>
Signed-off-by: njhill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: njhill <nickhill123@gmail.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2026-01-07 23:24:46 -08:00
Andreas Karatzas
364a8bc6dc [ROCm][CI] Fix plugin tests (2 GPUs) failures on ROCm and removing VLLM_FLOAT32_MATMUL_PRECISION from all ROCm tests (#31829)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-07 01:12:23 +00:00
Andreas Karatzas
e5d427e93a [ROCm][CI] Pinning timm lib version to fix ImportError in Multi-Modal Tests (Nemotron) (#31835)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2026-01-06 23:23:11 +00:00
Robert Shaw
81323ea221 [CI] Fix CPU MM PRocessor Test (#31764)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2026-01-06 04:22:18 +00:00
amitz-nv
ee21291825 [Model] Nemotron Parse 1.1 Support (#30864)
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2026-01-05 13:00:14 -08:00
wang.yuqi
76fd458aa7 [CI] Bump sentence-transformer from 3.2.1 to 5.2.0 (#31664)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2026-01-04 21:45:01 -08:00
qli88
0f35429a0c [CI]Test Group 'NixlConnector PD accuracy tests' is fixed (#31460)
Signed-off-by: qli88 <qiang.li2@amd.com>
2025-12-29 23:48:56 +00:00
Andreas Karatzas
573dd0e6f0 [ROCm] Migrate xgrammar to upstream release (#31327)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-28 00:08:29 -08:00
Andreas Karatzas
96142f2094 [ROCm][CI] Added perceptron lib in requirements for isaac multi-modal test (#31441)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-28 04:15:14 +00:00
Patrick von Platen
48e744976c [Mistral common] Ensure all functions are imported from the top & only use public methods (#31138)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-26 04:48:24 -08:00
oscardev256
b7165d53c6 Feature/isaac 0.1 (#28367)
Signed-off-by: oscardev256 <42308241+oscardev256@users.noreply.github.com>
Signed-off-by: Oscar Gonzalez <ogonzal6@alumni.jh.edu>
Signed-off-by: Yang <lymailforjob@gmail.com>
Co-authored-by: Yang <lymailforjob@gmail.com>
2025-12-25 18:49:11 -08:00
Cyrus Leung
d201807339 [Chore] Bump lm-eval version (#31264)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 05:39:13 -08:00
Li, Jiang
e3ab93c896 [CPU] Refactor CPU fused MOE (#30531)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-12-18 14:36:49 +08:00
Andrey Talman
e06d0bf0aa 2.9.1 PyTorch release update (#28495) 2025-12-17 12:20:22 -08:00
shanjiaz
009a773828 bump up compressed tensors version to 0.13.0 (#30799)
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
2025-12-16 21:01:04 -08:00
Michael Goin
811cdf5197 Update model-hosting-container-standards to 0.1.10 (#30815)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
2025-12-16 17:52:14 -08:00
Nick Hill
947dfda9c2 [LMCache] Relax lmcache version requirement (#30425)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-11 18:18:47 -09:00
Ye (Charlotte) Qi
e458270a95 [Misc] Add mcp to requirements (#30474)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-12-11 20:06:09 +00:00
Nick Hill
6ccb7baeb1 [LMCache] Fix breakage due to new LMCache version (#30216)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-10 11:52:01 -08:00
Andreas Karatzas
ed7af3178a [ROCm][CI] Attempt to fix the failures under a subgroup of the e2e the test group (#29358)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
2025-12-10 05:33:13 +00:00
Johnny Yang
d1b5e7afbf [TPU] Bump tpu-inference to 0.12.0 (#30221)
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-12-08 20:10:10 +00:00
Deboleina
02a4169193 [Tests] Tool call tests for openai/gpt-oss-20b (#26237)
Signed-off-by: Debolina Roy <debroy@redhat.com>
2025-12-05 19:03:29 -08:00
Micah Williamson
06579f9a82 [AMD][CI] Add ray[default] Dependency On ROCm To Pass v1/metrics/test_engine_logger_apis.py (#30110)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-12-05 06:48:23 +00:00
Charlie Fu
7c9b2c8f81 [ROCm][CI] Add jiwer dependency for testing (#30081)
Signed-off-by: charlifu <charlifu@amd.com>
2025-12-05 03:34:51 +00:00
Noa Neria
6366c098d7 Validating Runai Model Streamer Integration with S3 Object Storage (#29320)
Signed-off-by: Noa Neria <noa@run.ai>
2025-12-04 18:04:43 +08:00
Jianwei Mao
80f8af4b2f Fix error while downloading dependencies for CPU backend (#29797)
Signed-off-by: Jianwei Mao <maojianwei2016@126.com>
2025-12-04 06:04:44 +00:00
avigny
dd5d1ef780 [Bugfix] Mistral tool parser streaming update (#19425)
Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Jeff Cook <jeff@jeffcook.io>
Co-authored-by: sfbemerk <benjaminmerkel@mail.de>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-03 17:45:31 +00:00
Andreas Karatzas
506ed87e87 [ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues (#29909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-03 10:36:49 +08:00
Andreas Karatzas
ea3370b428 [ROCm][Bugfix] Patch for the Multi-Modal Processor Test group (#29702)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-11-29 01:31:44 +00:00
Li, Jiang
e2f56c309d [CPU] Update torch 2.9.1 for CPU backend (#29664)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-28 13:37:54 +00:00
HappyAmazonian
f8151b66fa Revert "Supress verbose logs from model_hosting_container_standards (… (#29335)
Signed-off-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 05:29:05 -08:00
Cyrus Leung
b34e8775a3 Revert "[CPU]Update CPU PyTorch to 2.9.0 (#29589)" (#29647)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 22:43:18 -08:00
scydas
35657bcd7a [CPU]Update CPU PyTorch to 2.9.0 (#29589)
Signed-off-by: scyda <scyda@outlook.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2025-11-28 09:34:33 +08:00
Andrii Skliar
a5345bf49d [BugFix] Fix plan API Mismatch when using latest FlashInfer (#29426)
Signed-off-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
Co-authored-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
2025-11-27 11:34:59 -08:00
Harry Mellor
e1f262337b Update Transformers pin in CI to 4.57.3 (#29418)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-27 08:42:14 -08:00
Johnny Yang
ba1fcd84a7 [TPU] add tpu_inference (#27277)
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-11-26 14:46:36 -08:00
Ryan Rock
fe3a4f5b34 [CI/Build] Pin torchgeo dependency for AMD (#29353)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-25 07:14:59 +00:00
Divakar Verma
22b42b5402 [CI][ROCm] Install arctic-inference on ROCm tests (#29344)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-11-25 02:15:39 +00:00
Kunshang Ji
b8328b49fb [XPU] upgrade torch & ipex 2.9 on XPU platform (#29307)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-25 09:34:47 +08:00
Nicolò Lucchesi
26a465584a [NIXL] Use config to enable telemetry + NIXL version bump (#29305)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-11-24 17:18:04 +00:00
Roger Wang
0ff70821c9 [Core] Deprecate xformers (#29262)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Micah Williamson
55c21c8836 [ROCm][CI] Fix "Cannot re-initialize CUDA in forked subprocess" in test_pynccl.py (#29119)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-11-23 13:05:00 +08:00
Ryan Rock
ed8e6843cc [CI/Build] Add terratorch for AMD (#29205)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-21 17:31:22 -08:00