Commit Graph

224 Commits

Author SHA1 Message Date
Douglas Lehr
8a798be929 [ROCm] Enable MXFP4 MoE weight pre-shuffling on gfx950 and update aiter (#34192)
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
Co-authored-by: tjtanaavllm <tunjian.tan@amd.com>
2026-02-12 05:06:33 -08:00
Kunshang Ji
cb9574eb85 [XPU][9/N] clean up existing ipex code/doc (#34111)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-02-11 00:27:15 -08:00
Hongxia Yang
4d39650961 [ROCm] update triton branch to support gpt-oss models for gfx11xx devices (#34032)
Signed-off-by: Hongxia Yang <hongxia.yang@amd.com>
2026-02-09 19:36:30 +00:00
zifeitong
52181baaea Update DeepGEMM version pin in Dockerfile to match #32479 (#33935)
Signed-off-by: Zifei Tong <zifeitong@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2026-02-07 05:30:22 -08:00
Dimitrios Bariamis
207c3a0c20 Fix RoutingMethodType logic (#33919)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2026-02-06 14:03:34 -08:00
sihao_li
6550815c3a [XPU]Replace pip in docker.xpu with uv pip (#31112)
Signed-off-by: sihao.li <sihao.li@intel.com>
2026-02-06 14:02:33 +08:00
kourosh hakhamaneshi
2f6d17cb2f [rocm][ray] Fix: Unify Ray device visibility handling across CUDA and ROCm (#33308)
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2026-02-04 10:09:14 -08:00
Kunshang Ji
e10604480b [XPU][1/N] Deprecate ipex and switch to vllm-xpu-kernels for xpu platform (#33379)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2026-02-02 22:46:10 -08:00
杨朱 · Kiki
a0a984ac2e [CI/Build] Remove hardcoded America/Los_Angeles timezone from Dockerfiles (#33553)
Signed-off-by: carlory <baofa.fan@daocloud.io>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 22:32:39 -08:00
R3hankhan
ab374786c7 [CPU][IBM Z][Dockerfile] Fix IBM Z builds (#33243)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2026-02-01 23:41:29 -08:00
Dimitrios Bariamis
f0bca83ee4 Add support for Mistral Large 3 inference with Flashinfer MoE (#33174)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-01-30 22:48:27 -08:00
Pengchao Wang
2515bbd027 [CI/Build][BugFix] fix cuda/compat loading order issue in docker build (#33116)
Signed-off-by: Pengchao Wang <wpc@fb.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
2026-01-29 00:19:05 -08:00
TJian
f9d03599ef [Release] [CI] Optim release pipeline (#33156)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-28 22:45:42 -08:00
Xinan Miao
604e3b87e8 [Feature]: Container image WORKDIR consistency (#33159)
Signed-off-by: SouthWest7 <am1ao@qq.com>
Co-authored-by: SouthWest7 <am1ao@qq.com>
2026-01-28 11:06:48 +08:00
Maryam Tahhan
203d0bc0c2 [CPU] Improve CPU Docker build (#30953)
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-01-24 17:08:24 +00:00
Orion Reblitz-Richardson
68b0a6c1ba [CI][torch nightlies] Use main Dockerfile with flags for nightly torch tests (#30443)
Signed-off-by: Orion Reblitz-Richardson <orionr@meta.com>
Signed-off-by: Orion Reblitz-Richardson <orionr@gmail.com>
Co-authored-by: Kevin H. Luu <khluu000@gmail.com>
2026-01-23 10:22:56 -08:00
Fadi Arafeh
10e94c84f6 [CPU][Feat] Update PyTorch to v2.10 for CPU Backend (#32869)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-01-23 21:13:06 +08:00
elvischenv
808d6fd7b9 Bump Flashinfer to v0.6.1 (#30993)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2026-01-21 08:49:50 -08:00
qli88
a0490be8f1 [CI][amd] Revert NIXL connector change to avoid crash (#32570)
Signed-off-by: Qiang Li <qiang.li2@amd.com>
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
2026-01-19 18:39:16 +00:00
Mritunjay Kumar Sharma
9e078d0582 [CI/Build][Docker] Add centralized version manifest for Docker builds (#31492)
Signed-off-by: Mritunjay Sharma <mritunjay.sharma@chainguard.dev>
2026-01-17 13:45:30 +00:00
TJian
41c544f78a [ROCm] [CI] [Release] Rocm wheel pipeline with sccache (#32264)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-16 02:56:18 +08:00
smit kadvani
74e4bb1c5a fixing podman build issue (#32131)
Signed-off-by: Smit Kadvani <smit.kadvani@gmail.com>
Co-authored-by: Smit Shaileshbhai Kadvani <kadvani@meta.com>
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
2026-01-15 11:07:08 -06:00
Douglas Lehr
c5891b5430 [ROCM] Add ROCm image build to release pipeline (#31995)
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
2026-01-15 19:01:40 +08:00
qli88
3a612322eb [CI] Move rixl/ucx from Dockerfile.rocm_base to Dockerfile.rocm (#32295)
Signed-off-by: Qiang Li <qiang.li2@amd.com>
2026-01-14 16:53:36 +00:00
emricksini-h
2a60ac91d0 [Improvement] Persist CUDA compat libraries paths to prevent reset on apt-get (#30784)
Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai>
2026-01-13 14:35:05 -08:00
TJian
0346396e94 [ROCm] [Bugfix] Fix order of mori build in Dockerfile.rocm_base (#32179)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2026-01-12 15:33:21 +00:00
inkcherry
4505849b30 [ROCm][PD] add moriio kv connector. (#29304)
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
2026-01-09 14:01:57 +00:00
Nishidha Panpaliya
a563866b48 Fix ijson build for Power. (#31702)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
2026-01-08 17:12:33 +00:00
Shang Wang
33156f56e0 [docker] A follow-up patch to fix #30913: [docker] install cuda13 version of lmcache and nixl (#31775)
Signed-off-by: Shang Wang <shangw@nvidia.com>
2026-01-07 23:47:02 -08:00
Seiji Eicher
3c98c2d21b [CI/Build] Allow user to configure NVSHMEM version via ENV or command line (#30732)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2026-01-05 15:56:08 -08:00
Qidong Su
af1b07b0c5 [docker] install cuda13 version of lmcache and nixl (#30913)
Signed-off-by: Qidong Su <soodoshll@gmail.com>
2026-01-05 12:50:39 -08:00
Amr Mahdi
e1ee11b2a5 Add docker buildx bake configuration (#31477)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-31 01:08:54 +00:00
Roger Feng
3d973764ce [xpu] [bugfix] upgrade to latest oneccl in dockerfile (#31522)
Signed-off-by: roger feng <roger.feng@intel.com>
2025-12-30 14:52:28 +08:00
qli88
0f35429a0c [CI]Test Group 'NixlConnector PD accuracy tests' is fixed (#31460)
Signed-off-by: qli88 <qiang.li2@amd.com>
2025-12-29 23:48:56 +00:00
Andreas Karatzas
f70368867e [ROCm][CI] Add TorchCodec source build for transcription tests (#31323)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-28 16:06:05 +08:00
Nick Cao
d7e05ac743 [docker] Fix downloading sccache on aarch64 platform (#30070)
Signed-off-by: Nick Cao <nickcao@nichi.co>
2025-12-23 21:36:33 -08:00
Yan Ma
f1c2c20136 [XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation (#30538)
Signed-off-by: Yan Ma <yan.ma@intel.com>
2025-12-23 05:22:15 +00:00
Gregory Shtrasberg
ab3a85fd68 [ROCm][CI/Build] Fix triton version to one that has triton_kernels required for gpt-oss to run (#31159)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-12-22 17:19:27 +00:00
Gregory Shtrasberg
0be149524c [ROCm][CI/Build] Update ROCm dockerfiles (#30991)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-12-20 03:19:12 +00:00
Li, Jiang
e3ab93c896 [CPU] Refactor CPU fused MOE (#30531)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-12-18 14:36:49 +08:00
Li, Jiang
cfb7e55515 [Doc][CPU] Update CPU doc (#30765)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Signed-off-by: Li, Jiang <bigpyj64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-18 04:59:09 +00:00
Amr Mahdi
c0a88df7f7 [docker] Allow kv_connectors install to fail on arm64 (#30806)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-16 16:41:57 -08:00
Amr Mahdi
ff21a0fc85 [docker] Restructure Dockerfile for more efficient and cache-friendly builds (#30626)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-15 18:52:19 -08:00
Kunshang Ji
e3a1cd1c59 [XPU] fix Dockerfile.xpu, avoid wheel conflicts (#30662)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-12-15 13:32:06 +08:00
Noa Neria
6366c098d7 Validating Runai Model Streamer Integration with S3 Object Storage (#29320)
Signed-off-by: Noa Neria <noa@run.ai>
2025-12-04 18:04:43 +08:00
Shengqi Chen
1109f98288 [CI] fix docker image build by specifying merge-base commit id when downloading pre-compiled wheels (#29930)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
2025-12-03 14:08:19 -08:00
Amr Mahdi
f5d3d93c40 [docker] Build CUDA kernels in separate Docker stage for faster rebuilds (#29452)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-03 11:41:53 +00:00
Andreas Karatzas
506ed87e87 [ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues (#29909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-03 10:36:49 +08:00
Benjamin Bartels
2d613de9ae [CI/Build] Fixes missing runtime dependencies (#29822)
Signed-off-by: bbartels <benjamin@bartels.dev>
2025-12-02 10:21:49 -08:00
Andreas Karatzas
ea3370b428 [ROCm][Bugfix] Patch for the Multi-Modal Processor Test group (#29702)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-11-29 01:31:44 +00:00