diff --git a/vllm/Dockerfile b/vllm/Dockerfile index cef70f0..1cd43ec 100644 --- a/vllm/Dockerfile +++ b/vllm/Dockerfile @@ -1,3 +1,24 @@ +# ============================================================================== +# ⚠️⚠️⚠️ WORKING BUILD - DO NOT TOUCH ⚠️⚠️⚠️ +# ============================================================================== +# Build #43 succeeded on 2026-04-03 with these exact versions: +# - vLLM: v0.18.2rc0 +# - flashinfer: v0.6.7 +# - flash-attention: hopper branch +# - lmcache: dev branch +# - infinistore: main +# - triton: 3.6.0 (PyPI wheel) +# - Base: nvcr.io/nvidia/pytorch:26.03-py3 (PyTorch 2.11.0a0, CUDA 13.2.0) +# +# HARD RULES: +# 1. NO DOWNGRADES - CUDA 13+, PyTorch 2.9+, vLLM 0.18.1+ +# 2. NO SKIPPING COMPILATION - Build from source +# 3. CLEAR ALL CHANGES WITH MIKE BEFORE MAKING THEM +# 4. ONE BUILD AT A TIME - Mike reports failure → I assess → I report +# +# If you need to modify this file, ask Mike first. +# ============================================================================== + # ---------- Builder Base ---------- # Using NVIDIA NGC PyTorch container (26.03) with: # - PyTorch 2.11.0a0 (bleeding edge)