✅ WORKING BUILD #43 - GH200 vLLM container builds successfully

Versions locked: - vLLM: v0.18.2rc0 - flashinfer: v0.6.7 - flash-attention: hopper branch - lmcache: dev branch - infinistore: main - triton: 3.6.0 (PyPI wheel) - Base: nvcr.io/nvidia/pytorch:26.03-py3 (PyTorch 2.11.0a0, CUDA 13.2.0) DO NOT MODIFY WITHOUT MIKE'S APPROVAL
2026-04-03 11:08:29 +00:00
parent 2442906d95
commit 659c79638c
1 changed files with 21 additions and 0 deletions
--- a/vllm/Dockerfile
+++ b/vllm/Dockerfile
@@ -1,3 +1,24 @@
+# ==============================================================================
+# ⚠️⚠️⚠️ WORKING BUILD - DO NOT TOUCH ⚠️⚠️⚠️
+# ==============================================================================
+# Build #43 succeeded on 2026-04-03 with these exact versions:
+#   - vLLM: v0.18.2rc0
+#   - flashinfer: v0.6.7
+#   - flash-attention: hopper branch
+#   - lmcache: dev branch
+#   - infinistore: main
+#   - triton: 3.6.0 (PyPI wheel)
+#   - Base: nvcr.io/nvidia/pytorch:26.03-py3 (PyTorch 2.11.0a0, CUDA 13.2.0)
+#
+# HARD RULES:
+#   1. NO DOWNGRADES - CUDA 13+, PyTorch 2.9+, vLLM 0.18.1+
+#   2. NO SKIPPING COMPILATION - Build from source
+#   3. CLEAR ALL CHANGES WITH MIKE BEFORE MAKING THEM
+#   4. ONE BUILD AT A TIME - Mike reports failure → I assess → I report
+#
+# If you need to modify this file, ask Mike first.
+# ==============================================================================
+
 # ---------- Builder Base ----------
 # Using NVIDIA NGC PyTorch container (26.03) with:
 # - PyTorch 2.11.0a0 (bleeding edge)