v0.19.0 pins a compatible flash-attn commit (2921022). The sed that forced GIT_TAG to main pulled in newer code that references FLASHATTENTION_FP8_TWO_LEVEL_INTERVAL which isn't defined in v0.19.0's build config. Use the pinned commit instead.
VLLM images for GH200
Hosted here
docker login
# Alternative
# docker buildx build --platform linux/arm64 --memory=600g -t rajesh550/gh200-vllm:0.9.0.1 .
docker build --memory=450g --platform linux/arm64 -t rajesh550/gh200-vllm:0.11.1rc2 . 2>&1 | tee build.log
docker push rajesh550/gh200-vllm:0.11.1rc2