Instead of fragile inline Dockerfile patching, just copy a modified utils.py (with _post_quant_fix call) into the image, same pattern as deepseek_v4.py and deepseek_v4_attention.py patches.
Docker's parser chokes on multi-line Python in RUN. Moved to scripts/patch_utils.py and COPY + RUN it.