The 5-minute gap after safetensors load is GPU weight upload — no output, k8s marks the pod unhealthy. Now prints a heartbeat every 256 weight loads during the expert loading phase. Also adds checkpoint-ready and model-ready prints around finalize: Checkpoint loaded. Transferring weights to GPU & preparing NVFP4... (JIT compile)NVFP4 MoE layers: 50%|██████████░░░░░░░░░░| 31/61 NVFP4 model ready ✓