Files
nvfp4-megamoe-kernel/vllm
biondizzle a569612df5 feat: add load progress heartbeats to prevent k8s health check kills
The 5-minute gap after safetensors load is GPU weight upload — no
output, k8s marks the pod unhealthy. Now prints a heartbeat every
256 weight loads during the expert loading phase.

Also adds checkpoint-ready and model-ready prints around finalize:
  Checkpoint loaded. Transferring weights to GPU & preparing NVFP4...
  (JIT compile)NVFP4 MoE layers: 50%|██████████░░░░░░░░░░| 31/61
  NVFP4 model ready ✓
2026-05-16 05:51:35 +00:00
..