test_abort_metrics_reset is flaky due to hardware-dependent fixed sleeps: replace fixed sleeps with polling. test_metrics_exist_run_batch passes even when the engine crashes on startup (false positive): add subprocess lifecycle guards. Signed-off-by: Andreas Karatzas <akaratza@amd.com>