fix: prevent nixos-rebuild from killing in-flight update; improve stale status recovery

Part A (modules/core/sovran-hub.nix):
- Add restartIfChanged=false and stopIfChanged=false to sovran-hub-update service
- Add restartIfChanged=false and stopIfChanged=false to sovran-hub-rebuild service
These prevent nixos-rebuild switch from terminating an in-flight update mid-execution.

Part B (app/sovran_systemsos_web/server.py):
- Replace _recover_stale_status() with improved version
- Use MainPID + os.kill() to guard against transient is-active lies during daemon-reload
- Use ExecMainStatus (actual exit code) instead of Result (may be stale from prior run)

Agent-Logs-Url: https://github.com/naturallaw777/staging_alpha/sessions/63bf2cd5-9c02-4542-8926-44aa9ed63bf0

Co-authored-by: naturallaw777 <99053422+naturallaw777@users.noreply.github.com>
This commit is contained in:
copilot-swe-agent[bot]
2026-04-12 13:47:49 +00:00
committed by GitHub
parent 8310028546
commit 008a003fa1
2 changed files with 64 additions and 35 deletions

View File

@@ -350,6 +350,8 @@ in
systemd.services.sovran-hub-update = {
description = "Sovran_SystemsOS System Update";
restartIfChanged = false; # Don't let nixos-rebuild kill an in-flight update
stopIfChanged = false; # Don't stop it during activation either
serviceConfig = {
Type = "oneshot";
ExecStart = "${update-script}";
@@ -358,6 +360,8 @@ in
systemd.services.sovran-hub-rebuild = {
description = "Sovran_SystemsOS System Rebuild";
restartIfChanged = false; # Don't let nixos-rebuild kill an in-flight rebuild
stopIfChanged = false; # Don't stop it during activation either
serviceConfig = {
Type = "oneshot";
ExecStart = "${rebuild-script}";