fix(agents): record auto-merge exit code via sidecar (reap correctness) #35

Merged
navigator merged 1 commit from feature/hotfix-approver-reap into main 2026-05-24 15:33:26 -03:00
Owner

PR Approver could not reap auto-merge exit codes because wait $pid does not work across loop iterations. Every completed auto-merge — including successes — was logged as unknown-exit + circuit failure. Observed on first autonomous PR #34.

Fix: dispatch_auto_merge wraps the invocation in a sub-shell that writes exit to .exit sidecar; reap reads sidecar, falls back to log-grep for legacy launches.

PR Approver could not reap auto-merge exit codes because wait \$pid does not work across loop iterations. Every completed auto-merge — including successes — was logged as unknown-exit + circuit failure. Observed on first autonomous PR #34. Fix: dispatch_auto_merge wraps the invocation in a sub-shell that writes exit to <pid>.exit sidecar; reap reads sidecar, falls back to log-grep for legacy launches.
fix(agents): record auto-merge exit code via sidecar (reap correctness)
All checks were successful
build / scalafmt-check (push) Successful in 3s
build / sbt-compile (push) Successful in 3s
build / shell-lint (push) Successful in 11s
build / scalafmt-check (pull_request) Successful in 3s
build / sbt-compile (pull_request) Successful in 4s
build / shell-lint (pull_request) Successful in 10s
e2459f094d
PR Approver's dispatch_auto_merge backgrounded auto-merge.sh and stored
the PID. reap_in_flight on a later tick tried 'wait $pid' to recover
the exit code — but wait only works on children of the CURRENT shell,
and the auto-merge process is a grandchild of an earlier tick. The
wait returns 127 (not a child), the case statement fell through to *)
and emitted a high-priority Telegram 'unknown exit' plus a circuit
failure for every completed auto-merge — including successful ones.

Observed during the first autonomous PR (#34): merge actually succeeded
upstream but the local Approver could not detect it, the in-flight pid
file lingered, and subsequent tick had stale state.

Fix:
- dispatch_auto_merge now wraps the invocation in a sub-shell that
  writes exit code to <pid_file>.exit upon completion. The pid file
  records the wrapper's pid; the .exit file records auto-merge.sh's
  actual exit code.
- reap_in_flight reads .exit instead of calling wait. If .exit is
  missing (legacy launch or sidecar lost), falls back to scanning the
  auto-merge log file for known signatures ('merged via squash',
  'CI failure', 'Timeout waiting for CI', etc.).
- Both pid and exit sidecar files are removed after reap.

Also tightened the exit-code-5 message text ('needs rebase (behind
base)') to match the actual Forgejo error wording for easier triage.
fluidpop-bot left a comment
Collaborator

CI green (head e2459f094d), auto-approving

CI green (head e2459f094dad139c328316242506db64ac65e3e5), auto-approving
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Fluid/fluidpop-v1!35
No description provided.