# ECP-0110: `ecp-forge` Hetzner Robot recovery wrapper Status: Draft ## Problem / context `git.every.channel` is a single dedicated Hetzner host. When SSH and HTTPS are both unreachable, the blockchain and Forgejo validation path stalls before repo-owned deployment tools can connect. Robot can recover the host, but browser-only recovery is hard to repeat and easy to lose across agent handoffs. ## Decision Add a repo-local Robot wrapper for `ecp-forge` recovery: - default to server `2800441` / `95.216.114.54`, - read Robot Webservice credentials from environment variables or the existing 1Password item at runtime, - avoid storing Robot passwords in git or shell profiles, - expose explicit status, rescue, reset, recover, and reachability-probe commands, and - mask Robot-generated rescue passwords unless the operator explicitly opts into printing them. The wrapper treats rescue activation and reset as operational recovery steps, not deployment. Once the host is reachable again, `scripts/deploy-ecp-forge.sh` remains the source of truth for the NixOS system state. ## Consequences - Future agents can recover the Forge after a local 1Password CLI sign-in without asking for pasted Robot secrets. - The host identity and Robot server number are documented in the repo instead of being rediscovered from the browser UI. - Recovery actions remain explicit commands; ordinary probes never mutate Robot state. ## Alternatives considered - Continue browser-only Robot recovery. Rejected because it is too stateful for repeated agent handoffs and does not leave a repo-owned runbook. - Store Robot credentials in a repo-local file. Rejected because Robot credentials are operational secrets and should stay in 1Password or the caller's environment. - Move recovery into the deploy script. Rejected because Robot rescue/reset is a host-recovery action, while `deploy-ecp-forge.sh` should remain the NixOS deployment entrypoint. ## Rollout / teardown 1. Add `scripts/hetzner-robot-forge.sh`. 2. Document the emergency path in `docs/DEPLOY_ECP_FORGE.md`. 3. Use `probe` first, then `status`, then `recover` only when the Forge is unreachable. Teardown is removing the wrapper and returning to browser-only Robot operations.