86 lines
2.9 KiB
Markdown
86 lines
2.9 KiB
Markdown
# Sovereign Deploy: `ecp-forge`
|
|
|
|
This repository owns deployment of `git.every.channel` (Hetzner 300TB host).
|
|
|
|
## Requirements
|
|
|
|
- SSH access to `root@git.every.channel`.
|
|
- Local key that matches host `authorized_keys` (default: `~/.ssh/id_ed25519`).
|
|
- `nix` with flakes enabled.
|
|
- For emergency Hetzner recovery, Robot Webservice credentials in 1Password item `Hetzner Robot`
|
|
or `EVERY_CHANNEL_ROBOT_USER` / `EVERY_CHANNEL_ROBOT_PASSWORD`.
|
|
|
|
## Deploy
|
|
|
|
```sh
|
|
./scripts/deploy-ecp-forge.sh
|
|
```
|
|
|
|
For the OP Stack operator path and observation-rail validation, see:
|
|
|
|
```sh
|
|
cat docs/OP_STACK_ECP_FORGE.md
|
|
```
|
|
|
|
Equivalent:
|
|
|
|
```sh
|
|
NIX_SSHOPTS="-o BatchMode=yes -o IdentityAgent=none -o IdentitiesOnly=yes -i ~/.ssh/id_ed25519" \
|
|
nix run nixpkgs#nixos-rebuild -- \
|
|
--flake .#ecp-forge \
|
|
--target-host root@git.every.channel \
|
|
--build-host root@git.every.channel \
|
|
--use-remote-sudo \
|
|
switch
|
|
```
|
|
|
|
## Overrides
|
|
|
|
- `EVERY_CHANNEL_FORGE_TARGET_HOST` (default `root@git.every.channel`)
|
|
- `EVERY_CHANNEL_FORGE_BUILD_HOST` (default same as target)
|
|
- `EVERY_CHANNEL_FORGE_SSH_IDENTITY` (default `~/.ssh/id_ed25519`)
|
|
|
|
## Emergency Robot recovery
|
|
|
|
Use this only when both Forge HTTPS and SSH are unreachable. The dedicated host is server
|
|
`2800441` at `95.216.114.54`.
|
|
|
|
```sh
|
|
./scripts/hetzner-robot-forge.sh probe
|
|
```
|
|
|
|
If the probe confirms outage, sign in to 1Password CLI so the wrapper can read the existing Robot
|
|
Webservice item at runtime:
|
|
|
|
```sh
|
|
op signin
|
|
./scripts/hetzner-robot-forge.sh status
|
|
```
|
|
|
|
To boot the host into Hetzner Rescue and issue a hardware reset:
|
|
|
|
```sh
|
|
./scripts/hetzner-robot-forge.sh recover
|
|
./scripts/hetzner-robot-forge.sh wait-ssh
|
|
```
|
|
|
|
The wrapper masks Robot-generated rescue passwords by default and tries to attach the local SSH key
|
|
fingerprint when activating rescue. Set `EVERY_CHANNEL_ROBOT_AUTHORIZED_KEY_FINGERPRINT` if Robot
|
|
uses a different uploaded key fingerprint. Set `EVERY_CHANNEL_ROBOT_PRINT_SENSITIVE=1` only when
|
|
password-based rescue login is required.
|
|
|
|
If production boots but public SSH and HTTPS still time out, inspect the previous boot from Rescue.
|
|
The known recovery check is host-wide VPN state: `mullvad-daemon.service` must not be active on
|
|
`ecp-forge`, because its firewall policy can block public Forge ingress even when Robot and the
|
|
NixOS firewall allow the ports. If a not-yet-redeployed generation still starts Mullvad and the
|
|
mutable cached target state is rewritten to `secured`, back up `/boot/grub/grub.cfg`, append
|
|
`systemd.mask=mullvad-daemon.service systemd.mask=mullvad-early-boot-blocking.service` to the
|
|
default Linux line, and reboot production. After public SSH returns, deploy this repo's NixOS config
|
|
so the bootloader is regenerated without the emergency mask.
|
|
|
|
## Verify
|
|
|
|
```sh
|
|
ssh -o BatchMode=yes -o IdentityAgent=none -i ~/.ssh/id_ed25519 root@git.every.channel \
|
|
'hostnamectl --static; systemctl is-active forgejo caddy every-channel-netboot-stage every-channel-netboot'
|
|
```
|