51 lines
2.5 KiB
Markdown
51 lines
2.5 KiB
Markdown
# ECP-0065: NixOS Runner Images + Netboot Artifacts
|
|
|
|
Status: Implemented
|
|
|
|
## Decision
|
|
|
|
Publish a first-party, reproducible NixOS "runner" system definition from this repo, and expose build outputs suitable for:
|
|
|
|
- local-disk installs (pave/reinstall),
|
|
- netboot (iPXE/PXE) bootstrap, and
|
|
- byte-identical runner OS images produced in CI.
|
|
|
|
The runner system:
|
|
|
|
- is defined in-repo as a `nixosConfiguration` in `flake.nix`,
|
|
- exports the repo source tree inside the OS at a stable path (read-only) so the node can self-build and verify from the same flake,
|
|
- uses `ec-node` as the primary long-running publisher binary, with orchestration via NixOS + systemd.
|
|
- defaults to a read-only root filesystem with a tmpfs-backed overlayfs upperdir (appliance semantics), while image/bootstrap variants (netboot/ISO/sdimage) may disable this where it conflicts with their initrd/root setup.
|
|
|
|
Initial implementation targets `aarch64-linux` builds first (local builds via OrbStack). `x86_64-linux` is defined in the flake but may not be built until an x86 builder is available.
|
|
|
|
## Motivation
|
|
|
|
- "Bootstrap path == update path": the same flake definition and CI-built artifacts should be usable to (re)install and to update.
|
|
- Fleet operability: remove per-node hand configuration; treat nodes as cattle.
|
|
- Verifiability: runners can rebuild and compare their OS closure against the CI artifacts using the embedded flake source.
|
|
|
|
## Scope
|
|
|
|
In scope:
|
|
|
|
- `nixosConfigurations.ec-runner-{aarch64,x86_64}` in `flake.nix`.
|
|
- `nixosConfigurations.ec-runner-*-netboot` and `nixosConfigurations.ec-runner-*-iso` for image artifacts.
|
|
- Minimal runner NixOS module for baseline host settings and stable in-OS flake source path.
|
|
- Docs/scripts for building netboot outputs locally in OrbStack.
|
|
|
|
Out of scope (defer):
|
|
|
|
- CI publishing pipeline (binary cache, attestation, release upload).
|
|
- Remote runtime provisioning (fetching per-node channel lists).
|
|
- Hardware-accelerated transcode changes (keep current CPU x264 baseline).
|
|
|
|
## Alternatives considered
|
|
|
|
- Keep runner images out-of-repo and publish ad hoc artifacts. Rejected because it weakens reproducibility and provenance.
|
|
- Restrict to one install path only (disk install only). Rejected because netboot/bootstrap is required for fleet recovery.
|
|
|
|
## Rollout / Reversibility
|
|
|
|
- Rollout begins with local builds and a single test machine.
|
|
- Reversible by removing the `nixosConfigurations` and runner module; existing nodes can continue to run via manual `tmux` or ad-hoc installs.
|