every.channel/evolution/proposals/ECP-0083-declarative-netboot-service-module.md

2.3 KiB

ECP-0083: Declarative Netboot Service Module

Status: Implemented

Context

ECP-0082 added script-driven netboot staging and serving for UniFi/ProxyDHCP fleets. That path works, but it is still operator-session driven (tmux, manual env vars, manual restart order), which is fragile for sustained fleet bring-up.

The constitution favors explicit, reviewable infrastructure definitions. Netboot delivery should be operated as a normal NixOS service with stable options, systemd lifecycle, and auditable host config.

Decision

  1. Add a reusable NixOS module at nix/modules/ec-netboot.nix exported as nixosModules.ec-netboot.
  2. Define a first-class services.every-channel.netboot option tree for:
    • UniFi-only mode (default, no ProxyDHCP),
    • optional ProxyDHCP mode,
    • release source pinning (host/repo/tag/local tarball/token file),
    • iPXE strategy (embedded build, local file, or explicit remote download),
    • security controls (chain token file, HTTP CIDR allowlist).
  3. Run persistent systemd units:
    • every-channel-netboot-ipxe (oneshot, optional embedded EFI build),
    • every-channel-netboot-stage (oneshot artifact staging),
    • every-channel-netboot (long-running HTTP+TFTP service).
  4. Add tmpfiles and firewall wiring in-module so host configs remain concise and reversible.
  5. Keep existing scripts as execution primitives to avoid duplicate logic and preserve local/manual fallback operations.

Alternatives considered

  • Keep scripts only. Rejected because startup order, secret injection, and restart behavior remain ad-hoc.
  • Implement host-specific module logic only in key.store. Rejected because this behavior is core every.channel netboot operations and should be reusable across hosts.
  • Replace scripts with a brand new daemon immediately. Rejected to keep rollout incremental and avoid avoidable regressions.

Rollout / teardown plan

  • Rollout:
    • import every-channel.nixosModules.ec-netboot on the boot host,
    • set services.every-channel.netboot.* options,
    • activate and verify every-channel-netboot-stage then every-channel-netboot.
  • Teardown:
    • disable services.every-channel.netboot.enable,
    • remove host option stanza,
    • fall back to manual script operation from docs/NUC_UNIFI_NETBOOT.md if needed.