ops: add CI boot-image releases and Unifi PXE rollout
Some checks failed
ci-gates / checks (push) Has been cancelled
deploy-cloudflare / checks (push) Has been cancelled
deploy-cloudflare / deploy (push) Has been cancelled

This commit is contained in:
every.channel 2026-02-28 22:53:59 -08:00
parent 043b1730dc
commit be26313225
No known key found for this signature in database
9 changed files with 720 additions and 0 deletions

View file

@ -0,0 +1,36 @@
# ECP-0082: Unifi PXE Rollout Path for Runner Images
Status: Implemented
## Context
Runner netboot artifacts now publish from CI, but there is no repository-native operating path for fleet provisioning on common prosumer networks (for example Unifi VLANs).
Unifi DHCP can expose next-server/bootfile settings, but iPXE chainloading often requires conditional bootfile behavior to avoid loops (`ipxe.efi` first stage, script second stage). Not all controller setups expose that cleanly.
## Decision
1. Add first-party scripts for local netboot staging and serving:
- stage x86_64 netboot artifacts from Forgejo Releases (or local tarball),
- stage iPXE UEFI binary for TFTP,
- run HTTP + TFTP + ProxyDHCP via `dnsmasq` for deterministic chainloading.
2. Keep Unifi DHCP as the IP authority; use ProxyDHCP only to supply bootfile logic.
3. Document a concrete NUC rollout sequence for same-VLAN provisioning.
4. Keep dependencies minimal (`curl`, `tar`, `python3`, `dnsmasq`) and avoid requiring image flashing workflows.
## Alternatives considered
- Require Unifi DHCP conditional iPXE rules. Rejected because controller capabilities vary and misconfiguration risks boot loops.
- Keep manual USB-only provisioning. Rejected because it increases labor for multi-node rollout.
- Add a heavy provisioning stack (MAAS/Foreman/Kickstart integration). Rejected as too much operational overhead for current scale.
## Rollout / teardown plan
- Rollout:
- merge scripts/docs,
- run `netboot-stage` on the boot server,
- run `netboot-serve` on the NUC VLAN and boot hosts via PXE.
- Teardown:
- stop `netboot-serve`,
- remove staged artifacts under `tmp/netboot`,
- continue with ISO+USB fallback path.