Advance forge rollout, Ethereum rails, and NBC sources

This commit is contained in:
every.channel 2026-04-01 15:58:49 -07:00
parent be26313225
commit 7d84510eac
No known key found for this signature in database
88 changed files with 11230 additions and 302 deletions

View file

@ -1,35 +1,88 @@
# NUC Fleet Netboot (Unifi + ProxyDHCP)
# NUC Fleet Netboot (Unifi)
This runbook provisions x86_64 NUCs from runner netboot artifacts without USB image flashing.
It uses:
Supported modes:
- Unifi DHCP for IP leases.
- Local `dnsmasq` ProxyDHCP for PXE/iPXE bootfile logic.
- Local HTTP + TFTP service for boot artifacts.
## Why ProxyDHCP
iPXE commonly needs two boot stages:
1. firmware PXE -> `ipxe.efi`
2. iPXE -> `netboot.ipxe`
If DHCP always returns `ipxe.efi`, clients can loop forever. ProxyDHCP handles stage-specific boot responses cleanly while leaving Unifi as the DHCP lease server.
- ProxyDHCP mode: recommended when you want automatic stage-1/2 iPXE handling.
- UniFi-only mode: DHCP options 66/67 in UniFi, no ProxyDHCP.
## Prerequisites
- A Linux boot server on the same VLAN/L2 domain as the NUCs.
- Unifi network with normal DHCP enabled.
- Linux boot server on the same VLAN/L2 domain as the NUCs.
- Unifi network with DHCP enabled.
- Local DNS record on that VLAN: `boot.every.channel -> <boot-server-ip>`.
- `curl`, `tar`, `python3`, `dnsmasq` installed on the boot server.
- Runner netboot artifact already published to Forgejo Releases (or available as a local tarball).
- For UniFi-only mode with reliable chainloading: `git` and `make` to build embedded iPXE.
- `openssl` (or equivalent) if you want generated chain tokens.
- Runner netboot artifact published to Forgejo Releases (or available as local tarball).
## 1) Stage artifacts
## Persistent NixOS service (recommended)
From repository root on the boot server:
Instead of running scripts manually, use the exported NixOS module and keep netboot
staging/serving declarative:
```nix
{
imports = [ every-channel.nixosModules.ec-netboot ];
services.every-channel.netboot = {
enable = true;
listenIP = "10.20.30.2";
interface = "enp195s0";
hostname = "boot.every.channel";
tftpBootFilename = "ec-ipxe.efi";
httpAllowedCIDRs = [ "10.20.30.0/24" ];
chainTokenFile = "/run/agenix/every-channel-netboot-chain-token";
# UniFi-only mode by default (no ProxyDHCP):
proxyDhcp.enable = false;
release.host = "https://git.every.channel";
release.repo = "every-channel/every.channel";
# release.tag = "boot-v2026.03.02"; # optional pin
# release.tokenFile = "/run/agenix/forgejo-api-token"; # optional private repo token
};
}
```
Operational commands:
```sh
sudo systemctl start every-channel-netboot-stage.service
sudo systemctl restart every-channel-netboot.service
sudo systemctl status every-channel-netboot.service
```
If you prefer ProxyDHCP mode:
```nix
services.every-channel.netboot.proxyDhcp.enable = true;
services.every-channel.netboot.proxyDhcp.subnet = "10.20.30.0/24";
```
## 1) Build embedded iPXE (UniFi-only mode)
This removes iPXE boot loops without requiring ProxyDHCP.
```sh
EVERY_CHANNEL_NETBOOT_HOSTNAME=boot.every.channel \
EVERY_CHANNEL_NETBOOT_HTTP_PORT=8080 \
EVERY_CHANNEL_NETBOOT_CHAIN_TOKEN="$(openssl rand -hex 16)" \
./scripts/netboot-build-ipxe.sh
```
Output:
- `tmp/netboot/tftp/ec-ipxe.efi` (use this as DHCP option 67 filename)
## 2) Stage runner netboot artifacts
```sh
EVERY_CHANNEL_NETBOOT_HOSTNAME=boot.every.channel \
EVERY_CHANNEL_NETBOOT_CHAIN_TOKEN="<same-token-as-step-1>" \
EVERY_CHANNEL_NETBOOT_IPXE_EFI_PATH=tmp/netboot/tftp/ec-ipxe.efi \
EVERY_CHANNEL_NETBOOT_IPXE_EFI_FILENAME=ec-ipxe.efi \
./scripts/netboot-stage.sh
```
@ -38,16 +91,31 @@ Optional inputs:
- `EVERY_CHANNEL_NETBOOT_RELEASE_TAG=boot-v2026.02.28`
- `EVERY_CHANNEL_NETBOOT_TARBALL=/path/to/ec-runner-x86_64-netboot-....tar.gz`
- `EVERY_CHANNEL_FORGE_TOKEN=<token>` for private releases
- `EVERY_CHANNEL_NETBOOT_HOSTNAME=boot.every.channel`
- `EVERY_CHANNEL_NETBOOT_ALLOW_REMOTE_IPXE=true` only if you intentionally want to download iPXE from URL
- `EVERY_CHANNEL_IPXE_EFI_SHA256=<sha256>` to pin iPXE binary integrity
This stages:
- `tmp/netboot/http/{kernel,initrd,netboot.ipxe}`
- `tmp/netboot/tftp/ipxe.efi`
- `tmp/netboot/tftp/ec-ipxe.efi`
## 2) Serve HTTP + TFTP + ProxyDHCP
## 3) Serve HTTP + TFTP
Example (replace values for your VLAN):
UniFi-only mode (no ProxyDHCP):
```sh
sudo \
EVERY_CHANNEL_NETBOOT_LISTEN_IP=10.20.30.2 \
EVERY_CHANNEL_NETBOOT_INTERFACE=eth0 \
EVERY_CHANNEL_NETBOOT_HOSTNAME=boot.every.channel \
EVERY_CHANNEL_NETBOOT_CHAIN_TOKEN="<same-token-as-step-1>" \
EVERY_CHANNEL_NETBOOT_HTTP_ALLOWED_CIDRS=10.20.30.0/24 \
EVERY_CHANNEL_NETBOOT_PROXY_DHCP=false \
EVERY_CHANNEL_NETBOOT_TFTP_BOOT_FILENAME=ec-ipxe.efi \
./scripts/netboot-serve.sh
```
ProxyDHCP mode:
```sh
sudo \
@ -55,48 +123,40 @@ sudo \
EVERY_CHANNEL_NETBOOT_INTERFACE=eth0 \
EVERY_CHANNEL_NETBOOT_PROXY_SUBNET=10.20.30.0/24 \
EVERY_CHANNEL_NETBOOT_HOSTNAME=boot.every.channel \
EVERY_CHANNEL_NETBOOT_CHAIN_TOKEN="<same-token-as-step-1>" \
EVERY_CHANNEL_NETBOOT_HTTP_ALLOWED_CIDRS=10.20.30.0/24 \
EVERY_CHANNEL_NETBOOT_PROXY_DHCP=true \
EVERY_CHANNEL_NETBOOT_TFTP_BOOT_FILENAME=ec-ipxe.efi \
./scripts/netboot-serve.sh
```
Notes:
## 4) UniFi settings (you do this)
- Keep this process running during provisioning.
- Do not set Unifi DHCP bootfile options while this proxy mode is active.
- Ensure `boot.every.channel` resolves to the boot server IP from NUC clients.
UniFi-only mode:
## 3) Unifi / NUC settings
- `Network Boot`: enabled
- `Server`: `boot.every.channel` (or boot server IP)
- `Filename`: `ec-ipxe.efi`
- `TFTP Server`: `boot.every.channel`
Unifi:
ProxyDHCP mode:
- Keep DHCP enabled for the provisioning VLAN.
- Leave DHCP boot/TFTP overrides unset when using `netboot-serve.sh`.
- Create/verify local DNS host override: `boot.every.channel -> <boot-server-ip>`.
- leave UniFi boot/TFTP options unset.
NUC BIOS:
- Enable UEFI network boot (IPv4 PXE).
- Disable Legacy/CSM if possible.
- Put network boot before disk for first install cycle.
- Enable UEFI PXE boot.
- Disable Legacy/CSM where possible.
- Put network boot first for initial install.
## 4) Provision the fleet
## Security hardening
1. Boot each NUC on the provisioning VLAN.
2. PXE will chainload into iPXE and then runner `netboot.ipxe`.
3. Complete install/bootstrap flow on each node.
4. After successful install, switch boot order back to local disk.
## Troubleshooting
- Symptom: iPXE loop (`ipxe.efi` repeatedly)
- Cause: static DHCP bootfile without iPXE-aware logic.
- Fix: use ProxyDHCP flow (`netboot-serve.sh`) or set conditional DHCP rules.
- Symptom: NUC gets IP but never downloads boot artifacts
- Verify firewall allows UDP 67/68, UDP 69, and TCP 8080 between NUCs and boot server.
- Symptom: no `dnsmasq` offers seen
- Verify `EVERY_CHANNEL_NETBOOT_INTERFACE` and `EVERY_CHANNEL_NETBOOT_PROXY_SUBNET`.
## Security / networking
- Tailscale is not required for provisioning.
- Keep the provisioning VLAN isolated from regular clients.
- Stop `netboot-serve.sh` when rollout is complete.
- Keep provisioning on an isolated VLAN.
- Allow only required ports from NUC VLAN to boot server: UDP 69, TCP 8080 (and DHCP if ProxyDHCP mode).
- Keep provisioning services up only during rollout, then stop them.
- Use `EVERY_CHANNEL_NETBOOT_HTTP_ALLOWED_CIDRS` to limit HTTP artifact access to NUC subnet(s).
- Use `EVERY_CHANNEL_NETBOOT_CHAIN_TOKEN` so only tokened iPXE chain requests receive `netboot.ipxe`.
- Use checksum verification in `netboot-stage.sh` (enabled by default when release has `SHA256SUMS.txt`).
- `netboot-stage.sh` now defaults to local iPXE binaries; remote URL download requires explicit opt-in.
- Prefer embedded `ec-ipxe.efi` with fixed chain target over generic unsigned internet binaries.
- If Secure Boot is required, use signed boot chain and keys for your environment (outside this basic runbook).