governance: normalize ECP 0063-0078 and add ECP-0079

This commit is contained in:
every.channel 2026-02-27 23:34:35 -08:00
parent 5a28a24294
commit fe03ec8f1a
No known key found for this signature in database
17 changed files with 185 additions and 8 deletions

View file

@ -1,6 +1,6 @@
# ECP-0063: Cloudflare MoQ Relay + WebTransport-Only Web Watch # ECP-0063: Cloudflare MoQ Relay + WebTransport-Only Web Watch
Status: Draft Status: Implemented
## Decision ## Decision
@ -77,6 +77,11 @@ Implementation choice:
Web share link: Web share link:
- `https://every.channel/watch?url=<relay-url>&name=<broadcast-name>` - `https://every.channel/watch?url=<relay-url>&name=<broadcast-name>`
## Alternatives considered
- Keep the legacy WebRTC/WS path as primary. Rejected because it does not align with relay-native MoQ fanout goals.
- Wait for full draft parity across all relays before shipping. Rejected because live interop was already sufficient on the chosen relay path.
## Rollout / Reversibility ## Rollout / Reversibility
- Keep existing `/api/*` bootstrap endpoints during migration. - Keep existing `/api/*` bootstrap endpoints during migration.

View file

@ -1,6 +1,6 @@
# ECP-0064: NixOS Module For `ec-node` WebTransport Publisher (Tower) # ECP-0064: NixOS Module For `ec-node` WebTransport Publisher (Tower)
Status: Draft Status: Implemented
## Decision ## Decision
@ -41,8 +41,12 @@ Out of scope (defer):
- Automatic lineup-based channel selection by callsign. - Automatic lineup-based channel selection by callsign.
- Secrets management (publisher doesn't require secrets for Cloudflare relay preview). - Secrets management (publisher doesn't require secrets for Cloudflare relay preview).
## Alternatives considered
- Continue running publishers manually via shells/tmux. Rejected because it is not reproducible or restart-safe.
- Build a separate external deployment repo first. Rejected because this delays in-repo infrastructure ownership.
## Rollout / Reversibility ## Rollout / Reversibility
- Enabling the module is per-host. - Enabling the module is per-host.
- Reversible by removing the module import and disabling the service(s); roll back with the existing deployment tooling. - Reversible by removing the module import and disabling the service(s); roll back with the existing deployment tooling.

View file

@ -1,6 +1,6 @@
# ECP-0065: NixOS Runner Images + Netboot Artifacts # ECP-0065: NixOS Runner Images + Netboot Artifacts
Status: Draft Status: Implemented
## Decision ## Decision
@ -40,6 +40,11 @@ Out of scope (defer):
- Remote runtime provisioning (fetching per-node channel lists). - Remote runtime provisioning (fetching per-node channel lists).
- Hardware-accelerated transcode changes (keep current CPU x264 baseline). - Hardware-accelerated transcode changes (keep current CPU x264 baseline).
## Alternatives considered
- Keep runner images out-of-repo and publish ad hoc artifacts. Rejected because it weakens reproducibility and provenance.
- Restrict to one install path only (disk install only). Rejected because netboot/bootstrap is required for fleet recovery.
## Rollout / Reversibility ## Rollout / Reversibility
- Rollout begins with local builds and a single test machine. - Rollout begins with local builds and a single test machine.

View file

@ -1,6 +1,6 @@
# ECP-0066: iroh-Gossip Control Protocol For Hybrid MoQ Discovery # ECP-0066: iroh-Gossip Control Protocol For Hybrid MoQ Discovery
Status: Draft Status: Implemented
## Decision ## Decision
@ -39,6 +39,11 @@ Out of scope:
- Security policy beyond existing iroh/gossip trust boundaries. - Security policy beyond existing iroh/gossip trust boundaries.
- Replacing existing catalog gossip immediately (coexist first). - Replacing existing catalog gossip immediately (coexist first).
## Alternatives considered
- Keep relay and direct discovery completely separate. Rejected because it forces duplicated consumer logic.
- Replace existing catalog gossip in one cutover. Rejected because additive coexistence is safer for rollout.
## Rollout / Reversibility ## Rollout / Reversibility
- Additive and reversible: removing control commands and topic does not affect existing media paths. - Additive and reversible: removing control commands and topic does not affect existing media paths.

View file

@ -1,6 +1,6 @@
# ECP-0067: Control Transport Resolution And NixOS Control Wiring # ECP-0067: Control Transport Resolution And NixOS Control Wiring
Status: Draft Status: Implemented
## Decision ## Decision
@ -32,6 +32,11 @@ Out of scope:
- End-to-end automatic failover execution (resolve + launch subscribe) in one command. - End-to-end automatic failover execution (resolve + launch subscribe) in one command.
- Cryptographic policy hardening beyond current control-topic trust model. - Cryptographic policy hardening beyond current control-topic trust model.
## Alternatives considered
- Keep transport selection in ad hoc shell logic. Rejected because policy behavior becomes inconsistent across operators.
- Wire control flags per host manually. Rejected because it is error-prone and not declarative.
## Rollout / Reversibility ## Rollout / Reversibility
- Additive only: existing relay and direct publish/subscribe paths remain unchanged. - Additive only: existing relay and direct publish/subscribe paths remain unchanged.

View file

@ -1,6 +1,6 @@
# ECP-0068: Iroh Control To Web Directory Bridge # ECP-0068: Iroh Control To Web Directory Bridge
Status: Draft Status: Implemented
## Decision ## Decision
@ -34,6 +34,11 @@ Out of scope:
- Signed/authenticated control announcements. - Signed/authenticated control announcements.
- Replacing relay playback with direct iroh in browsers. - Replacing relay playback with direct iroh in browsers.
## Alternatives considered
- Keep manual stream naming/link entry on the website. Rejected because it blocks one-click discovery.
- Bridge directly from browser clients instead of a node command. Rejected because browser trust/availability constraints are higher.
## Rollout / Reversibility ## Rollout / Reversibility
- Additive change; existing `/api/directory` and watch-by-link behavior remain intact. - Additive change; existing `/api/directory` and watch-by-link behavior remain intact.

View file

@ -1,6 +1,6 @@
# ECP-0069: NixOS Control Bridge Auto-Bootstrap # ECP-0069: NixOS Control Bridge Auto-Bootstrap
Status: Draft Status: Implemented
## Decision ## Decision
@ -31,6 +31,11 @@ Out of scope:
- Signed control announcements. - Signed control announcements.
- Browser-native iroh direct transport playback. - Browser-native iroh direct transport playback.
## Alternatives considered
- Continue manual gossip peer bootstrapping for the bridge. Rejected because restarts/reboots cause repeated operational toil.
- Use static peer lists only. Rejected because local publisher sets are dynamic and should be discovered from runtime endpoint files.
## Rollout / Reversibility ## Rollout / Reversibility
- Additive: existing publisher behavior is unchanged when `control.bridgeWeb.enable = false`. - Additive: existing publisher behavior is unchanged when `control.bridgeWeb.enable = false`.

View file

@ -1,5 +1,7 @@
# ECP-0070: Relay-Native CAS Archival + NixOS Auto-Archive Service # ECP-0070: Relay-Native CAS Archival + NixOS Auto-Archive Service
Status: Implemented
## Summary ## Summary
Add a first-party archival path for MoQ relay streams: Add a first-party archival path for MoQ relay streams:
@ -48,6 +50,11 @@ Tradeoffs:
- Discovery source is the web public stream list (not full control-topic gossip ingestion). - Discovery source is the web public stream list (not full control-topic gossip ingestion).
- Per-broadcast workers are process-based and best-effort supervised. - Per-broadcast workers are process-based and best-effort supervised.
## Alternatives considered
- Rely on browser-side replay caches only. Rejected because it does not provide durable archival storage.
- Archive only manifests without CAS payloads. Rejected because replay/integrity requires retained object bytes.
## Rollout ## Rollout
1. Ship `wt-archive` command in `ec-node`. 1. Ship `wt-archive` command in `ec-node`.

View file

@ -1,5 +1,7 @@
# ECP-0071: Archive Replay DVR Endpoints # ECP-0071: Archive Replay DVR Endpoints
Status: Implemented
## Context ## Context
ECP-0070 added relay archival (`wt-archive`) into CAS objects plus JSONL indexes, but there is no read path for viewers to scrub historical content. ECP-0070 added relay archival (`wt-archive`) into CAS objects plus JSONL indexes, but there is no read path for viewers to scrub historical content.
@ -26,6 +28,16 @@ Add an archive replay path with these pieces:
- Preserves CAS as source of truth; playlists are derived views. - Preserves CAS as source of truth; playlists are derived views.
- Uses standard HLS+DVR semantics so browser playback + scrubbing works without custom protocol work in the short term. - Uses standard HLS+DVR semantics so browser playback + scrubbing works without custom protocol work in the short term.
## Alternatives considered
- Build a custom replay protocol/UI instead of HLS. Rejected because browser DVR support is stronger with standard HLS tooling.
- Serve archive from a separate domain only. Rejected because same-domain replay keeps watch links and CORS simpler.
## Rollout / teardown
- Enable archive serve mode on archive hosts and deploy worker proxy routing to `/api/archive/*`.
- Teardown by disabling `archive.serve.enable` and removing proxy routing.
## Reversibility ## Reversibility
- Disable `archive.serve.enable` and remove worker proxy route to revert to archive-only mode. - Disable `archive.serve.enable` and remove worker proxy route to revert to archive-only mode.

View file

@ -1,5 +1,7 @@
# ECP-0072: CMAF Seedbox Invariant For Relay Archive # ECP-0072: CMAF Seedbox Invariant For Relay Archive
Status: Implemented
## Context ## Context
Archive replay currently stores and serves relay groups exactly as received, but many existing broadcasts were published in `legacy` container mode. Those bytes are not browser-HLS compatible, so archive playback fails despite a valid timeline and object store. Archive replay currently stores and serves relay groups exactly as received, but many existing broadcasts were published in `legacy` container mode. Those bytes are not browser-HLS compatible, so archive playback fails despite a valid timeline and object store.
@ -20,6 +22,16 @@ Update the NixOS module default `services.every-channel.ec-node.passthrough = tr
- Exact-byte retention avoids drift between live and replay. - Exact-byte retention avoids drift between live and replay.
- Browsers can play CMAF fragments via standard HLS tooling; no custom legacy converter is required for new streams. - Browsers can play CMAF fragments via standard HLS tooling; no custom legacy converter is required for new streams.
## Alternatives considered
- Keep `passthrough=false` as default for all publishers. Rejected because archive replay needs byte-compatible CMAF fragments.
- Re-encode archived payloads during replay. Rejected because it adds complexity and breaks exact-byte history semantics.
## Rollout / teardown
- Flip default `passthrough` to true in CLI and Nix module, then verify new archives play via HLS.
- Teardown by explicitly setting `passthrough=false` on hosts needing legacy framing.
## Reversibility ## Reversibility
- Operators can explicitly set `passthrough = false` per host to revert to legacy framing. - Operators can explicitly set `passthrough = false` per host to revert to legacy framing.

View file

@ -1,5 +1,7 @@
# ECP-0073: Archive Relay Affinity Override # ECP-0073: Archive Relay Affinity Override
Status: Implemented
## Context ## Context
`wt-archive` workers discover streams from `/api/public-streams` and subscribe to the listed `relay_url`. In practice, `cdn.moq.dev` resolves to region-local relay IPs, and broadcasts published from one region are not consistently visible from another region endpoint. `wt-archive` workers discover streams from `/api/public-streams` and subscribe to the listed `relay_url`. In practice, `cdn.moq.dev` resolves to region-local relay IPs, and broadcasts published from one region are not consistently visible from another region endpoint.
@ -22,6 +24,11 @@ This allows operators to pin archive ingestion to the same relay endpoint used b
- Keeps deployment-level control in Nix (no app-level migration needed). - Keeps deployment-level control in Nix (no app-level migration needed).
- Reversible with a single config change. - Reversible with a single config change.
## Alternatives considered
- Keep subscribing to directory-provided `relay_url` only. Rejected because cross-region visibility is inconsistent in practice.
- Rewrite directory entries per-region. Rejected because this mixes deployment affinity into public directory payloads.
## Rollout ## Rollout
1. Set `archive.relayUrlOverride` on archive hosts that need relay affinity. 1. Set `archive.relayUrlOverride` on archive hosts that need relay affinity.

View file

@ -1,5 +1,7 @@
# ECP-0074: Archive HLS Engine Selection For Chromium # ECP-0074: Archive HLS Engine Selection For Chromium
Status: Implemented
## Context ## Context
Archive mode currently chooses native HLS whenever `video.canPlayType("application/vnd.apple.mpegurl")` is non-empty. Archive mode currently chooses native HLS whenever `video.canPlayType("application/vnd.apple.mpegurl")` is non-empty.
@ -16,6 +18,16 @@ Use native HLS only on Safari/iOS user agents. For all other browsers (including
- Keeps Safari native path where it is reliable. - Keeps Safari native path where it is reliable.
- Preserves a single URL and UI flow (`/api/archive/.../master.m3u8`). - Preserves a single URL and UI flow (`/api/archive/.../master.m3u8`).
## Alternatives considered
- Keep `canPlayType` as the only gate. Rejected because Chromium reports support but fails event-style playback.
- Force `hls.js` for all browsers including Safari. Rejected because Safari native playback is already reliable and simpler.
## Rollout / teardown
- Deploy UA-gated engine selection in web app and validate archive playback on Chromium and Safari.
- Teardown by reverting to the previous generic `canPlayType` gate.
## Reversibility ## Reversibility
Revert the UA gate and return to the previous `canPlayType`-only check. Revert the UA gate and return to the previous `canPlayType`-only check.

View file

@ -1,5 +1,7 @@
# ECP-0075: Bump Web Watcher To `@moq/watch@0.2.0` # ECP-0075: Bump Web Watcher To `@moq/watch@0.2.0`
Status: Implemented
## Context ## Context
Production web watchers currently load `@moq/watch@0.1.1`. Under live OTA relay streams, Chromium sessions frequently emit runtime failures (`VideoFrame clone` errors and repeated stream resets), leaving playback stalled even after successful subscribe. Production web watchers currently load `@moq/watch@0.1.1`. Under live OTA relay streams, Chromium sessions frequently emit runtime failures (`VideoFrame clone` errors and repeated stream resets), leaving playback stalled even after successful subscribe.
@ -15,6 +17,16 @@ Set both `name` and `path` attributes on `<moq-watch>` so minor-version attribut
- Pulls in upstream runtime fixes without introducing new local playback logic. - Pulls in upstream runtime fixes without introducing new local playback logic.
- Preserves multi-CDN fallback behavior already used for dependency resilience. - Preserves multi-CDN fallback behavior already used for dependency resilience.
## Alternatives considered
- Keep pin at `0.1.1` and add larger local workarounds. Rejected because upstream fixes already address core runtime failures.
- Switch to a different browser player stack immediately. Rejected because this is higher risk than a targeted minor-version bump.
## Rollout / teardown
- Roll out `@moq/watch@0.2.0` on all CDN import fallbacks and verify live subscribe/playback.
- Teardown by repinning imports to `0.1.1`.
## Reversibility ## Reversibility
- Roll back by pinning imports back to `0.1.1` if regressions appear. - Roll back by pinning imports back to `0.1.1` if regressions appear.

View file

@ -1,5 +1,7 @@
# ECP-0076: WebTransport-Only Browser Watcher Path # ECP-0076: WebTransport-Only Browser Watcher Path
Status: Implemented
## Context ## Context
The browser watcher (`@moq/watch`) races WebTransport against WebSocket fallback by default. In production relay sessions this fallback path correlates with degraded playback behavior (frequent stream resets and unreliable audio despite active subscription). The browser watcher (`@moq/watch`) races WebTransport against WebSocket fallback by default. In production relay sessions this fallback path correlates with degraded playback behavior (frequent stream resets and unreliable audio despite active subscription).
@ -18,6 +20,16 @@ Also set default watcher volume to full (`volume="1"`). Keep canvas live renderi
- Removes fallback-induced variability from live playback behavior. - Removes fallback-induced variability from live playback behavior.
- Keeps implementation local to web app wiring without forking upstream packages. - Keeps implementation local to web app wiring without forking upstream packages.
## Alternatives considered
- Leave WebSocket fallback enabled. Rejected because fallback races correlated with unstable live playback.
- Fork upstream watcher package for a custom transport stack. Rejected because app-level wiring changes were sufficient.
## Rollout / teardown
- Deploy connection override to disable websocket fallback and validate live session stability.
- Teardown by removing the override and restoring default transport behavior.
## Reversibility ## Reversibility
- Remove the connection override to restore default fallback behavior. - Remove the connection override to restore default fallback behavior.

View file

@ -1,5 +1,7 @@
# ECP-0077: Explicit AAC-LC Live Audio Profile In `wt-publish` # ECP-0077: Explicit AAC-LC Live Audio Profile In `wt-publish`
Status: Implemented
## Context ## Context
Live OTA inputs expose multiple AC-3 audio tracks (5.1 + stereo language variants). Browser watcher behavior is more stable when the published relay stream has a single explicit AAC-LC stereo track shape. Live OTA inputs expose multiple AC-3 audio tracks (5.1 + stereo language variants). Browser watcher behavior is more stable when the published relay stream has a single explicit AAC-LC stereo track shape.
@ -22,6 +24,16 @@ In `ec-node wt-publish` transcode mode, force explicit stream mapping and AAC pr
- Keeps audio encoding browser-friendly and deterministic. - Keeps audio encoding browser-friendly and deterministic.
- Preserves optional audio behavior (`0:a:0?`) for edge cases where input temporarily lacks audio. - Preserves optional audio behavior (`0:a:0?`) for edge cases where input temporarily lacks audio.
## Alternatives considered
- Keep ffmpeg auto stream selection/profile defaults. Rejected because multi-track OTA inputs produced unstable browser outcomes.
- Preserve AC-3 passthrough for all sources. Rejected because browser compatibility is weaker than explicit AAC-LC stereo.
## Rollout / teardown
- Enable explicit audio mapping/profile in `wt-publish` transcode mode and verify browser playback across OTA sources.
- Teardown by removing explicit `-map` and AAC profile options.
## Reversibility ## Reversibility
- Revert to ffmpeg auto mapping/profile by removing explicit `-map` and `-profile:a` flags. - Revert to ffmpeg auto mapping/profile by removing explicit `-map` and `-profile:a` flags.

View file

@ -1,5 +1,7 @@
# ECP-0078: Live `<video>`-First Rendering With Gesture Audio Unlock # ECP-0078: Live `<video>`-First Rendering With Gesture Audio Unlock
Status: Implemented
## Context ## Context
Live browser playback currently prioritizes canvas rendering. Audio can fail on first load due to autoplay policy (`AudioContext was not allowed to start`) and we still need a robust `<video>` rendering path for native controls. Live browser playback currently prioritizes canvas rendering. Audio can fail on first load due to autoplay policy (`AudioContext was not allowed to start`) and we still need a robust `<video>` rendering path for native controls.
@ -19,6 +21,16 @@ In the web watcher mount path:
- Preserves the `<video>` UX target while handling browser autoplay constraints explicitly. - Preserves the `<video>` UX target while handling browser autoplay constraints explicitly.
- Keeps changes local to app wiring without forking upstream MoQ player internals. - Keeps changes local to app wiring without forking upstream MoQ player internals.
## Alternatives considered
- Keep canvas-first rendering only. Rejected because native `<video>` controls/audio handling are still required.
- Attempt autoplay with unmuted audio by default. Rejected because browser policy blocks reliable first-play behavior.
## Rollout / teardown
- Deploy muted-start plus gesture unlock wiring and validate first-load playback and unmute behavior.
- Teardown by removing unlock wiring or reverting to prior renderer mode.
## Reversibility ## Reversibility
- Remove the unlock wiring (or return to canvas renderer) to restore prior behavior. - Remove the unlock wiring (or return to canvas renderer) to restore prior behavior.

View file

@ -0,0 +1,45 @@
# ECP-0079: Governance Hygiene, CI Quality Gates, and Main-Branch Protection
Status: Implemented
## Context
Recent delivery velocity improved product behavior, but governance and quality signals drifted:
- active ECPs were not consistently marked with explicit status and alternatives;
- pull requests lacked a single, explicit CI gate for core tests plus web build;
- deploy could proceed without an explicit prerequisite check job;
- branch protection settings were not codified as an operator runbook artifact.
This conflicts with the constitutional requirement that non-trivial changes remain reviewable and merge through pull requests.
## Decision
1. Normalize governance records for the active proposal window (`ECP-0063` through `ECP-0078`):
- mark implemented decisions as `Status: Implemented`,
- add explicit `Alternatives considered` sections,
- ensure rollout/teardown intent is present.
2. Add `scripts/ecp-lint.sh` and run it in CI to enforce required ECP sections for active proposals.
3. Add a `ci-gates` workflow for pull requests that runs:
- ECP lint,
- core Rust test subset,
- `apps/web` production build.
4. Update deploy workflow to include a dedicated `checks` job and make deploy depend on that job.
5. Correct Cloudflare deploy docs so manual commands and secret prerequisites match current implementation.
6. Add a branch-protection enforcement script and runbook so `main` can be locked to PR merges with required checks.
## Alternatives considered
- Keep governance cleanup manual and ad hoc. Rejected because drift reappears quickly under fast iteration.
- Gate only deploy, not pull requests. Rejected because review-time feedback is required before merge.
- Rely on UI-only branch protection configuration with no repo script/runbook. Rejected because settings become opaque and harder to audit.
## Rollout / teardown plan
- Rollout:
- land ECP updates + lint script + CI workflows + docs + branch-protection tooling together;
- apply branch protection using the new script;
- set required check context to `ci-gates / checks`.
- Teardown:
- remove `ci-gates` workflow and lint script if governance process is superseded;
- relax branch protection via API/script and adjust constitutional process in a superseding ECP.