governance: normalize ECP 0063-0078 and add ECP-0079
This commit is contained in:
parent
5a28a24294
commit
fe03ec8f1a
17 changed files with 185 additions and 8 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0063: Cloudflare MoQ Relay + WebTransport-Only Web Watch
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -77,6 +77,11 @@ Implementation choice:
|
|||
Web share link:
|
||||
- `https://every.channel/watch?url=<relay-url>&name=<broadcast-name>`
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep the legacy WebRTC/WS path as primary. Rejected because it does not align with relay-native MoQ fanout goals.
|
||||
- Wait for full draft parity across all relays before shipping. Rejected because live interop was already sufficient on the chosen relay path.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Keep existing `/api/*` bootstrap endpoints during migration.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0064: NixOS Module For `ec-node` WebTransport Publisher (Tower)
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -41,8 +41,12 @@ Out of scope (defer):
|
|||
- Automatic lineup-based channel selection by callsign.
|
||||
- Secrets management (publisher doesn't require secrets for Cloudflare relay preview).
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Continue running publishers manually via shells/tmux. Rejected because it is not reproducible or restart-safe.
|
||||
- Build a separate external deployment repo first. Rejected because this delays in-repo infrastructure ownership.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Enabling the module is per-host.
|
||||
- Reversible by removing the module import and disabling the service(s); roll back with the existing deployment tooling.
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0065: NixOS Runner Images + Netboot Artifacts
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -40,6 +40,11 @@ Out of scope (defer):
|
|||
- Remote runtime provisioning (fetching per-node channel lists).
|
||||
- Hardware-accelerated transcode changes (keep current CPU x264 baseline).
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep runner images out-of-repo and publish ad hoc artifacts. Rejected because it weakens reproducibility and provenance.
|
||||
- Restrict to one install path only (disk install only). Rejected because netboot/bootstrap is required for fleet recovery.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Rollout begins with local builds and a single test machine.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0066: iroh-Gossip Control Protocol For Hybrid MoQ Discovery
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -39,6 +39,11 @@ Out of scope:
|
|||
- Security policy beyond existing iroh/gossip trust boundaries.
|
||||
- Replacing existing catalog gossip immediately (coexist first).
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep relay and direct discovery completely separate. Rejected because it forces duplicated consumer logic.
|
||||
- Replace existing catalog gossip in one cutover. Rejected because additive coexistence is safer for rollout.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Additive and reversible: removing control commands and topic does not affect existing media paths.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0067: Control Transport Resolution And NixOS Control Wiring
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -32,6 +32,11 @@ Out of scope:
|
|||
- End-to-end automatic failover execution (resolve + launch subscribe) in one command.
|
||||
- Cryptographic policy hardening beyond current control-topic trust model.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep transport selection in ad hoc shell logic. Rejected because policy behavior becomes inconsistent across operators.
|
||||
- Wire control flags per host manually. Rejected because it is error-prone and not declarative.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Additive only: existing relay and direct publish/subscribe paths remain unchanged.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0068: Iroh Control To Web Directory Bridge
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -34,6 +34,11 @@ Out of scope:
|
|||
- Signed/authenticated control announcements.
|
||||
- Replacing relay playback with direct iroh in browsers.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep manual stream naming/link entry on the website. Rejected because it blocks one-click discovery.
|
||||
- Bridge directly from browser clients instead of a node command. Rejected because browser trust/availability constraints are higher.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Additive change; existing `/api/directory` and watch-by-link behavior remain intact.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ECP-0069: NixOS Control Bridge Auto-Bootstrap
|
||||
|
||||
Status: Draft
|
||||
Status: Implemented
|
||||
|
||||
## Decision
|
||||
|
||||
|
|
@ -31,6 +31,11 @@ Out of scope:
|
|||
- Signed control announcements.
|
||||
- Browser-native iroh direct transport playback.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Continue manual gossip peer bootstrapping for the bridge. Rejected because restarts/reboots cause repeated operational toil.
|
||||
- Use static peer lists only. Rejected because local publisher sets are dynamic and should be discovered from runtime endpoint files.
|
||||
|
||||
## Rollout / Reversibility
|
||||
|
||||
- Additive: existing publisher behavior is unchanged when `control.bridgeWeb.enable = false`.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0070: Relay-Native CAS Archival + NixOS Auto-Archive Service
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Summary
|
||||
|
||||
Add a first-party archival path for MoQ relay streams:
|
||||
|
|
@ -48,6 +50,11 @@ Tradeoffs:
|
|||
- Discovery source is the web public stream list (not full control-topic gossip ingestion).
|
||||
- Per-broadcast workers are process-based and best-effort supervised.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Rely on browser-side replay caches only. Rejected because it does not provide durable archival storage.
|
||||
- Archive only manifests without CAS payloads. Rejected because replay/integrity requires retained object bytes.
|
||||
|
||||
## Rollout
|
||||
|
||||
1. Ship `wt-archive` command in `ec-node`.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0071: Archive Replay DVR Endpoints
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
ECP-0070 added relay archival (`wt-archive`) into CAS objects plus JSONL indexes, but there is no read path for viewers to scrub historical content.
|
||||
|
|
@ -26,6 +28,16 @@ Add an archive replay path with these pieces:
|
|||
- Preserves CAS as source of truth; playlists are derived views.
|
||||
- Uses standard HLS+DVR semantics so browser playback + scrubbing works without custom protocol work in the short term.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Build a custom replay protocol/UI instead of HLS. Rejected because browser DVR support is stronger with standard HLS tooling.
|
||||
- Serve archive from a separate domain only. Rejected because same-domain replay keeps watch links and CORS simpler.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Enable archive serve mode on archive hosts and deploy worker proxy routing to `/api/archive/*`.
|
||||
- Teardown by disabling `archive.serve.enable` and removing proxy routing.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Disable `archive.serve.enable` and remove worker proxy route to revert to archive-only mode.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0072: CMAF Seedbox Invariant For Relay Archive
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Archive replay currently stores and serves relay groups exactly as received, but many existing broadcasts were published in `legacy` container mode. Those bytes are not browser-HLS compatible, so archive playback fails despite a valid timeline and object store.
|
||||
|
|
@ -20,6 +22,16 @@ Update the NixOS module default `services.every-channel.ec-node.passthrough = tr
|
|||
- Exact-byte retention avoids drift between live and replay.
|
||||
- Browsers can play CMAF fragments via standard HLS tooling; no custom legacy converter is required for new streams.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep `passthrough=false` as default for all publishers. Rejected because archive replay needs byte-compatible CMAF fragments.
|
||||
- Re-encode archived payloads during replay. Rejected because it adds complexity and breaks exact-byte history semantics.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Flip default `passthrough` to true in CLI and Nix module, then verify new archives play via HLS.
|
||||
- Teardown by explicitly setting `passthrough=false` on hosts needing legacy framing.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Operators can explicitly set `passthrough = false` per host to revert to legacy framing.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0073: Archive Relay Affinity Override
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
`wt-archive` workers discover streams from `/api/public-streams` and subscribe to the listed `relay_url`. In practice, `cdn.moq.dev` resolves to region-local relay IPs, and broadcasts published from one region are not consistently visible from another region endpoint.
|
||||
|
|
@ -22,6 +24,11 @@ This allows operators to pin archive ingestion to the same relay endpoint used b
|
|||
- Keeps deployment-level control in Nix (no app-level migration needed).
|
||||
- Reversible with a single config change.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep subscribing to directory-provided `relay_url` only. Rejected because cross-region visibility is inconsistent in practice.
|
||||
- Rewrite directory entries per-region. Rejected because this mixes deployment affinity into public directory payloads.
|
||||
|
||||
## Rollout
|
||||
|
||||
1. Set `archive.relayUrlOverride` on archive hosts that need relay affinity.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0074: Archive HLS Engine Selection For Chromium
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Archive mode currently chooses native HLS whenever `video.canPlayType("application/vnd.apple.mpegurl")` is non-empty.
|
||||
|
|
@ -16,6 +18,16 @@ Use native HLS only on Safari/iOS user agents. For all other browsers (including
|
|||
- Keeps Safari native path where it is reliable.
|
||||
- Preserves a single URL and UI flow (`/api/archive/.../master.m3u8`).
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep `canPlayType` as the only gate. Rejected because Chromium reports support but fails event-style playback.
|
||||
- Force `hls.js` for all browsers including Safari. Rejected because Safari native playback is already reliable and simpler.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Deploy UA-gated engine selection in web app and validate archive playback on Chromium and Safari.
|
||||
- Teardown by reverting to the previous generic `canPlayType` gate.
|
||||
|
||||
## Reversibility
|
||||
|
||||
Revert the UA gate and return to the previous `canPlayType`-only check.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0075: Bump Web Watcher To `@moq/watch@0.2.0`
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Production web watchers currently load `@moq/watch@0.1.1`. Under live OTA relay streams, Chromium sessions frequently emit runtime failures (`VideoFrame clone` errors and repeated stream resets), leaving playback stalled even after successful subscribe.
|
||||
|
|
@ -15,6 +17,16 @@ Set both `name` and `path` attributes on `<moq-watch>` so minor-version attribut
|
|||
- Pulls in upstream runtime fixes without introducing new local playback logic.
|
||||
- Preserves multi-CDN fallback behavior already used for dependency resilience.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep pin at `0.1.1` and add larger local workarounds. Rejected because upstream fixes already address core runtime failures.
|
||||
- Switch to a different browser player stack immediately. Rejected because this is higher risk than a targeted minor-version bump.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Roll out `@moq/watch@0.2.0` on all CDN import fallbacks and verify live subscribe/playback.
|
||||
- Teardown by repinning imports to `0.1.1`.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Roll back by pinning imports back to `0.1.1` if regressions appear.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0076: WebTransport-Only Browser Watcher Path
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
The browser watcher (`@moq/watch`) races WebTransport against WebSocket fallback by default. In production relay sessions this fallback path correlates with degraded playback behavior (frequent stream resets and unreliable audio despite active subscription).
|
||||
|
|
@ -18,6 +20,16 @@ Also set default watcher volume to full (`volume="1"`). Keep canvas live renderi
|
|||
- Removes fallback-induced variability from live playback behavior.
|
||||
- Keeps implementation local to web app wiring without forking upstream packages.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Leave WebSocket fallback enabled. Rejected because fallback races correlated with unstable live playback.
|
||||
- Fork upstream watcher package for a custom transport stack. Rejected because app-level wiring changes were sufficient.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Deploy connection override to disable websocket fallback and validate live session stability.
|
||||
- Teardown by removing the override and restoring default transport behavior.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Remove the connection override to restore default fallback behavior.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0077: Explicit AAC-LC Live Audio Profile In `wt-publish`
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Live OTA inputs expose multiple AC-3 audio tracks (5.1 + stereo language variants). Browser watcher behavior is more stable when the published relay stream has a single explicit AAC-LC stereo track shape.
|
||||
|
|
@ -22,6 +24,16 @@ In `ec-node wt-publish` transcode mode, force explicit stream mapping and AAC pr
|
|||
- Keeps audio encoding browser-friendly and deterministic.
|
||||
- Preserves optional audio behavior (`0:a:0?`) for edge cases where input temporarily lacks audio.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep ffmpeg auto stream selection/profile defaults. Rejected because multi-track OTA inputs produced unstable browser outcomes.
|
||||
- Preserve AC-3 passthrough for all sources. Rejected because browser compatibility is weaker than explicit AAC-LC stereo.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Enable explicit audio mapping/profile in `wt-publish` transcode mode and verify browser playback across OTA sources.
|
||||
- Teardown by removing explicit `-map` and AAC profile options.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Revert to ffmpeg auto mapping/profile by removing explicit `-map` and `-profile:a` flags.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,7 @@
|
|||
# ECP-0078: Live `<video>`-First Rendering With Gesture Audio Unlock
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Live browser playback currently prioritizes canvas rendering. Audio can fail on first load due to autoplay policy (`AudioContext was not allowed to start`) and we still need a robust `<video>` rendering path for native controls.
|
||||
|
|
@ -19,6 +21,16 @@ In the web watcher mount path:
|
|||
- Preserves the `<video>` UX target while handling browser autoplay constraints explicitly.
|
||||
- Keeps changes local to app wiring without forking upstream MoQ player internals.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep canvas-first rendering only. Rejected because native `<video>` controls/audio handling are still required.
|
||||
- Attempt autoplay with unmuted audio by default. Rejected because browser policy blocks reliable first-play behavior.
|
||||
|
||||
## Rollout / teardown
|
||||
|
||||
- Deploy muted-start plus gesture unlock wiring and validate first-load playback and unmute behavior.
|
||||
- Teardown by removing unlock wiring or reverting to prior renderer mode.
|
||||
|
||||
## Reversibility
|
||||
|
||||
- Remove the unlock wiring (or return to canvas renderer) to restore prior behavior.
|
||||
|
|
|
|||
|
|
@ -0,0 +1,45 @@
|
|||
# ECP-0079: Governance Hygiene, CI Quality Gates, and Main-Branch Protection
|
||||
|
||||
Status: Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Recent delivery velocity improved product behavior, but governance and quality signals drifted:
|
||||
|
||||
- active ECPs were not consistently marked with explicit status and alternatives;
|
||||
- pull requests lacked a single, explicit CI gate for core tests plus web build;
|
||||
- deploy could proceed without an explicit prerequisite check job;
|
||||
- branch protection settings were not codified as an operator runbook artifact.
|
||||
|
||||
This conflicts with the constitutional requirement that non-trivial changes remain reviewable and merge through pull requests.
|
||||
|
||||
## Decision
|
||||
|
||||
1. Normalize governance records for the active proposal window (`ECP-0063` through `ECP-0078`):
|
||||
- mark implemented decisions as `Status: Implemented`,
|
||||
- add explicit `Alternatives considered` sections,
|
||||
- ensure rollout/teardown intent is present.
|
||||
2. Add `scripts/ecp-lint.sh` and run it in CI to enforce required ECP sections for active proposals.
|
||||
3. Add a `ci-gates` workflow for pull requests that runs:
|
||||
- ECP lint,
|
||||
- core Rust test subset,
|
||||
- `apps/web` production build.
|
||||
4. Update deploy workflow to include a dedicated `checks` job and make deploy depend on that job.
|
||||
5. Correct Cloudflare deploy docs so manual commands and secret prerequisites match current implementation.
|
||||
6. Add a branch-protection enforcement script and runbook so `main` can be locked to PR merges with required checks.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- Keep governance cleanup manual and ad hoc. Rejected because drift reappears quickly under fast iteration.
|
||||
- Gate only deploy, not pull requests. Rejected because review-time feedback is required before merge.
|
||||
- Rely on UI-only branch protection configuration with no repo script/runbook. Rejected because settings become opaque and harder to audit.
|
||||
|
||||
## Rollout / teardown plan
|
||||
|
||||
- Rollout:
|
||||
- land ECP updates + lint script + CI workflows + docs + branch-protection tooling together;
|
||||
- apply branch protection using the new script;
|
||||
- set required check context to `ci-gates / checks`.
|
||||
- Teardown:
|
||||
- remove `ci-gates` workflow and lint script if governance process is superseded;
|
||||
- relax branch protection via API/script and adjust constitutional process in a superseding ECP.
|
||||
Loading…
Add table
Add a link
Reference in a new issue