25 KiB
ECP-0156: Duplicate Publisher Deterministic Data Layer
Status: Draft
Context
Two publisher nodes may broadcast the same logical channel at the same time. The archive and relay layers need this for resilience, but duplicate publishers currently risk looking like conflicting streams instead of convergent copies of the same media.
Decision
Duplicate publishers are valid for a published channel. The data layer dedupes and verifies media by content identity, not by publisher envelope identity:
- CMAF init and media segment bytes for the same input, ladder profile, and chunk cadence must be byte-for-byte identical.
- BLAKE3 media hashes and per-rung Merkle roots are the shared data identity.
- Publisher manifests may carry different
stream_id,epoch_id,created_unix_ms, signatures, locators, and manifest ids. - The archive must treat matching media hashes from different publishers as corroborating sources.
- Archive records must carry source identity. Two copied buffers with the same
source_nodeare not duplicate-publisher proof, even when their BLAKE3 hashes match. - Divergent hashes for the same logical channel, rendition, and media time are misses that must be measured before the data is promoted as redundant.
Verification
The proof path has two stages:
- Single-node duplicate-publisher tests produce the same ladder twice with different publisher
identities and assert byte-for-byte BLAKE3 equality for every generated init and media segment.
The
duplicate_publishers_same_input_produce_identical_cmaf_ladder_bytestest is part of the default Rust test path when ffmpeg is present; it is not an ignored E2E. - Production verification runs the same channel on two real publishers long enough to measure duplicate media convergence, hash divergence, missing objects, and backfill behavior in Grafana.
The goal is not just "two publishers are online." Success requires elapsed production time behind the numbers and dashboards that show duplicate hits, misses, and archive repair.
Consequences
Manifest ids cannot be used as the archive dedupe key for duplicate publishers. Operators get a clear signal when two publishers produce identical bytes versus merely announcing the same channel. If encoder determinism changes, the single-node test fails before production redundancy silently degrades.
Alternatives considered
- Dedupe by manifest id. This preserves envelope identity but misses the resilience property because duplicate publishers necessarily produce different envelopes.
- Dedupe by logical channel and time only. This can hide encoder divergence and promote bad redundancy before byte-level media equality is proven.
- Disable duplicate publishers until the scheduler is perfect. This avoids conflict handling but weakens live resilience and leaves the archive data layer untested.
Rollout/teardown
Roll forward by landing the local deterministic test, adding miss/duplicate metrics to the archive scrape surface, then running two publishers for one logical channel in production. Roll back by disabling duplicate scheduling for that channel; existing content-addressed archive objects remain valid.
Implementation notes
The node-agent archive scrape now exposes duplicate-source and miss gauges without placing hashes in labels. Per node, role, broadcast, rendition, and track it reports duplicate matching hash sources, duplicate hash sequences, divergent hash sequences, and missing hash records. Grafana shows those next to archive ladder coverage so the production duplicate-publisher run has an operator-visible convergence and miss signal.
ec-node archive-convergence is the primary proof surface for duplicate media identity. It compares
named archive manifest roots directly inside the Rust node binary, groups records by logical stream,
rendition, track, and sequence, and only returns ok when every expected sequence has matching
duplicate source hashes with no missing or divergent sequence. It also requires archive records to
carry at least two distinct source_node values, so mirrored global-origin manifests cannot pass as
independent publishers. This keeps the media-data invariant in the already-shipped Rust artifact
instead of extending the Python node-agent. Rollout gates should use
ec-node archive-convergence --require-ok; the command emits the JSON report either way, but
--require-ok exits non-zero unless duplicate convergence is actually proven.
ec-node archive-convergence --prometheus renders the same Rust convergence report as scrapeable
every_channel_archive_* gauges for duplicate source records, duplicate sequences, divergent
sequences, source-local divergence, missing hashes, missing source identity, media timing conflicts,
record source count, and pass/fail state. This gives Grafana a Rust-owned proof metric path while
the older node-agent ladder metrics remain available during migration.
ec-node archive-convergence-serve keeps that proof path live for Prometheus: it serves /health
and /metrics, recomputes convergence on each scrape, and emits scrape_ok=0 metrics instead of
disappearing when manifests are missing or not ready. Production Grafana can therefore distinguish a
healthy metrics target from an unproven duplicate-publisher run.
The Nix services.every-channel.ec-node.archive.convergence.proofs option turns those Rust proof
servers into named systemd units. Each proof must name at least two NAME=PATH sources and gets a
dedicated listen address, so operators can add one Prometheus scrape target per duplicate channel
without resurrecting the Python node-agent as the proof oracle.
Forge enables an initial la-kcop-publisher-origin proof target on 127.0.0.1:7812 and Prometheus
scrapes it alongside the other local every.channel targets. Until two real publisher manifest roots
are mounted or fetched into Forge, the target intentionally uses the Forge manifest root as a
placeholder peer and must report unproven convergence rather than green duplicate-publisher proof.
Forge also exposes a static two-NUC la-kcet-remote-publisher-origin proof target once that channel
is the live converged duplicate sample. Dynamic Headscale file-SD remains useful for discovery, but
it can include relays and stale nodes; duplicate-publisher proof should use an explicit publisher
pair or future scheduler group labels so unrelated agents do not turn a passing channel red.
This static proof exports its own Rust convergence gauges rather than gating on broad legacy
Prometheus aggregates, because older node-agent archive metrics do not yet carry enough proof-role
labels to avoid summing stale divergence from unrelated scrape targets.
ec-node archive-convergence-measure is the primary production proof harness. It fetches named
node-agent /v1/archive-manifest samples or direct manifest JSONL URLs, writes bounded temporary
manifest roots, reuses the Rust archive-convergence report, and optionally queries Prometheus for
the Grafana-facing duplicate/miss series. A production run only counts as complete when the report
has elapsed samples, matching duplicate media hashes, zero divergent hash sequences, and live
Prometheus series for the duplicate/miss gauges. The measurement groups records by archive record
source identity, not by the URL used to fetch a manifest, and reports source identity failures when
the sample is too weak to prove independent publisher data. The older
scripts/measure-duplicate-publishers.py stays compatibility-only until live operators and Forge
jobs are switched to the Rust command.
The convergence report carries bounded divergent-sequence samples with per-source hash, byte size,
receive time, source node/session, CAS path, and media timing when present, so a red proof is
immediately actionable without fetching full manifests by hand.
It also reports a non-blocking media-timing-missing count and Prometheus gauge; hash equality can
still prove duplicate bytes, but missing timing means a divergent proof cannot yet classify whether
the mismatch is a phase/windowing problem or an encoder byte problem.
Publisher service builders must pass proof cadence explicitly. Both the node-agent publisher
supervisor and Nix systemd publisher module set --publisher-archive-segment-duration-ms and
--publisher-start-boundary-ms by default, so netbooted NUCs do not depend on stale hotpatch CLI
defaults when aligning duplicate publisher proof windows.
ec-node archive-convergence-measure-serve turns that production proof harness into a live
Prometheus target. Each /metrics scrape fetches one fresh sample from node-agent or direct JSONL
manifest URLs, keeps a bounded in-memory sample window, and only reports measurement ok after the
configured elapsed window has passed. This avoids blocking Prometheus scrapes for the measurement
duration while still preventing two immediate samples from looking like a real production run.
The service emits measurement-level gauges for fetch success, source record counts, invalid records,
elapsed seconds, Prometheus series presence, reasons, and then appends the same
every_channel_archive_* convergence gauges from the latest sample. The service can also read
Prometheus file-SD JSON from Forge's Headscale node-agent discovery and turn each discovered target
into a sampled node-agent manifest source. The Nix
services.every-channel.ec-node.archive.convergence.remoteProofs option creates these remote proof
services as systemd units from either static NAME=URL endpoints or dynamic file-SD inputs. Forge
now exposes la-kcop-remote-publisher-origin on 127.0.0.1:7813 using the live
/var/lib/prometheus/every-channel-node-agents.json inventory. It must stay red until that
inventory contains at least two independent publisher node-agents whose publisher.m4s records
converge.
When archive-serve ports are not reachable from the proof runner, the node-agent exposes a bounded,
tailnet-authenticated /v1/archive-manifest sample endpoint. The harness can use that endpoint for
each named publisher, compare local manifest records directly, and still require at least two elapsed
samples before declaring success.
Production duplicate proof also requires archive-buffer freshness on each participating publisher.
During mixed-generation rollouts, the current node-agent may supervise an older installed
archive-hot-sync helper. The agent must probe helper flag support and omit optional arguments such
as --link-mode when an older helper lacks them, because a silently failing archive-buffer sync can
leave one publisher with healthy live streams but stale manifests.
The publisher buffer refresh is freshness-first: the node-managed sync must mirror full manifests without origin object fetch before running the slower cache fill/prune pass. This lets convergence checks, Grafana scrape surfaces, and demand fetch see current BLAKE3 indexes even when proactive CAS object backfill is still catching up.
wt-archive stamps each archive index record with source_node and source_session. The Nix
archive launcher passes the runtime hostname as --source-node; explicit CLI users can override it.
Older records without this identity continue to parse, but proof commands and production measurement
mark them incomplete instead of accepting them as independent publisher evidence.
Publisher-origin proof must be captured before relay/archive mirroring can collapse source identity.
When node-agent archive buffering is enabled, supervised wt-publish processes pass
--publisher-archive-output-dir, --publisher-archive-manifest-dir, and
--publisher-archive-source-node. wt-publish now supervises the Rust
publisher-proof-archive-source worker for that archive track. The worker splits the MPEG-TS source
by source-clock windows, fresh-encodes each bounded window with the deterministic proof profile,
stores the resulting media fragments under publisher.m4s in the same CAS/index format, and stamps
them with node-agent source identity. The relay playback encoder remains continuous for watchability,
but it is no longer the BLAKE3 data identity for duplicate-publisher proof. The source identity is
explicit override first, then hostname plus a short hash of machine-id, with boot-id only as a
fallback; hostname alone is not enough because publisher images can share names like ec-node.
Production duplicate verification can therefore compare publisher.m4s from two publisher buffers
without treating copied relay-origin manifests as independent sources.
Proof tooling defaults to publisher.m4s. The relay video track 0.m4s is useful playback data,
but it is not duplicate-publisher proof: a publisher buffer may hold relay/cache records on 0.m4s
that have no publisher source identity. Production convergence checks that sample 0.m4s should be
treated as playback/archive-cache diagnostics, not byte-for-byte duplicate publisher evidence.
The first live publisher-origin measurements on 2026-06-08 showed correct distinct source labels but
zero matching duplicate sequences for la-nbc4, la-pbs-socal, and la-kcet. The failure is
useful: independent wt-publish processes currently start their fragment sequence and encoder chunk
phase at local process start, so sequence 0 from two publishers is not necessarily the same
broadcast moment. Duplicate-publisher proof therefore requires a shared chunk clock or
scheduler-controlled aligned encoder phase before byte-for-byte archive convergence can pass in
production.
Publisher-origin publisher.m4s records now require timed fMP4 fragments for global proof and map
those fragments onto observed wall-clock epoch buckets instead of local process counters. The Rust
writer learns track timescales from the init moov box, reads fragment
moof/traf/tfhd+tfdt decode timestamps to reject untimed proof when possible, then assigns
group_sequence = observed_epoch_bucket * bucket_stride + fragment_slot. Fragments that lack usable
timing still fall back to the previous local counter so publishing does not fail hard on malformed
metadata, but duplicate-publisher proof should use timed fragments. The wt-publish ffmpeg path
also preserves source timestamps and uses closed-GOP, single-threaded x264 settings with forced
keyframe cadence so independent publishers have a real chance of producing identical bytes for the
same media time window.
A later live run on 2026-06-08 found a stricter local invariant before cross-publisher byte equality:
each publisher must produce at most one hash for a given source_node and group_sequence.
Production publisher.m4s samples for la-kcop and la-ktla showed multiple hashes from the same
source in the same sequence bucket because real fMP4 fragments can arrive faster than the configured
proof segment duration, and the writer rounded decode time into repeated buckets. The writer now
uses a fixed per-epoch bucket stride and increments an in-bucket fragment slot when multiple timed
fragments arrive inside the same proof duration. This keeps source-local manifests unique while
allowing independently restarted publishers to align on the same observed wall-clock bucket.
ec-node archive-convergence reports this separately as source_local_divergent_sequences so
operator tooling can distinguish a self-contradicting publisher from two publishers that simply
disagree about the same sequence.
Because bucket-strided proof sequences intentionally leave numeric gaps, archive convergence uses
the observed sparse sequence union for publisher-origin manifests. Dense contiguous sequence ranges
remain available in the simulation layer when a model explicitly expects every integer sequence.
The 2026-06-08 live la-kcet/publisher.m4s sample from Forge confirmed that both publishers now
emit distinct source identities (ec-node-c3546fa5abc3 and ec-node-72cf1c3aa196) with no missing
source identity records on the sampled publisher-origin manifests. It also confirmed the remaining
bug: 156 shared publisher-origin sequences had zero byte-for-byte BLAKE3 matches and 156 divergent
hashes. The next production fix must align the publisher chunk clock and encoded fMP4 byte stream,
not merely improve scrape or Grafana plumbing.
After the wall-clock bucket hotpatch, the same live proof no longer has fake sparse-range missing
IDs: la-kcet/publisher.m4s reported 376 observed proof sequences, zero missing source identities,
zero source-local divergent sequences, and 234 divergent shared sequences. A byte-level sample for
sequence 7287381184512 had different sizes, different BLAKE3 hashes, different tfdt
base-media-decode-times (210210 versus 0), and different mdat payload prefixes. Across that
sampled window there were zero common fragment hashes even when sequence IDs were ignored, proving
that the remaining failure was independent-encoder media phase and fMP4 payload determinism, not an
archive manifest identity bug.
A later la-kcop/publisher.m4s sample exposed a stricter live-source bug: source-window proof
records were using unsynced MPEG-TS PCR chunk indexes as group_sequence when the OTA UTC clock was
unavailable, causing restart-dependent jumps such as 93M, 135M, 341M, and 390M. The source-proof
writer now uses the chunk UTC start only when the chopper reports synced timing, otherwise it falls
back to the local wall-clock window start, and rewrites fMP4 tfdt onto that shared window before
hashing. The live HTTP proof worker also retries transient source opens/reader failures in unbounded
live mode, so a tuner 503 or malformed TS burst is skipped/retried instead of killing the
publisher proof process.
The synced source-window clock must use the chopper's exact global chunk index, not integer UTC
seconds. A 1001 ms proof cadence makes whole-second UTC start metadata lossy: adjacent source
windows can share the same utc_start_unix, which caused one publisher to write several different
hashes under the same source-local group_sequence. Synced chunks therefore use
ChunkTiming.chunk_index directly; only unsynced chunks fall back to local wall-clock receipt.
The live source-window proof writer also keeps subfragment slot allocation as stream state instead
of per-chunk state. Real source windows can be emitted in more than one proof chunk for the same
media timing sequence; resetting the slot counter for every chunk reused the same
group_sequence and made one healthy publisher look self-divergent. The counter is bounded so the
long-running live worker does not grow state unbounded.
wt-publish now has an explicit Unix-epoch start boundary, defaulting to the publisher-origin proof
cadence. After relay setup and immediately before spawning ffmpeg it waits until the next boundary,
so a newly restarted duplicate publisher starts its forced-keyframe clock on the same global cadence
as already-running publishers.
This does not by itself prove byte equality; it removes the local-process-start phase error from the
live publisher path and gives rollout measurement a deterministic knob (--publisher-start-boundary-ms 0 disables it). The live ffmpeg argument plan is factored into a Rust unit-testable helper so
future timestamp/keyframe changes are pinned in ec-node instead of being inferred from node-agent
process strings or production samples.
The first post-start-clock live sample still failed duplicate byte identity: both publishers landed
in the same wall-clock proof bucket, but one fragment carried tfdt=390390 while the other carried
tfdt=30030, matching the staggered restart gap. Their mdat prefixes differed too, which means a
continuous x264 encoder keeps enough local history that a later restart cannot prove byte equality
merely by joining the same wall-clock cadence. The live profile therefore enables x264
stitchable=1 alongside closed GOP, no scenecut, no B-frames, no lookahead, and one thread. If that
still does not converge in production, the next fix is a deliberately stateless per-fragment encode
or a Rust-owned media clock/segmenter that resets encoder history at each proof boundary.
The follow-up production hotpatch moved the start-boundary wait to immediately before ffmpeg spawn,
enabled stitchable=1, and restarted both publisher nodes in the same batch. The latest la-kcet
sample still reported zero matching duplicate hashes with no missing source identity and no
source-local divergence. A final sampled shared sequence differed by hundreds of milliseconds of
receive time and by media size (439737 versus 270283 bytes for the video fragment), so the
remaining mismatch is not just MP4 timestamp metadata. Production duplicate proof now needs a
stateless fragment boundary: either encode each proof segment from the same bounded source window
with fresh encoder state, or make the Rust media pipeline own exact frame-window capture before
calling ffmpeg/x264.
Archive manifests now carry optional fMP4 media timing for publisher-origin fragments. The
archive-convergence gate treats equal archive group sequence IDs with different media sequence or
decode-time metadata as media_sequence_conflict, even if the byte hash happens to match. This keeps
production proof aligned with the Rust simulation model: a duplicate publisher only proves the same
broadcast moment when the archive sequence and media window agree.
The first stateless proof primitives are now in ec-node. publisher-proof-segment takes one
bounded MPEG-TS source-clock window, runs a fresh deterministic x264/AAC fMP4 encode, splits the
result into init bytes and media fragments, and emits BLAKE3 hashes for each. publisher-proof-windows
uses the Rust MPEG-TS source-clock splitter first, then fresh-encodes each bounded window and reports
per-window source TS, init, and media hashes. Proof windows carry explicit MPEG-TS decoder context
with --preroll-packets, defaulting to the repo-owned WT_PUBLISH_PROOF_PREROLL_PACKETS budget, so
mid-GOP windows do not silently depend on best-effort decoder recovery. Focused Rust tests
fresh-encode the same bounded input and the same finite source-window campaign twice and assert
byte-for-byte identical proof hashes.
publisher-proof-duplicates is the single-node duplicate-publisher gate for the stateless path. It
runs publisher-proof-windows independently under at least two publisher identity labels, defaults
to publisher-a and publisher-b, and compares source TS, init, and media fragment BLAKE3 hashes
for every source-clock window. --require-ok exits non-zero unless every compared window matches,
and duplicate publisher labels are rejected so the proof cannot accidentally collapse to one source
identity. publisher-proof-compare is the cross-machine stateless proof gate: each publisher can run
publisher-proof-windows against the same bounded source TS file locally, copy the JSON report back
to the operator host, and compare the reports by named publisher. It rejects mismatched chunk cadence,
missing windows, source TS hash mismatches, init hash mismatches, media fragment hash mismatches, and
empty media windows.
publisher-proof-remote-compare is the production operator harness for that cross-machine gate. It
copies one bounded .ts proof input to each named SSH target, runs ec-node publisher-proof-windows
on the target, stores each returned JSON report under the local output directory, writes a
compare.json, and returns the existing compare report with upload/proof timing. Remote labels use
the same single-component validation as publisher identities, remote proof roots are constrained to
/tmp/every-channel-*, and cleanup is opt-in so the generated proof files remain inspectable unless
the operator explicitly requests removal. This keeps the live proof path in Rust without making the
Python node-agent a new oracle. It proves the machine/runtime/compiler boundary without requiring
the two NUCs to share a live tuner at the exact same instant.
publisher-proof-archive-source is the live archive implementation of the same proof model. It can
read local source files directly, read plain HTTP MPEG-TS bodies directly for HDHomeRun-style
sources, or fall back to an ffmpeg MPEG-TS copy reader for other inputs. Each emitted source-clock
window is encoded with fresh proof state, archived as CAS-backed publisher.m4s records, and mapped
to source-clock group sequences with explicit media timing metadata. A focused Rust regression now
archives the same bounded TS input as two source nodes, then runs archive-convergence against the
two manifest roots and requires full duplicate convergence with zero divergent or source-local
divergent sequences.
Forge ci-gates now runs the publisher_proof and archive_convergence Rust filters before the
distributed simulator campaign, so single-node byte-for-byte determinism, source-window archive
proof semantics, and duplicate archive convergence are checked before production rollout evidence is
considered. The next production step is to deploy the updated node binary and let fresh
publisher.m4s source-window records age into the Grafana scrape window so live duplicate metrics
can replace the older continuous-encoder divergence.