# ECP-0022: swarm availability, signed manifests, anti-junk controls Status: Draft ## Problem / context We need to scale distribution without turning every node into a dump pipe. The primary threat is denial via junk data: a peer can join and send invalid chunks that waste bandwidth and compute. We also need a model for multi-party signing and divergent encodes without inventing a full blockchain. ## Decision Adopt a pull-based swarm model centered on signed manifests and content-addressed chunks: - Chunks are immutable and referenced by hash. - A manifest commits to a chunk set via a Merkle root and is signed by one or more identities. - Peers only serve chunks that match a manifest they recognize and that a requester explicitly asks for. - Junk is rejected by hash before decode, and peers are rate-limited and penalized for bad data. Multi-root signing is handled through **attestations**: multiple signers may publish distinct manifest roots for the same stream/time slice, and clients decide which roots are acceptable based on policy. ## Details ### Objects - `ChunkId`: - `stream_id` - `epoch_id` (time window / slice) - `chunk_index` - `chunk_hash` (blake3 of raw chunk bytes) - `ManifestBody`: - `stream_id` - `epoch_id` - `chunk_duration_ms` - `total_chunks` - `chunk_start_index` - `encoder_profile_id` - `merkle_root` (over ordered chunk_hash list) - `created_unix_ms` - `metadata` (optional, e.g. channel title, guide hints) - `chunk_hashes` (ordered list for now; proofs can come later) - `Manifest`: - `body` - `manifest_id` = `blake3(body)` - `signatures`: list of `(signer_id, sig_over_manifest_id)` ### Pull-only data flow - A peer first obtains a manifest (via gossip, direct share, or local catalog). - The peer requests specific chunks by `ChunkId`. - Providers return bytes **only** if they can supply exact hash bytes and have quota available. ### Anti-junk measures - Validate `chunk_hash` before decode; invalid bytes are dropped. - Require the requester to supply `manifest_id` + `chunk_index` + expected `chunk_hash`. - Token-bucket quotas per peer: - max bytes/sec - max concurrent chunk responses - max outstanding requests - Penalize peers: - invalid hash → immediate disconnect + temporary ban - repeated invalid requests → throttle or blacklist ### Multi-root signing (not blockchain) Multiple manifests may exist for the same `stream_id + epoch_id`. We treat them as **parallel attestations**, not consensus. Clients choose a policy, e.g.: - accept a manifest if signed by `>=1` trusted key - or `>=N` signatures from a trust set - or “self + any one other” If two manifests conflict (different `merkle_root`), clients may: - select the one with the most trusted signatures - fall back to local preference - display “split availability” without forcing global consensus ### Relay behavior Relays cache hot chunks by `ChunkId` and serve on request only. They do not push data to peers. Relays enforce the same quota and validation rules as peers. ### MoQ transport notes (initial) - Track `chunks` carries object payloads as in ECP-0012. - Track `manifests` carries JSON-encoded `Manifest` entries, one per group. - For live streams we start with **per-chunk manifests** (epoch size = 1) so every chunk can be validated immediately. Multi-chunk epochs and Merkle proofs can replace this later. ### Chunk membership proofs (incremental) To avoid sending full `chunk_hashes` forever, object metadata may include a Merkle branch proving that `chunk_hash` is included in the manifest's `merkle_root` for the epoch. - `ObjectMeta.chunk_proof`: list of sibling hashes from leaf to root. - Verification uses the chunk offset within the epoch: `offset = chunk_index - chunk_start_index`. ## Alternatives considered - Push-only streams: rejected (amplifies junk and DDoS risk). - Global consensus / blockchain for manifests: rejected as overkill for now. - Accepting arbitrary chunks without hashes: rejected (no integrity). ## Rollout / teardown plan 1. Define `ManifestBody`, `Manifest`, and `ChunkId` in `ec-core`. 2. Implement signing + verification in `ec-crypto` (age/ssh). 3. Add manifest publication and request protocol in `ec-moq`. 4. Enforce per-peer quotas and ban lists in `ec-iroh` transport layer. 5. Update catalog gossip to advertise manifests, not raw chunks. Teardown: revert to direct MoQ stream subscriptions (current behavior) if manifest flow blocks progress. ## Open questions - What default trust policy should the UI use for manifests? - Should relays require proof-of-work for public manifest announcements? - Do we embed encoder profile parameters directly in `encoder_profile_id` or reference a registry?