114 lines
4.6 KiB
Markdown
114 lines
4.6 KiB
Markdown
# ECP-0022: swarm availability, signed manifests, anti-junk controls
|
|
|
|
Status: Draft
|
|
|
|
## Problem / context
|
|
|
|
We need to scale distribution without turning every node into a dump pipe. The primary threat is denial via junk data: a peer can join and send invalid chunks that waste bandwidth and compute. We also need a model for multi-party signing and divergent encodes without inventing a full blockchain.
|
|
|
|
## Decision
|
|
|
|
Adopt a pull-based swarm model centered on signed manifests and content-addressed chunks:
|
|
|
|
- Chunks are immutable and referenced by hash.
|
|
- A manifest commits to a chunk set via a Merkle root and is signed by one or more identities.
|
|
- Peers only serve chunks that match a manifest they recognize and that a requester explicitly asks for.
|
|
- Junk is rejected by hash before decode, and peers are rate-limited and penalized for bad data.
|
|
|
|
Multi-root signing is handled through **attestations**: multiple signers may publish distinct manifest roots for the same stream/time slice, and clients decide which roots are acceptable based on policy.
|
|
|
|
## Details
|
|
|
|
### Objects
|
|
|
|
- `ChunkId`:
|
|
- `stream_id`
|
|
- `epoch_id` (time window / slice)
|
|
- `chunk_index`
|
|
- `chunk_hash` (blake3 of raw chunk bytes)
|
|
- `ManifestBody`:
|
|
- `stream_id`
|
|
- `epoch_id`
|
|
- `chunk_duration_ms`
|
|
- `total_chunks`
|
|
- `chunk_start_index`
|
|
- `encoder_profile_id`
|
|
- `merkle_root` (over ordered chunk_hash list)
|
|
- `created_unix_ms`
|
|
- `metadata` (optional, e.g. channel title, guide hints)
|
|
- `chunk_hashes` (ordered list for now; proofs can come later)
|
|
- `Manifest`:
|
|
- `body`
|
|
- `manifest_id` = `blake3(body)`
|
|
- `signatures`: list of `(signer_id, sig_over_manifest_id)`
|
|
|
|
### Pull-only data flow
|
|
|
|
- A peer first obtains a manifest (via gossip, direct share, or local catalog).
|
|
- The peer requests specific chunks by `ChunkId`.
|
|
- Providers return bytes **only** if they can supply exact hash bytes and have quota available.
|
|
|
|
### Anti-junk measures
|
|
|
|
- Validate `chunk_hash` before decode; invalid bytes are dropped.
|
|
- Require the requester to supply `manifest_id` + `chunk_index` + expected `chunk_hash`.
|
|
- Token-bucket quotas per peer:
|
|
- max bytes/sec
|
|
- max concurrent chunk responses
|
|
- max outstanding requests
|
|
- Penalize peers:
|
|
- invalid hash → immediate disconnect + temporary ban
|
|
- repeated invalid requests → throttle or blacklist
|
|
|
|
### Multi-root signing (not blockchain)
|
|
|
|
Multiple manifests may exist for the same `stream_id + epoch_id`. We treat them as **parallel attestations**, not consensus. Clients choose a policy, e.g.:
|
|
|
|
- accept a manifest if signed by `>=1` trusted key
|
|
- or `>=N` signatures from a trust set
|
|
- or “self + any one other”
|
|
|
|
If two manifests conflict (different `merkle_root`), clients may:
|
|
|
|
- select the one with the most trusted signatures
|
|
- fall back to local preference
|
|
- display “split availability” without forcing global consensus
|
|
|
|
### Relay behavior
|
|
|
|
Relays cache hot chunks by `ChunkId` and serve on request only. They do not push data to peers. Relays enforce the same quota and validation rules as peers.
|
|
|
|
### MoQ transport notes (initial)
|
|
|
|
- Track `chunks` carries object payloads as in ECP-0012.
|
|
- Track `manifests` carries JSON-encoded `Manifest` entries, one per group.
|
|
- For live streams we start with **per-chunk manifests** (epoch size = 1) so every chunk can be validated immediately. Multi-chunk epochs and Merkle proofs can replace this later.
|
|
|
|
### Chunk membership proofs (incremental)
|
|
|
|
To avoid sending full `chunk_hashes` forever, object metadata may include a Merkle branch proving that `chunk_hash` is included in the manifest's `merkle_root` for the epoch.
|
|
|
|
- `ObjectMeta.chunk_proof`: list of sibling hashes from leaf to root.
|
|
- Verification uses the chunk offset within the epoch: `offset = chunk_index - chunk_start_index`.
|
|
|
|
## Alternatives considered
|
|
|
|
- Push-only streams: rejected (amplifies junk and DDoS risk).
|
|
- Global consensus / blockchain for manifests: rejected as overkill for now.
|
|
- Accepting arbitrary chunks without hashes: rejected (no integrity).
|
|
|
|
## Rollout / teardown plan
|
|
|
|
1. Define `ManifestBody`, `Manifest`, and `ChunkId` in `ec-core`.
|
|
2. Implement signing + verification in `ec-crypto` (age/ssh).
|
|
3. Add manifest publication and request protocol in `ec-moq`.
|
|
4. Enforce per-peer quotas and ban lists in `ec-iroh` transport layer.
|
|
5. Update catalog gossip to advertise manifests, not raw chunks.
|
|
|
|
Teardown: revert to direct MoQ stream subscriptions (current behavior) if manifest flow blocks progress.
|
|
|
|
## Open questions
|
|
|
|
- What default trust policy should the UI use for manifests?
|
|
- Should relays require proof-of-work for public manifest announcements?
|
|
- Do we embed encoder profile parameters directly in `encoder_profile_id` or reference a registry?
|