Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Security posture checklist

audience: operators

Each item below is a pre-production checklist entry. Print it, initial it, file it with the deploy record. Work through this checklist per instance — an honest posture on acme.mainnet does not protect preview.alpha if the two share a fault domain or a secret store.

Instance identity and scope

  • ZIPNET_INSTANCE is set to a namespaced string (e.g. acme.mainnet) and documented in the release notes your publishers consume. No operator within the same universe uses the same string.
  • ZIPNET_UNIVERSE, if set, points at a universe you control. The default (zipnet::UNIVERSE) is the shared world and is correct for most deployments.
  • The instance’s MR_TD (TDX-gated instances) is published alongside the instance name in a signed channel. Publishers verify against that hash.

Committee secret handling

  • ZIPNET_COMMITTEE_SECRET is stored only in a secret manager (vault, AWS Secrets Manager, HashiCorp Vault, k8s Secret resource). Never in a git repo, never in a plain environment file.
  • The secret is unique per instance. Do not reuse one committee secret across acme.mainnet and acme.preview even though the operator is the same.
  • Rotation procedure is documented and rehearsed (see Rotations and upgrades).
  • Access to read the secret is audited. A quarterly review of access logs is on the calendar.

Committee server node hygiene

  • Each committee server runs in a separate fault domain (different cloud account, different region, different operator organization if possible). The whole point of any-trust is diversity.
  • In production, every committee server runs inside a TDX guest built by the mosaik image builder. The committee’s require_mrtd(...) validator is set to the build’s measured MR_TD. See Rebuilding a TDX image for the rebuild cadence.
  • ZIPNET_SECRET is unique per node and stored in the node’s own secret scope (not shared with any other node).
  • Committee servers listen only on the iroh port (default UDP ephemeral + relay) and the Prometheus metrics port. No other inbound exposure.
  • Decommissioned committee servers have their disks wiped. DH secrets leaking from a decommissioned box are historically replayable.

Aggregator node hygiene

  • The aggregator is not in the committee’s secret-possession circle. It does not have access to ZIPNET_COMMITTEE_SECRET.
  • Aggregator memory is not a secret store — aggregates are XOR-sums whose plaintext only the committee can recover. Still, hardening the aggregator is good practice: read-only filesystem, dropped capabilities, etc.
  • If you operate one aggregator per instance, each is configured with its own ZIPNET_INSTANCE and its own ZIPNET_SECRET.

Client image hygiene (TDX-gated instances)

  • The client image you ship to publishers is built reproducibly. The mosaik TDX builder is deterministic — commit your toolchain and feature-flag set alongside the release.
  • The committee’s Tdx validator lists the published client MR_TD in require_mrtd(...). Publishers running any other image are rejected at bond time.
  • TDX quote expiration is monitored; see Monitoring.
  • Image rebuild cadence is documented. At minimum, rebuild whenever the upstream kernel or initramfs toolchain ships a security fix — a new MR_TD is cheap compared with unpatched firmware.

Client image hygiene (TDX disabled, dev/test only)

  • Understood: without TDX, the client trusts the client host for DH key protection. Anyone with access to the client process can deanonymize that client’s own messages (not others’).
  • Clients handling non-public messages wait for the ClientRegistry to include their own entry and wait for at least ZIPNET_MIN_PARTICIPANTS − 1 other clients to also be registered before relying on anonymity properties.
  • This posture is explicitly not used for production in TDX-gated instances.

Network hygiene

  • Firewalls permit outbound UDP to iroh relays. If you run your own relay, ensure clients can reach it.
  • NTP is configured on every node. Raft tolerates small skew; large skew causes election storms. TDX quote validation is also clock-sensitive.
  • Prometheus metrics endpoints are NOT publicly exposed.

Archival / audit

  • A job pulls the Broadcasts collection to durable storage at the chosen cadence, keyed by instance name (see Accounting and audit).
  • PeerId → legal entity registry is version-controlled, signed, and scoped per instance.

Emergency contacts

  • On-call rotation documented for each node, per instance.
  • Break-glass procedure for committee-secret rotation documented, per instance.
  • “Who can revoke a compromised bundle ticket” is specified — note that in v1 a ticket lives in gossip until the node is removed from the universe, so the answer is “the node’s operator, by stopping the node”.

Known-not-yet-protected footguns

  • Metadata from iroh. The iroh layer leaks some metadata (relay preferences, coarse geography via relay choice). A global passive adversary observing traffic patterns across relays can narrow anonymity sets.
  • Cross-instance traffic correlation. Instances share a universe. A passive observer of gossip can often tell “this peer is a member of instance X” from catalog membership, even without seeing any Broadcasts content. Anonymity within a round is unaffected; anonymity of membership in an instance is not a property the protocol provides.
  • Client message length. The protocol encrypts the message but does not pad it to a uniform length. Unusually long messages are recognizable in the broadcast. Pad your payloads to the nearest slot boundary at the application layer if this matters for you.
  • Participant set disclosure. BroadcastRecord::participants lists every ClientId whose envelope was folded into the round. Knowing “client X was in this round” is not the same as knowing “client X wrote this message”, but it is visible and it leaks connection timing.

These are tracked in Roadmap to v2.

See also