Security posture checklist
audience: operators
Each item below is a pre-production checklist entry. Print it,
initial it, file it with the deploy record. Work through this
checklist per instance — an honest posture on acme.mainnet
does not protect preview.alpha if the two share a fault domain or
a secret store.
Instance identity and scope
-
ZIPNET_INSTANCEis set to a namespaced string (e.g.acme.mainnet) and documented in the release notes your publishers consume. No operator within the same universe uses the same string. -
ZIPNET_UNIVERSE, if set, points at a universe you control. The default (zipnet::UNIVERSE) is the shared world and is correct for most deployments. - The instance’s MR_TD (TDX-gated instances) is published alongside the instance name in a signed channel. Publishers verify against that hash.
Committee secret handling
-
ZIPNET_COMMITTEE_SECRETis stored only in a secret manager (vault, AWS Secrets Manager, HashiCorp Vault, k8sSecretresource). Never in a git repo, never in a plain environment file. - The secret is unique per instance. Do not reuse one
committee secret across
acme.mainnetandacme.previeweven though the operator is the same. - Rotation procedure is documented and rehearsed (see Rotations and upgrades).
- Access to read the secret is audited. A quarterly review of access logs is on the calendar.
Committee server node hygiene
- Each committee server runs in a separate fault domain (different cloud account, different region, different operator organization if possible). The whole point of any-trust is diversity.
- In production, every committee server runs inside a TDX guest
built by the mosaik image builder. The committee’s
require_mrtd(...)validator is set to the build’s measured MR_TD. See Rebuilding a TDX image for the rebuild cadence. -
ZIPNET_SECRETis unique per node and stored in the node’s own secret scope (not shared with any other node). - Committee servers listen only on the iroh port (default UDP ephemeral + relay) and the Prometheus metrics port. No other inbound exposure.
- Decommissioned committee servers have their disks wiped. DH secrets leaking from a decommissioned box are historically replayable.
Aggregator node hygiene
- The aggregator is not in the committee’s secret-possession
circle. It does not have access to
ZIPNET_COMMITTEE_SECRET. - Aggregator memory is not a secret store — aggregates are XOR-sums whose plaintext only the committee can recover. Still, hardening the aggregator is good practice: read-only filesystem, dropped capabilities, etc.
- If you operate one aggregator per instance, each is configured
with its own
ZIPNET_INSTANCEand its ownZIPNET_SECRET.
Client image hygiene (TDX-gated instances)
- The client image you ship to publishers is built reproducibly. The mosaik TDX builder is deterministic — commit your toolchain and feature-flag set alongside the release.
- The committee’s
Tdxvalidator lists the published client MR_TD inrequire_mrtd(...). Publishers running any other image are rejected at bond time. - TDX quote expiration is monitored; see Monitoring.
- Image rebuild cadence is documented. At minimum, rebuild whenever the upstream kernel or initramfs toolchain ships a security fix — a new MR_TD is cheap compared with unpatched firmware.
Client image hygiene (TDX disabled, dev/test only)
- Understood: without TDX, the client trusts the client host for DH key protection. Anyone with access to the client process can deanonymize that client’s own messages (not others’).
- Clients handling non-public messages wait for the
ClientRegistryto include their own entry and wait for at leastZIPNET_MIN_PARTICIPANTS − 1other clients to also be registered before relying on anonymity properties. - This posture is explicitly not used for production in TDX-gated instances.
Network hygiene
- Firewalls permit outbound UDP to iroh relays. If you run your own relay, ensure clients can reach it.
- NTP is configured on every node. Raft tolerates small skew; large skew causes election storms. TDX quote validation is also clock-sensitive.
- Prometheus metrics endpoints are NOT publicly exposed.
Archival / audit
- A job pulls the
Broadcastscollection to durable storage at the chosen cadence, keyed by instance name (see Accounting and audit). -
PeerId → legal entityregistry is version-controlled, signed, and scoped per instance.
Emergency contacts
- On-call rotation documented for each node, per instance.
- Break-glass procedure for committee-secret rotation documented, per instance.
- “Who can revoke a compromised bundle ticket” is specified — note that in v1 a ticket lives in gossip until the node is removed from the universe, so the answer is “the node’s operator, by stopping the node”.
Known-not-yet-protected footguns
- Metadata from iroh. The iroh layer leaks some metadata (relay preferences, coarse geography via relay choice). A global passive adversary observing traffic patterns across relays can narrow anonymity sets.
- Cross-instance traffic correlation. Instances share a
universe. A passive observer of gossip can often tell “this peer
is a member of instance X” from catalog membership, even without
seeing any
Broadcastscontent. Anonymity within a round is unaffected; anonymity of membership in an instance is not a property the protocol provides. - Client message length. The protocol encrypts the message but does not pad it to a uniform length. Unusually long messages are recognizable in the broadcast. Pad your payloads to the nearest slot boundary at the application layer if this matters for you.
- Participant set disclosure.
BroadcastRecord::participantslists everyClientIdwhose envelope was folded into the round. Knowing “client X was in this round” is not the same as knowing “client X wrote this message”, but it is visible and it leaks connection timing.
These are tracked in Roadmap to v2.