Running a committee server
audience: operators
A committee server joins the Raft group that orchestrates the
instance’s rounds, holds one of the X25519 keys used to unblind the
broadcast vector, and publishes its public bundle into the replicated
ServerRegistry. In production it runs inside a TDX guest built from
the mosaik image builder; see the
Quickstart TDX section.
One-shot command
ZIPNET_INSTANCE="acme.mainnet" \
ZIPNET_COMMITTEE_SECRET="your-committee-secret" \
ZIPNET_SECRET="stable-node-seed" \
ZIPNET_MIN_PARTICIPANTS=2 \
ZIPNET_ROUND_PERIOD=3s \
ZIPNET_ROUND_DEADLINE=15s \
./zipnet-server --bootstrap <peer_id_of_another_server>
On a fresh universe with no existing seed peers, start the first
server without --bootstrap, grab the peer=… value printed at
startup, and pass it as --bootstrap to the remaining servers. Every
subsequent server, aggregator, or client can be bootstrapped off any
one of them. After the universe has settled, the mosaik discovery
layer finds peers on its own and the bootstrap hint is only needed
for cold starts.
Environment variables
The full list lives in Environment variables. The ones you will actually set in production:
| Variable | Meaning | Notes |
|---|---|---|
ZIPNET_INSTANCE | Instance name this server serves | Required. Short, stable, namespaced (e.g. acme.mainnet). Must match across the whole deployment. |
ZIPNET_UNIVERSE | Universe override | Optional. Leave unset to use zipnet::UNIVERSE (the shared mosaik universe). Set only for isolated federations. |
ZIPNET_COMMITTEE_SECRET | Shared committee admission secret | Treat as root credential. Identical on every committee member of this instance. |
ZIPNET_SECRET (or --secret) | Seed for this node’s stable PeerId | Unique per node. Anything not 64-hex is blake3-hashed. |
ZIPNET_BOOTSTRAP | Peer IDs to dial on startup | Helpful on cold universes; unnecessary once discovery has converged. |
ZIPNET_MIN_PARTICIPANTS | Minimum clients before the leader opens a round | Default 1. Set to at least 2 for meaningful anonymity. |
ZIPNET_ROUND_PERIOD | How often the leader attempts to open a round | e.g. 2s, 500ms. |
ZIPNET_ROUND_DEADLINE | Max time a round may stay open | e.g. 15s. The leader will force-advance a stuck round. |
ZIPNET_METRICS | Bind address for the Prometheus exporter | e.g. 0.0.0.0:9100. |
RUST_LOG | Log filter | Sane default: info,zipnet_node=info,mosaik=warn. |
Naming the instance
Instance names share a flat namespace per universe. Two operators
picking the same name collide in the same committee group and
neither deployment works — mosaik has no way to prevent or detect
this. Namespace aggressively: <org>.<purpose>.<env>, for example
acme.mixer.mainnet. If unsure, add a random suffix once and forget
about it (acme.mixer.mainnet.8f3c1a).
What a healthy startup looks like
INFO zipnet_server: spawning zipnet server server=a2095bed48
INFO zipnet_node::roles::common: zipnet up: network=<universe> instance=acme.mainnet peer=f5e28a69e6... role=3b37e5d575...
INFO zipnet_node::roles::server: server booting; waiting for collections + group
INFO zipnet_node::committee: committee: opening round at index I_1
INFO zipnet_node::roles::server: submitted partial unblind at I_2
INFO zipnet_node::committee: committee: round finalized round=r1 participants=N
A server that has been up for more than a minute and has not printed
round finalized yet is almost always waiting on one of:
- Client count below
ZIPNET_MIN_PARTICIPANTS. Check the aggregator’szipnet_client_registry_sizemetric. - Committee group has not elected a leader. Check
mosaik_groups_leader_is_localon each server; exactly one should be 1. - Bundle tickets not replicated. See Incident response — stuck rounds.
Resource profile
A single-slot round at the default RoundParams (64 slots × 256
bytes = 16 KiB broadcast vector) with 100 clients uses roughly:
- CPU: a burst of ~5 ms per round per client (pad derivation dominates).
- RAM: O(N) client bundles × 64 bytes + a ring buffer of recent aggregates.
- Network: inbound one aggregate envelope per round (+ Raft heartbeat traffic between servers), outbound one partial per round + Raft replication to followers.
Graceful shutdown
Send SIGTERM. The server emits a departure announcement over
gossip so peers learn within the next announce cycle (default 15 s)
that it is gone. Raft proceeds with the remaining quorum provided a
majority is still up.
Availability warning
In v1, any committee server going offline halts round progression because the state machine waits for one partial per server listed in the round’s roster. This is by design — the paper’s any-trust model prioritizes correctness over liveness. A v2 improvement is sketched in Roadmap to v2.
See also
- Running the aggregator — the other always-on node.
- Rotations and upgrades — rolling restarts, key rotation, adding/removing members.
- Monitoring and alerts — what to put on your dashboard.
- Incident response — when things go wrong.