Introduction

audience: all

Zipnet is an anonymous broadcast channel for bounded sets of authenticated participants. A group of clients publish messages onto a shared log; nobody — not even the operators of the infrastructure, acting individually — can tell which client authored which message.

This book documents a working prototype of ZIPNet built as a mosaik-native application. The protocol follows Rosenberg, Shih, Zhao, Wang, Miers, and Zhang (2024) with a small, grep-able set of v1 simplifications tracked in Roadmap to v2. The motivation for running this shape in Flashbots’ infrastructure is laid out in Network-anonymized mempools — zipnet is one concrete realisation of the direction sketched there.

The source tree lives at github.com/flashbots/zipnet; this book is rendered from book/src/ in the same repo.

What zipnet is for

The canonical motivating case is an encrypted mempool: TEE-attested wallets seal transactions and publish them through zipnet; builders read an ordered log of sealed transactions; no party — not even a compromised builder — can link a transaction back to its author until on-chain execution reveals whatever the transaction itself reveals. The encryption layer (threshold decryption, TEE unsealing, plaintext-if-you-want) sits on top; zipnet supplies the anonymous, ordered, sybil-resistant publish channel underneath.

Other deployments in the same shape:

Permissioned order-flow auctions. Whitelisted searchers publish intents; builders bid without knowing which searcher sent what.
Anonymous governance signalling. Token-holder wallets cast signals a delegate can tally without learning which wallet sent any given one.
Private sealed-bid auctions. Bidders publish; outcomes are public; bid-to-bidder linkage is cryptographic.

What zipnet uniquely provides across these:

Sender anonymity within an attested set. A compromised reader cannot tie a message back to its author unless every committee operator colludes (any-trust).
Shared ordered view. Every subscriber sees the same log in the same order.
Sybil resistance. Only TEE-attested clients can publish.
Censorship resistance at the publish layer. Readers cannot drop messages from specific authors because authorship is unlinkable.

The deployment model in one paragraph

Zipnet runs as one mosaik-native organism among many on a shared mosaik universe — a single NetworkId (zipnet::UNIVERSE) that hosts zipnet alongside signers, storage, oracles, and anything else that composes the same way. An operator stands up a deployment by pinning a zipnet::Config — a short namespaced name (e.g. acme.mainnet) plus a shuffle window and a 32-byte init salt — and publishing it alongside the datum schema. Multiple deployments coexist on the same universe, each with its own committee, ACL, and round parameters. Consumers open typed handles against a deployment with three one-liners: Zipnet::<D>::submit(&network, &ACME_MAINNET), Zipnet::<D>::receipts(&network, &ACME_MAINNET), and Zipnet::<D>::read(&network, &ACME_MAINNET). There is no on-network registry; the operator publishes the Config (and, if TDX-gated, the committee MR_TD) via release notes or docs, and consumers compile it in.

The full rationale is in Designing coexisting systems on mosaik.

Three audiences, three entry points

This book is written for three distinct readers. Every page declares its audience on the first line and respects that audience’s tone. Pick the one that matches you:

Developers — external devs whose Rust agent publishes into, or reads from, a zipnet deployment somebody else operates. Start at Quickstart — publish and read.
Operators — devops staff deploying and maintaining deployments. Not expected to read Rust. Start at Deployment overview then Quickstart — stand up an instance.
Contributors — internal devs (senior Rust engineers with distsys and crypto background) extending the prototype itself. Start at Designing coexisting systems on mosaik then Architecture.

See Who this book is for for the tone conventions each audience is held to.

What this prototype is

A permissioned, any-trust broadcast system: anonymity is preserved as long as at least one committee server is honest; liveness requires every committee server to be honest (in v1).
Real cryptography — X25519 Diffie–Hellman, HKDF-SHA256, AES-128-CTR pad generation, blake3 falsification tags, ed25519 peer signatures (via iroh).
Real consensus — the committee runs a modified Raft through mosaik’s Group<CommitteeMachine>.
Real networking — the aggregator and the committee communicate through mosaik typed streams; discovery is gossip + pkarr + mDNS; transport is iroh / QUIC.

What this prototype is not

A production anonymous broadcast system. Ratcheting, footprint scheduling, cover traffic, multi-tier aggregators, and TDX-only builds tracked in the Roadmap to v2 are all deferred.
Byzantine fault tolerant. Mosaik is explicit about this; zipnet inherits the assumption. See Threat model for the precise statement.

Layout of the source tree

crates/zipnet/
  Cargo.toml
  build.rs          optional TDX image builder (feature-gated)
  src/
    lib.rs          SDK facade — Zipnet::<D>::{submit,receipts,read},
                     Config, UNIVERSE, ShuffleDatum, errors.
    main.rs         operator binary — `zipnet {server|aggregator|client}`
                     (+ optional `ingest` subcommand under --features ingest)
    proto/          pure wire types + crypto primitives (no async, no I/O)
    shuffler/       Algorithms 1/2/3 as pure functions
    node/           mosaik integration — declare! items, committee state
                     machine, ticket validator, role event loops
    ingest.rs       optional REST gateway role (feature-gated)
  tests/
    e2e.rs          node-level integration test
    sdk_e2e.rs      SDK-level integration test
book/               this book

See Crate map for the dependency graph and purity boundaries.

Who this book is for

audience: all

The zipnet book has three audiences, named after the roles that correspond to them:

Developers — external devs whose mosaik app binds into a running zipnet deployment.
Operators — devops staff standing up and running deployments.
Contributors — internal devs who extend the prototype itself (or build a sibling mosaik organism that reuses zipnet’s pattern).

Every chapter declares its audience on the first line (audience: developers | operators | contributors | both | all) and respects that audience’s conventions. This page is the authoritative description of each audience and the tone we hold ourselves to. New pages must pick one.

Mixing audiences wastes readers’ time and erodes trust. When content genuinely serves more than one group, use both (developers + operators, developers + contributors, …) or all, and structure the page so each audience gets the answer it came for in the first paragraph.

Developers (external devs)

Who they are. External Rust developers whose own mosaik agent publishes into — or reads from — a running zipnet deployment somebody else operates. They do not run committee servers or the aggregator; that’s the operator’s job. They do not modify the zipnet prototype itself; that’s the contributor’s job. They are integrators.

What they can assume.

Comfortable with async Rust and the mosaik book.
Already have a mosaik application in mind; zipnet is a dependency, not the centre of their work.
They bring their own Arc<Network> and own its lifecycle.

What they do not need.

Protocol theory. A developer who wants it can follow the link to the contributor pages.
An explanation of mosaik primitives. Link the mosaik book instead.
A committee operator’s view of keys, rotations, or monitoring.

What they care about.

“What do I import?”
“How do I bind to the operator’s deployment?”
“What does the operator owe me out of band — universe, instance name, MR_TD?”
“What does an error actually mean when it fires?”

Tone. Code-forward and cookbook-style. Snippets are rust,ignore, self-contained, and meant to be lifted into the reader’s workspace. Public API surfaces are listed as tables. Common pitfalls are called out inline so the reader does not have to infer them from silence. Second person (“you”) throughout.

Canonical developer page. Quickstart — publish and read.

Operators

Who they are. Devops staff deploying and maintaining zipnet instances. They run the committee, the aggregator, and the TDX images. They are the ones developers rely on.

What they can assume.

Familiar with Linux ops, systemd units, cloud networking, TLS, Prometheus.
Comfortable reading logs and dashboards.
Not expected to read Rust source. A Rust or protocol detail that is load-bearing for an operational decision belongs in a clearly marked “dev note” aside that can be skipped.

What they do not need.

The paper. Link it when a term is inherited; do not re-derive.
Internal crate layering. The operator cares what a binary does, not which crate it lives in.
Client-side ergonomics. That is the developers’ book.

What they care about.

“What do I run, on what hardware, with what env vars?”
“How do I know it is healthy?”
“How do I rotate secrets / retire an instance / upgrade an image?”
“What page covers the alert that just fired?”

Tone. Calm, runbook-style. Numbered procedures, parameter tables, one-line shell snippets. Pre-empt the obvious “what if…” questions inline. Avoid “simply” and “just”. Every command should either be safe to run verbatim or clearly marked as needing adaptation.

Canonical operator page. Quickstart — stand up an instance.

Contributors (internal devs)

Who they are. Internal devs — senior Rust engineers with distributed-systems and cryptography background — extending the zipnet prototype itself, or standing up a new mosaik organism (a signer, a storage service, an oracle, …) that reuses zipnet’s deployment pattern.

What they can assume.

Have read the ZIPNet paper (eprint 2024/1227).
Have read the mosaik book and are comfortable with Stream, Group, Collection, TicketValidator, the when() DSL, declare! macros.
Comfortable with async Rust, Raft, DC nets.

What they do not need.

Re-exposition of the paper. Cite section numbers (e.g. “§3.2”) and move on.
Primitives covered in the mosaik book. Link it.
User-level ergonomics unless they drive a design choice.

What they care about.

“Why is it this shape and not Shape A / B / C / D?”
“What invariants must hold? Where are they enforced?”
“What breaks when I bump StateMachine::signature()?”
“Where do I extend this — which module, which trait, which test?”

Tone. Dense, precise, design-review style. ASCII diagrams, pseudocode, rationale. rust,ignore snippets and structural comparisons without apology.

Canonical contributor page. Designing coexisting systems on mosaik.

Shared writing rules

No emojis anywhere in the book or the code.
No exclamation marks outside explicit security warnings.
Link the paper by section number when inheriting its terminology (e.g. “§3.2 scheduling”), not by paraphrase.
Link the mosaik book rather than re-explaining mosaik primitives. Our readers can follow a link.
Security-relevant facts are tagged with a visible admonition, not hidden inline.
Keep the three quickstarts synchronised. When the public library shape, the deployment model, or the naming convention changes, update the developers, operators, and contributors quickstarts together, not “this one first, the others later”.

What you need from the operator

audience: developers

Before you can write a line of code against a running zipnet deployment, collect three (or four, if it is TDX-gated) items from whoever runs it. That is the whole handshake — zipnet does not gossip a deployment registry, so everything you need to reach the deployment has to arrive out of band.

The handshake

1. Deployment `Config`

The operator’s chosen instance name, shuffle window, and 32-byte init salt. Together with the datum schema below, these fields fully determine the deployment’s on-wire identity.

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([/* 32 operator-published bytes */]);

2. Datum schema (`D::TYPE_TAG`, `D::WIRE_SIZE`)

The exact bytes-on-the-wire contract for one payload. Usually shipped as a small Rust crate the operator publishes, but the operator can also hand you the two constants out of band and you declare the type yourself.

impl ShuffleDatum for YourDatum { /* … */ }

3. Bootstrap `PeerId`

At least one reachable peer on the shared universe — typically the operator’s aggregator or a committee server. Without one, cold-start discovery falls back to the Mainline DHT and takes minutes instead of seconds.

discovery::Config::builder().with_bootstrap(peer_id)

(Applied on the Network builder, not the Config.)

4. Committee MR_TD — TDX-gated deployments only

48-byte hex measurement of the operator’s committee image. Pin this if your agent verifies inbound committee attestation, or match it if you are building a client image. See TEE-gated deployments for which applies to your setup.

Every byte of the Config plus the datum’s TYPE_TAG and WIRE_SIZE folds into the deployment’s on-wire identity via a single content-addressed hash. If your Config disagrees with the operator’s by one field — a typo in the name, a different ShuffleWindow preset, a stale init salt — your code derives an id nobody is serving, and the Zipnet::<D>::* constructors return Error::ConnectTimeout after the bond window elapses.

The bootstrap peer is universe-level, not zipnet-specific. Any reachable peer on the shared universe is a valid starting point; once you are bonded, mosaik’s discovery finds the specific instance’s committee and aggregator through the shared peer catalog.

The MR_TD is relevant only if the operator has turned on TDX gating. Most development deployments do not; production often does.

What you do not need to ask for

The universe NetworkId. It is zipnet::UNIVERSE — a shared constant baked into the SDK. Every operator and every user on zipnet uses the same value. You only need an operator-supplied override in the rare case they run an isolated federation on a different universe; assume they will tell you explicitly if so.
Per-deployment StreamId / StoreId / GroupId values. The SDK derives all of them from the Config + datum schema. Operators never hand these out, and the facade does not accept them.
Committee server secrets or any committee member’s X25519 secret. You are a consumer, not a committee member.
A seat on the committee’s Raft group. The SDK reads the broadcast log through a replicated collection; it does not vote.

How the handshake travels

Out of band. Release notes, a README in the operator’s repo, a Slack message, a secret-manager entry. Zipnet deliberately does not carry an on-network registry — the shared-universe model assumes consumers compile-time reference the Config they trust, rather than discovering “what deployments exist” at runtime. See Designing coexisting systems on mosaik for the rationale.

Pinning the `Config` at compile time

A typo in any Config field silently produces a different deployment id and surfaces as ConnectTimeout. For production code, bake the whole fingerprint in as a const:

use zipnet::{Config, ShuffleWindow, Zipnet};

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([
        // 32 operator-published bytes.
        0x7f, 0x3a, 0x9b, 0x1c, /* … */ 0x00,
    ]);

async fn run(network: std::sync::Arc<mosaik::Network>) -> anyhow::Result<()> {
struct Note; impl zipnet::ShuffleDatum for Note {
    const TYPE_TAG: zipnet::UniqueId = zipnet::unique_id!("demo");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { vec![0u8; 240] }
    fn decode(_: &[u8]) -> Result<Self, zipnet::DecodeError> { Ok(Note) }
}
let tx = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let _ = tx; Ok(()) }

Zipnet::<D>::deployment_id(&ACME_MAINNET) is a pure function — print it on both sides of the handshake to confirm you and the operator computed the same fingerprint without any wire traffic.

What you bring yourself

Your mosaik SecretKey if you want a stable PeerId across restarts. Leave it unset to get a random identity per run, which is the usual choice for anonymous-use-case clients. See Identity.
Your datum values. The SDK takes the typed D directly; D::encode produces exactly WIRE_SIZE bytes on the wire.

Minimal smoke test before writing anything substantial

Once you have the items above (four if TDX-gated), this program publishes to the deployment and prints its outcome within a few round periods:

use std::sync::Arc;
use futures::StreamExt;
use mosaik::{Network, discovery};
use zipnet::{Config, DecodeError, ShuffleDatum, ShuffleWindow,
              UNIVERSE, UniqueId, Zipnet, unique_id};

#[derive(Clone, PartialEq, Eq)]
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]); // replace with the operator's bytes

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let bootstrap = "<paste-the-operator's-peer-id>".parse()?;

    let network = Arc::new(
        Network::builder(UNIVERSE)
            .with_discovery(discovery::Config::builder().with_bootstrap(bootstrap))
            .build()
            .await?,
    );

    let tx         = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
    let mut reader = Zipnet::<Note>::read  (&network, &ACME_MAINNET).await?;

    let sent = Note([0u8; 240]);
    let _    = tx.send(sent.clone()).await?;

    // v1: confirm delivery by byte-equality on the read stream.
    // `Zipnet::<D>::receipts` is deferred to v2.
    while let Some(got) = reader.next().await {
        if got == sent {
            println!("landed");
            break;
        }
    }
    Ok(())
}

If the submitter returns ConnectTimeout, the Config or the bootstrap peer is the first suspect — see Troubleshooting.

Trust

The operator is trusted for liveness — they can stall or kill rounds at will. They are not trusted for anonymity, provided the any-trust assumption holds across their committee. See Threat model if you are auditing before integrating.

Quickstart — publish and read

audience: developers

You bring a mosaik::Network; the SDK layers ZIPNet on top of it as one service among many on a shared mosaik universe. Every deployment is a typed Zipnet<D> shuffler identified by a Config you compile in.

Why you might want this

You’re building something where a bounded, authenticated set of participants needs to publish messages without revealing which participant sent which. The canonical case is an encrypted mempool: TDX-attested wallets seal transactions and publish them through zipnet; builders read an ordered broadcast log of sealed transactions; nobody — not even a compromised builder — can link a transaction to its sender until on-chain execution reveals whatever the transaction itself reveals. The encryption layer (threshold decryption, TEE unsealing, or none) sits on top; zipnet supplies the anonymous, ordered, sybil-resistant publish channel underneath.

Other deployments in the same shape:

Permissioned order-flow auctions. Whitelisted searchers publish intents; builders bid without knowing which searcher sent what.
Anonymous governance signalling. Token-holder wallets cast signals a delegate can tally without learning which wallet sent any given one.
Private sealed-bid auctions. Bidders publish; outcome is public; bid-to-bidder linkage is cryptographic.

What zipnet uniquely provides across these:

Sender anonymity within an attested set. A compromised reader cannot tie a message back to its author unless every committee operator colludes (any-trust).
Shared ordered view. Every subscriber sees the same log in the same order. No relay-race asymmetry between readers.
Sybil resistance. Only TDX-attested clients can publish.
Censorship resistance at the publish layer. Readers can’t drop messages from specific authors because authorship is unlinkable.

If you’re the operator standing up the deployment rather than using one, read the operator quickstart instead.

The one-paragraph mental model

A mosaik universe is a single shared NetworkId. Many mosaik organisms — zipnet, multisig signers, secure storage, oracles — live on it simultaneously. An operator stands up a zipnet deployment by publishing a Config (instance name + shuffle window + per-deployment init salt) plus a ShuffleDatum impl describing the message type. You compile both into your code. Three free-function constructors — Zipnet::<D>::submit, ::read — give you a typed write-side and a typed read-side against that deployment. (A third constructor, ::receipts, is part of the shape but deferred to v2; see Publishing — v1 note.) The same Arc<Network> handle can bind to many deployments and to unrelated services on the same universe; one endpoint, many bindings.

`Cargo.toml`

[dependencies]
zipnet  = "0.1"
mosaik  = "=0.3.17"
tokio   = { version = "1", features = ["full"] }
futures = "0.3"
anyhow  = "1"

zipnet re-exports mosaik::{Tag, UniqueId, unique_id!} so you rarely reach for mosaik directly in small agents, but you’ll usually keep mosaik as a direct dep since you’re the one owning the Network.

Define your datum type

Every shuffler is parameterised by the type of message it shuffles. Implement ShuffleDatum for that type:

use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};

pub struct Note(pub [u8; 240]);

impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;

    fn encode(&self) -> Vec<u8> {
        self.0.to_vec()
    }

    fn decode(bytes: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(bytes)
            .map(Self)
            .map_err(|e| DecodeError(e.to_string()))
    }
}

TYPE_TAG is the schema fingerprint — change it whenever the encoding changes, even if WIRE_SIZE stays the same. WIRE_SIZE is the exact bytes-on-the-wire size; every value MUST encode to exactly this length. The SDK rejects mismatched-size encodings.

Constant size is non-negotiable: variable on-wire payload sizes leak sender identity through traffic analysis. Pad variable application data to a fixed WIRE_SIZE ceiling at the application layer before wrapping into your datum type.

Pin the deployment in a `const`

Every signature-altering input lives in a Config that the operator publishes and you compile in:

use zipnet::{Config, ShuffleWindow};

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([
        // 32 bytes the operator generated once at deploy time and
        // published in their release notes.
        0x7f, 0x3a, 0x9b, 0x1c, /* ... */ 0x00,
    ]);

ShuffleWindow::interactive() is a latency-optimised preset (1s rounds, up to 64 participants). archival() is the anonymity-optimised one (30s rounds, up to 1024 participants). For custom timings, use ShuffleWindow::custom(...).

The whole Config plus your D::TYPE_TAG and D::WIRE_SIZE folds into the deployment’s on-wire identity. Mismatched configs between you and the operator give different GroupIds — you silently don’t bond, and submit / read return ConnectTimeout.

Publisher

use std::sync::Arc;
use mosaik::Network;
use zipnet::{UNIVERSE, Zipnet};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(Network::new(UNIVERSE).await?);
    let tx      = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;

    let id = tx.send(Note([0u8; 240])).await?;
    println!("submitted {id:?} — watch the Reader for byte-equality");

    Ok(())
}

Submitter::send returns immediately with a SubmissionId. The driver behind tx seals the payload on the next round that includes this client.

Confirming delivery (v1 pattern)

Zipnet::<D>::receipts is deferred to v2. Until it lands, applications that need per-submission confirmation open a Reader alongside the submitter and check for byte- equality on the stream. See Publishing — Confirming delivery via the reader for a worked retry loop.

Reader

Subscribe to the shuffled output:

use futures::StreamExt;
use zipnet::Zipnet;

let mut rd = Zipnet::<Note>::read(&network, &ACME_MAINNET).await?;
while let Some(note) = rd.next().await {
    // note: D fully decoded; tag failures, collisions, and decode
    // errors are filtered silently inside the SDK.
    println!("got {} bytes", note.0.len());
}

Reader<D> yields fully decoded values. There is no Round / Message wrapper to pick apart — round-level metadata is an implementation detail of the protocol, not a consumer concern.

Because all three constructors only take &Arc<Network>, one network handle can serve many deployments and many services concurrently:

use std::sync::Arc;
use mosaik::Network;
use zipnet::{UNIVERSE, Zipnet};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(Network::new(UNIVERSE).await?);

    // Two zipnet deployments side by side, possibly with different D.
    let prod_tx    = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
    let testnet_tx = Zipnet::<Note>::submit(&network, &PREVIEW_ALPHA).await?;

    // …and unrelated services on the same network.
    // let multisig = Multisig::<Vote>::sign(&network, &TREASURY).await?;
    // let storage  = Storage::<Blob>::write(&network, &ARCHIVE).await?;

    let _ = prod_tx.send(Note([1u8; 240])).await?;
    let _ = testnet_tx.send(Note([2u8; 240])).await?;
    Ok(())
}

Every deployment derives its identifiers disjointly from its own Config + datum type, so they coexist on the shared peer catalog without collision. You pay for one mosaik endpoint, one DHT record, one gossip loop — not one per deployment.

Bring-your-own-config

You keep full control of the mosaik builder; the SDK never constructs the Network for you:

use std::{net::SocketAddr, sync::Arc};

use mosaik::{Network, discovery};
use zipnet::{UNIVERSE, Zipnet};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(
        Network::builder(UNIVERSE)
            .with_mdns_discovery(true)
            .with_discovery(discovery::Config::builder().with_bootstrap(universe_bootstrap_peers()))
            .with_prometheus_addr("127.0.0.1:9100".parse::<SocketAddr>()?)
            .build()
            .await?,
    );

    let tx = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
    let _ = tx.send(Note([0u8; 240])).await?;
    Ok(())
}

fn universe_bootstrap_peers() -> Vec<mosaik::PeerId> { vec![] }

Bootstrap peers are universe-level, not zipnet-specific. Any reachable peer on the shared network — a friendly operator’s aggregator, your own relay — works as a starting point. Once you’re bonded, the SDK locates the specific deployment’s committee and aggregator via the shared peer catalog.

What you get back

pub struct SubmissionId(pub u64);

// The `Receipt` / `Outcome` types are part of the public type graph
// but not surfaced in v1 — `Zipnet::<D>::receipts` is deferred to
// v2. See book/src/contributors/roadmap.md for the ECIES design.
pub struct Receipt {
    pub submission_id: SubmissionId,
    pub round:         zipnet::RoundId,
    pub slot:          usize,
    pub outcome:       Outcome,
}

pub enum Outcome { Landed, Collided, Dropped }

Submitter::send returns SubmissionId; Reader yields your D directly. No wrappers around round- or slot-level metadata you’d then have to unpack.

Error model

pub enum Error {
    WrongUniverse { expected: mosaik::NetworkId, actual: mosaik::NetworkId },
    ConnectTimeout,
    Attestation(String),
    Shutdown,
    Protocol(String),
}

ConnectTimeout is the one you’ll hit in development — usually a mismatched Config (different window or init than the operator), an unreachable bootstrap peer, or an operator whose committee isn’t up yet. WrongUniverse shows up if your Network was built against a different universe NetworkId than the SDK expects.

Cover traffic is on by default

An idle Submitter<D> sends a cover envelope each round to widen the anonymity set. See Publishing messages for how to tune or disable it.

Shutdown

drop(tx);   // close the writer; the reader stays open
drop(rd);   // close the reader

Handles are independent. Dropping one closes that handle; the others stay live as long as you (or other tasks) hold them. The Network stays up while any handle holds it.

Next reading

What you need from the operator — the fact sheet the operator gives you before writing code.
Publishing messages — fire-and-forget, cover traffic, retry policy.
Reading the broadcast log — replay, gap detection, filtering.
Client identity and registration — stable vs ephemeral peer identities.
TEE-gated deployments — TDX builds, measurement rollouts.
Designing coexisting systems on mosaik — the shared-universe / typed-shuffler / content+intent addressing model in full.
API reference — full type list.

Client identity

audience: developers

A zipnet client has two distinct identities that work together. The SDK manages one of them for you; the other you control through the mosaik Network you hand to Zipnet::<D>::submit.

Two identities

Identity	Type	Where it comes from	Purpose
`PeerId`	ed25519 public key	mosaik / iroh `SecretKey` on the `Network`	Authenticates you on the wire. Signs your `PeerEntry`.
Client-side DH identity	X25519 keypair	Generated inside `Zipnet::<D>::submit` per handle	Names your slot in the anonymous-broadcast rounds. Binds your pads.

The DH identity is internal to zipnet and not exposed across the SDK surface — you never see a ClientId or DhSecret type in user code. Every call to Zipnet::<D>::submit generates a fresh DH keypair, installs the matching bundle ticket through mosaik’s discovery layer, and waits until the committee admits the binding into a live round. When you drop the Submitter<D> handle, that keypair and its ticket go with it.

Your PeerId is the only identity you materially choose.

Choose your `PeerId` lifetime

Fully ephemeral (default)

Build the Network without calling with_secret_key. Mosaik picks a random iroh identity per run. Combined with the per-handle DH identity, this means every process run is an unlinkable (PeerId, client-DH) pair.

use std::sync::Arc;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

async fn run() -> anyhow::Result<()> {
let network = Arc::new(Network::new(UNIVERSE).await?);
let tx      = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let _ = tx; Ok(()) }

This is the right default for anonymous use cases. An observer correlating PeerIds across rounds learns only “this peer was online during this interval” — which is what mosaik’s transport layer exposes anyway, independent of zipnet.

Stable `PeerId`, ephemeral client DH identity

Useful when you want a predictable bootstrap target (your agent’s PeerId stays the same across restarts) but you don’t want to be correlatable inside zipnet rounds. Each opened submitter gets a fresh DH keypair regardless of the PeerId.

use std::sync::Arc;
use mosaik::{Network, SecretKey};
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

async fn run(my_seed_bytes: [u8; 32]) -> anyhow::Result<()> {
let sk = SecretKey::from_bytes(&my_seed_bytes);
let network = Arc::new(
    Network::builder(UNIVERSE)
        .with_secret_key(sk)
        .build()
        .await?,
);
let tx = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let _ = tx; Ok(()) }

Re-opening the submitter produces a new client DH identity even with the same PeerId, so rounds stay unlinkable at the zipnet layer. If you hold one Submitter<D> for a long time and publish many messages, those messages share one client DH identity and are linkable to each other. To rotate, drop the handle and call submit again.

Stable everything (rare)

The current SDK does not expose a way to persist the per-handle DH identity across restarts. If you need stable client identity for a reputation or allowlist use case, talk to the operator about attested-client TDX features — see TEE-gated deployments. Stable anonymous-publish identity at the application layer is an anti-pattern: it trivially breaks unlinkability across rounds.

Multiple submitters per process

The Zipnet::<D>::* constructors only borrow the Arc<Network>, so one network can host many submitters — the same deployment many times, different deployments side by side, or zipnet alongside other mosaik services:

use std::sync::Arc;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

const PREVIEW_ALPHA: Config = Config::new("preview.alpha")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

async fn run() -> anyhow::Result<()> {
let network  = Arc::new(Network::new(UNIVERSE).await?);
let prod     = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let testnet  = Zipnet::<Note>::submit(&network, &PREVIEW_ALPHA).await?;
let prod_bis = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let _ = (prod, testnet, prod_bis); Ok(()) }

prod and prod_bis have the same PeerId but independent client DH identities; the committee treats them as two distinct publishers. This is occasionally useful for widening your own anonymity set in test deployments, but it does not buy you extra anonymity in production against a global observer watching your network interface.

Rotating

Drop the Submitter<D> handle and call submit again:

drop(prod);
let prod = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;

drop tears down the driver task, removes the bundle ticket from discovery, and lets the committee’s roster forget the old DH identity at the next gossip cycle. The next submit starts clean.

In-flight send calls that haven’t been picked up when you drop the last clone surface as Error::Shutdown. Confirmed delivery in v1 is done via byte-equality on the Reader<D> stream — see Publishing — Confirming delivery via the reader.

What about the peer catalog?

The mosaik peer catalog — network.discovery().catalog() — lists every peer zipnet and anything else on the shared universe sees. It is not zipnet-specific, and the SDK does not ask you to interact with it. If you need to inspect it for debugging, see the mosaik book on discovery.

Publishing messages

audience: developers

Everything about getting a typed payload into a finalized broadcast round.

The whole surface

impl<D: ShuffleDatum> Zipnet<D> {
    pub async fn submit(
        network: &Arc<mosaik::Network>,
        config: &zipnet::Config,
    ) -> Result<Submitter<D>>;
}

impl<D: ShuffleDatum> Submitter<D> {
    pub async fn send(&self, datum: D) -> Result<SubmissionId>;
}

pub struct SubmissionId(pub u64);

Submitter::send returns immediately with a SubmissionId once the payload has been enqueued. A background driver owned by the handle watches the deployment’s live-round header, seals one envelope per round that includes your client, and pushes it onto the committee’s submit stream. Dropping the last clone of the Submitter tears the driver down cleanly.

v1 note — landing outcomes. The receipts stream described in the roadmap is deferred to v2. Calling Zipnet::<D>::receipts today returns Error::Protocol("deferred to v2"). Applications that need per-submission confirmation poll their own Reader<D> for the round(s) they care about, or rely on an application-level acknowledgement above the shuffler. The sections below describe both patterns.

Fire-and-forget

use std::sync::Arc;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(Network::new(UNIVERSE).await?);
    let tx      = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;

    let _ = tx.send(Note([0u8; 240])).await?;
    Ok(())
}

Many applications do exactly this — the encryption or ordering layer built on top replays lost messages at the application level, so a per-submission outcome from the SDK is not needed.

Confirming delivery via the reader

For at-least-once delivery, the idiom in v1 is to open a Reader<D> alongside the submitter and check for byte-equality on the stream:

use futures::StreamExt;
use zipnet::Zipnet;

let tx         = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let mut reader = Zipnet::<Note>::read  (&network, &ACME_MAINNET).await?;

let sent = Note([0u8; 240]);
let _    = tx.send(sent.clone()).await?;

// Wait up to a few rounds for `sent` to land in the broadcast.
use tokio::time::{timeout, Duration};
let deadline = tokio::time::Instant::now() + Duration::from_secs(10);
loop {
    if tokio::time::Instant::now() >= deadline { break; }
    match timeout(Duration::from_secs(2), reader.next()).await {
        Ok(Some(got)) if got == sent => {
            tracing::info!("landed");
            break;
        }
        Ok(Some(_)) => continue,     // someone else's note
        Ok(None)    => break,        // stream closed
        Err(_)      => continue,     // per-round timeout
    }
}

This is strictly more anonymity-preserving than an ECIES receipts stream would be — no receipts exist on the wire for an observer to count — at the cost that your application compares payload bytes rather than consulting a typed Outcome.

Retry policy

Zipnet does not retry for you. If you need at-least-once delivery at the application layer, wrap send in your own loop that consults the Reader:

use futures::StreamExt;
use std::time::Duration;
use tokio::time::timeout;
use zipnet::{Reader, Submitter};

async fn send_with_retry<D>(
    tx: &Submitter<D>,
    reader: &mut Reader<D>,
    datum: D,
    attempts: u32,
) -> zipnet::Result<()>
where
    D: zipnet::ShuffleDatum + Clone + PartialEq,
{
    for _ in 0..attempts {
        let _ = tx.send(datum.clone()).await?;
        // Give each attempt ~3 rounds before declaring it lost.
        let deadline = tokio::time::Instant::now() + Duration::from_secs(3);
        while tokio::time::Instant::now() < deadline {
            match timeout(Duration::from_secs(1), reader.next()).await {
                Ok(Some(got)) if got == datum => return Ok(()),
                Ok(Some(_)) | Err(_)          => continue,
                Ok(None)                      => return Err(zipnet::Error::Shutdown),
            }
        }
    }
    Err(zipnet::Error::Protocol("not delivered after retries".into()))
}

Retry latency is bounded by the deployment’s round cadence — at the interactive() preset’s 1s round period, three attempts cost ~3–10s depending on collision luck. Tune attempts to your SLA.

Payload shape

Submitter::send(D) takes your datum type directly. The SDK calls D::encode to produce exactly D::WIRE_SIZE bytes and rejects any encoding of a different length with Error::Protocol — that is a programmer error in your ShuffleDatum impl, not a runtime condition.

WIRE_SIZE is set once per datum type and folds into the deployment identity. Variable application data must be padded at the application layer before wrapping into D; see Quickstart — Define your datum type for why constant size is non-negotiable.

Cover traffic is on by default

An idle Submitter<D> still keeps its driver alive — whenever the deployment opens a round that includes this client, the driver pulls the next queued payload or seals cover traffic (a zero-message envelope) if nothing is queued. Cover envelopes do not surface on your side; observers cannot distinguish a cover round from a real-payload round for any given participant.

There is no SDK knob to tune cover-traffic rate today. If you hold a Submitter handle, you participate; if you drop it, you don’t. For applications that want to only appear for certain rounds, open the submitter immediately before you need to send and drop immediately after — see Identity.

Parallel sends on one handle

Submitter<D> is Clone and internally Arc-wrapped. Concurrent send calls on one handle are fine; the driver serializes them per-round and emits at most one payload per round per submitter. If you call send twice during the same round window, the second call waits for the next round rather than sharing the slot.

let tx = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let a  = tx.clone();
let b  = tx.clone();

let (_, _) = tokio::join!(
    a.send(Note(*b"message A")),
    b.send(Note(*b"message B")),
);
// The two notes land in different rounds. Match them back via the
// Reader if you need confirmation.

If you need higher throughput per wall-clock second, the right lever is operator-side round cadence or slot count. From the SDK, one submitter is one slot per round.

Shutdown

drop(tx);   // the driver drains pending items, then exits

Dropping the last clone of a Submitter aborts its driver task cleanly. In-flight send calls that have not been picked up by the driver surface as Error::Shutdown. Any Reader<D> bound to the same deployment is unaffected.

Dropping one Submitter handle does not tear down the Network. Other services or other zipnet deployments sharing the same Arc<Network> keep running.

Reading the broadcast log

audience: developers

Zipnet::<D>::read returns a typed stream of shuffled outputs. Every subscriber bonded to the same deployment sees the same decoded values in the same order.

The whole surface

impl<D: ShuffleDatum> Zipnet<D> {
    pub async fn read(
        network: &Arc<mosaik::Network>,
        config: &zipnet::Config,
    ) -> Result<Reader<D>>;
}

impl<D: ShuffleDatum> futures::Stream for Reader<D> {
    type Item = D;
}

Every call to read returns a fresh Reader<D>; handles are cheap and independent. Reader<D>::Item is your datum type directly — tag-verify failures, slot collisions, and D::decode errors are filtered silently inside the SDK.

Tail the log as it grows

use std::sync::Arc;
use futures::StreamExt;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network    = Arc::new(Network::new(UNIVERSE).await?);
    let mut reader = Zipnet::<Note>::read(&network, &ACME_MAINNET).await?;

    while let Some(note) = reader.next().await {
        println!("got {} bytes", note.0.len());
    }
    Ok(())
}

Reader<D> yields fully decoded values. There is no Round or Message wrapper to pick apart — round-level metadata is an implementation detail of the protocol, not a consumer concern. If you need delivery confirmation for a specific submission, see Publishing — Confirming delivery via the reader — the typed receipts stream (Zipnet::<D>::receipts) is deferred to v2.

Starting point and catch-up

A fresh Reader<D> begins from whatever the committee finalizes next. Earlier rounds are not replayed. If you need the full history, open the reader before any submissions you care about and buffer yourself.

Handling a slow consumer

If your reader falls behind — usually because your per-item handler is slower than the round cadence — the SDK’s internal broadcast channel lags. Stay ahead of the protocol by draining the stream in a dedicated task and handing work off to a bounded channel you control:

use futures::StreamExt;
use tokio::sync::mpsc;
use zipnet::Zipnet;

let mut reader = Zipnet::<Note>::read(&network, &ACME_MAINNET).await?;
let (tx, mut rx) = mpsc::channel(1024);

// Producer: drain the SDK stream as fast as it delivers.
tokio::spawn(async move {
    while let Some(note) = reader.next().await {
        if tx.send(note).await.is_err() {
            break;
        }
    }
});

// Consumer: heavy per-item work that can tolerate small bursts.
while let Some(note) = rx.recv().await {
    handle(note).await;
}

With this shape, the SDK’s internal buffer drains continuously; the bounded channel between tasks is the one that can fill up, and you control its size.

Multiple readers

One Arc<Network> can host many readers bonded to the same deployment:

let reader_a = Zipnet::<Note>::read(&network, &ACME_MAINNET).await?;
let reader_b = Zipnet::<Note>::read(&network, &ACME_MAINNET).await?;

Both receive the same values in the same order. Independent lag: slowing down reader A does not affect reader B. Readers bonded to different deployments (different Config) see disjoint streams.

Shutdown

Dropping the Reader<D> is enough. The SDK’s driver stays up as long as the Arc<Network> lives; opening a fresh reader later gives you a stream from the then-current point in the log.

Connecting to the universe

audience: developers

The nuts and bolts of building the Arc<Network> that the Zipnet::<D>::* constructors attach to. The zipnet SDK never constructs the network for you — this is intentional. One network can host zipnet alongside other mosaik services on the shared universe, and you own its lifetime.

The minimum

use std::sync::Arc;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]); // operator-published in real deployments

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(Network::new(UNIVERSE).await?);
    let tx      = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;

    let _ = tx.send(Note([0u8; 240])).await?;
    Ok(())
}

Network::new(UNIVERSE) produces a network with default mosaik settings — random SecretKey, mDNS off, no bootstrap peers, no prometheus endpoint. Enough for local integration tests; rarely enough for a real deployment.

Bring your own builder

For anything beyond a local experiment, use Network::builder:

use std::{net::SocketAddr, sync::Arc};
use mosaik::{Network, discovery};
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let network = Arc::new(
        Network::builder(UNIVERSE)
            .with_mdns_discovery(true)
            .with_discovery(
                discovery::Config::builder()
                    .with_bootstrap(universe_bootstrap_peers()),
            )
            .with_prometheus_addr("127.0.0.1:9100".parse::<SocketAddr>()?)
            .build()
            .await?,
    );

    let tx = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
    let _  = tx.send(Note([0u8; 240])).await?;
    Ok(())
}

fn universe_bootstrap_peers() -> Vec<mosaik::PeerId> { vec![] }

Every argument above is a mosaik concern, not a zipnet one. The full builder reference lives in the mosaik book. The rest of this page covers the fields that matter most for a zipnet user.

`UNIVERSE`

zipnet::UNIVERSE is the shared NetworkId every zipnet deployment lives on. Today it is mosaik::unique_id!("mosaik.universe"). When mosaik ships its own canonical universe constant, this value will be re-exported verbatim.

If your Network is on a different NetworkId, every Zipnet::<D>::* constructor rejects it with Error::WrongUniverse { expected, actual } before any I/O happens. There is no way to tunnel zipnet over a non-universe network; the SDK hard-checks this.

Bootstrap peers

Universe-level, not zipnet-specific. Any reachable peer on the shared universe works as a bootstrap — a mosaik registry node, a friendly operator’s aggregator, your own persistent relay. The operator does not typically hand out zipnet-deployment-specific bootstrap peers; they publish one set of universe bootstraps that their zipnet deployment (and any other services they host) joins through.

Once your network is bonded to the universe, the Zipnet::<D>::* constructors find the specific deployment’s committee through the shared peer catalog — you do not need to know anything zipnet-specific at network-builder time.

use mosaik::discovery;
use zipnet::UNIVERSE;

let network = mosaik::Network::builder(UNIVERSE)
    .with_discovery(
        discovery::Config::builder()
            .with_bootstrap(vec![
                // universe-level bootstrap peer IDs, operator-supplied
            ]),
    )
    .build()
    .await?;

On first connect with no bootstrap peers you fall back to the DHT. That works, but it is slow (tens of seconds on a cold start). At least one bootstrap peer is a practical requirement for anything beyond local tests.

mDNS

.with_mdns_discovery(true) collapses discovery latency from minutes to seconds on a shared LAN and is harmless elsewhere. Turn it off only if your security posture forbids advertising peers over mDNS.

Secret key

Omit .with_secret_key(...) for a fresh iroh identity per run. Set a stable SecretKey if you want a predictable PeerId across restarts. See Client identity for when each is appropriate.

Reaching the universe from behind NAT

iroh handles NAT traversal through its relay infrastructure. Most residential and office setups need no extra configuration. Things that help when they don’t:

Outbound UDP must be allowed. Iroh uses QUIC on UDP.
Full-cone NAT or better is easy. Symmetric NAT falls back to relay — still works, with extra latency.
UDP-terminating proxies break iroh. Run the agent from a host with raw outbound UDP.

At startup the network logs its relay choice:

relay-actor: home is now relay https://euc1-1.relay.n0.iroh-canary.iroh.link./

Repeated “Failed to connect to relay server” warnings mean your outbound path is broken; discovery mostly still works via DHT, just slow.

Observability for your own agent

use std::{net::SocketAddr, sync::Arc};
use mosaik::Network;
use zipnet::UNIVERSE;

let network = Arc::new(
    Network::builder(UNIVERSE)
        .with_prometheus_addr("127.0.0.1:9100".parse::<SocketAddr>()?)
        .build()
        .await?,
);

Then scrape http://127.0.0.1:9100/metrics — you’ll get mosaik’s metrics plus whatever you emit with the metrics crate. The zipnet SDK does not expose its own top-level metrics endpoint; observability is the network’s job.

One network, many services and deployments

Because the Zipnet::<D>::* constructors only borrow &Arc<Network>, you pay for one mosaik endpoint across every service and deployment you bind:

use std::sync::Arc;
use mosaik::Network;
use zipnet::{Config, ShuffleWindow, UNIVERSE, Zipnet};
use zipnet::{DecodeError, ShuffleDatum, UniqueId, unique_id};
pub struct Note(pub [u8; 240]);
impl ShuffleDatum for Note {
    const TYPE_TAG: UniqueId = unique_id!("acme.note-v1");
    const WIRE_SIZE: usize = 240;
    fn encode(&self) -> Vec<u8> { self.0.to_vec() }
    fn decode(b: &[u8]) -> Result<Self, DecodeError> {
        <[u8; 240]>::try_from(b).map(Self).map_err(|e| DecodeError(e.to_string()))
    }
}

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

const PREVIEW_ALPHA: Config = Config::new("preview.alpha")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

async fn run() -> anyhow::Result<()> {
let network = Arc::new(Network::new(UNIVERSE).await?);

let prod    = Zipnet::<Note>::submit(&network, &ACME_MAINNET).await?;
let testnet = Zipnet::<Note>::submit(&network, &PREVIEW_ALPHA).await?;
// Unrelated services on the same universe would bind similarly:
// let multisig = Multisig::<Vote>::sign(&network, &TREASURY).await?;
// let storage  = Storage::<Blob>::write(&network, &ARCHIVE).await?;
let _ = (prod, testnet);
Ok(()) }

Each deployment derives its own IDs from its own Config + datum type, so they coexist on the shared peer catalog without collision. One UDP socket, one DHT record, one gossip loop.

Graceful shutdown

drop(network);

drop cancels everything — open streams, collection readers, bonds. Mosaik emits a gossip departure so the operator’s logs show you leaving cleanly. Dropping individual handles (Submitter / Reader) closes that handle only; other tasks holding clones keep running, and the Network stays up. See Publishing — Shutdown.

Cold-start checklist

If your agent starts but one of the Zipnet::<D>::* constructors returns ConnectTimeout:

The Arc<Network> is on UNIVERSE. If you see WrongUniverse instead, the network was built against a different NetworkId. Switch back to UNIVERSE.
The Config matches the operator’s exactly. Name, window, and init all fold into the deployment’s on-wire identity; any mismatch surfaces as ConnectTimeout, not a structured error. Print Zipnet::<D>::deployment_id(&CONFIG) on both sides to confirm they agree.
D::TYPE_TAG and D::WIRE_SIZE match the operator’s schema. These also fold into the identity — a schema bump produces a different deployment id even at the same name.
Bootstrap PeerIds are reachable. nc -zv <their_host> or whatever the operator tells you to test.
Outbound UDP is allowed. iperf over UDP to a public host.
Your mosaik version matches (=0.3.17). Any minor-version drift changes wire formats.

If none of these resolves it, see Troubleshooting.

TEE-gated deployments

audience: developers

Some zipnet deployments require every participant — committee members and publishing clients — to run inside a TDX enclave whose measurement matches the operator’s expected MR_TD. This chapter covers the user side of that setup.

Is the deployment TEE-gated?

Ask the operator. Specifically:

Does the committee stack a Tdx validator on its admission tickets?
If so, what MR_TD must your client image report?

If the answer to the first question is no, skip this chapter — the rest of the user guide applies unchanged.

How the SDK decides whether to attest

TDX is a Cargo feature on the zipnet crate, not a function of the instance name:

tee-tdx disabled (default). The SDK runs a mocked attestation path. Your PeerEntry does not carry a TDX quote. A TDX-gated operator’s committee rejects you at bond time — you see Error::ConnectTimeout (the rejection is silent at the discovery layer) or Error::Attestation if the operator has enabled a stricter surfacing mode.
tee-tdx enabled. The Zipnet::<D>::* constructors use mosaik’s real TDX path to generate a quote bound to your current PeerId and attach it to your discovery entry. The committee validates the quote before admitting you.

# Cargo.toml for a developer-side agent that must attest.
[dependencies]
zipnet = { version = "0.1", features = ["tee-tdx"] }

With the feature on, your binary only runs correctly inside a real TDX guest. The TDX hardware refuses to quote from a non-TDX machine, so the Zipnet::<D>::* constructors surface that as Error::Attestation("…").

Build-time: produce a TDX image

Add mosaik’s TDX builder to your crate:

[build-dependencies]
mosaik = { version = "=0.3.17", features = ["tdx-builder-alpine"] }
# or: features = ["tdx-builder-ubuntu"]

build.rs:

fn main() {
    mosaik::tee::tdx::build::alpine().build();
}

This produces a bootable TDX guest image at target/<profile>/tdx-artifacts/<crate>/alpine/ plus a precomputed <crate>-mrtd.hex. The operator either uses your MR_TD as their expected value, or — if they pin a specific image — hands you theirs and you rebuild to match.

The mosaik TDX reference covers Alpine vs Ubuntu trade-offs, SSH and kernel customization, and environment-variable overrides.

The operator → developer handshake for TDX

A TDX-gated deployment adds one item to the three-item handshake in What you need from the operator:

Item	What it is
Committee MR_TD	The 48-byte hex measurement the operator’s committee images use.

The operator hands this out via their release notes, not via the wire. The zipnet SDK does not bake per-instance MR_TD mappings in — there is no table of “acme.mainnet requires MR_TD abc…” inside the crate. Keeping that mapping client-side is the operator’s responsibility, published out of band.

When the operator rotates the image, your old quote stops validating; the fix is to rebuild with the new MR_TD and redeploy. There is no auto-discovery of acceptable measurements on the wire.

Multi-variant deployments

During a rollout, an operator may accept multiple client MR_TDs simultaneously — usually the old and the new during a staged migration. You only need to match one of them. The precomputed hex files in target/<profile>/tdx-artifacts/<crate>/.../ tell you what your image reports; compare against the list the operator publishes.

Sealing secrets inside the enclave

Zipnet’s current SDK does not expose a sealed-storage helper — each Zipnet::<D>::submit generates a fresh per-handle DH identity in process memory. That is fine for the default anonymous-use-case model, where identity is meant to rotate.

If you need stable identity across enclave reboots for a reputation use case, you will need to persist state to TDX sealed storage yourself today. That is out of scope for the SDK and likely to land as a mosaik primitive rather than a zipnet feature; watch the mosaik release notes.

Falling back to non-TDX for development

If you’re writing integration tests and don’t want a TDX VM in the loop, build without the tee-tdx feature and use a deployment whose operator has disabled TDX gating. Typical arrangement:

Production and staging: tee-tdx on both sides.
Local dev / CI: tee-tdx off on both sides.

The operator runs the dev instance without the Tdx validator on committee admissions; you build your client without the tee-tdx feature. Both sides’ mocks line up.

Failure modes

The error the SDK surfaces when TDX is involved is Error::Attestation(String). Common causes:

You built with tee-tdx but aren’t running inside a TDX guest (hardware refuses to quote).
Your MR_TD differs from the operator’s. Rebuild with their image.
The operator rotated MR_TD and you haven’t. Rebuild.

ConnectTimeout can also stem from TDX mismatches on deployments that surface attestation failures silently at the bond layer; see Troubleshooting.

Troubleshooting from the developer side

audience: developers

Failure modes you can observe from your own agent, mapped to the SDK’s error enum and the fastest check for each.

The error enum

pub enum Error {
    WrongUniverse { expected: mosaik::NetworkId, actual: mosaik::NetworkId },
    ConnectTimeout,
    Attestation(String),
    Shutdown,
    Protocol(String),
}

Five variants. The two you will hit most in development are ConnectTimeout and WrongUniverse. Everything else is either a real runtime condition or lower-level plumbing surfaced through Protocol.

Symptom: a `Zipnet::<D>::*` constructor returns `ConnectTimeout`

This is the single most common dev-time error. It means the SDK could not bond to a peer serving your deployment within the connect deadline. In descending order of likelihood:

1. Config mismatch with the operator

Every field in your Config — name, window, init — plus your datum’s TYPE_TAG and WIRE_SIZE folds into the deployment’s on-wire identity. A one-character difference in the name, a different ShuffleWindow preset, or a stale init salt produces a completely different id and nobody is serving it.

Fix: double-check every field against the operator’s handoff. Prefer pinning the Config as a compile-time constant so the fingerprint is immutable at source:

use zipnet::{Config, ShuffleWindow, Zipnet};

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([
        0x7f, 0x3a, 0x9b, 0x1c, /* … operator-published bytes … */ 0x00,
    ]);

// Print the derived id on both sides to confirm agreement.
println!("{}", Zipnet::<Note>::deployment_id(&ACME_MAINNET));

2. Operator’s committee isn’t up

The fingerprint is right, but nobody is currently serving it. The SDK cannot distinguish “nobody serves this” from “operator isn’t up yet” without an on-network registry — both surface as ConnectTimeout.

Fix: ask the operator whether the deployment is live.

3. Bootstrap peers unreachable

Even if the instance name is right and the committee is up, your network never bonded to the universe — so it never found the committee. Usually shows up alongside no peer-catalog growth.

Fix: check the bootstrap peer list. See Connecting — Cold-start checklist.

4. TDX posture mismatch

Silent rejection at the bond layer from a TDX-gated deployment often looks like ConnectTimeout rather than a clear Attestation error. Common when your client is built without the tee-tdx feature against a TDX-gated operator.

Fix: see TEE-gated deployments.

Symptom: a `Zipnet::<D>::*` constructor returns `WrongUniverse`

Your Arc<Network> was built against a different NetworkId than zipnet::UNIVERSE. The error payload tells you both values:

match zipnet::Zipnet::<Note>::submit(&network, &ACME_MAINNET).await {
    Err(zipnet::Error::WrongUniverse { expected, actual }) => {
        tracing::error!(%expected, %actual, "network on wrong universe");
    }
    _ => {}
}

Fix: build the network with Network::new(UNIVERSE) or Network::builder(UNIVERSE). There is no way to tunnel zipnet over a non-universe network.

Symptom: a `Zipnet::<D>::*` constructor returns `Attestation`

TDX attestation failed. The string payload names the specific failure from the mosaik TDX stack.

Common causes:

You built with tee-tdx but aren’t running inside a TDX guest.
Your MR_TD differs from the operator’s expected value (fresh image you haven’t rebuilt, or operator rotated).
Your quote has expired.

See TEE-gated deployments.

Symptom: the `Reader<D>` never yields a note you submitted

Three common causes, in rough frequency order.

1. Slot collision

Another client in the same round hashed to the same slot you did. Both payloads XOR-corrupt each other; neither lands. Retry on the next round — see Publishing — Retry policy.

Persistent collisions mean the deployment is oversubscribed for its num_slots (an operator-side tuning concern). Collision probability per pair per round is 1 / num_slots; for N clients the expected number of collisions per round is C(N, 2) / num_slots.

2. Aggregator dropped your envelope

The aggregator never forwarded your envelope into a committed aggregate. Usually transient:

Aggregator was offline that round.
Your registration hadn’t propagated yet (first few seconds after opening the submitter).

Retry. Repeated failures across many rounds mean the aggregator is unreachable from you — check the peer catalog and bootstrap peers, then contact the operator.

3. Your client isn’t in the live-round roster

The SDK’s driver re-sends its ClientBundle nudge every round it finds itself outside the roster. First-time admission can take one or two rounds even under ideal conditions.

v1 note. The SDK does not currently surface which of these three outcomes happened — the ECIES receipts stream that would tell you is deferred to v2. Distinguishing the three requires application-level retries with byte-equality checks on the Reader<D> stream.

Symptom: the reader sees no values for a long time

Two possibilities:

1. The committee is stuck

The cluster is not finalizing rounds. Contact the operator.

2. Your handle hasn’t caught up yet

The Zipnet::<D>::* constructors wait for the first live round roster before returning, so once you hold a Reader<D> values should start at the next round boundary. If they do not, you are not reaching the broadcast collection’s group — same checks as for ConnectTimeout (config, bootstrap, UDP egress, TDX).

Symptom: `send` or stream `next` returns `Shutdown`

The handle is closing. Either you dropped every clone of the submitter, dropped the reader, or the underlying Network went down.

Fix: check that the Arc<Network> is still alive and that no other part of your code dropped the handle. If this is intentional, the error is just the post-drop signal.

Symptom: `Error::Protocol(…)` with an opaque string

The SDK bubbled up a lower-level mosaik or zipnet-protocol failure. The string content is for humans — do not pattern-match on it.

Fix: enable verbose logging and inspect the mosaik-layer event stream:

RUST_LOG=info,zipnet=debug,mosaik=info cargo run

If the root cause is in mosaik, the mosaik book has better diagnostics than this page can. Open a zipnet issue with the log excerpt if the failure looks zipnet-specific.

Symptom: reader lags and misses values

Your per-item handler is slower than the deployment’s round cadence. Internal broadcast channels drop items rather than stall the SDK.

Fix: offload heavy per-item work to a separate task. See Reading — Handling a slow consumer.

Symptom: my client compiled against one version, the operator upgraded

Mosaik pinned to =0.3.17 on both sides; zipnet and zipnet::proto baselines must also match the deployment. If WIRE_VERSION or round-parameter defaults change, your client derives a different deployment id and the Zipnet::<D>::* constructors return ConnectTimeout.

Fix: keep your zipnet dep version aligned with the operator’s release notes. Mosaik stays pinned.

When to escalate to the operator

A Zipnet::<D>::* constructor consistently fails with ConnectTimeout after the config, bootstrap, and universe have all been verified.
The reader never yields notes you submitted, even after several round periods of retry.
Your reader stays open but sees no values over several round periods.

When you escalate, include:

Your mosaik version (=0.3.17) and zipnet SDK version.
The full Config (name, window, init) and the Zipnet::<D>::deployment_id(&CONFIG) you derive locally.
Whether you built with tee-tdx and, if so, your client’s MR_TD.
A 60-second log excerpt at RUST_LOG=info,zipnet=debug,mosaik=info.

API reference

audience: developers

A compact reference of the surface the zipnet facade crate exposes. Link-in-book pages cover the “how”; this page is the “what”.

The whole import story

Almost every developer-side agent pulls from exactly one module:

use zipnet::{
    // The universe constant.
    UNIVERSE,

    // The typed facade + handle types.
    Zipnet, Submitter, Receipts, Reader,

    // Deployment fingerprint.
    Config, ShuffleWindow,

    // Datum contract.
    ShuffleDatum, DecodeError,

    // Identifier types and macros.
    UniqueId, NetworkId, Tag, unique_id,

    // Value types.
    Receipt, Outcome, SubmissionId, RoundId,

    // Error model.
    Error, Result,
};

Constants

Item	Type	Role
`zipnet::UNIVERSE`	`NetworkId`	The shared mosaik universe every zipnet deployment lives on. Build your `Network` against it.

Deployment fingerprint

pub struct Config { /* opaque, const-constructible */ }

impl Config {
    pub const fn new(name: &'static str) -> Self;
    pub const fn with_window(self, window: ShuffleWindow) -> Self;
    pub const fn with_init(self, init: [u8; 32]) -> Self;

    pub const fn name(&self) -> &'static str;
    pub const fn window(&self) -> &ShuffleWindow;
    pub const fn init(&self) -> &[u8; 32];
}

pub struct ShuffleWindow { /* opaque */ }

impl ShuffleWindow {
    pub const fn interactive() -> Self; // 1s rounds, up to 64
    pub const fn archival()    -> Self; // 30s rounds, up to 1024
    pub const fn custom(
        round_period:     std::time::Duration,
        min_participants: u32,
        max_participants: u32,
        fold_deadline:    std::time::Duration,
        round_deadline:   std::time::Duration,
    ) -> Self;
}

Every field folds into the deployment id — see Quickstart — Pin the deployment in a const.

Datum contract

pub trait ShuffleDatum: Send + Sync + Sized + 'static {
    const TYPE_TAG: UniqueId;
    const WIRE_SIZE: usize;

    fn encode(&self) -> Vec<u8>;
    fn decode(bytes: &[u8]) -> Result<Self, DecodeError>;
}

pub struct DecodeError(pub String);

Every value of D MUST encode to exactly D::WIRE_SIZE bytes. TYPE_TAG and WIRE_SIZE both fold into the deployment identity; schema bumps produce a different id and fail to bond cleanly rather than silently garble.

The typed façade

pub struct Zipnet<D: ShuffleDatum>(/* marker */);

impl<D: ShuffleDatum> Zipnet<D> {
    pub async fn submit  (network: &Arc<Network>, config: &Config) -> Result<Submitter<D>>;
    pub async fn receipts(network: &Arc<Network>, config: &Config) -> Result<Receipts<D>>;
    pub async fn read    (network: &Arc<Network>, config: &Config) -> Result<Reader<D>>;

    pub fn deployment_id(config: &Config) -> UniqueId;
}

Zipnet<D> is a marker — there is no instance to hold. The submit and read constructors open the writer and read-side streams for the deployment described by Config and the datum type D. The receipts constructor is part of the shape but deferred in v1 — see v1 status below. All three reach the same on-wire identity derived from Config + D’s TYPE_TAG/WIRE_SIZE.

deployment_id is a pure function useful for handshake diagnostics — print it on both sides to confirm you and the operator compute the same fingerprint.

Handle types

#[derive(Clone)]
pub struct Submitter<D: ShuffleDatum> { /* opaque */ }

impl<D: ShuffleDatum> Submitter<D> {
    pub async fn send(&self, datum: D) -> Result<SubmissionId>;
}

pub struct Receipts<D: ShuffleDatum> { /* opaque */ }
impl<D: ShuffleDatum> futures::Stream for Receipts<D> {
    type Item = Receipt;
}

pub struct Reader<D: ShuffleDatum> { /* opaque */ }
impl<D: ShuffleDatum> futures::Stream for Reader<D> {
    type Item = D;
}

Submitter<D> is Clone; all clones share one driver task. Drop every clone to tear down the writer. Reader<D> is an independent stream handle; drop it to stop its task.

Receipts (v1 deferred)

Zipnet::<D>::receipts currently returns Error::Protocol("...deferred to v2..."). The Receipts<D> and Receipt / Outcome types stay in the type graph so downstream code can be written against the intended shape — but no handle is ever vended in v1. Applications that need per-submission confirmation poll a Reader<D> and match on byte equality; see Publishing — Confirming delivery via the reader.

The design target for v2 is an ECIES-encrypted per-submission receipt stream; see Roadmap — ECIES-encrypted receipts stream.

Value types

pub struct SubmissionId(pub u64);

pub struct Receipt {
    pub submission_id: SubmissionId,
    pub round:         RoundId,
    pub slot:          usize,
    pub outcome:       Outcome,
}

pub enum Outcome {
    Landed,   // happy path
    Collided, // slot collision; retry next round
    Dropped,  // aggregator never forwarded; retry
}

Receipt and Outcome are the types the v2 receipts stream will yield; in v1 no handle is vended and the types exist only so application code can be compiled forward-compatibly. The SubmissionId is still returned by every send call and can be used by the caller for local correlation.

Errors

pub type Result<T, E = Error> = core::result::Result<T, E>;

#[derive(Debug, thiserror::Error)]
pub enum Error {
    WrongUniverse { expected: NetworkId, actual: NetworkId },
    ConnectTimeout,
    Attestation(String),
    Shutdown,
    Protocol(String),
}

See Troubleshooting for a per-variant diagnostic checklist.

Re-exports from mosaik

Item	From	Use
`UniqueId`	`mosaik::UniqueId`	Alias for 32-byte intent-addressed identifiers.
`NetworkId`	`mosaik::NetworkId`	Type of `UNIVERSE` and `WrongUniverse` fields.
`Tag`	`mosaik::Tag`	Peer-catalog tag type. Rarely needed directly.
`unique_id!`	`mosaik::unique_id!`	Compile-time `UniqueId` construction.

Re-exports from `zipnet::proto`

Item	Role
`RoundId`	Monotonic round counter; appears on `Receipt`.

What you do NOT import

zipnet::node::* — committee and role internals. Users do not construct CommitteeMachines or run committee Raft groups.
mosaik::groups::GroupKey — you do not have committee secrets.
Any raw StreamId / StoreId / GroupId — the SDK derives them from Config + D. Do not try to pin them yourself.

If you find yourself reaching for these, you are probably writing an operator or contributor concern. Revisit What you need from the operator.

Version compatibility

Dependency	Version	Note
`mosaik`	`=0.3.17`	Pin exactly; minor versions change wire formats.
`zipnet`	follow the deployment’s release notes	Keep in lockstep with the operator’s version.
`tokio`	`1.x`	Any compatible minor.
`futures`	`0.3`	For `StreamExt::next` on `Reader<D>`.

When the operator announces a deployment upgrade, they should publish the zipnet version to use. Users rebuild and redeploy in lockstep.

Deployment overview

audience: operators

A zipnet deployment runs as one service among many on a shared mosaik universe — a single NetworkId that hosts zipnet alongside other mosaik services. What you stand up is an instance of zipnet under a short, namespaced name you pick (e.g. acme.mainnet). Multiple instances coexist on the same universe concurrently, each with its own committee, ACL, round parameters, and committee MR_TD.

If you haven’t yet, read the Quickstart — it walks you end-to-end from a fresh checkout to a live instance. This page gives the architectural background the runbooks later in this section refer back to.

The shared universe model

The universe constant is zipnet::UNIVERSE = unique_id!("mosaik.universe"). Override only for an isolated federation via ZIPNET_UNIVERSE; in the common case, leave it alone.
All your nodes — committee servers, aggregator, clients — join that same universe. Mosaik’s standard peer discovery (/mosaik/announce gossip plus the Mainline DHT bootstrap) handles reachability. You don’t configure streams, groups, or IDs by hand.
The instance is identified by ZIPNET_INSTANCE (e.g. acme.mainnet). Every sub-ID — committee GroupId, submit StreamId, broadcasts StoreId — is derived from that name, so typos surface as ConnectTimeout rather than a config error.

Publishers bond to your instance knowing only three things: the universe NetworkId, the instance name, and (for TDX-gated deployments) your committee MR_TD. You hand those out in release notes or docs; there is no on-network registry to publish to and nothing to advertise.

Three node roles

A zipnet deployment has three kinds of nodes. You — the operator — will run at least the first two. The third is optional (most publishers are external developers running their own clients).

Role	Count	Trust status	Resource profile
Committee server	3 or more (odd)	any-trust: at least one must be honest for anonymity; all must be up for liveness in v1	low CPU, modest RAM, stable identity, low churn
Aggregator	1 (v1)	untrusted for anonymity, trusted for liveness	higher CPU + bandwidth, can churn
Publishing client	many	TDX-attested in production; untrusted for liveness	ephemeral; any churn is tolerated
REST ingest gateway (optional)	0 or more	trusted for HTTP-side unlinkability — see below	thin, I/O-bound, feature-gated (`--features ingest`)

The optional ingest gateway

zipnet ingest is a small HTTP front door bolted onto the same zipnet binary. Compiled only when you build with --features ingest, it runs as a single zipnet client and accepts POST /submit over HTTP, republishing each body into the anonymous stream. Run it when your publishers can speak HTTP but not the mosaik SDK — typical cases: web frontends, legacy integrations, one-off scripts.

Trust boundary. Every HTTP caller shares the ingester’s one ClientId. The ingester can log the HTTP source of each payload; to every other zipnet participant, the output looks like one publisher’s traffic. Operators run the gateway as a deliberate hand-off: they accept responsibility for the HTTP side of the anonymity story in exchange for giving publishers a simpler integration surface. If callers want end-to-end anonymity against the operator too, they should use the Rust SDK directly and skip the gateway.

Throughput. One gateway == one slot per round. Backlog drains at the round cadence; at ShuffleWindow::interactive() (1s rounds) you get 1 msg/s. Run multiple gateways, each with its own ZIPNET_SECRET, for more.

What every node needs

Outbound UDP to the internet (iroh / QUIC transport) and to mosaik relays.
A few MB of RAM; committee servers need more during large-round replay.
A clock within a few seconds of the rest of the universe (Raft tolerates skew but not arbitrary drift).
ZIPNET_INSTANCE=<name> set to the same instance name on every node in that deployment.

What only committee servers need

A stable PeerId across restarts. Set ZIPNET_SECRET to any string — it is hashed with blake3 to derive the node’s long-term iroh identity. Rotating it invalidates every bond.
Access to the shared committee secret, passed as ZIPNET_COMMITTEE_SECRET. This gates admission to the Raft group. Distribute it out of band (vault, secrets manager, k8s secret). Anyone holding it can join the committee — treat it like a root credential.
In production, a TDX host. Mosaik ships the TDX image builder; you call mosaik::tee::tdx::build::ubuntu() from your build.rs and get a launch script, initramfs, OVMF, and a precomputed MR_TD at build time. See the Quickstart’s TDX section.
Durable storage is not required in v1 (state is in memory). A restarted server rejoins and catches up by snapshot.

What only aggregators need

More network bandwidth than committee servers. The aggregator receives every client envelope and emits a single aggregate per round.
A stable PeerId is strongly recommended — clients often use the aggregator as a discovery bootstrap.
The aggregator does not need the committee secret. It is untrusted for anonymity.

What only clients need

The universe NetworkId, instance name, and (for TDX-gated instances) your committee MR_TD. That is the whole handshake.
A TDX host if the instance is TDX-gated. See Security posture checklist.

How the three talk

   clients ── ClientEnvelope stream ─────► aggregator
                                               │
                                 AggregateEnvelope stream
                                               │
                                               ▼
                                        committee servers
                                               │
                                     Raft-replicated apply
                                               │
                                               ▼
                              Broadcasts collection (readable by anyone)

Clients and the aggregator are not members of the committee’s Raft group; they observe the final broadcasts through a replicated collection.

Minimum viable deployment

Three committee servers + one aggregator + a handful of clients is the smallest deployment where anonymity holds meaningfully. Two committee servers will technically run but any one of them can deanonymize the set — stick to three or more.

     TDX host A           TDX host B           TDX host C
   ┌─────────────┐      ┌─────────────┐      ┌─────────────┐
   │ zipnet-     │      │ zipnet-     │      │ zipnet-     │
   │ server #1   │      │ server #2   │      │ server #3   │
   └──────┬──────┘      └──────┬──────┘      └──────┬──────┘
          └────────────────────┼────────────────────┘
                               │   Raft / mosaik group
                               ▼
                      ┌───────────────────┐
                      │  zipnet aggregator │   (non-TDX host, well-connected)
                      └─────────┬─────────┘
                                │
                                ▼
                         external publishers
                        (TDX where gated, else
                         operator-trusted hosts)

Each box runs ZIPNET_INSTANCE=acme.mainnet and joins zipnet::UNIVERSE over iroh; mosaik discovery wires the rest.

Running many instances side by side

Operators routinely run several instances — production, a public testnet, internal dev — on the same universe. Each has its own instance name, its own committee, its own MR_TD pin, its own ACL. Hosts can host one or many; run a separate unit per instance:

systemctl start zipnet@acme-mainnet-server
systemctl start zipnet@preview.alpha-server
systemctl start zipnet@dev.ops-server

Unit names are operator-chosen; each one wraps an invocation of the same zipnet server subcommand with a distinct ZIPNET_INSTANCE. The instances share the universe and the discovery layer, and appear to publishers as three distinct zipnet::Config fingerprints, each bound via the usual Zipnet::<D>::* constructors.

Quickstart — stand up an instance

audience: operators

This page walks you from a fresh checkout to a live zipnet instance that external publishers can reach with one line of code. Read Deployment overview first for the architectural background; this page assumes it.

Who runs a zipnet instance

Typical deployments:

A rollup or app offering an encrypted mempool. The team runs the committee; user wallets publish sealed transactions; the sequencer or builder reads them ordered and opaque-to-sender, and decrypts at block-build time via whatever mechanism they prefer (threshold decryption, TEE unsealing).
An MEV auction team hosting a permissioned order-flow channel. The team runs the committee; whitelisted searchers publish intents; every connected builder reads the same ordered log.
A governance coalition running anonymous signalling. The coalition runs the committee; delegated wallets signal anonymously; anyone can tally.

What’s common: you want a bounded participant set — which you authenticate via TEE attestation and a ticket class — to publish messages without any single party (yourself included) being able to link message to sender. You run the committee and the aggregator. Participants bring their own TEE-attested client software, typically from a TDX image you also publish.

One-paragraph mental model

Zipnet runs as one service among many on a shared mosaik universe — a single NetworkId that hosts zipnet alongside other mosaik services (signers, storage, oracles). Your job as an operator is to stand up an instance of zipnet under a name you pick (e.g. acme.mainnet) and keep it running. External agents bind to your deployment by pinning a zipnet::Config (Config::new("acme.mainnet") .with_window(...).with_init(...)) and opening a Zipnet::<D>::submit / receipts / read handle against it — they compile the fingerprint in from their side, so there is no registry to publish to and nothing to advertise. Your servers simply need to be reachable.

What you’re running

A minimum instance is:

Role	Count	Hosted where
Committee server	3 or more (odd)	TDX-enabled hosts you operate
Aggregator	1 (v1)	Any host with outbound UDP
(optional) Your own publishing clients	any	TDX-enabled if the instance is gated

All of these join the same shared mosaik universe. The committee and aggregator advertise on the shared peer catalog; external publishers reach them through mosaik’s discovery without any further config from you.

What defines your instance

Your instance is identified by a deployment fingerprint that folds entirely into the on-wire identity. Every signature-altering input lives in a single Config struct that publishers compile in to their code:

#	Field	Notes
1	instance name	Short, stable, namespaced string (e.g. `acme.mainnet`).
2	shuffle window	Round period, participant bounds, fold + round deadlines. Use a preset (`ShuffleWindow::interactive` or `archival`) or a custom tuple. Determines latency vs anonymity-set tradeoff.
3	`init` salt	32 random bytes you generate once at deploy time. Disambiguates two operators picking the same instance name.
4	datum type	Publisher-side: `D::TYPE_TAG` (32-byte schema id) and `D::WIRE_SIZE` (exact serialised byte length). You publish what `D` your deployment shuffles; publishers implement `ShuffleDatum` against it.

The four together fold into the deployment’s GroupId, submit StreamId, broadcasts StoreId, and ticket class via a single content-addressed hash. Change any of them and you have a different deployment. Publishers compiled against the old values can no longer bond — they get ConnectTimeout rather than silent corruption. See Designing coexisting systems on mosaik for the derivation and the underlying content + intent addressing principle.

The universe NetworkId is zipnet::UNIVERSE by default — almost always shared across all zipnet deployments. Override only if you run an isolated federation; in that case publish your override alongside the Config.

The ACL — TDX MR_TD pin, JWT issuer, etc. — is determined by the SDK’s Cargo features (tee-tdx enables real attestation). Publish the MR_TD of your committee image alongside the Config so publishers building TDX-enabled clients can pin against it.

Minimal smoke test

Before you touch hardware, confirm the pipeline works end-to-end on your laptop. The deterministic check is the integration test that exercises three committee servers + one aggregator + two clients over real mosaik transports in one tokio runtime:

cargo test -p zipnet --test e2e one_round_end_to_end

A green run in roughly 10 seconds tells you the crypto, consensus, round lifecycle, and mosaik transport are all healthy in your checkout. If it fails, nothing else on this page is going to work — investigate before touching hardware.

Exercising the binary directly (optional)

If you want to watch the zipnet binary run one of its role subcommands in each of several terminals — useful for shaking out systemd units, env vars, or firewall rules — bootstrap them by hand on one host. Localhost discovery over fresh iroh relays is slow, so give the first round up to a minute to land.

# terminal 1 — seed committee server; grab its peer= line from stdout
ZIPNET_INSTANCE="dev.local" \
ZIPNET_COMMITTEE_SECRET="dev-committee-secret" \
ZIPNET_SECRET="seed-1" \
./target/debug/zipnet server

# terminals 2+3 — remaining committee servers, bootstrapped off #1
ZIPNET_INSTANCE="dev.local" \
ZIPNET_COMMITTEE_SECRET="dev-committee-secret" \
ZIPNET_SECRET="seed-2" \
ZIPNET_BOOTSTRAP=<peer_id_from_terminal_1> \
./target/debug/zipnet server

# terminal 4 — aggregator
ZIPNET_INSTANCE="dev.local" \
ZIPNET_BOOTSTRAP=<peer_id_from_terminal_1> \
./target/debug/zipnet aggregator

# terminal 5 — reference publisher
ZIPNET_INSTANCE="dev.local" \
ZIPNET_BOOTSTRAP=<peer_id_from_terminal_1> \
ZIPNET_MESSAGE="hello from the smoke test" \
./target/debug/zipnet client

A healthy run prints round finalized on the committee servers within a minute and the client’s payload echoes back on the subscriber side. TDX is off in this mode — production instances re-enable it (see below).

What every server process does for you

When zipnet server starts it:

Joins the shared universe network (zipnet::UNIVERSE, or whatever you set ZIPNET_UNIVERSE to).
Derives every instance-local id from ZIPNET_INSTANCE — committee GroupId, the submit stream, the broadcasts collection, the registries.
Bonds with its peers using the committee secret and TDX measurement.
Advertises itself on the shared peer catalog via mosaik’s standard /mosaik/announce gossip. Publishers that compile in the same instance name reach the same GroupId and bond automatically.
Accepts rounds from the aggregator and replicates broadcasts through the committee Raft group.

You do not configure streams, collections, or group ids by hand, and you do not publish an announcement anywhere. The instance name is the only piece of identity you manage; everything else is either derived or taken care of by mosaik.

Building a TDX image (production path)

For production, every committee server and every publishing client runs inside a TDX guest. Mosaik ships the image builder — you do not compose QEMU, OVMF, kernels, and initramfs yourself, and you do not compute MR_TD by hand.

The zipnet crate’s build.rs is feature-gated so library consumers don’t pay for the TDX build. Opt in with the corresponding Cargo feature:

# Ubuntu image (committee servers, typical):
cargo build --release --features tdx-builder-ubuntu

# Alpine image (lighter — good for clients that care about image size):
cargo build --release --features tdx-builder-alpine

Artifacts land under target/release/tdx-artifacts/zipnet/<distro>/:

Artifact	What it’s for
`zipnet-run-qemu.sh`	Self-extracting launcher. This is what you invoke on a TDX host.
`zipnet-mrtd.hex`	The 48-byte measurement. Publishers pin against this.
`zipnet-vmlinuz`	Raw kernel, in case you repackage.
`zipnet-initramfs.cpio.gz`	Raw initramfs.
`zipnet-ovmf.fd`	Raw OVMF firmware.

Mosaik computes MR_TD at build time by parsing the OVMF, the kernel and the initramfs according to the TDX spec — the same value the TDX hardware will report at runtime. You ship this hex string alongside your announcement; a client whose own image does not measure to the same MR_TD cannot join the instance. See developers/handshake-with-operator for the matching client-side flow.

Same binary, one MR_TD. The consolidated zipnet binary holds all three roles — the image is identical whether it runs as committee server, aggregator, or client, so the MR_TD is also identical. The role is chosen at guest launch via the subcommand (or ZIPNET_ROLE-style wrapper your entrypoint supplies). If you require cryptographically distinct committee vs. client images — so an attested peer can prove “I am running as a client, not a committee member” — build twice with different entrypoint defaults baked into each initramfs; mosaik’s with_default_memory_size / with_extra_initramfs_files builder knobs are the usual seams.

Alpine produces a ~5 MB image versus Ubuntu’s ~25 MB, at the cost of musl. Use Alpine for publishers where image size matters; keep Ubuntu for committee servers unless you have a specific reason otherwise.

Instance naming and your developers’ handshake

Publishers bond to your instance by knowing three things: the universe NetworkId, the instance name, and (if TDX-gated) the MR_TD of your committee image. That is the complete handoff — no registry, no dynamic lookup, no on-network advertisement.

Publish these via whatever channel suits the developers binding against your deployment: release notes, a docs page, direct handoff in a setup email. They bake the instance name (or its derived UniqueId) into their code at compile time.

Instance names share a flat namespace per universe. Two operators picking the same name collide in the committee group and neither works correctly — mosaik has no mechanism to prevent this and no way to tell you it happened. Namespace aggressively: <org>.<purpose>.<env>, for example acme.mixer.mainnet. If in doubt, include an irrevocable random suffix once and forget about it (acme.mixer.mainnet.8f3c1a).

Retiring an instance is just stopping every server under that name. Publishers still trying to bond will see ConnectTimeout; they update their code to the new name and rebuild.

Going live

Once the smoke test passes on staging hardware:

Build your production TDX images (committee + client). Publish the two mrtd.hex values to whatever channel the developers binding against your deployment consume (docs site, release notes, signed announcement).
Stand up three TDX committee servers on geographically separate hosts, with the production ZIPNET_INSTANCE and ZIPNET_COMMITTEE_SECRET.
Stand up the aggregator on a non-TDX but well-connected host.
Verify the committee has elected a leader and the aggregator is bonded to the submit stream. Your own aggregator metrics are the easiest check; on the committee side, exactly one server should report mosaik_groups_leader_is_local = 1.
Hand publishers your instance name, one universe bootstrap PeerId, and (if TDX-gated) your committee MR_TD. That is the entirety of their onboarding.

Running many instances side by side

systemctl start zipnet@acme-mainnet-server
systemctl start zipnet@preview.alpha-server
systemctl start zipnet@dev.ops-server

Each unit sets a different ZIPNET_INSTANCE; they share the universe and the discovery layer, and appear to publishers as three distinct zipnet::Config fingerprints — each one bound via the usual Zipnet::<D>::* constructors.

Next reading

Running a committee server — every environment variable and what it does.
Running the aggregator — the untrusted-but-load-bearing node.
Rotations and upgrades — retiring an instance, rebuilding TDX images, rotating committee secrets.
Monitoring and alerts — the metrics that matter in production.
Incident response — stuck rounds, split brain, expired MR_TDs.
Security posture checklist — what committee operators must protect.
Designing coexisting systems on mosaik — the shared-universe model in full, for operators who want to understand why the instance is the unit of identity.

End-to-end deploy example — one TDX host

audience: operators

A worked, copy-pasteable runbook that stands up a complete zipnet instance on a single TDX-capable host reachable at ubuntu@tdx-host. The topology is the minimum viable deployment: three committee servers, one aggregator, one reference publisher, all co-located as separate TDX guests (plus one non-TDX process for the aggregator) on the same physical host.

Use this recipe for staging, integration, or a demo. For production, split the three committee servers onto three independently-operated TDX hosts — the steps per host are identical; only the bootstrap wiring changes.

What you are about to build

                    ubuntu@tdx-host  (one physical TDX server)
  ┌──────────────────────────────────────────────────────────────┐
  │  TDX guest #1        TDX guest #2        TDX guest #3        │
  │  zipnet server #1    zipnet server #2    zipnet server #3    │
  │        │                    │                    │           │
  │        └────── Raft / mosaik group (committee) ──┘           │
  │                             │                                │
  │               ┌─────────────▼──────────────┐                 │
  │               │  zipnet aggregator         │                 │
  │               │  (host process, no TDX)    │                 │
  │               └─────────────┬──────────────┘                 │
  │                             │                                │
  │                ┌────────────▼────────────┐                   │
  │                │  TDX guest #4           │                   │
  │                │  zipnet client (demo)   │                   │
  │                └─────────────────────────┘                   │
  └──────────────────────────────────────────────────────────────┘

Every process above is the same zipnet binary; the role is picked at launch via zipnet server, zipnet aggregator, or zipnet client. The three committee guests and the client guest boot from the same TDX image (one MR_TD) — the subcommand is set by each guest’s entrypoint, not the image.

The instance name used throughout is demo.tdx. Swap it for your own namespaced name before running anything in production (<org>.<purpose>.<env>; see Quickstart — naming the instance).

Prerequisites

On your workstation:

A checkout of this repo.
Rust 1.93 (rustup show confirms rust-toolchain.toml).
SSH access to the host: ssh ubuntu@tdx-host returns a shell.
scp and rsync available locally.

On ubuntu@tdx-host:

Bare-metal or cloud host with Intel TDX enabled in BIOS and a TDX kernel installed. ls /dev/tdx_guest exists on the host and the kernel module kvm_intel is loaded with tdx=Y. If you are unsure, run dmesg | grep -i tdx.
qemu-system-x86_64 at a version the mosaik launcher supports (8.2+). The launcher script will tell you if the local QEMU is too old.
A user that can access /dev/kvm and /dev/tdx_guest without root. On Ubuntu, add ubuntu to the kvm and tdx groups.
tmux (used below to keep each role’s logs visible). Any process supervisor works — systemd user units, screen, nohup. The commands that follow use tmux because it is the lowest-ceremony option.
Outbound UDP to the internet for iroh / QUIC and mosaik relays. No inbound ports need to be opened — mosaik’s hole-punching layer handles reachability.

Two small decisions fixed for this example:

Knob	Value used here	Why
`ZIPNET_INSTANCE`	`demo.tdx`	Short, obvious, collision-unlikely. Rename freely.
`ZIPNET_COMMITTEE_SECRET`	`openssl rand -hex 32` once, pasted into the env for all three servers	Shared admission secret for the committee. Clients and the aggregator must not see this value.
`ZIPNET_MIN_PARTICIPANTS`	`1`	So the single demo client triggers rounds. Raise to `>=2` for real anonymity.
`ZIPNET_ROUND_PERIOD`	`3s`	Enough headroom on a shared host to see logs land in order.

Step 1 — Build the TDX artifacts on your workstation

From the repo root, build the zipnet binary twice — once as the plain executable the aggregator runs, and once (or twice) as a TDX image the committee servers and client guests boot into. The TDX build is feature-gated on the zipnet crate, so SDK-library consumers don’t pay for it.

# Plain aggregator binary (no TDX):
cargo build --release --bin zipnet

# TDX image — Ubuntu flavour. Run a second time with
# --features tdx-builder-alpine if you want a separate lighter
# image for clients; this example uses the same Ubuntu image for
# both roles to keep the walkthrough short.
cargo build --release --bin zipnet --features tdx-builder-ubuntu

When this finishes you have:

target/release/
  zipnet                                   # plain binary; aggregator uses this
  tdx-artifacts/
    zipnet/ubuntu/
      zipnet-run-qemu.sh                   # self-extracting TDX launcher
      zipnet-mrtd.hex                      # 48-byte image measurement
      zipnet-vmlinuz
      zipnet-initramfs.cpio.gz
      zipnet-ovmf.fd

There is one MR_TD — the same image runs committee and client roles, the role is picked by the guest’s entrypoint (zipnet server vs. zipnet client). That’s a deliberate v1 simplification; operators who need separable committee-vs-client attestation build twice with different entrypoint defaults.

MRTD=$(cat target/release/tdx-artifacts/zipnet/ubuntu/zipnet-mrtd.hex)
echo "image MR_TD: $MRTD"

Step 2 — Copy artifacts to the host

ssh ubuntu@tdx-host 'mkdir -p ~/zipnet/{image,aggregator,logs}'

rsync -avz --delete \
  target/release/tdx-artifacts/zipnet/ubuntu/ \
  ubuntu@tdx-host:~/zipnet/image/

scp target/release/zipnet \
  ubuntu@tdx-host:~/zipnet/aggregator/

The launcher scripts are self-extracting — they embed kernel, initramfs, and OVMF. You do not need to copy the raw vmlinuz / initramfs / ovmf.fd files unless you plan to repackage.

Step 3 — Pick a committee secret

On the TDX host, once, generate the shared committee secret and park it in a file you will source into each server’s environment. Anyone with this value can join the committee, so treat it as a root credential.

ssh ubuntu@tdx-host

# on the host
umask 077
openssl rand -hex 32 > ~/zipnet/committee-secret
chmod 600 ~/zipnet/committee-secret

Step 4 — Start the first committee server and capture its PeerId

The first server has no one to bootstrap against, so it starts without ZIPNET_BOOTSTRAP. Its startup line prints peer=<hex>… — capture that and reuse it as the bootstrap hint for every following process.

Open a tmux session on the host and start server 1:

# on the host
tmux new-session -d -s zipnet-s1 -n server-1
tmux send-keys -t zipnet-s1:server-1 "
  ZIPNET_INSTANCE=demo.tdx \
  ZIPNET_COMMITTEE_SECRET=\$(cat ~/zipnet/committee-secret) \
  ZIPNET_SECRET=server-1-seed \
  ZIPNET_MIN_PARTICIPANTS=1 \
  ZIPNET_ROUND_PERIOD=3s \
  ZIPNET_ROUND_DEADLINE=15s \
  RUST_LOG=info,zipnet_node=info \
  ~/zipnet/image/zipnet-run-qemu.sh server 2>&1 | tee ~/zipnet/logs/server-1.log
" C-m

Wait five or ten seconds for the TDX guest to come up, then pull the PeerId out of the log:

# on the host
BOOTSTRAP=$(grep -oE 'peer=[0-9a-f]{10,}' ~/zipnet/logs/server-1.log | head -1 | cut -d= -f2)
echo "bootstrap peer: $BOOTSTRAP"

If $BOOTSTRAP is empty, the guest has not finished booting — the first round of QEMU + TDX can take 30 s on a cold host. Re-run the grep after a beat.

What if I don’t see the peer= line? The self-extracting launcher prints its own boot banner first. The zipnet line (zipnet up: network=<universe> instance=demo.tdx peer=...) only appears once the binary inside the guest has announced. If it is still missing after a minute, less ~/zipnet/logs/server-1.log and look for QEMU-level errors — typically TDX not enabled, or /dev/kvm permissions.

Step 5 — Start the remaining two committee servers

Each server gets a distinct ZIPNET_SECRET (so each derives a unique PeerId) and bootstraps against server 1.

# on the host — still inside your SSH session
tmux new-session -d -s zipnet-s2 -n server-2
tmux send-keys -t zipnet-s2:server-2 "
  ZIPNET_INSTANCE=demo.tdx \
  ZIPNET_COMMITTEE_SECRET=\$(cat ~/zipnet/committee-secret) \
  ZIPNET_SECRET=server-2-seed \
  ZIPNET_BOOTSTRAP=$BOOTSTRAP \
  ZIPNET_MIN_PARTICIPANTS=1 \
  ZIPNET_ROUND_PERIOD=3s \
  ZIPNET_ROUND_DEADLINE=15s \
  RUST_LOG=info,zipnet_node=info \
  ~/zipnet/image/zipnet-run-qemu.sh server 2>&1 | tee ~/zipnet/logs/server-2.log
" C-m

tmux new-session -d -s zipnet-s3 -n server-3
tmux send-keys -t zipnet-s3:server-3 "
  ZIPNET_INSTANCE=demo.tdx \
  ZIPNET_COMMITTEE_SECRET=\$(cat ~/zipnet/committee-secret) \
  ZIPNET_SECRET=server-3-seed \
  ZIPNET_BOOTSTRAP=$BOOTSTRAP \
  ZIPNET_MIN_PARTICIPANTS=1 \
  ZIPNET_ROUND_PERIOD=3s \
  ZIPNET_ROUND_DEADLINE=15s \
  RUST_LOG=info,zipnet_node=info \
  ~/zipnet/image/zipnet-run-qemu.sh server 2>&1 | tee ~/zipnet/logs/server-3.log
" C-m

Within 15–30 s, one of the three servers should log committee: opening round at index I_1. That one is the current Raft leader; the other two are followers. Which server wins the election is not deterministic — do not special-case the first server as “always the leader”.

Confirm the committee is healthy:

# on the host
grep -E 'zipnet up|leader|round' ~/zipnet/logs/server-*.log | tail -20

Step 6 — Start the aggregator

The aggregator is the only non-TDX process. It bootstraps against any committee server and must not be given the committee secret.

# on the host
tmux new-session -d -s zipnet-agg -n aggregator
tmux send-keys -t zipnet-agg:aggregator "
  ZIPNET_INSTANCE=demo.tdx \
  ZIPNET_SECRET=aggregator-seed \
  ZIPNET_BOOTSTRAP=$BOOTSTRAP \
  ZIPNET_FOLD_DEADLINE=2s \
  RUST_LOG=info,zipnet_node=info \
  ~/zipnet/aggregator/zipnet aggregator 2>&1 | tee ~/zipnet/logs/aggregator.log
" C-m

A healthy aggregator settles quickly and logs aggregator booting; waiting for collections to come online within a few seconds.

Step 7 — Start the reference client

# on the host
tmux new-session -d -s zipnet-c1 -n client-1
tmux send-keys -t zipnet-c1:client-1 "
  ZIPNET_INSTANCE=demo.tdx \
  ZIPNET_BOOTSTRAP=$BOOTSTRAP \
  ZIPNET_MESSAGE='hello from ubuntu@tdx-host' \
  ZIPNET_CADENCE=1 \
  RUST_LOG=info,zipnet_node=info \
  ~/zipnet/image/zipnet-run-qemu.sh client 2>&1 | tee ~/zipnet/logs/client-1.log
" C-m

Within one ZIPNET_ROUND_PERIOD (3s here) after the aggregator bonds, the Raft leader should print:

INFO zipnet::node::committee: committee: opening round at index I_1
INFO zipnet::node::roles::server: submitted partial unblind at I_2
INFO zipnet::node::committee: committee: round finalized round=r1 participants=1

Step 8 — Verify end-to-end

From the host, tail all four log streams at once:

# on the host
tail -F ~/zipnet/logs/server-*.log ~/zipnet/logs/aggregator.log ~/zipnet/logs/client-1.log

You are looking for:

Signal	Where	Meaning
`zipnet up: network=<universe> instance=demo.tdx`	every role	Universe join and instance binding succeeded.
`mosaik_groups_leader_is_local = 1` on exactly one server (Prometheus or log line)	server logs	Committee has a single Raft leader.
`aggregator: forwarded aggregate to committee round=rN participants=1`	aggregator	Client envelopes reached the aggregator and were folded.
`committee: round finalized round=rN participants=1`	whichever server is leader	End-to-end round closed; broadcast published into the `Broadcasts` collection.

Once you see round finalized with a non-zero participants count, the topology is working.

Cleanup

# on the host
for s in zipnet-s1 zipnet-s2 zipnet-s3 zipnet-agg zipnet-c1; do
  tmux kill-session -t $s 2>/dev/null || true
done

Each TDX guest emits a departure announcement over gossip on SIGTERM and Raft tolerates a majority remaining; kill-session sends SIGTERM to the foreground QEMU process, which in turn signals the guest.

If a guest is wedged, pkill -f zipnet-run-qemu.sh is safe — all in-memory state is disposable in v1.

What to change for a real deployment

This example collapses a three-node committee onto one host to keep the runbook short. To roll the same shape into production:

Replace ubuntu@tdx-host with three separate TDX hosts ubuntu@tdx-1, ubuntu@tdx-2, ubuntu@tdx-3 run by three independent operators (or at minimum, with three independent blast radii). Geographic separation is the point.
Run the aggregator on a fourth, non-TDX but well-connected host. Clients will often use it as a bootstrap; pick something with a stable address.
Swap tmux for systemd unit files — one per role — so crash recovery is automatic. See Running a committee server for the full production env matrix.
Bump ZIPNET_MIN_PARTICIPANTS to at least 2. A single client produces no anonymity.
Publish the instance name, universe NetworkId, and the two MR_TDs ($SERVER_MRTD, $CLIENT_MRTD) to the developers who will bind against your deployment, via release notes or a signed announcement. That is the entire onboarding handoff; see What you need from the operator for the matching reader side.

Running a committee server

audience: operators

A committee server joins the Raft group that orchestrates the instance’s rounds, holds one of the X25519 keys used to unblind the broadcast vector, and publishes its public bundle into the replicated ServerRegistry. In production it runs inside a TDX guest built from the mosaik image builder; see the Quickstart TDX section.

One-shot command

ZIPNET_INSTANCE="acme.mainnet" \
ZIPNET_COMMITTEE_SECRET="your-committee-secret" \
ZIPNET_SECRET="stable-node-seed" \
ZIPNET_MIN_PARTICIPANTS=2 \
ZIPNET_ROUND_PERIOD=3s \
ZIPNET_ROUND_DEADLINE=15s \
zipnet server --bootstrap <peer_id_of_another_server>

On a fresh universe with no existing seed peers, start the first server without --bootstrap, grab the peer=… value printed at startup, and pass it as --bootstrap to the remaining servers. Every subsequent server, aggregator, or client can be bootstrapped off any one of them. After the universe has settled, the mosaik discovery layer finds peers on its own and the bootstrap hint is only needed for cold starts.

Environment variables

The full list lives in Environment variables. The ones you will actually set in production:

Variable	Meaning	Notes
`ZIPNET_INSTANCE`	Instance name this server serves	Required. Short, stable, namespaced (e.g. `acme.mainnet`). Must match across the whole deployment.
`ZIPNET_UNIVERSE`	Universe override	Optional. Leave unset to use `zipnet::UNIVERSE` (the shared mosaik universe). Set only for isolated federations.
`ZIPNET_COMMITTEE_SECRET`	Shared committee admission secret	Treat as root credential. Identical on every committee member of this instance.
`ZIPNET_SECRET` (or `--secret`)	Seed for this node’s stable `PeerId`	Unique per node. Anything not 64-hex is blake3-hashed.
`ZIPNET_BOOTSTRAP`	Peer IDs to dial on startup	Helpful on cold universes; unnecessary once discovery has converged.
`ZIPNET_MIN_PARTICIPANTS`	Minimum clients before the leader opens a round	Default 1. Set to at least 2 for meaningful anonymity.
`ZIPNET_ROUND_PERIOD`	How often the leader attempts to open a round	e.g. `2s`, `500ms`.
`ZIPNET_ROUND_DEADLINE`	Max time a round may stay open	e.g. `15s`. The leader will force-advance a stuck round.
`ZIPNET_METRICS`	Bind address for the Prometheus exporter	e.g. `0.0.0.0:9100`.
`RUST_LOG`	Log filter	Sane default: `info,zipnet_node=info,mosaik=warn`.

Naming the instance

Instance names share a flat namespace per universe. Two operators picking the same name collide in the same committee group and neither deployment works — mosaik has no way to prevent or detect this. Namespace aggressively: <org>.<purpose>.<env>, for example acme.mixer.mainnet. If unsure, add a random suffix once and forget about it (acme.mixer.mainnet.8f3c1a).

What a healthy startup looks like

INFO zipnet_server: spawning zipnet server server=a2095bed48
INFO zipnet::node::roles::common: zipnet up: network=<universe> instance=acme.mainnet peer=f5e28a69e6... role=3b37e5d575...
INFO zipnet::node::roles::server: server booting; waiting for collections + group
INFO zipnet::node::committee: committee: opening round at index I_1
INFO zipnet::node::roles::server: submitted partial unblind at I_2
INFO zipnet::node::committee: committee: round finalized round=r1 participants=N

A server that has been up for more than a minute and has not printed round finalized yet is almost always waiting on one of:

Client count below ZIPNET_MIN_PARTICIPANTS. Check the aggregator’s zipnet_client_registry_size metric.
Committee group has not elected a leader. Check mosaik_groups_leader_is_local on each server; exactly one should be 1.
Bundle tickets not replicated. See Incident response — stuck rounds.

Resource profile

A single-slot round at the default RoundParams (64 slots × 256 bytes = 16 KiB broadcast vector) with 100 clients uses roughly:

CPU: a burst of ~5 ms per round per client (pad derivation dominates).
RAM: O(N) client bundles × 64 bytes + a ring buffer of recent aggregates.
Network: inbound one aggregate envelope per round (+ Raft heartbeat traffic between servers), outbound one partial per round + Raft replication to followers.

Graceful shutdown

Send SIGTERM. The server emits a departure announcement over gossip so peers learn within the next announce cycle (default 15 s) that it is gone. Raft proceeds with the remaining quorum provided a majority is still up.

Availability warning

In v1, any committee server going offline halts round progression because the state machine waits for one partial per server listed in the round’s roster. This is by design — the paper’s any-trust model prioritizes correctness over liveness. A v2 improvement is sketched in Roadmap to v2.

Running the aggregator

audience: operators

The aggregator receives every client envelope for the live round, XORs them into a single AggregateEnvelope, and forwards that to the committee. It is untrusted for anonymity — compromising it only affects liveness and round-membership accounting, never whether a message can be linked to its sender. It is trusted for liveness: if it stops, rounds stop.

In v1 there is exactly one aggregator per instance. It does not need to run inside a TDX guest (though you can if your ops story prefers uniformity).

One-shot command

ZIPNET_INSTANCE="acme.mainnet" \
ZIPNET_SECRET="stable-agg-seed" \
ZIPNET_FOLD_DEADLINE=2s \
zipnet aggregator --bootstrap <peer_id_of_a_committee_server>

Environment variables

Variable	Meaning	Notes
`ZIPNET_INSTANCE`	Instance name this aggregator serves	Required. Must match the committee’s. Typos show up as `ConnectTimeout` at round-open time.
`ZIPNET_UNIVERSE`	Universe override	Optional; leave unset to use the shared universe.
`ZIPNET_SECRET` (or `--secret`)	Seed for this aggregator’s stable `PeerId`	Strongly recommended: clients often use the aggregator as a discovery bootstrap.
`ZIPNET_BOOTSTRAP`	Peer IDs to dial on startup	At least one committee server on a cold universe.
`ZIPNET_FOLD_DEADLINE`	Time window to collect envelopes after a round opens	Default 2s. Raising it admits slower clients at the cost of latency.
`ZIPNET_METRICS`	Prometheus bind address	Optional.

The aggregator does not take ZIPNET_COMMITTEE_SECRET. It is outside the committee’s trust boundary by design; do not give it that secret even if your secret store makes it convenient.

What a healthy aggregator log looks like

INFO zipnet::node::roles::common: zipnet up: network=<universe> instance=acme.mainnet peer=4c210e8340... role=5ef6c4ada2...
INFO zipnet::node::roles::aggregator: aggregator booting; waiting for collections to come online
INFO zipnet::node::roles::aggregator: aggregator: forwarded aggregate to committee round=r1 participants=3
INFO zipnet::node::roles::aggregator: aggregator: forwarded aggregate to committee round=r2 participants=3
...

Capacity planning

Per round the aggregator:

Receives N × B bytes from clients, where N is the number of active clients and B is the broadcast vector size (defaults to 16 KiB).
Sends one aggregate of size B to every committee server.

If the committee is 5 servers and the instance has 1000 clients with default parameters:

Inbound per round ≈ 1000 × 16 KiB = 16 MiB.
Outbound per round ≈ 5 × 16 KiB = 80 KiB.

At a 2 s round cadence, inbound averages 64 Mbit/s. Provision accordingly.

Graceful shutdown

SIGTERM. Clients whose envelopes had not yet been folded into the current round’s aggregate will drop to the floor; they retry on the next round automatically.

Because the aggregator is a single point of failure for liveness in v1, plan restarts against your monitoring: a round stall of 3 × ROUND_PERIOD + ROUND_DEADLINE triggers the stuck-round alert documented in Monitoring.

What if I want two aggregators?

Not supported in v1. Running two on the same instance name gets you two processes competing for the submit stream, not load-balancing. If you need redundancy today, fail over with a warm-standby host behind a process supervisor — not two live aggregators. A multi-tier aggregator tree is sketched in Roadmap to v2 — Multi-tier aggregators.

Running a client

audience: operators

The typical zipnet publisher is an external user running their own TDX-attested agent — you don’t operate those. This page is about the reference zipnet client subcommand you ship to publishers (or run yourself for a bundled wallet, a cover-traffic filler, or a smoke-test participant).

A client generates an X25519 keypair, publishes its public bundle via gossip, and seals one envelope per round. In production every client runs inside a TDX guest whose MR_TD matches the value your committee pinned; see Quickstart TDX section.

One-shot command

ZIPNET_INSTANCE="acme.mainnet" \
ZIPNET_MESSAGE="payload-to-broadcast" \
zipnet client --bootstrap <peer_id_of_aggregator_or_server>

Omit ZIPNET_MESSAGE to run a cover-traffic client that participates in every round with a zero payload. Cover traffic is the operator’s tool for raising the effective anonymity set size when real publishers are sparse.

Environment variables

Variable	Meaning	Notes
`ZIPNET_INSTANCE`	Instance name to bind to	Required. Same string the committee uses; typos show up as `ConnectTimeout`.
`ZIPNET_UNIVERSE`	Universe override	Optional; leave unset to use the shared universe.
`ZIPNET_BOOTSTRAP`	Peer IDs to dial on startup	Aggregator’s `PeerId` or any committee server’s. Needed only on cold networks.
`ZIPNET_MESSAGE`	UTF-8 message to seal per round	Truncate yourself to fit `slot_bytes − tag_len`. Default slot width is 240 bytes of user payload.
`ZIPNET_CADENCE`	Talk every Nth round	Default 1. Useful for dialing your own talk/cover ratio.
`ZIPNET_METRICS`	Prometheus bind address	Optional.

Building the TDX image you ship to publishers

Publishers to a TDX-gated instance need to run your image (not their own ad-hoc build), because the committee will reject any client whose quote doesn’t match the pinned MR_TD.

Under the consolidated-binary layout, the TDX build lives on the single zipnet crate and is gated by a Cargo feature:

cargo build --release --features tdx-builder-alpine

Alpine is the usual choice for clients — ~5 MB versus Ubuntu’s ~25 MB — unless your agent has a specific glibc dependency. The artifacts land under target/release/tdx-artifacts/zipnet/alpine/:

Artifact	What it’s for
`zipnet-run-qemu.sh`	Self-extracting launcher publishers invoke on a TDX host.
`zipnet-mrtd.hex`	The 48-byte measurement. You pin this in the committee and publish it to readers.
`zipnet-vmlinuz`	Raw kernel, for repackaging.
`zipnet-initramfs.cpio.gz`	Raw initramfs.
`zipnet-ovmf.fd`	Raw OVMF firmware.

The image bakes the zipnet binary only; the role (server / aggregator / client) is picked at launch via the subcommand or via ZIPNET_* env vars set by the guest’s entrypoint. MR_TD is identical for every role because the image is identical. If you need cryptographically separable committee and client images, build twice with different default entrypoints baked in — see Quickstart — TDX.

Publish zipnet-mrtd.hex alongside your release notes. It goes into the committee’s Tdx::require_mrtd(...) configuration and into readers’ verification code. See Rotations and upgrades for rolling a new MR_TD without downtime.

What a healthy client log looks like

INFO zipnet: spawning zipnet client client=550fda1ffa
INFO zipnet::node::roles::common: zipnet up: network=<universe> instance=acme.mainnet peer=c2e9aeee0e... role=a8b7ed5911...
INFO zipnet::node::roles::client: client booting; waiting for rosters

After boot, every sealed envelope is a DEBUG event. Raise RUST_LOG to debug,zipnet_node=debug to see them.

Why a client’s envelope might get dropped

The client bundle hasn’t replicated yet. The first few rounds after a client connects may not include it in ClientRegistry. Wait for zipnet_client_registered to flip to 1 before relying on anonymity guarantees.
Slot collision with another client. v1’s slot assignment is a deterministic hash — two clients occasionally pick the same slot and XOR their messages into garbage. Neither falsification tag verifies, the committee still publishes the broadcast, the messages are lost, the clients retry next round. A 4x-oversized scheduling vector in v2 makes this rare.
Message is longer than slot_bytes − tag_len. The client exits with MessageTooLong. Shorten, or raise slot_bytes at the instance level (which retires the instance — see Rotations and upgrades).

Identity lifetime

In the mock path (TDX disabled), each process run generates a fresh X25519 identity — run-to-run unlinkability is free. In the TDX path, the identity lives in sealed storage inside the enclave so a restart preserves it; useful for reputation systems, but means the same enclave is recognizable across runs. Design accordingly when you pick a cover-traffic cadence.

Rotations and upgrades

audience: operators

Every routine change in a running instance falls into one of these procedures. Follow them verbatim; the consensus and crypto are unforgiving about accidental divergence.

Rolling a committee server (restart, same identity)

Safe any time. Minority-restart is handled by Raft automatically.

Stop the target server with SIGTERM. Wait for graceful exit (under 5 s).
Replace the binary / restart the container / whatever triggered the rollout.
Start the server with the same ZIPNET_INSTANCE, ZIPNET_SECRET, and ZIPNET_COMMITTEE_SECRET as before.
Observe mosaik_groups_leader_is_local on the remaining servers — election should settle within a few seconds.
Once the restarted server’s log shows round finalized, move to the next one.

Do not restart a majority of the committee simultaneously — that drops quorum and halts round progression until a majority is back up.

Adding a committee server

Provision the new node. Assign it a fresh ZIPNET_SECRET seed.
Distribute the same ZIPNET_INSTANCE and ZIPNET_COMMITTEE_SECRET to it.
Start it with --bootstrap <peer_id_of_any_existing_server>.
Wait for the new server’s log to print round finalized — it has caught up.
Update your operational runbook, monitoring targets, and audit log to reflect the added node.

The ServerRegistry collection automatically reflects the new member within one round. Clients start including the new server in their pad derivation from the next OpenRound the leader issues.

Removing a committee server

Announce the removal at least one gossip cycle ahead (default 15 s) so catalog entries expire cleanly.
SIGTERM the target node.
Verify the remaining servers still form a majority and continue to finalize rounds (round finalized events in the logs).

Security warning

A removed server retains its DH secret. If that secret is not wiped, an adversary who later compromises the decommissioned machine can replay historic rounds and compute that server’s share of past pads. Combined with any other committee server’s DH secret compromise, this would break anonymity of past rounds. Wipe DH secrets on decommission.

Rotating a committee server’s long-term key

v1 does not have first-class key rotation. The procedure is “decommission + re-add”:

Remove the old server (above).
Add a new server with a fresh ZIPNET_SECRET (above).

The committee’s GroupId does not change (it depends on the instance name and shared ZIPNET_COMMITTEE_SECRET, not on individual node identities), so the Raft group persists across the swap. The ServerRegistry entry is updated automatically.

Rotating the committee secret

This is disruptive: changing ZIPNET_COMMITTEE_SECRET changes the GroupId, so the old committee is abandoned. External publishers compiled against the instance name still bond, but the committee they find is new.

Announce a maintenance window.
Stop every client, aggregator, and committee server on this instance.
Distribute the new ZIPNET_COMMITTEE_SECRET to all committee members.
Start the committee first, then the aggregator, then the clients.

Rotating round parameters

RoundParams (num_slots, slot_bytes, tag_len) is folded into the committee’s state-machine signature. Changing it is equivalent to rotating the committee secret (above), and it is a breaking change for any publisher that compiled the old parameters in — meaning in practice you bump the instance.

See Retiring and replacing an instance below.

Dev note

Developers changing RoundParams in code must also bump the signature string in CommitteeMachine::signature() when appropriate — otherwise old and new nodes silently derive the same GroupId but disagree on apply semantics. See The committee state machine.

Rebuilding a TDX image

Rebuilding the committee or client image produces a new MR_TD. The committee’s ticket validator is pinned to a specific MR_TD, so a rebuild requires coordinated rollout:

Build the new image with cargo build --release (the mosaik TDX builder runs in build.rs, producing a fresh mrtd.hex).
Publish the new mrtd.hex to your release-notes channel.
Decide whether the change is ABI-compatible with the current committee’s expectations:
- Patch-level image change (kernel patch, initramfs tweak, no wire-format or state-machine change): accept both MR_TDs transiently by updating the committee’s require_mrtd list to include the new hash, roll the committee hosts one at a time to the new image, then drop the old MR_TD from the allow-list.
- Breaking change (new state-machine signature, new wire format, new RoundParams): treat it as retiring the instance (below).
Sign and publish the new MR_TD, along with the retirement window for the old one, so publishers can rebuild their own images in time.

Retiring and replacing an instance

Use this path whenever a cross-compatibility boundary moves (RoundParams, CommitteeMachine::signature, wire format, breaking MR_TD change). You have two idiomatic versioning stories:

Version in the name. Stand up the new deployment under a new instance name (acme.mainnet.v2). Old and new run in parallel for the transition window; publishers re-pin and rebuild at their own pace; you tear down the old instance when traffic has drained. The cleanest story for external publishers; forces them to cut a release.
Lockstep release against a shared deployment crate. Keep the instance name stable, cut a new deployment-crate version pinning the new state-machine signature, and coordinate operator + publisher upgrades as a single release event. Avoids instance-ID churn at the cost of tighter release-cadence coupling.

Zipnet v1 does not mandate which you pick; see Designing coexisting systems on mosaik — Versioning under stable instance names for the full tradeoff.

Retirement itself is just stopping every server under the old instance name. Publishers still trying to bond see ConnectTimeout; they rebuild against the new name or the new deployment crate and reconnect.

Upgrading the binary

Patch-level upgrades (no CommitteeMachine::signature change, no RoundParams change, no wire format change, no MR_TD change if TDX-gated) are safe to roll one node at a time following the restart procedure.

Upgrades that change any of those four cross a compatibility boundary — treat them like retiring the instance.

Dev notes on where to look in source:

WIRE_VERSION in crates/zipnet/src/proto/lib.rs
CommitteeMachine::signature in crates/zipnet/src/node/committee.rs
RoundParams::default_v1 in crates/zipnet/src/proto/params.rs

Any change to those requires a coordinated restart of the whole instance.

Monitoring and alerts

audience: operators

Zipnet inherits mosaik’s Prometheus exporter. Enable it by setting ZIPNET_METRICS=0.0.0.0:9100 (or a port of your choice) on every node you want scraped. See Metrics reference for the complete list; this page covers the metrics that actually tell you whether an instance is healthy.

All zipnet-emitted metrics carry an instance="<name>" label set from ZIPNET_INSTANCE. Scope your alert rules on that label so a stuck preview.alpha doesn’t page the on-call for acme.mainnet.

The three questions you ask every shift

1. “Are rounds finalizing?”

The authoritative signal is new entries appearing in the Broadcasts collection. Track the rate of round finalized log events on committee servers (INFO level). A healthy instance finalizes one round per ZIPNET_ROUND_PERIOD interval, plus or minus ZIPNET_FOLD_DEADLINE.

Alert condition: no round finalized event on a leader server for 3 × ROUND_PERIOD + ROUND_DEADLINE.

2. “Is the committee healthy?”

Exactly one committee server in this instance should report itself as leader at any one time. If zero or two-plus, investigate (see Incident response — split-brain). The relevant metric is mosaik_groups_leader_is_local{instance="…"}.
Bond count per server should equal N − 1 where N is the committee size. A dropped bond suggests a universe-level partition or an expired ticket.
Raft log position should advance in lockstep across servers. A persistent lag (> 5 indices) on one server indicates that node is falling behind.

3. “Are clients and their pubkeys reaching the committee?”

ClientRegistry size ≈ number of clients you launched for this instance, give or take gossip cycles.
Per-round participants count in round finalized events ≈ the number of non-idle clients.

Alert condition: participants = 0 for two consecutive rounds while you expected > 0.

Useful log filters

Unit names are operator-chosen. The examples below use a zipnet@<instance>-<role> template — one template, one binary, the role selector is part of the instance specifier.

On committee servers:

journalctl -u zipnet@acme-mainnet-server -f \
  --grep='round finalized|opening round|submitted partial|SubmitAggregate|rival group leader'

On the aggregator:

journalctl -u zipnet@acme-mainnet-aggregator -f \
  --grep='forwarded aggregate|registering client'

On clients:

journalctl -u zipnet@acme-mainnet-client -f \
  --grep='sealed envelope|registration'

(Adjust for your process supervisor.)

Baseline expectations at default parameters

Condition	Committee server	Aggregator	Client
Steady-state CPU	< 5 % on a mid-range core	varies with client count	< 1 %
RAM	50–200 MB	100–500 MB	20–50 MB
Bond count	`committee_size − 1`	0 (not a group member)	0
Gossip catalog size	total universe node count ± 2	total universe node count ± 2	total universe node count ± 2
Inbound per round	N × B / committee_size (replication)	N × B	B / client
Outbound per round	B + heartbeats	committee_size × B	B

N = clients, B = broadcast vector bytes (default 16 KiB).

Dev note

The gossip catalog includes peers from every service on the shared universe, not just zipnet. Your catalog size may be much larger than your committee size if the universe also hosts multisig signers, oracles, or other mosaik agents. Do not alert on absolute catalog size; alert on change in catalog size relative to a baseline.

Sensible alerts to configure

Round stall. No new Broadcasts entry for 3 × ROUND_PERIOD + ROUND_DEADLINE. Page on-call: committee is stuck, aggregator is down, or min_participants is unmet.
Committee partition. sum by (instance) (mosaik_groups_leader_is_local{instance="…"}) is 0 or ≥ 2 for more than 1 minute. Page on-call.
TDX attestation approaching expiry. Less than 24 h to ticket exp on any bonded peer. Page TEE operator.
Bond drop. mosaik_groups_bonds{peer=<known>,instance="…"} drops from 1 to 0 for more than 30 s and does not recover.

Multi-instance dashboards

Since multiple instances share the same universe and the same host fleet, build dashboards with instance as a dimension from the start:

A top-level panel showing rate(zipnet_round_finalized_total[1m]) broken out by instance.
A committee-health grid: rows are instances, columns are the committee members, cells are mosaik_groups_leader_is_local.
A per-instance heatmap of participants over time — sparse rounds are often the first hint of a sick publisher fleet.

A starter Grafana dashboard is not shipped in v1. The metrics list in Metrics reference is sufficient to build one. A community-maintained dashboard is tracked as a v2 follow-up.

Incident response

audience: operators

This page is a runbook. It lists the failure modes we have actually observed in testing and the minimal steps that resolve each. Each section is scoped to a single instance — if multiple instances on the same universe are misbehaving at once, something is wrong at the universe level (relays, DHT, network) rather than in any one instance, and you should start with the “Discovery is slow” section.

Stuck rounds

Symptom: no round finalized log on any committee server in this instance for more than 3 × ROUND_PERIOD + ROUND_DEADLINE.

Root-cause checklist, in order of likelihood:

Fewer active clients than ZIPNET_MIN_PARTICIPANTS. The leader won’t open a round until this threshold is met.
- Check: zipnet_client_registry_size{instance="…"} on any committee server.
- Fix: either start more clients (or cover-traffic filler) or lower ZIPNET_MIN_PARTICIPANTS (rolling restart of the committee — this is in the state machine’s signature derivation, so everyone needs the same value).
Committee has no leader. Raft election has not settled (yet, or ever).
- Check: mosaik_groups_leader_is_local{instance="…"} == 0 on all members.
- Fix: usually self-heals within ELECTION_TIMEOUT + BOOTSTRAP_DELAY. If persistent, suspect clock skew or a full network partition.
Client bundles have not replicated to the committee. Clients have connected but their bundles haven’t landed in ClientRegistry — the aggregator hasn’t yet mirrored them in.
- Check: aggregator log for registering client bundle; this should fire for each new client.
- Fix: ensure the aggregator is reachable from every client (correct ZIPNET_BOOTSTRAP or working universe discovery). Wait one gossip cycle (≈ 15 s).
One or more server bundles missing from ServerRegistry. A committee server failed to self-publish.
- Check: query ServerRegistry size on each committee server; should equal committee size.
- Fix: restart the offending server; it re-publishes on boot.

If a publisher reports Error::ConnectTimeout that traces back to any of the root causes above, it is an operator-side issue surfacing as a developer-side error. The SDK cannot distinguish “my instance name is wrong” from “the operator’s committee is stuck” — that’s a deliberate tradeoff of the no-registry design.

Split-brain

Symptom: two or more committee servers in this instance report mosaik_groups_leader_is_local == 1, or a server’s log shows rival group leader detected.

v1 uses mosaik’s modified Raft which resolves rivals by term. The system self-heals within one ELECTION_TIMEOUT. If it does not self-heal:

Check clock skew across committee members (ntpdate -q on each). More than a few seconds of skew breaks Raft timing.
Check the network — split-brain persisting past self-heal is a partition.
As a last resort, SIGTERM the minority faction. They’ll rejoin as followers.

Do not change ZIPNET_COMMITTEE_SECRET mid-incident. It would force a fresh committee group and hide evidence of the split, not resolve it.

Committee quorum loss

Symptom: fewer than a majority of committee servers are reachable. Rounds cannot commit.

Restore the failed nodes. They rejoin on startup.
If restoration is impossible (hardware loss, etc.), a v1 deployment has no graceful recovery — retire the instance and stand up a fresh one under a new name (or bump the deployment crate version). See Rotations and upgrades — Retiring and replacing an instance.

Aggregator crash-loop

Symptom: aggregator exits or OOMs shortly after boot.

Most common cause in v1: too many concurrent clients pushing envelopes larger than the internal buffer (buffer_size = 1024 per mosaik default).

Fix: either lower client concurrency by splitting the publisher fleet across multiple instances (each with its own ZIPNET_INSTANCE), or tune the aggregator’s stream buffer when calling network.streams().consumer::<ClientEnvelope>().with_buffer_size(N) — this requires a code change in zipnet::node (dev task).

TDX attestation expiry

Symptom: committee rejects a previously-good peer with unauthorized; the peer re-bonds in a loop with the same outcome. On the peer side, logs mention an expired quote.

Causes, in order of likelihood:

Quote exp elapsed. Each TDX quote carries an expiration. The bonded peer needs a fresh quote.
- Fix: restart the peer. On restart the TDX layer fetches a new quote from the hardware. If the peer still fails, check the TDX host’s attestation service reachability.
Clock skew between the peer and the committee. The committee rejects a quote whose exp has already passed in its local clock.
- Fix: NTP on both sides.
MR_TD mismatch. The peer is running a different image than the committee expects. Common after a committee rebuild the peer hasn’t yet picked up.
- Fix: re-build the peer image from the current release, or see Rotations and upgrades — Rebuilding a TDX image for the transition plan.

Discovery is slow (universe-level)

Symptom: nodes log Could not bootstrap the routing table and take minutes to find each other. Typically affects all instances on the same universe simultaneously.

Usual cause: iroh’s pkarr / Mainline DHT bootstrap is struggling (common on fresh residential networks or a fresh universe). Workarounds:

Pass an explicit ZIPNET_BOOTSTRAP=<peer_id> on every non-bootstrap node.
Enable mDNS discovery (already on by default in this prototype). For LAN deployments this is often enough.
Run a mosaik bootstrap node (see mosaik’s examples/bootstrap.rs) with a well-known public address and seed it everywhere.

A dedicated bootstrap node is recommended for any production universe that hosts more than one zipnet instance.

When to escalate

Unknown log messages containing committed or reverted outside the expected Raft lifecycle.
Broadcasts collection contains entries where the number of servers in the record does not match your configured committee size for this instance.
Any indication that two clients with the same ClientId coexist (would mean someone forged a bundle — investigate as a security incident).
Publishers reporting WrongUniverse — indicates an operator misconfiguration of ZIPNET_UNIVERSE, or a publisher using the wrong zipnet::UNIVERSE constant.

Accounting and audit

audience: operators

Anonymous broadcast looks, from the outside, uncomfortably like a thing you cannot account for. Auditors will ask. This page tells you what you can attest to, what you cannot, and how to produce evidence for each. Everything here is scoped to a single zipnet instance — multiple instances on the same universe are separately audited against their own committee roster and Broadcasts collection.

What the protocol is designed to guarantee

Given at least one honest committee server, no party — not the operator, not the aggregator, not the remaining committee members, not an outside observer of the network — can determine which client authored which published broadcast.
Given all parties operating the protocol honestly, every broadcast in the Broadcasts log is the XOR-sum of the messages of the clients listed in that round’s participants field, subject to slot collisions.
Committed broadcasts are signed-in-transit by every bonded pair and logically signed by the Raft leader at commit time. Replays are detectable.

What the protocol is not designed to guarantee

Who an individual ClientId refers to. A client’s ClientId is a hash of its X25519 public key, not a legal identity. You will need an out-of-band registration process if you want to tie a ClientId to a legal entity.
That a broadcast is well-formed. A malicious client can put garbage in its slot. The falsification tag protects honest clients from other clients corrupting their slot, but not from a client corrupting its own slot.
Censorship-resistance. A malicious aggregator or a majority of malicious committee servers can delay or drop rounds. Anonymity still holds; availability does not.

What you can attest to

“Did this instance publish this broadcast on this date?”

Every entry in the instance’s Broadcasts collection carries:

round: RoundId
participants: Vec<ClientId> — snapshot of the active clients at round-open time
servers: Vec<ServerId> — committee members that contributed partials
broadcast: Vec<u8> — the final XORed vector

Together with the Raft commit index, this is a point-in-time claim signed (through the bond layer) by every committee server. Archive the Broadcasts entries you care about, keyed by instance name — there is no authoritative external registry.

“Who was running which node on this date?”

This is an organizational fact, not a cryptographic one. Maintain an external table per instance:

Instance	`PeerId`	Legal entity	Role	Valid from	Valid to
`acme.mainnet`	`f5e28a…`	Acme Corp	committee-server-1	2026-03-01	present
`acme.mainnet`	`4c210e…`	Acme Corp	aggregator	2026-03-01	present
`acme.preview`	`a91742…`	Acme Corp	committee-server-1	2026-04-02	present

Sign this table with your corporate root, version it, and include it in your audit package. PeerId is stable when ZIPNET_SECRET is stable; rotate only via a documented procedure (see Rotations and upgrades).

“Was a specific server in the committee on this round?”

BroadcastRecord::servers lists every committee member whose partial unblind was folded into the published broadcast. Combine with your PeerId → legal entity table to produce a legal-readable statement.

“Did this committee server operate honestly?”

You cannot prove this from the record alone — a malicious committee member can behave indistinguishably from an honest one, provided at least one other committee member is honest. (That’s the whole point of the any-trust model.) What you can attest to:

The server was up and participating (its partial is folded in).
The server’s key material was controlled by the claimed legal entity (via the PeerEntry signature).
For TDX-gated instances, the server’s boot measurement matched the committee’s pinned MR_TD. Archive the quote alongside the instance deployment record (see below).

In regulatory settings where “operated honestly” must be proven positively, a TDX attestation is as close as the protocol gets — the quote cryptographically proves the code running inside the committee server matches a published image hash.

Archival recommendations

Archive Broadcasts continuously, per instance. A committee server’s in-memory copy is the source of truth in v1; if the majority of the committee goes offline at once, the log is gone. Mirror the log into durable storage at your cadence of choice. A minimal script: open a Zipnet::<D>::read(&network, &CONFIG) handle from a non-committee host, iterate values newer than your checkpoint, append to a signed ledger, commit.
Archive the PeerId table, keyed by instance. Version it; keep change history. A SHA-256 of this table goes into your audit manifest.
Archive the instance configuration. For each instance:
- Instance name.
- ZIPNET_COMMITTEE_SECRET’s blake3 fingerprint (not the raw secret).
- RoundParams.
- ConsensusConfig.
- Committee roster.
- Committee MR_TD (if TDX-gated).
Archive TDX attestation quotes. For TDX-gated instances, each committee server’s quote includes its MR_TD and RTMRs. Store them per instance, per deploy.

Evidence package for external audit

A minimal per-quarter package, per instance:

Instance name and its universe NetworkId.
Broadcasts log excerpt for the quarter (signed by your corporate root).
PeerId → legal entity table for that instance (signed, version-pinned).
Instance configuration fingerprint: SHA-256 of blake3(COMMITTEE_SECRET) || blake3(ROUND_PARAMS) || blake3(CONSENSUS_CONFIG) || instance_name.
Committee MR_TD (TDX-gated instances).
List of committee membership changes, cross-referenced to git/CD deployment records.
Incident log covering any stuck rounds, split-brain events, or membership changes in the period.

An auditor can re-derive ClientIds referenced in participants from the corresponding signed PeerEntry tickets archived from gossip — useful if they want to ask “was client X part of round Y”.

Multiple instances, shared universe

Because zipnet instances share a universe, an auditor who reads your raw gossip logs will see traffic that belongs to other instances — and possibly to other mosaik services entirely. Two consequences to call out in your audit narrative:

Gossip-level traffic volume from your fleet is not a proxy for your instance’s traffic. A committee server on acme.mainnet routinely forwards discovery messages on behalf of other instances and services on the same universe.
Peer-catalog size is likewise a universe-level quantity. Do not attempt to derive per-instance population from catalog counts.

For per-instance accounting, stick to the Broadcasts collection and the ServerRegistry / ClientRegistry contents read through the Zipnet::<D>::read / receipts handles for each Config.

Privacy and data retention

Published broadcasts are, by design, readable by anyone who can read the Broadcasts collection. Treat them as public data. Archival retention policy is a business decision; the protocol neither enforces nor contradicts any specific retention period.

Signed PeerEntrys (carrying peers’ ClientBundles / ServerBundles) are also public by design — they are gossiped to every universe member. There is no way to revoke a signed entry retroactively.

Security warning

Do not publish ZIPNET_COMMITTEE_SECRET or any committee server’s X25519 secret, historic or current. Disclosure of any committee server’s DH secret, combined with disclosure of any other committee server’s DH secret, breaks anonymity of every round in which both nodes participated.

Security posture checklist

audience: operators

Each item below is a pre-production checklist entry. Print it, initial it, file it with the deploy record. Work through this checklist per instance — an honest posture on acme.mainnet does not protect preview.alpha if the two share a fault domain or a secret store.

Instance identity and scope

ZIPNET_INSTANCE is set to a namespaced string (e.g. acme.mainnet) and documented in the release notes your publishers consume. No operator within the same universe uses the same string.
ZIPNET_UNIVERSE, if set, points at a universe you control. The default (zipnet::UNIVERSE) is the shared world and is correct for most deployments.
The instance’s MR_TD (TDX-gated instances) is published alongside the instance name in a signed channel. Publishers verify against that hash.

Committee secret handling

ZIPNET_COMMITTEE_SECRET is stored only in a secret manager (vault, AWS Secrets Manager, HashiCorp Vault, k8s Secret resource). Never in a git repo, never in a plain environment file.
The secret is unique per instance. Do not reuse one committee secret across acme.mainnet and acme.preview even though the operator is the same.
Rotation procedure is documented and rehearsed (see Rotations and upgrades).
Access to read the secret is audited. A quarterly review of access logs is on the calendar.

Committee server node hygiene

Each committee server runs in a separate fault domain (different cloud account, different region, different operator organization if possible). The whole point of any-trust is diversity.
In production, every committee server runs inside a TDX guest built by the mosaik image builder. The committee’s require_mrtd(...) validator is set to the build’s measured MR_TD. See Rebuilding a TDX image for the rebuild cadence.
ZIPNET_SECRET is unique per node and stored in the node’s own secret scope (not shared with any other node).
Committee servers listen only on the iroh port (default UDP ephemeral + relay) and the Prometheus metrics port. No other inbound exposure.
Decommissioned committee servers have their disks wiped. DH secrets leaking from a decommissioned box are historically replayable.

Aggregator node hygiene

The aggregator is not in the committee’s secret-possession circle. It does not have access to ZIPNET_COMMITTEE_SECRET.
Aggregator memory is not a secret store — aggregates are XOR-sums whose plaintext only the committee can recover. Still, hardening the aggregator is good practice: read-only filesystem, dropped capabilities, etc.
If you operate one aggregator per instance, each is configured with its own ZIPNET_INSTANCE and its own ZIPNET_SECRET.

Client image hygiene (TDX-gated instances)

The client image you ship to publishers is built reproducibly. The mosaik TDX builder is deterministic — commit your toolchain and feature-flag set alongside the release.
The committee’s Tdx validator lists the published client MR_TD in require_mrtd(...). Publishers running any other image are rejected at bond time.
TDX quote expiration is monitored; see Monitoring.
Image rebuild cadence is documented. At minimum, rebuild whenever the upstream kernel or initramfs toolchain ships a security fix — a new MR_TD is cheap compared with unpatched firmware.

Client image hygiene (TDX disabled, dev/test only)

Understood: without TDX, the client trusts the client host for DH key protection. Anyone with access to the client process can deanonymize that client’s own messages (not others’).
Clients handling non-public messages wait for the ClientRegistry to include their own entry and wait for at least ZIPNET_MIN_PARTICIPANTS − 1 other clients to also be registered before relying on anonymity properties.
This posture is explicitly not used for production in TDX-gated instances.

Network hygiene

Firewalls permit outbound UDP to iroh relays. If you run your own relay, ensure clients can reach it.
NTP is configured on every node. Raft tolerates small skew; large skew causes election storms. TDX quote validation is also clock-sensitive.
Prometheus metrics endpoints are NOT publicly exposed.

Archival / audit

A job pulls the Broadcasts collection to durable storage at the chosen cadence, keyed by instance name (see Accounting and audit).
PeerId → legal entity registry is version-controlled, signed, and scoped per instance.

Emergency contacts

On-call rotation documented for each node, per instance.
Break-glass procedure for committee-secret rotation documented, per instance.
“Who can revoke a compromised bundle ticket” is specified — note that in v1 a ticket lives in gossip until the node is removed from the universe, so the answer is “the node’s operator, by stopping the node”.

Known-not-yet-protected footguns

Metadata from iroh. The iroh layer leaks some metadata (relay preferences, coarse geography via relay choice). A global passive adversary observing traffic patterns across relays can narrow anonymity sets.
Cross-instance traffic correlation. Instances share a universe. A passive observer of gossip can often tell “this peer is a member of instance X” from catalog membership, even without seeing any Broadcasts content. Anonymity within a round is unaffected; anonymity of membership in an instance is not a property the protocol provides.
Client message length. The protocol encrypts the message but does not pad it to a uniform length. Unusually long messages are recognizable in the broadcast. Pad your payloads to the nearest slot boundary at the application layer if this matters for you.
Participant set disclosure. BroadcastRecord::participants lists every ClientId whose envelope was folded into the round. Knowing “client X was in this round” is not the same as knowing “client X wrote this message”, but it is visible and it leaks connection timing.

These are tracked in Roadmap to v2.

Designing coexisting systems on mosaik

audience: contributors

Mosaik composes primitives — Stream, Group, Collection, TicketValidator. It does not prescribe how a whole organism — a bundle of primitives with its own operator, its own ACL, its own lifecycle — is shipped onto a network and made available to third-party agents. That convention lives one layer above mosaik and has to be invented per organism family.

This page describes the convention zipnet uses, why it was picked, and what a contributor building the next mosaik organism (multisig signer, secure storage, attested oracle, …) should reuse. It is a mental model, not an API reference: the concrete instantiation is in Architecture.

The problem

A mosaik network is a universe where any number of services run concurrently. Each service:

is operated by an identifiable organisation (or coalition) and has its own ACL
ships as a bundle of internally-coupled primitives — usually a committee Group, one or more collections backed by that group, and one or more streams feeding it
must be addressable and discoverable by external agents who do not operate it
co-exists with many other instances of itself (testnet, staging, per-tenant deployments) and with unrelated services on the same wire

The canonical shape zipnet itself was built for is an encrypted mempool — a bounded set of TEE-attested wallets publishing sealed transactions for an unbounded set of builders to read, ordered and unlinkable to sender. Other services built on this pattern (signers, storage, oracles) have the same structural properties.

Nothing about these requirements is in mosaik itself. The library will happily let you stand up ten Groups and thirty Streams on one Network; it says nothing about which of them constitute “one zipnet” versus “one multisig”.

Two axes of choice

Every design in this space picks a point on two axes.

Network topology. Does a deployment live on its own NetworkId, or on a shared universe with peers of every other service?
Discovery. How does an agent go from “I want zipnet-acme” to bonded-and-consuming without hardcoded bootstraps or out-of-band config?

Four shapes fall out:

Shape	Topology	When to pick
A. Service-per-network	One `NetworkId` per deployment; agents multiplex many `Network` handles	Strong isolation, per-service attestation scope, no cross-service state
B. Shared meta-network	One universe `NetworkId`; deployments are overlays of `Group`s/`Stream`s	Many services per agent, cheap composition, narrow public surface required to tame noise
C. Derived sub-networks	`ROOT.derive(service).derive(instance)` hybrids	Isolation with structured discovery, still multi-network per agent
D. Service manifest	Orthogonal: a rendezvous record naming all deployment IDs	Composable with A/B/C; required for discoverable-without-out-of-band-config

Zipnet picks B for topology, with optional derived private networks for high-volume internal plumbing, and compile-time deployment- fingerprint derivation for discovery — no on-network registry required. The rest of this page unpacks why and how.

Narrow public surface

The single most important discipline in this model is that a deployment exposes a small, named, finite set of primitives to the shared network. The ideal is one or two — a stream plus a collection, two streams, a state machine plus a collection, and so on. Everything else is private to the bundle and wired up by the deployment author, who is free to hardcode internal dependencies as aggressively as they like.

Zipnet’s outward surface decomposes cleanly into two functional roles, even though it carries several declare! types:

write-side: ClientRegistrationStream and ClientToAggregator — ticket-gated, predicate-gated, used by external TEE clients to join a round and submit sealed envelopes.
read-side: LiveRoundCell, Broadcasts, plus the two registries — read-only ambient round state that external agents need in order to seal envelopes and interpret finalized rounds.

An integrator’s mental model is “a way to write, a way to read”. They do not need to know the committee exists, how many aggregators there are, or how DH shuffles are scheduled. Internally the bundle looks like this:

  shared network                                     (public surface)
  ─────────────────────────────────────────────────────────────────
  ClientRegistrationStream, ClientToAggregator  ─┐
                                                 │
  LiveRoundCell, Broadcasts, ClientRegistry,   ◀─┤
  ServerRegistry                                 │
                                                 │
  ─────────────────────────────────────────────────
  derived private network (optional)             │  (private plumbing)
                                                 ▼
      Aggregator fan-in / DH-shuffle gossip      Committee Group<CommitteeState>
      Round-scheduler chatter                    AggregateToServers stream
                                                 BroadcastsStore (backs Broadcasts)

The committee Group stays on the shared network because the public-read collections are backed by it and bridging collections across networks is worse than the catalog noise. Only the genuinely high-churn channels belong on a derived private network.

The underlying principle: content + intent addressing

The deployment-model decisions above are concrete applications of a single discipline that generalises to every mosaik-native organism.

Consensus-critical identifiers — GroupId, StreamId, StoreId, any on-wire ID that two parties must agree on to interact safely — MUST be derivable from a deterministic hash of three inputs:

Intent: an operator-chosen label (the instance name).
Content: every state-affecting parameter that determines what the deployment does — schema versions, wire-size constants, ConsensusConfig, round-window parameters, encoding format, every immutable initial-state input.
ACL: the TicketValidator composition gating admission — JWT issuer keys, TDX MR_TD pins, expiry policies.

id = blake3(intent ‖ content ‖ acl).

This is content + intent addressing — Merkle-DAG storage’s content-as-id discipline merged with mosaik’s intent-addressed unique_id!. Both at once.

The argument for it is failure-mode based. Suppose only the name folded into the identity (the naïve case). Two operators pick acme.mainnet with different ShuffleWindows. Same GroupId. Their committees attempt to bond, Raft elects a leader across incompatible state machines, commits land on one side that the other rejects, the public broadcast log fills with garbage. The failure is silent and corrupting. With content + intent addressing, mismatched windows produce different IDs. The committees never see each other; consumers compiling against one configuration get ConnectTimeout against an operator running the other. Failure is loud and debuggable.

The same argument applies to ACL. Two operators sharing a name and parameters but using different TDX MR_TDs (e.g. one runs attested-prod, one runs mock-dev) should not bond. Folding the ticket-validator composition into the ID makes that automatic.

Concrete corollary: every signature-altering input lives in a deployment Config struct. The SDK derives identifiers from the struct, not from any subset of it. Operators publish the struct (or its serialized fingerprint) as the handshake. Consumers compile it in. There is no configuration that is both consensus-critical and not in the identity. If you find one, it’s a bug — either fold it in, or convince yourself it’s not actually consensus-critical and document why.

The three conventions in the next section are the application of this principle to zipnet specifically: content + intent addressed fingerprint derivation, the typed-primitive Shuffle<D> API (content in the type system), Config-based identity (content + intent + ACL bundled). Other mosaik organisms — multisig signers, storage, oracles — apply the same discipline with their own state shape.

The three conventions

Three things make this pattern work. A contributor starting a new service should reproduce all three.

1. Identifier derivation from the deployment fingerprint

Every public ID in a deployment descends from one root that canonically encodes every signature-altering input — the operator’s chosen name, the datum type’s schema and wire size, the shuffle window, the per-deployment init salt, the ACL composition. The SDK hashes these into one UniqueId and chains everything else off it with .derive() for structural clarity:

  DEPLOYMENT   = blake3("zipnet|" || name || "|type=" || TYPE_TAG ||
                        "|size=" || WIRE_SIZE || "|window=" || window ||
                        "|init=" || init)
  SUBMIT       = DEPLOYMENT.derive("submit")           // StreamId
  BROADCASTS   = DEPLOYMENT.derive("broadcasts")       // StoreId
  COMMITTEE    = DEPLOYMENT.derive("committee")        // GroupKey material
  ...

The fingerprint inputs all live in a zipnet::Config struct that’s const-constructible, plus the datum’s ShuffleDatum impl. Operators publish the Config (or its serialised form) and the datum schema; consumers compile both in. The consumer-side API is the three typed free-function constructors:

const ACME_MAINNET: zipnet::Config = zipnet::Config::new("acme.mainnet")
    .with_window(zipnet::ShuffleWindow::interactive())
    .with_init([0u8; 32]); // operator-published random bytes

let mut s = zipnet::Zipnet::<Note>::submit  (&network, &ACME_MAINNET).await?;
let mut r = zipnet::Zipnet::<Note>::receipts(&network, &ACME_MAINNET).await?;
let mut x = zipnet::Zipnet::<Note>::read    (&network, &ACME_MAINNET).await?;

Zipnet::<D>::* constructors derive the deployment id internally; raw StreamId/StoreId/GroupId values are never exposed across the crate boundary. Zipnet::<D>::deployment_id(&Config) is exposed as a pure function for diagnostics — print it and the operator can verify both sides agree on the fingerprint without a wire round-trip.

2. A `Deployment`-shaped convention

Authors should declare a deployment’s public surface once, in one place, so consumers can bind without reassembling ID derivations by hand. Whether this is a literal declare::deployment! macro or a hand-written impl Deployment is ergonomics; the constraint is that the public surface is a declared, named, finite set of primitives — not “whatever the bundle happens to put on the network today”.

Every deployment crate should export:

the public declare::stream! / declare::collection! types for its surface, colocated in a single protocol module
a bind(&Network, instance_name) -> TypedHandles function
the intended TicketValidator composition for each public primitive

A service that exposes eight unrelated collections has probably not thought hard enough about its interface.

3. A fingerprint convention, not a registry

Two parties agree on a deployment iff they compute the same fingerprint from the same Config + datum schema. No on-network advertisement is required — the service does not need to advertise its own existence. Bonding is automatic via mosaik’s discovery once both sides hash to the same GroupId.

The operator’s complete public contract is two items, plus an optional third:

The Config struct (or its serialised hex fingerprint) — a single blob that captures name + window + init.
The datum schema — D::TYPE_TAG and D::WIRE_SIZE, typically shipped as a small Rust crate the consumer depends on.
(If TDX-gated) the committee image’s MR_TD, which the consumer pins via the tee-tdx Cargo feature.

These travel via release notes, a deployment-spec crate on crates.io, a setup email — anything out of band. The shared universe NetworkId is zipnet::UNIVERSE by default; only flagged explicitly when an isolated federation runs on a different one.

A directory may exist — a shared collection listing known deployments — but it is a devops convenience for humans enumerating deployments, not part of the consumer binding path. Build it if you need it; nothing about the pattern requires it.

A directory may exist — a shared collection listing known instances — but it is a devops convenience for humans enumerating deployments, not part of the consumer binding path. Build it if you need it; nothing about the pattern requires it.

Wire contracts within a typed primitive

The deployment model gets a service onto the shared universe and lets consumers bond to it; what flows over the wire after that is a second design space. For shuffler-class primitives — anonymous broadcast, mixers, threshold-signing fan-out, anything where a public observer’s correlation power is part of the threat model — two invariants are non-negotiable.

Constant payload size

Every shuffler is parameterised by a datum type D implementing a ShuffleDatum trait that carries const WIRE_SIZE: usize. Every value of D MUST canonically serialise to exactly WIRE_SIZE bytes; the SDK rejects encodings of any other length. WIRE_SIZE folds into the instance’s derived UniqueId — Shuffle<TxV1> (240 bytes) and Shuffle<TxV2> (1024 bytes) at the same instance name produce different GroupId/StreamIds and do not interoperate. This collapses the “versioning under stable instance names” problem (see below) into the type system: a schema bump that changes WIRE_SIZE is a clean retire-and-replace, no on-wire negotiation, no silent split-brain.

Variable on-wire payload sizes leak sender identity through traffic analysis. The DC-net and anonymous-broadcast literature is unanimous: Chaum 1988 (Dining Cryptographers, foundational constant-bit-length transmission); Riposte (Corrigan-Gibbs, Boneh, Mazières — S&P 2015) on fixed-size cells with cuckoo-hashing for slot allocation; Vuvuzela (van den Hooff et al. — SOSP 2015) on fixed-size noise envelopes from every client every round; Atom (Kwon, Lazar, Devadas, Ford — SOSP 2017) on verifiable shuffles over fixed cells; Loopix (Piotrowska et al. — USENIX 2017) on Sphinx packets with fixed-size SURBs; Stadium (Tyagi et al. — SOSP 2017) on fixed-cell partitioning; ZIPNet (eprint 2024/1227) inherits the discipline by construction.

Multi-slot fragmentation without cover is explicitly hostile to anonymity — cross-slot correlation reveals which slots came from one sender. Variable-size application data is padded (preferred) or chunked with cover (sometimes); both are the application’s job, not the primitive’s. The shuffler primitive enforces a single fixed size at its boundary and refuses anything else.

Receipts encrypted to the requester

A submitter wants observability: did my envelope land, collide, get dropped? A naive receipt collection — public per-publisher outcomes broadcast alongside the shuffled output — would leak the sender→message linkage the protocol exists to hide.

The committed shape: a Receipts stream on the public surface, where every item is an opaque ciphertext encrypted via ECIES (ephemeral X25519 + AEAD) to the original submitter’s long-term X25519 pubkey — the same key already in ClientRegistry for DC-net pad derivation, so no new key infrastructure. Consumers trial-decrypt every receipt; AEAD auth tag failures discriminate. A public observer sees a stream of indistinguishable ciphertexts of identical size and cannot link any receipt to any submitter.

Trial-decrypt cost: O(N_receipts) ECDH + AEAD per round per receipt-watcher, microseconds each. Bounded for the permissioned client sets these primitives target. For large-N deployments, a per-round HMAC tag scheme (recipient-derivable, round-rotated to avoid cross-round linkability) is the upgrade path. Not in v1.

The contract for downstream services building their own typed shuffler-class primitives on mosaik: keep both invariants. Constant payload size at the type boundary; opaque public-stream receipts. Anything else compromises a property the primitive exists to provide.

What this buys you

A third-party agent’s mental model collapses to: “one Network, many services, each bound by instance name.”
Multiple instances of the same service coexist trivially — each derives disjoint IDs from its salt.
ACL is per-instance, enforced at the edge via require_ticket on the public primitives; no second ACL layer is needed inside the bundle.
Internal plumbing can move to a derived private network without changing the public surface.
Private-side schema changes (StateMachine::signature() bumps) are absorbed behind the instance identity, as long as operators and consumers cut releases against the same version of the deployment crate.

Where the pattern strains

Three things are not free under this convention. Every new service author should be honest about them up front.

Cross-service atomicity is out of scope

There is no way to execute “mix a message AND rotate a multisig signer” in one consensus transaction. They are different Groups with different GroupIds, possibly with disjoint membership. If a service genuinely needs that — rare, but real for some coordination-heavy cases — the right answer is a fourth primitive that is itself a deployment providing atomic composition across services, not an ad-hoc cross-group protocol.

Versioning under stable instance names

If StateMachine::signature() changes, GroupId changes, and consumers compiled against the old code silently split-brain. Under multi-instance, the expectation is that “zipnet-acme” is an operator-level identity that outlives schema changes. Two ways to reconcile:

Let the instance salt carry a version (zipnet-acme-v2), and treat version bumps as retiring the old instance. Clean, but forces consumers to re-pin and release a new build on every upgrade.
Keep the instance name stable across versions and require operators and consumers to cut releases in lockstep against a shared deployment crate version. Avoids churn in instance IDs, at the cost of tighter coupling between operator and consumer release cadences.

Zipnet v1 does not need to resolve this. V2 must.

Noisy neighbours on the shared network

A shared NetworkId means every service’s peers appear in every agent’s catalog. Discovery gossip, DHT slots, and bond maintenance scale with the universe, not with the services an agent cares about. The escape hatch is the derived private network for internal chatter; the residual cost — peer-catalog size and /mosaik/announce volume — is paid by everyone. If a service’s traffic would dominate the shared network (high-frequency metric streams, bulk replication) it belongs behind its own NetworkId, not on the shared one. Shape A is the correct call when the narrow-interface argument no longer outweighs the noise argument.

Checklist for a new service

When adding a service to a shared mosaik universe, use this list:

Identify the one or two public primitives. If you cannot, the interface is not yet designed.
Pick a service root: unique_id!("your-service").
Define the fingerprint inputs: what instance_name means, who picks it, what window/config parameters fold in, whether the fingerprint encoding carries a version.
Write typed constructors (e.g. Service::<D>::open(&network, &config)) that every consumer uses. Never export raw StreamId/StoreId/GroupId values across the crate boundary.
Decide which internal channels, if any, move to a derived private Network. Default: only the high-churn ones.
Specify TicketValidator composition on the public primitives. ACL lives here.
Document your Config + datum-schema convention in release notes or docs. Consumers compile the fingerprint in; you are on the hook for keeping the fields stable and the code release version-matched.
Call out your versioning story before shipping. If you cannot answer “what happens when StateMachine::signature() bumps?”, you will regret it.

Cross-references

Architecture — the concrete instantiation of this pattern for zipnet v1.
Mosaik integration notes — gotchas and idioms specific to the primitives referenced here.
Roadmap to v2 — where versioning-under-stable-names and cross-service composition work live.

Architecture

audience: contributors

This chapter is the concrete instantiation of the pattern described in Designing coexisting systems on mosaik for zipnet v1. It maps the paper’s three-part architecture (§2) onto mosaik primitives and identifies which of those primitives form the public surface on the shared universe versus the private plumbing that may live on a derived sub-network.

The reader is assumed to have read the ZIPNet paper, the mosaik book, and design-intro.

Deployment model recap

Zipnet runs as one mosaik-native organism among many on the shared universe zipnet::UNIVERSE = unique_id!("mosaik.universe"). A deployment is a single zipnet instance: one committee, one ACL, one set of round parameters, one operator. Many deployments coexist on the universe.

A deployment is identified by a content + intent addressed UniqueId derived from a Config (operator-chosen name + shuffle window + 32-byte init salt) and the datum type’s TYPE_TAG / WIRE_SIZE. Every public id in the deployment descends from that root:

  DEPLOYMENT   = blake3("zipnet|" || name || "|type=" || TYPE_TAG ||
                        "|size=" || WIRE_SIZE || "|window=" || window ||
                        "|init=" || init)                 // root UniqueId
  COMMITTEE    = DEPLOYMENT.derive("committee")           // Group<M> key material
  SUBMIT       = DEPLOYMENT.derive("submit")              // ClientToAggregator StreamId
  REGISTER     = DEPLOYMENT.derive("register")            // ClientRegistrationStream StreamId
  BROADCASTS   = DEPLOYMENT.derive("broadcasts")          // Vec<BroadcastRecord> StoreId
  LIVE         = DEPLOYMENT.derive("live-round")          // Cell<LiveRound> StoreId
  CLIENT_REG   = DEPLOYMENT.derive("client-registry")     // Map StoreId
  SERVER_REG   = DEPLOYMENT.derive("server-registry")     // Map StoreId

Consumers recompute the same derivations from the same Config + datum schema; no on-wire registry is involved. See design-intro — The underlying principle.

Public surface (what lives on UNIVERSE)

The instance’s outward-facing primitives decompose into two functional roles:

write-side — ClientRegistrationStream + ClientToAggregator. Ticket-gated, consumed by the aggregator. External TEE clients use these to join a round and submit sealed envelopes.
read-side — LiveRoundCell + Broadcasts + ClientRegistry + ServerRegistry. Read-only ambient round state every external agent needs in order to seal envelopes and interpret finalized rounds.

Integrators bind via the facade:

const ACME_MAINNET: zipnet::Config = zipnet::Config::new("acme.mainnet")
    .with_window(zipnet::ShuffleWindow::interactive())
    .with_init([0u8; 32]);

let network    = Arc::new(Network::new(zipnet::UNIVERSE).await?);
let tx         = zipnet::Zipnet::<Note>::submit  (&network, &ACME_MAINNET).await?;
let mut rx     = zipnet::Zipnet::<Note>::receipts(&network, &ACME_MAINNET).await?;
let mut reader = zipnet::Zipnet::<Note>::read    (&network, &ACME_MAINNET).await?;

The facade hides StreamId / StoreId / GroupId entirely; they never cross the zipnet crate boundary.

Internal plumbing (optional derived private network)

Everything that is not part of the advertised surface is deployment- internal. In v1 it all runs on UNIVERSE alongside the public surface; this is the simplest place to start. A future deployment topology may move the high-churn channels onto a derived private Network keyed off DEPLOYMENT.derive("private"):

AggregateToServers — aggregator → committee fan-out
any footprint-scheduling gossip
round-scheduler chatter

The committee Group<CommitteeMachine> itself stays on UNIVERSE because LiveRoundCell / Broadcasts / the two registries are backed by it; bridging collections across networks is worse than the extra catalog noise. See design-intro — Narrow public surface.

Data flow

                    shared universe (public surface)
  +--------+  ClientToAggregator   +-------------+  AggregateToServers  +-------------+
  | Client |  (stream)             |  Aggregator |  (stream) [*]        |  Committee  |
  |  TEE   | --------------------> |   role      | -------------------> |  Group<M>   |
  +--------+                       +-------------+                      +-------------+
       |                                    |                                    |
       |  ClientRegistrationStream          |                                    |
       +----------------------------------->|                                    |
                                            |                                    |
                        +-------------------+---------------------+--------------+
                        |                                                        |
                 ClientRegistry (Map<ClientId, ClientBundle>)    ServerRegistry (Map<ServerId, ServerBundle>)
                        |                                                        |
                        +-------------------------+------------------------------+
                                                  |
                                        LiveRoundCell (Cell<LiveRound>)
                                                  |
                                        Broadcasts (Vec<BroadcastRecord>)

  [*] may migrate to a derived private network in a future topology.

All four collections are declare::collection!-declared with intent- addressed StoreIds. The three streams are declare::stream!-declared the same way. In v1 every derived id salt is a literal string; a forthcoming Deployment-shaped convention (see design-intro §The three conventions) will replace the literal strings with chained .derive() calls off DEPLOYMENT.

Pipeline per round

                t₀         t₁               t₂                    t₃
                 |          |                |                     |
  leader: ──── OpenRound ─── committed ─── LiveRoundCell mirrored  ─── Broadcasts appended
                 │          (to followers)                              (on finalize)
                 ▼
clients:    read LiveRoundCell,  seal envelope,  send on ClientToAggregator
                                                       │
                 ┌─────────────────────────────────────┘
                 ▼
aggregator: fold envelopes until fold_deadline,  send AggregateEnvelope
                                                       │
                 ┌─────────────────────────────────────┘
                 ▼
any committee server: receive,  group.execute(SubmitAggregate)
                                                       │
                                                       ▼
every committee server: see committed aggregate,  compute its partial,
                        group.execute(SubmitPartial)
                                                       │
                                                       ▼
state machine: all N_S partials gathered → finalize()  → apply() pushes
                                                           BroadcastRecord
                                                       │
                                                       ▼
apply-watcher on each server: mirror to LiveRoundCell / Broadcasts

Round latency is dominated by fold_deadline + one Raft commit round trip per SubmitAggregate and one per SubmitPartial.

Participant roles

Clients

Implemented in zipnet::node::roles::client. Each client is an Arc<Network> bonded to UNIVERSE, tagged zipnet.client, carrying a zipnet.bundle.client ticket on its PeerEntry. Event loop:

loop {
    live.when().updated().await;
    let header = live.get();
    if header.round == last { continue; }
    if !header.clients.contains(&self.id) { retry registration; continue; }
    let bundles = servers.get_all_in(header.servers);
    let sealed  = zipnet::shuffler::client::seal(
        self.id, &self.dh, msg, header.round, &bundles, params,
    )?;
    envelopes.send(sealed.envelope).await?;
}

Aggregator

Implemented in zipnet::node::roles::aggregator. ClientRegistry writer. ClientToAggregator consumer. AggregateToServers producer. Does not join the committee group.

loop {
    live.when().updated().await;
    let header = live.get();
    let mut fold = RoundFold::new(header.round, params);
    let close = tokio::time::sleep(fold_deadline);
    loop {
        tokio::select! {
            _ = &mut close => break,
            Some(env) = envelopes.next() => {
                if env.round != header.round
                    || !header.clients.contains(&env.client) {
                    continue;
                }
                fold.absorb(&env)?;
            }
        }
    }
    if let Ok(agg) = fold.finish() {
        aggregates.send(agg).await?;
    }
}

Committee servers

Implemented in zipnet::node::roles::server. Joins Group<CommitteeMachine> as a Writer of ServerRegistry, LiveRoundCell, and Broadcasts; reads ClientRegistry. Single tokio::select! over three sources:

group.when().committed().advanced() — drives the apply-watcher.
AggregateToServers::consumer — feeds inbound aggregates via execute(SubmitAggregate).
A periodic tick — leader-only round driver that opens new rounds via execute(OpenRound).

Why a dedicated `Group<CommitteeMachine>` and not just collections

The collections are each backed by their own internal Raft group. In principle all round orchestration could be pushed into a bespoke collection. We use a dedicated StateMachine because:

Round orchestration needs domain transitions (Open → Aggregate → Partials → Finalize). These are hostile to Map / Vec / Cell CAS operations.
Apply-time validation (e.g. rejecting aggregates that name non- roster clients) reads more clearly in apply(Command) than spread across collection CAS sequences.
signature() is a clean place to pin wire / parameter version so incompatible nodes never form the same group.

The collections still pull their weight: they are the public-facing state external agents read without joining the committee group.

Identity universe

All IDs are 32-byte blake3 digests, via mosaik’s UniqueId. The aliases used in v1:

Alias	Derivation	Scope
`NetworkId`	`zipnet::UNIVERSE = unique_id!("mosaik.universe")`	shared universe
`DEPLOYMENT`	`blake3("zipnet\|" + name + "\|type=" + TYPE_TAG + "\|size=" + WIRE_SIZE + "\|window=" + window + "\|init=" + init)`	one per deployment
`GroupId`	mosaik-derived from `GroupKey(DEPLOYMENT.derive("committee")) + ConsensusConfig + signature() + validators`	one per deployment’s committee
`StreamId` / `StoreId`	`DEPLOYMENT.derive("submit")`, `DEPLOYMENT.derive("broadcasts")`, etc. in the target layout	one per public primitive
`ClientId`	`blake3_keyed("zipnet:client:id-v1", dh_pub)`	stable across runs iff `dh_pub` is persisted
`ServerId`	`blake3_keyed("zipnet:server:id-v1", dh_pub)`	same
`PeerId`	iroh’s ed25519 public key	one per running `Network`

ClientId / ServerId are not iroh PeerIds. They’re stable across restarts iff the X25519 secret is persisted. In v1 (mock TEE default) every client run generates a fresh identity; in the TDX path the secret is sealed and ClientId becomes a long-lived pseudonym.

Current-state caveat: `ZIPNET_SHARD`

The zipnet binary (all three subcommands — server, aggregator, client) still takes a ZIPNET_SHARD flag and derives a fresh NetworkId from unique_id!("zipnet.v1").derive(shard). This predates the UNIVERSE + deployment-fingerprint design and will be retired as the binary migrates to the Zipnet::<D>::* constructors on UNIVERSE. Treat it as a pre-migration artifact; new code should not replicate the pattern. The e2e integration test exercises this path today.

Boundary between `zipnet::proto` / `zipnet::shuffler` / `zipnet::node`

zipnet::proto — wire types, crypto primitives, XOR. No mosaik types, no async, no I/O. Anything that could be reused by an alternative transport lives here.
zipnet::shuffler — Algorithm 1/2/3 as pure functions. Depends on proto; no async, no I/O. The pure-DC-net round-trip test lives here.
zipnet::node — mosaik integration. Owns CommitteeMachine, all declare! items, all role loops. Everything async, everything I/O.
zipnet — dual-target crate. src/lib.rs is the SDK facade: wraps zipnet::node behind the typed Zipnet::<D>::{submit, receipts, read}(&network, &Config) constructors; hides mosaik types from consumers. src/main.rs is the operator binary — zipnet {server|aggregator|client} — dispatching to zipnet::node::roles::*::run.

See Crate map for the full workspace layout and design-intro — Narrow public surface for the rationale behind the facade boundary.

Cross-references

Design intro — the generalised pattern this page instantiates.
Committee state machine — commands, queries, signature() versioning.
Mosaik integration notes — the specific 0.3.17 footguns this architecture bumps into.
Threat model — anonymity and integrity claims anchored to the state-machine guarantees above.

Crate map

audience: contributors

Workspace at /Users/karim/dev/flashbots/zipnet/. Edition 2024, MSRV 1.93. Mosaik pinned to =0.3.17 (see CLAUDE.md for rationale).

One crate, layered. Everything ships inside crates/zipnet/. Inside, modules form a strict layering where each layer can import the one above but not the reverse:

zipnet::proto      (pure: no mosaik, no tokio, no I/O)
    ▲
    │
zipnet::shuffler   (pure: no mosaik, no tokio, no I/O)
    ▲
    │
zipnet::node       ── mosaik 0.3.17 ── iroh 0.97 (QUIC)
    ▲
    │
zipnet (crate root)
  ├── src/lib.rs   — SDK facade consumed by other mosaik apps
  ├── src/main.rs  — `zipnet {server|aggregator|client}` operator binary
  └── src/ingest.rs — optional REST gateway (feature-gated)

The purity boundary between proto/shuffler and the rest is load-bearing, not cosmetic. Anything that touches tokio, mosaik, or I/O must live in node (or higher). Anything that could be reused by an alternative transport lives in proto / shuffler. If you find yourself reaching for tokio::spawn or mosaik:: inside proto or shuffler, you are in the wrong module.

`zipnet::proto`

Pure wire types and crypto primitives. No mosaik, no async.

Module	Role
`wire`	`ClientEnvelope`, `AggregateEnvelope`, `PartialUnblind`, `BroadcastRecord`, `ClientId`, `ServerId`, `RoundId`
`crypto`	HKDF-SHA256 salt composition, AES-128-CTR pad generator, blake3 falsification tag
`keys`	`DhSecret` (X25519 StaticSecret), `ClientKeyPair`, `ServerKeyPair`, public `ClientBundle` / `ServerBundle`
`params`	`RoundParams` (broadcast shape)
`xor`	`xor_into`, `xor_many_into` over equal-length buffers

WIRE_VERSION is bumped any time a wire or params shape changes. CommitteeMachine::signature() in zipnet::node mixes this in so nodes with different wire versions will never form a group.

`zipnet::shuffler`

Paper’s algorithms as pure functions over zipnet::proto types. No async.

Module	Role
`client::seal`	Algorithm 1 — TEE-side sealing of one envelope
`aggregator::RoundFold`	Algorithm 2 — stateful XOR fold of envelopes for one round
`server::partial_unblind`	Algorithm 3 — per-server partial computation
`server::finalize`	Committee combine — aggregate + partials → broadcast
`slot`	Deterministic slot assignment + slot layout helpers

The full round trip is exercised by shuffler::server::tests::e2e_two_servers_three_clients, which constructs a 3-server / 4-client setup (2 talkers + 2 cover) and asserts that the final BroadcastRecord contains each talker’s plaintext at the expected slot with a valid falsification tag. No transport is involved — this is the pure-algebra proof.

`zipnet::node`

The mosaik integration layer. Hosts the declare! items, the committee state machine, and the role event loops.

Module	Role
`protocol`	`declare::stream!` + `declare::collection!` items, tag constants, ticket class constants
`committee`	`CommitteeMachine: StateMachine`, `Command`, `Query`, `QueryResult`, `LiveRound`, `CommitteeConfig`
`tickets`	`BundleValidator<K>: TicketValidator` for client / server bundle tickets
`roles::common`	`NetworkBoot` helper that wraps iroh secret, tags, tickets, and mDNS setup
`roles::client`	client event loop
`roles::aggregator`	aggregator event loop
`roles::server`	committee server event loop (single `tokio::select!` over three event sources)

The role modules are reusable as a library — the operator binary is a thin CLI wrapper around them. Test code in crates/zipnet/tests/e2e.rs reuses the same primitives but inlines the server loop so it can inject a pre-built Arc<Network> and cross-sync_with all peers before anything starts (same pattern as mosaik’s examples/orderbook).

`protocol.rs` today vs target

protocol.rs currently declares its StreamId / StoreId literals as flat strings ("zipnet.stream.client-to-aggregator", etc.). The target per design-intro is DEPLOYMENT.derive("submit") / .derive("broadcasts") / … chained off the per-deployment content + intent addressed root so multiple deployments can coexist on one mosaik universe without colliding. The migration removes the ZIPNET_SALT.derive(shard) NetworkId scoping in favour of the shared zipnet::UNIVERSE constant.

SDK facade (`lib`, `src/lib.rs`)

Public surface consumers interact with. Re-exports hide the layered proto / shuffler / node modules unless the caller explicitly reaches for them.

Module	Role
`environments`	`UNIVERSE` constant
`config`	`Config`, `ShuffleWindow` (deployment fingerprint)
`datum`	`ShuffleDatum` trait, `DecodeError`
`client`	`Zipnet::<D>::{submit, receipts, read, deployment_id}`, `Submitter<D>`, `Receipts<D>`, `Reader<D>`
`driver`	internal — the per-handle driver task powering `submit` / `read`
`error`	`Error { WrongUniverse, ConnectTimeout, Attestation, Shutdown, Protocol }`
`types`	`Receipt`, `SubmissionId`, `Outcome`

Re-exports from mosaik that the SDK intentionally surfaces: UniqueId, NetworkId, Tag, unique_id!. Nothing else is re-exported — callers that need raw mosaik types drop to zipnet::node::* directly.

Zipnet::<D>::deployment_id(&Config) is the pure-function derivation that produces the on-wire identity — use it on both sides of the handshake for diagnostic parity. The canonical form is blake3("zipnet|" || name || "|type=" || TYPE_TAG || "|size=" || WIRE_SIZE || "|window=" || window || "|init=" || init); any change to that encoding is a wire-breaking change and must be versioned.

Operator binary (`bin`, `src/main.rs`)

Single executable the operator deploys; role is chosen at launch:

zipnet server     [--committee-secret …] [--min-participants …] …
zipnet aggregator [--fold-deadline …]
zipnet client     [--message …] [--cadence …]
zipnet ingest     [--listen …]                       (requires --features ingest)

All subcommands share the common flags — --shard, --bootstrap, --metrics, --secret — and dispatch into zipnet::node::roles::{server, aggregator, client}::run (or zipnet::ingest::run for the optional gateway). In v1 the binary still takes ZIPNET_SHARD and scopes to ZIPNET_SALT.derive(shard); this predates the UNIVERSE + deployment-fingerprint design and is tracked for retirement in the roadmap.

Subcommand	Role-specific flags
`zipnet client`	`ZIPNET_MESSAGE`, `ZIPNET_CADENCE`
`zipnet aggregator`	`ZIPNET_FOLD_DEADLINE`
`zipnet server`	`ZIPNET_COMMITTEE_SECRET`, `ZIPNET_MIN_PARTICIPANTS`, `ZIPNET_ROUND_PERIOD`, `ZIPNET_ROUND_DEADLINE`
`zipnet ingest`	`ZIPNET_INGEST_LISTEN` (feature `ingest`)

Full flag and env-var reference in Environment variables.

Feature flags

tee-tdx (off by default) — folds mosaik::tickets::Tdx::new().require_own_mrtd()? into the committee’s admission validators. Requires mosaik’s tdx feature (on by default) and TDX hardware. Mock TEE is the default path (// SIMPLIFICATION: in source); critical-path enforcement lands in v2 (see Roadmap).
tdx-builder-ubuntu / tdx-builder-alpine — opt-in: run mosaik’s TDX image builder as part of cargo build. Emits artifacts under target/<profile>/tdx-artifacts/zipnet/<distro>/.
ingest — compile the optional REST gateway subcommand and its axum dependency. Off by default so library consumers pay nothing for HTTP if they aren’t running the gateway.

Dependency choices worth knowing

x25519-dalek 2.0 pins rand_core 0.6 (not workspace rand 0.9). zipnet::proto uses rand_core = "0.6" explicitly for OsRng compatibility with StaticSecret::random_from_rng; the rest of the crate uses workspace rand 0.9.
mosaik = "=0.3.17" — the API we developed against. Upgrades are expected to break compile; the declare::stream! / declare::collection! macros are stable-ish, the ticket and group APIs have shifted across minor versions.
axum is declared as an optional dependency pulled in only by the ingest feature. tower and http-body-util are dev-dependencies for the ingest router unit tests.

Cryptography

audience: contributors

All cryptographic primitives live in zipnet::proto. This chapter is a rationale + proof-sketch document; correctness tests are in zipnet::proto::crypto::tests and the end-to-end algebraic test is zipnet::shuffler::server::tests::e2e_two_servers_three_clients. Nothing on this page is deployment-topology-specific — the KDF schedule and falsification-tag construction are identical under any instance layout. See design-intro for how the instance salt (and hence schedule_hash, once footprint scheduling lands in v2) attaches to a deployment.

Primitives

Purpose	Primitive	Crate
Key agreement	X25519	`x25519-dalek` 2.0
Key derivation	HKDF-SHA256	`hkdf` 0.12
Pad generation	AES-128 in CTR mode	`aes` 0.8 + `ctr` 0.9
Falsification tag	keyed-blake3	`blake3` 1.8
ID derivation	keyed-blake3	`blake3` 1.8
Peer-entry signatures	ed25519	via `iroh`

Notable negatives: no signatures from the prototype itself — clients do not ed25519-sign their envelopes because iroh already signs the PeerEntry that carries their bundle and the stream transport is authenticated QUIC. We rely on mosaik’s session security, not on an application-level signature scheme.

Per-round key schedule

For each (client, server, round) pair the protocol computes a one-time pad P of length B = num_slots * slot_bytes:

  shared  = X25519(client_sk, server_pk)                // 32 bytes
  salt    = params_prefix ‖ round ‖ schedule_hash       // 56 bytes
  prk     = HKDF-Extract(salt, shared)                  // 32 bytes
  key     = HKDF-Expand(prk, "zipnet/pad/v1", 16)       // 16 bytes
  iv      = round_le ‖ zeros                            // 16 bytes
  P       = AES-128-CTR(key, iv, zeros of length B)

where params_prefix is a little-endian encoding of (wire_version, num_slots, slot_bytes, tag_len) and schedule_hash is the 32-byte NO_SCHEDULE constant in v1 (the footprint scheduling reservation vector hash in v2).

Why this structure

Salt over (params, round, schedule_hash) binds the pad to every negotiated round parameter. A client or server computing with a different RoundParams derives a different pad; in the XOR algebra this reduces the colliding result to noise, not to a silent crypto vulnerability. The WIRE_VERSION in the salt prefix extends this to major-version boundaries.
HKDF-Extract over the raw DH shared secret, not a hash of it. X25519 shared secrets are uniform in the twist-restricted subgroup; HKDF’s extract step is the standard step to convert that into a uniform PRK.
AES-128-CTR with a round-prefixed IV. A fresh IV = (round‖0⁸) gives every round a non-overlapping counter space; the sequence of counters within a round is (round‖0⁸) + 0, 1, 2, .... As long as two rounds never share round, the AES key–IV pair is never reused. The round: u64 ensures uniqueness across realistic deployments.
HKDF-Expand labelled "zipnet/pad/v1". The label guards against accidental reuse of the same PRK across crypto contexts; bumping it to "zipnet/pad/v2" is free domain separation.
AES-128 over a stream cipher. AES-NI accelerated; output is pseudorandom; the commutativity that DC nets require (XOR) is immediate.

What this buys

For any honest client C and honest server S that agree on the five inputs (shared_secret, wire_version, num_slots, slot_bytes, tag_len, round, schedule_hash), they derive byte-identical pads. The XOR operation is commutative, so the order in which the aggregator and the committee XOR in their contributions is irrelevant.

For any adversary who does not know shared_secret, the pad is indistinguishable from uniformly random under the standard DDH assumption on Curve25519 (for the X25519 step) and the PRF security of AES-128 (for the expansion step), given a secure HKDF.

What this does not buy

Forward secrecy. A compromise of shared_secret compromises every past and future round for that (client, server) pair until the secret is rotated. v2 ratchets shared_secret ← HKDF-Extract(shared_secret, "ratchet") at each round boundary.
Authentication of the envelope itself. The mosaik transport authenticates the sender PeerId (ed25519); the pad binds the envelope to round and client via the KDF inputs. But an adversary who can inject bytes at the transport layer as a specific peer can replay or mutate envelopes. We rely on iroh’s QUIC/TLS.

Falsification tags

The paper’s §3 “falsification tag” is a keyed-blake3 XOF of the plaintext message:

pub fn falsification_tag(message: &[u8], tag_len: usize) -> Vec<u8> {
    let key = blake3::derive_key("zipnet:falsification-tag:v1", &[]);
    let mut h = blake3::Hasher::new_keyed(&key);
    h.update(message);
    let mut buf = vec![0u8; tag_len];
    h.finalize_xof().fill(&mut buf);
    buf
}

Why keyed-blake3, not HMAC

Keyed-blake3 is a PRF under the standard security argument for blake3-keyed and is enormously faster than HMAC-SHA256 at the sizes involved.
The key is a domain-separating constant ("zipnet:falsification-tag:v1") not a secret; the goal is not authentication from an adversary, it’s cross-slot collision resistance.

What the tag protects against

Malicious client corrupting another honest client’s slot. Slots are deterministically assigned (v1) or reservation-checked (v2). Collisions across clients overwrite both messages with their XOR. An honest client’s tag is computed on its original message; after the XOR with garbage, the tag at the published slot no longer matches the visible payload bytes → any observer rejects the slot as corrupted.
Malicious client writing garbage in an unused slot. The unused-slot hypothesis fails the tag check; observers skip it.

What the tag does not protect against

A malicious client corrupting its own slot by writing nonsense and computing a tag over that nonsense. In v1 this is a trivial DoS against the client itself; the protocol treats the published broadcast as authoritative.
Cross-round correlation attacks based on message length or pattern.

Identity derivation

ClientId = blake3_keyed("zipnet:client:id-v1", dh_pub), ServerId = blake3_keyed("zipnet:server:id-v1", dh_pub), both XOF’d to 32 bytes.

Separate domain strings per role prevent an adversary who harvests a client’s dh_pub from spoofing a server with the same identifier, which would matter if we ever compared ClientIds and ServerIds inside the state machine (we don’t, but the separation is free).

Constant-time concerns

X25519 in x25519-dalek is constant-time by design.
AES-128-CTR in aes + ctr uses AES-NI on recent x86_64 / ARM — the assembly path is constant-time.
HKDF (SHA-256) is constant-time over inputs of a fixed length.
XOR buffers are word-wise and constant-time.
The equality check for tag verification is Vec::eq — not constant-time. This is fine: tag comparison is against a public broadcast, not against a secret.

If a contributor adds a secret comparison path, they should reach for subtle::ConstantTimeEq rather than ==.

Cryptographic agility

None. The prototype nails down curve (X25519), hash (blake3, SHA-256), and cipher (AES-128) because each choice is folded into a string constant in the KDF. To change any of them, bump WIRE_VERSION and the corresponding label ("zipnet/pad/v1" → "zipnet/pad/v2").

Rotating the curve to, say, X448 would require a new DhSecret type and a corresponding ClientBundle / ServerBundle layout change. There is no on-wire negotiation of crypto parameters — nodes that disagree are isolated into disjoint groups by construction.

The committee state machine

audience: contributors

Source: crates/zipnet/src/node/committee.rs.

Trait shape

impl StateMachine for CommitteeMachine {
    type Command     = Command;
    type Query       = Query;
    type QueryResult = QueryResult;
    type StateSync   = LogReplaySync<Self>;

    fn signature(&self) -> UniqueId { ... }
    fn apply(&mut self, cmd: Command, ctx: &dyn ApplyContext) { ... }
    fn query(&self, q: Query)        -> QueryResult { ... }
    fn state_sync(&self)             -> LogReplaySync<Self> { LogReplaySync::default() }
}

LogReplaySync is the default; the committee state is small (< 1 KB per round) so replaying the log on catch-up is cheap. When we add per-round archival in v2 we’ll swap in a snapshot strategy.

Commands

pub enum Command {
    OpenRound(LiveRound),
    SubmitAggregate(AggregateEnvelope),
    SubmitPartial(PartialUnblind),
}

Each command is idempotent:

OpenRound: resets current to a fresh InFlight(header). If a previous round was not finalized, its state is silently dropped — the leader is the authority on when to move on.
SubmitAggregate: first valid submission wins. Duplicates from follower forwarding are silently ignored. Validation checks:
- round matches current.header.round,
- payload length matches config.params.broadcast_bytes(),
- participant set is non-empty,
- every participant is in current.header.clients (no rogue clients).
SubmitPartial: first partial per (round, server) wins. Validation:
- round matches,
- partial length matches,
- server is in current.header.servers.

When a partial submission brings the total to N_S and an aggregate has been submitted, apply() calls zipnet::shuffler::server::finalize(...) and pushes the resulting BroadcastRecord into self.broadcasts. Everything after that is apply()-synchronous and deterministic.

Queries

pub enum Query {
    LiveRound,
    CurrentAggregate,
    PartialsReceived,
    RecentBroadcasts(u32),
}

Queries are read-only and do not replicate. The apply-watcher task on each server uses weak-consistency queries to drive its side effects (mirror LiveRound to LiveRoundCell, push broadcasts into the Broadcasts vec collection, issue partial submissions when an aggregate appears).

Signature versioning

fn signature(&self) -> UniqueId {
    let tag = format!(
        "zipnet.committee.v{WIRE_VERSION}.slots={}.bytes={}.min={}",
        self.config.params.num_slots,
        self.config.params.slot_bytes,
        self.config.min_participants,
    );
    UniqueId::from(tag.as_str())
}

signature() is folded into the GroupId by mosaik, alongside the GroupKey (derived from DEPLOYMENT.derive("committee")) and the consensus config. Therefore:

Bumping WIRE_VERSION (wire or params breaking change) isolates old nodes from new.
Changing num_slots, slot_bytes, or min_participants likewise forces a fresh group, so nodes can’t silently fork on divergent config.
Changing any field of the deployment fingerprint (Config + datum schema) disjoins the deployments; two acme.mainnet / acme.testnet deployments share no GroupId even under identical params. See design-intro — The underlying principle.

If you add a field to CommitteeConfig or change apply semantics without touching signature(), two nodes with incompatible code will form the same group and diverge at the apply level. Always bump the signature string when apply() or Command semantics change. That’s the invariant.

What this machine guarantees vs. does not

The state machine guarantees round ordering, exactly-once partial admission, and deterministic finalization under Raft’s normal crash- fault tolerance. It deliberately guarantees nothing about anonymity — anonymity is a property of the cryptographic protocol (any-honest-server DC-net algebra, see Threat model), not of consensus. Byzantine committee members cannot break anonymity via the state machine path; they can only withhold or submit bogus partials, which is an availability problem.

Apply-context usage

ApplyContext exposes deterministic metadata. We use it only in a debug log right now:

debug!(
    round = %header.round,
    "committee: opening round at index {:?}",
    ctx.log_position(),
);

Anything derived from ctx is safe to use in state mutation because mosaik guarantees it is identical on every replica. If v2 needs a per-round random salt, pulling it from ctx.log_position() and ctx.current_term() is the deterministic path.

The apply-watcher

The reason apply() doesn’t write directly to the public collections: apply() is synchronous and must be free of I/O to keep the state machine deterministic. Side effects on the outside world go through a task that polls the group after every commit advance:

tokio::select! {
    _ = group.when().committed().advanced() => {
        let live  = group.query(Query::LiveRound, Weak).await?.into();
        let agg   = group.query(Query::CurrentAggregate, Weak).await?.into();
        let recent = group.query(Query::RecentBroadcasts(8), Weak).await?.into();
        reconcile_into_collections(live, agg, recent).await;
        maybe_submit_my_partial(agg).await;
    }
    // ...
}

This is the same pattern the mosaik book recommends for “state machine emits events, side-effect task consumes them”. Because queries are weak-consistency reads of the local replica, they are lock-free and fast; by the time we see the commit advance, the local apply has already run.

Idempotency and replays

A follower that crashes mid-apply replays the log on recovery. Because apply() is deterministic, replaying yields the same state.
A client that never sees its round finalized and retries on the next LiveRound is safe: the new round has a fresh RoundId, new pads, new envelope. No anti-replay logic is needed at the protocol layer.
An aggregator retrying SubmitAggregate after a leader flip is safe: the state machine rejects duplicates.
A server retrying SubmitPartial after its own restart is safe for the same reason.

Sizes of in-flight state

Field	Size per round
`LiveRound.clients`	`N * 32` bytes
`LiveRound.servers`	`N_S * 32` bytes
`aggregate.aggregate`	`B` bytes (default 16 KiB)
`partials`	`N_S * (32 + 8 + B)` bytes

Finalization pushes one BroadcastRecord (size: B + N*32 + N_S*32) into self.broadcasts which is retained in RAM indefinitely in v1. For long-running deployments you will want external archival; see Operators — Accounting and audit.

Mosaik integration notes

audience: contributors

Drop-in advice, footguns, and places where the prototype bumped into the mosaik 0.3.17 API. This is a grab-bag — sorted roughly by how likely a contributor is to trip over each item. For the higher-level deployment conventions that sit above mosaik, see design-intro.

Deployment-fingerprint derivation

Every public id in a zipnet deployment descends from a single content + intent addressed root. The SDK hashes the operator’s Config together with the datum’s TYPE_TAG and WIRE_SIZE:

use mosaik::UniqueId;
use zipnet::{Config, ShuffleWindow, Zipnet};

const ACME_MAINNET: Config = Config::new("acme.mainnet")
    .with_window(ShuffleWindow::interactive())
    .with_init([0u8; 32]);

// Pure fn; no I/O. Prints the same 32 bytes on both sides of the
// handshake when Config + datum schema agree.
let deployment: UniqueId = Zipnet::<Note>::deployment_id(&ACME_MAINNET);

// Sub-ids chain with .derive() for structural clarity.
let committee_key    = deployment.derive("committee");   // GroupKey material
let submit_stream    = deployment.derive("submit");      // StreamId
let broadcasts_store = deployment.derive("broadcasts");  // StoreId

The canonical encoding is blake3("zipnet|" || name || "|type=" || TYPE_TAG || "|size=" || WIRE_SIZE || "|window=" || window || "|init=" || init); every field folds in, so a typo in any of them produces a disjoint id and the two sides cannot bond. Never expose raw StreamId / StoreId / GroupId values across the zipnet crate boundary — the Zipnet::<D>::{submit, receipts, read} constructors are the only supported path.

The `declare::stream!` predicate direction

Reading the macro source (mosaik-macros/src/stream.rs in the mosaik repo) reveals the following:

“For require and require_ticket, the side prefix describes who must satisfy the requirement, not who performs the check. consumer require_ticket: V means consumers need a valid ticket, so the producer runs the validator — route to the opposite side.”

So in our ClientToAggregator stream:

declare::stream!(
    pub ClientToAggregator = ClientEnvelope,
    "zipnet.stream.client-to-aggregator",
    producer require: |p| p.tags().contains(&CLIENT_TAG),
    consumer require: |p| p.tags().contains(&AGGREGATOR_TAG),
    producer online_when: |c| c.minimum_of(1).with_tags("zipnet.aggregator"),
);

producer require: |p| p.tags().contains(&CLIENT_TAG) → “the producer must have the zipnet.client tag” → enforced on the consumer side (aggregator subscribes only to peers tagged zipnet.client).
consumer require: |p| p.tags().contains(&AGGREGATOR_TAG) → “the consumer must have the zipnet.aggregator tag” → enforced on the producer side (client accepts subscribers only if they’re tagged zipnet.aggregator).

Getting this inverted produces symptoms like rejected consumer connection: unauthorized in the producer logs, with consumer PeerEntry tag counts of 1 that don’t match the expected role. The clue is that the producer is the one rejecting; consumer-requires apply on the producer.

Without both clauses, any peer on the network could subscribe to your client’s envelope stream — defeating the point. The ticket-based analog is require_ticket, which is what you want in the TDX-enabled path.

`Group<M>`, `Map<K,V>`, `Network` are not `Clone`

All three hold Arc internally but don’t derive or implement Clone. When you need to share them across spawned tasks, wrap in a fresh Arc:

let group   = Arc::new(network.groups()...join());
let network = Arc::new(builder.build().await?);

tokio::spawn({
    let group = Arc::clone(&group);
    async move { ... group.execute(...).await ... }
});

Group::execute, Group::query, Group::feed return futures that are 'static — they take ownership of the arguments they need at the moment of call, so passing Arc<Group> + Arc::clone() into each task is the straightforward pattern.

The server role deliberately keeps the Group inside a single tokio::select! rather than spawning task-per-responsibility so we avoid the Arc noise. The integration test in zipnet::node/tests/e2e.rs does the same.

`QueryResultAt<M>` doesn’t pattern-match directly

group.query(...).await? returns Result<QueryResultAt<M>, QueryError<M>> where QueryResultAt<M> is #[derive(Deref)] with Target = M::QueryResult. You cannot pattern-match QueryResultAt against variants of your QueryResult. The canonical destructure:

let qr = group.query(Query::LiveRound, Consistency::Weak).await?;
let QueryResult::LiveRound(live) = qr.into() else { return Ok(()) };

QueryResultAt::into is inherent (not From) and returns the M::QueryResult by value.

`Cell` write / clear

let cell = LiveRoundCell::writer(&network);
cell.set(header).await?;   // atomic replace
cell.clear().await?;       // empty

There is no unset — the method is clear. Cell already has Option-like emptiness semantics, so Cell<T> gives you the “sometimes present” store you’d expect; no need for Cell<Option<T>>.

`StateMachine::apply` can’t be async

Apply is synchronous by contract. Side effects that need async (e.g. writing to a collection, sending a stream, issuing another command) must happen in a separate task that watches the commit cursor and reads the state machine via queries:

loop {
    tokio::select! {
        _ = group.when().committed().advanced() => reconcile().await?,
        Some(msg) = stream.next() => forward(msg).await?,
        _ = period.tick() => maybe_open_round().await?,
    }
}

The apply-watcher in zipnet::node/src/roles/server.rs::reconcile_state is the canonical implementation in our prototype.

`InvalidTicket` is a unit struct

mosaik::tickets::InvalidTicket doesn’t have ::new; it’s a bare struct InvalidTicket;. Return it as:

return Err(InvalidTicket);

Context goes into the tracing log, not into the error, because the error is opaque at the protocol level.

`GroupKey::from(Digest)`

GroupKey: From<Secret> where Secret = Digest. The ergonomic constructor from a caller-provided string:

let key = GroupKey::from(mosaik::Digest::from("my-committee-secret"));

GroupKey::from_secret(impl Into<Secret>) is the same thing; either works. GroupKey::random() is present but not what you want in production because every committee member must converge on the same value.

Discovery on localhost

iroh’s pkarr/Mainline DHT bootstrap is unreliable for same-box tests. For integration tests, cross-call sync_with between every pair of networks (same pattern as mosaik’s examples/orderbook::discover_all):

async fn cross_sync(nets: &[&Arc<Network>]) -> anyhow::Result<()> {
    for (i, a) in nets.iter().enumerate() {
        for (j, b) in nets.iter().enumerate() {
            if i != j {
                a.discovery().sync_with(b.local().addr()).await?;
            }
        }
    }
    Ok(())
}

For out-of-process binaries, pass an explicit --bootstrap <peer_id> pointing at a well-known node.

`Tag = UniqueId`, no `tag!` macro

Book examples show tag!("...") but 0.3.17 exports no such macro. Tag is an alias for UniqueId, so use unique_id!("...") for compile-time construction:

pub const CLIENT_TAG: Tag = unique_id!("zipnet.client");

Runtime construction is Tag::from("...") via the From<&str> impl on UniqueId.

Declaring collections that don’t exist at `use` time

The declare::collection! macro refers to its value type by path, so you can declare a collection over a type defined later in the same crate:

// src/protocol.rs
use crate::committee::LiveRound;

declare::collection!(
    pub LiveRoundCell = mosaik::collections::Cell<LiveRound>,
    "zipnet.collection.live-round",
);

LiveRound is defined in src/committee.rs; the macro’s expansion resolves the path at compile time in the usual way.

`Network::builder(...).with_mdns_discovery(true)`

mDNS is off by default in 0.3.17. For single-box testing and for clusters on the same LAN, turning it on collapses discovery latency from minutes (DHT bootstrap) to sub-seconds. Costs nothing on WAN deployments where it silently no-ops.

Network::builder(network_id)
    .with_mdns_discovery(true)
    .with_discovery(discovery::Config::builder().with_tags(tags))
    .build().await?;

We enable it unconditionally in NetworkBoot::boot.

TDX gating: install own ticket, require others’

Mosaik’s TDX support composes on both sides of the peer-entry dance. The idiomatic zipnet committee setup:

// On boot, if built with the tee-tdx feature:
network.tdx().install_own_ticket()?;  // attach our quote to our PeerEntry

// When joining the committee or a public collection, require peers
// to present a matching TDX quote:
use mosaik::tickets::Tdx;
let tdx_validator = Tdx::new().require_mrtd(expected_mrtd);

// Stack with BundleValidator via multi-require_ticket:
group_builder
    .require_ticket(BundleValidator::<ServerBundleKind>::new())
    .require_ticket(tdx_validator);

expected_mrtd comes from the reproducible committee-image build and is published alongside the Config (see design-intro — A fingerprint convention, not a registry). In v1, BundleValidator is the only admission check in the non-TDX path; TDX critical-path enforcement lands in v2 (Roadmap).

Threat model

audience: contributors

This chapter restates the paper’s adversary model (§3.3) against the concrete objects that exist in our prototype, and gives proof sketches for the claims we make. The claims are scoped to one zipnet deployment — the committee Group<CommitteeMachine> identified by DEPLOYMENT.derive("committee") for a given Config + datum schema (see design-intro — The underlying principle). Distinct deployments on the same universe have disjoint GroupIds, disjoint rosters, and disjoint anonymity sets; what holds for one says nothing about another. Multi-deployment composition is out of scope here.

Goals and non-goals

Goal: unlinkability of (author, message) for messages published in the Broadcasts collection, against any adversary that controls at most N_S − 1 of N_S committee servers, the aggregator, the TEE host (of an unbounded subset of clients), and the network. The adversary does not control a strict majority of the honest clients. (The precise (t, n)-anonymity formulation is in Appendix A of the paper.)

Non-goals:

Byzantine fault tolerance of the consensus layer. Mosaik’s Raft variant is crash-fault tolerant, not Byzantine.
Availability under any adversarial committee participation. In v1, a single crashed committee server halts round progression.
Confidentiality of application payload. Once finalized, broadcast is world-readable by design.
Resistance to message-length side channels (see security checklist).

Attacker powers

What the adversary can do:

Read and modify any packet on the wire. iroh/QUIC authenticates peer identities, so the adversary cannot impersonate an honest node, but can block, delay, or corrupt packets (triggering Raft timeouts and stream reconnects).
Control the operating system of any non-TEE node, including committee servers it is designated to operate.
Issue arbitrary Commands to the committee via a corrupt server (which forwards its own commands into the Raft log) or via a corrupt client (which sends arbitrary ClientEnvelopes through the aggregator).
Compromise the TEE of any number of clients (and read their DH secrets) in the v1 mock path.

What the adversary cannot do (by assumption or by protocol):

Compromise the TEE of a client in the v2 TDX path without triggering attestation failure. (Formal: SGX/TDX bound by the hardware root of trust.)
Compromise the DH secret of every committee server simultaneously — anonymity requires at least one honest server.
Force a BroadcastRecord to contain a participants list that includes an unregistered ClientId: the state machine rejects such an aggregate at SubmitAggregate apply time (see committee state machine).

Anonymity sketch

Let C₁, ..., C_N be the clients participating in round r. Each client C_i contributes msg_i ⊕ (XOR over servers of pad_ij) to the aggregate. The aggregate is:

agg_r = XOR_i (msg_i ⊕ XOR_j pad_ij)
      = (XOR_i msg_i) ⊕ (XOR_i XOR_j pad_ij)

The broadcast is agg_r ⊕ (XOR_j partial_j) where partial_j = XOR_i pad_ij. Substituting:

broadcast = (XOR_i msg_i) ⊕ (XOR_i XOR_j pad_ij) ⊕ (XOR_j XOR_i pad_ij)
          = (XOR_i msg_i)              // the inner pads cancel

So the broadcast is exactly the XOR of every client’s slotted message. Given the deterministic slot assignment, messages land in distinct slots (modulo collisions) and can be read back slot-by-slot.

For unlinkability: given any one honest server j* whose pad secrets are unknown to the adversary, every pad_{ij*} is PRF-indistinguishable from uniform random (under the PRF security of the HKDF-AES construction). Each honest client’s envelope_i = msg_i ⊕ XOR_j pad_ij is therefore PRF-indistinguishable from uniform — the adversary cannot distinguish which honest client authored which envelope. This is the standard DC-net anonymity argument under the any-trust assumption.

The paper strengthens this to a (t, n) game (Appendix A). The state-machine-level permutation check at SubmitAggregate apply ensures the aggregate’s participants vector is a subset of the round’s client roster: any participants shuffle by the adversary is a subset of already-known IDs, so the permutation is within the honest anonymity set.

Integrity: what the state machine guarantees

A committed BroadcastRecord is the result of exactly one SubmitAggregate followed by exactly one SubmitPartial per committee member in that round’s servers snapshot. No partial is double-counted; no aggregate is re-applied.
Every published broadcast in the log is computable deterministically from the committed commands. A replay (e.g. after a committee server restart) produces the identical byte sequence.

Integrity: what the state machine does not guarantee

The honesty of the aggregator’s fold. A malicious aggregator can:
- omit an envelope (DoS a specific client),
- include a garbage envelope attributed to a real client’s ClientId (see below),
- lie about the participants list.
The state machine rejects a SubmitAggregate whose participants set is not a subset of the LiveRound.clients roster, preventing the aggregator from naming rogue clients. It does not reject an aggregator that names honest clients whose envelopes were never received — but in that case the partial unblinds will remove the expected pads, and the slot of the missing client will show noise (since msg_i = 0 was not what the honest client sent).

A malicious aggregator cannot break anonymity; it can only degrade availability and introduce noise into specific slots.
The honesty of a committee server’s partial. A malicious server can submit a garbage partial. The broadcast will be XORed with that garbage and published as garbage. The state machine has no way to detect this — DC-net unblinding does not carry a zero-knowledge proof. This is consistent with the paper: malicious servers break availability, not anonymity.

A v2 mitigation (not in v1) is an anti-disrupter phase modeled on Riposte’s auditing or Blinder’s MPC format check.

Failure modes that break anonymity (not in the adversary model)

All committee servers collude. By assumption the any-trust model is void; anonymity is lost. Operators must enforce the any-trust diversity axiom out of band.
The same DH secret is used across roles. Re-using a DhSecret between a committee server and a client (a pathological misconfiguration) would let the server correlate its own client envelopes with its own partial unblinds. The ClientId / ServerId type separation guards against this at the type level.
Traffic analysis across rounds. ZIPNet per se does not defend against a global passive adversary who correlates client connection times across many rounds. This is a transport-level concern and is inherited from mosaik’s iroh transport.
Universe-level co-location. Running on the shared mosaik universe (Shape B in design-intro) does not weaken the anonymity argument: admission to the committee group and to the public write-side streams is gated per-instance by TicketValidator composition (BundleValidator<K> today, + Tdx::new().require_mrtd(...) in the TDX path). A peer on the universe who does not present the expected bundle — or MR_TD — is not admitted to the bond, and therefore cannot submit a Command, a partial, or a client envelope. The universe topology is a discovery-scope decision, not a trust-scope decision.

Denial-of-service surface

Attacker	Attack	Effect
Compromised TEE	Flood envelopes	Aggregator backpressures, drops lagging stream senders (mosaik `TooSlow` code 10_413)
Compromised aggregator	Omit / delay aggregates	Rounds stall until the committee’s `round_deadline` fires
Compromised committee server	Omit partial	Round never finalizes; operator intervenes or the server is rotated out
Compromised committee server	Submit malformed partial	Broadcast is garbage for this round; next round is clean
Network	Drop / delay packets	Raft heartbeats time out, election thrashes, rounds delayed

All of these are availability issues and none of them break anonymity of past or future rounds.

Roadmap to v2

audience: contributors

These are the simplifications baked into v1 and the planned path to address each. The order here is not the implementation order — it is the order in which each change affects the external behavior of the system.

Footprint scheduling

v1: deterministic slot per (client, round) via keyed-blake3 mod num_slots. Collision probability ≈ N / num_slots; at N = 8, num_slots = 64 that’s ~12%.

v2: the paper’s two-channel scheduling (§3.2). A side channel of 4 * N slots holds footprint reservations. Clients pick a random slot and an f-bit random footprint each round, write the footprint into the scheduling vector, and in round r+1 use the assigned message slot only if their footprint round-tripped unchanged.

Implementation shape: add a second RoundParams::num_sched_slots and a second broadcast vector, run the same HKDF-AES pad derivation against a distinct label "zipnet/pad/sched/v1". The CommitteeMachine consumes two aggregates per round (message + schedule) and splits the final broadcast into two halves. WIRE_VERSION bump: 1 → 2.

ECIES-encrypted receipts stream

v1: Zipnet::<D>::receipts returns Error::Protocol with a “deferred to v2” message. No receipts are emitted on the wire; applications that need per-submission outcomes either (a) poll a Reader<D> for byte-equality on the round(s) they care about, or (b) use an application-level acknowledgement above the shuffler.

v2: a public Receipts stream declared in zipnet::node::protocol, populated once per finalized round by the committee (or the aggregator, TBD — the committee already has the state-machine hook, so that’s the lighter lift). Each entry is an ECIES ciphertext (ephemeral X25519 + AEAD) of (submission_id, round, slot, outcome) to the submitter’s long-term X25519 pubkey, the same key registered for pad derivation. The facade’s Receipts<D> stream trial-decrypts every item and yields only those authenticating against the caller’s key.

Implementation shape: a new declare::stream! in zipnet::node::protocol; a per-round broadcast side-effect in CommitteeMachine::apply (or an apply-watcher that fans out receipts after finalize); trial-decrypt loop in crates/zipnet/src/driver.rs alongside the existing reader driver. Opacity to a public observer is the design property: no cleartext recipient hint, no key-id, no linkable field. See CLAUDE.md “Wire-shape invariants §2” for the contract.

Trial-decrypt scales as O(N_receipts) ECDH + AEAD per round per receipt-watcher. Bounded for permissioned client sets; for large-N we’d move to a per-round HMAC-tag scheme (recipient- derivable, round-rotated to avoid cross-round linkability).

Cover traffic

v1: non-talking clients omit their envelope entirely. This narrows the anonymity set to active talkers.

v2: clients with no message produce a pure-pad envelope (msg_i = 0, all pads XORed in). The aggregator and committee process these indistinguishably from talker envelopes. The only visible change at the state-machine level: participants grows to include cover traffic.

This is a tiny code change on the client (just remove the “skip when message == None” early return in client::seal) plus a policy decision on how often a client should send cover. Stay-cheap- on-the-server was a first-class design goal of the paper; v2 makes it concrete.

Ratcheting for forward secrecy

v1: every round reruns HKDF-Extract from the same shared_secret. Compromise of the secret compromises all past pads.

v2: at the end of each round, both client and server ratchet:

shared_secret ← HKDF-Extract("zipnet/ratchet/v1", shared_secret);

Past shared secrets are unrecoverable from the new one under the PRF assumption. Both sides must step the ratchet in lockstep; the round number acts as the step counter. Committee members rederiving a missed step for a late-joining client catch up by evaluating the KDF round times.

For the client, the ratchet state sits in the TEE’s sealed storage (v2 TDX path). For the mock client, it sits in RAM — so a restart re-derives an independent key tree, which is fine.

Multi-tier aggregators

v1: single aggregator.

v2: arbitrary rooted tree of aggregators. Each leaf-level aggregator XOR-folds from its assigned clients, pushes up to its parent, parent folds and pushes to root, root publishes to the committee. Filtering uses require(|p| peer.tags().contains(&tag!("aggregator.tierN"))) and with_tags("aggregator.tierN+1") on online_when.

Each aggregator-to-aggregator link uses a dedicated stream (we already have the pattern in AggregateToServers). No state-machine change required because the root aggregator still emits one AggregateEnvelope per round.

Liveness resilience

v1: any committee server being offline halts round finalization — the state machine waits for len(partials) == len(header.servers).

v2 options:

Relaxed finalization. Finalize after t-of-n partials, where t is a configured threshold. A missing server’s pads are retroactively removed via a published “apology partial” submitted by any honest server that knows the remaining clients’ pads. (This requires publishing the missing server’s pad seeds under the committee’s shared secret, which defeats the point — so it needs MPC.)
Aggregator-sponsored timeout. The leader signals a timeout, bumps the RoundId, and opens a fresh round without the stuck server’s pads. This is simpler but loses the anonymity contribution of the absent honest server.

The first option is research-complete but not engineering-complete; the second option is trivial and is the candidate for v2.

TDX attestation in the critical path

v1: tee-tdx feature exists but the committee accepts any peer with a well-formed ClientBundle ticket (our BundleValidator only checks id/dh_pub consistency).

v2: on each committee admission path add .require_ticket(Tdx::new() .require_mrtd(expected_mrtd)) so only enclave-verified peers can participate. The expected MR_TD comes from the reproducible image build. ClientRegistry writes only land if the bundle’s PeerEntry also carries a valid TDX quote.

This is additive to the existing BundleValidator and stacks cleanly thanks to mosaik’s multi-require_ticket support.

State archival and snapshot sync

v1: CommitteeMachine.broadcasts grows unbounded in RAM; LogReplaySync is used for catch-up.

v2: implement a StateSync strategy that snapshots the last N broadcasts + the current InFlight and emits a blob. Externalize the archival of rotated broadcasts to a sink collection or a replicated object store.

Rate-limiting tags

v1: absent. A malicious client can flood envelopes.

v2: per the paper’s §3.1 sketch, each envelope carries PRF_k(ctr || epoch) where ctr is attested by the enclave. The aggregator dedupes by tag per epoch. This requires the TEE path to have landed first.

Scheduling vector equivocation protection

v1: a single leader publishes LiveRound into LiveRoundCell; divergent schedules would be detectable via the schedule_hash input to the KDF (if we included it — we pass NO_SCHEDULE in v1). Once footprint scheduling lands, every client must derive schedule_hash from the same broadcast schedule as the committee, or pads disagree and the broadcast is noise (correct failure mode per paper §3.2).

Versioning under stable instance names

v1: every incompatible change (any WIRE_VERSION or signature() bump) produces a new GroupId. Under the UNIVERSE + deployment-fingerprint design described in design-intro, this effectively makes the old deployment a ghost and forces consumers to re-pin. If "acme.mainnet" is meant to be an operator-level identity that outlives schema changes, v1 cannot deliver it.

v2 must pick one of two reconciliation strategies, documented in design-intro — Versioning under stable instance names:

Version-in-name. acme.mainnet-v2 retires acme.mainnet. Clean, but forces a consumer-side release per bump.
Lockstep releases. The instance name stays stable across versions and operators + consumers cut matching releases against a shared deployment crate. Avoids id churn at the cost of tighter release-cadence coupling.

Neither is chosen yet. The call is forced the first time a v2 milestone above lands in a production deployment.

Cross-service composition

v1: zipnet is the only service we ship on zipnet::UNIVERSE.

v2: as sibling services (multisig signer, secure storage, attested oracles) land on the same universe, two concerns surface:

Catalog noise. Every peer on the universe appears in every agent’s discovery catalog. /mosaik/announce volume scales with the universe, not with the services an agent cares about. The escape hatch is the per-service derived private network for high-churn internal chatter; the residual cost is paid by everyone. If a service’s traffic would dominate the shared network, it belongs behind its own NetworkId — Shape A in design-intro — Two axes of choice — not on the shared one.
Cross-service atomicity. “Mix a zipnet message AND rotate a multisig signer” cannot be a single consensus transaction; they are different Groups, possibly with disjoint membership. If a coordination-heavy use case genuinely needs that, the answer is a fourth primitive that is itself a deployment providing atomic composition — not an ad-hoc cross-group protocol.

Optional directory collection (devops convenience)

Not a core feature. Zipnet’s consumer binding path is compile- time name reference plus mosaik peer discovery; no on-network registry is required, and the CLAUDE.md commitment is explicit that one will not be added. However, a shared Map<InstanceName, InstanceCard> listing known deployments may ship as a devops convenience for humans enumerating instances across operators. If built, it must:

be documented as a convenience, not a binding path;
be independently bindable — the SDK never consults it;
not become load-bearing for ACL or attestation decisions.

Flag-in-source as // CONVENIENCE: if it lands, to distinguish it from the // SIMPLIFICATION: v2-deferred markers.

Migration across these milestones

Each milestone above changes WIRE_VERSION or at minimum CommitteeMachine::signature(). Rolling between v1 and an arbitrary v2 milestone is therefore a coordinated “stop all nodes, start with new config” operation — same procedure as rotating the committee secret. We make no attempt at on-the-fly upgrade paths in this prototype.

Extending zipnet

audience: contributors

This chapter covers two kinds of extension:

Extending zipnet itself — new commands, collections, streams, ticket classes, or round-parameter knobs within a zipnet deployment.
Building an adjacent service on the shared universe — a new mosaik-native service (multisig signer, secure storage, attested oracle, …) that coexists with zipnet on zipnet::UNIVERSE and reuses the content + intent addressed fingerprint pattern.

The second is the generalisation of the first. The “checklist for a new service” at the end of design-intro is the canonical reference for the second kind; this chapter links to it and concentrates on the concrete how-tos.

Extending zipnet itself

Adding a new command to the committee state machine

Add a variant to Command in crates/zipnet/src/node/committee.rs.
Handle it in apply(). Deterministic only — no I/O, no randomness that isn’t derived from ApplyContext (see Committee state machine — Apply-context usage).
Bump the version tag in CommitteeMachine::signature() (v1 → v2). This re-scopes the GroupId so mismatched nodes cannot bond. This is a breaking change.
Add a Query variant if the new state needs external read access.
Decide who issues the command. If a non-server peer needs to trigger it, add a declare::stream! channel and a side-task in roles::server that feeds it into group.execute.

Adding a new collection

Declare in crates/zipnet/src/node/protocol.rs:

declare::collection!(
    pub MyMap = mosaik::collections::Map<K, V>,
    "zipnet.collection.my-map",
);

Decide writer and reader roles. Writers join the collection’s internal Raft group and bear the leadership election cost.
For TDX-gated collections, compose Tdx::new().require_mrtd(...) onto the collection’s require_ticket alongside the existing BundleValidator — see Mosaik integration — TDX gating.
If the new collection is part of the public surface, think twice. Zipnet’s declared public surface is small (write-side + read-side, see Architecture). A new public collection widens the consumer contract; prefer surfacing via the existing Zipnet::<D>::* constructors instead of growing raw declarations.
Once the target per-deployment layout lands, the literal string will be replaced by DEPLOYMENT.derive("my-map"); structure the name so the migration is a pure rename.

Adding a new typed stream

Declare in protocol.rs. Prefix predicates with producer / consumer per the direction semantics (Mosaik integration — predicate direction).
Use in a role module: MyStream::producer(&network) / MyStream::consumer(&network) returns concrete typed handles.
If this is a high-churn internal channel (aggregator fan-in, DH gossip), it’s a candidate to live on a derived private network rather than the shared universe — see Architecture — Internal plumbing.

Adding a new `TicketValidator`

Implement mosaik::tickets::TicketValidator on a fresh type. BundleValidator<K> in crates/zipnet/src/node/tickets.rs is the reference shape.
Pick a TicketClass constant. Keep it human-readable ("zipnet.bundle.server", etc.) — ticket classes are intent-addressed and the string is the intent.
Fold a version tag into signature() the same way BundleValidator does:
```
fn signature(&self) -> UniqueId {
    K::CLASS.derive("zipnet.my-validator.v1")
}
```
Bumping v1 → v2 re-scopes the GroupId of every group that stacks this validator. Treat it as a breaking change.
Compose with existing validators via mosaik’s multi- require_ticket — see Mosaik integration — TDX gating for the stacking pattern.

Changing `RoundParams`

Edit RoundParams::default_v1() in crates/zipnet/src/proto/params.rs.
Bump WIRE_VERSION if the change is semantically meaningful (any client/server disagreement on shape would garble pads otherwise).
CommitteeMachine::signature() already mixes in params fields; every member rederives GroupId and old + new do not bond.
Deploy-time coordination: same procedure as rotating the committee secret.

Adding a TDX attestation requirement

Turn on the tee-tdx feature on zipnet::node and zipnet (the binary half of the latter stacks the same validator as the SDK library).
In the deployment-specific main, pre-compute (or hardcode) the expected MR_TD.

Build a validator:

use mosaik::tickets::Tdx;
let validator = Tdx::new().require_mrtd(expected_mrtd);

Plumb validator into the server’s run path by stacking it on the committee GroupBuilder::require_ticket and on each collection / stream whose producer you want to TDX-gate.

Swapping the slot assignment function

The slot is picked by zipnet::shuffler::slot::slot_for(client, round, params). Change the body; the caller contract is -> usize.
If you want the footprint scheduling variant, you’ll also want a per-round side channel — see Roadmap — Footprint scheduling.
Deterministic and agreed upon by all nodes. Bump the protocol version tags accordingly.

Running the integration test under heavier parameters

crates/zipnet/tests/e2e.rs uses RoundParams::default_v1() and a hardcoded 3-server / 2-client topology. Modify directly; the helpers (cross_sync, run_server, run_client, run_aggregator) are scoped to the test so no cross-cutting refactor is needed.

RUST_LOG=info,zipnet_node=debug cargo test -p zipnet --test e2e -- --nocapture

A successful run ends with

zipnet e2e: round r1 finalized with 2/2 messages recovered

Where to put a new role

If you introduce a fourth participant type (say, an “auditor” that archives Broadcasts to cold storage), add a new module in crates/zipnet/src/node/roles/ with the event loop, then surface it as a fourth subcommand on the zipnet binary by extending crates/zipnet/src/main.rs’s Role enum and dispatcher. Follow the existing server / aggregator / client variants as the pattern.

Measuring something

Mosaik’s Prometheus metrics are auto-wired; add your own via the metrics crate:

use metrics::{counter, gauge};

counter!("zipnet_rounds_opened_total").increment(1);
gauge!("zipnet_client_registry_size").set(registry.len() as f64);

They will appear at the configured ZIPNET_METRICS endpoint without any scraper-side changes.

Building an adjacent organism on the shared universe

Zipnet’s deployment model is a reusable pattern — the full rationale is in design-intro. Any mosaik-native organism that wants to coexist on zipnet::UNIVERSE alongside zipnet should reproduce the three conventions:

Content + intent addressed fingerprint. Every public id descends from a single blake3 hash over the operator’s intent (name), the signature-altering content (schema version, wire sizes, consensus config, init salt), and the ACL composition. Expose the fingerprint inputs as a const-constructible Config struct.
A Deployment-shaped convention. Declare the public surface (one or two primitives, ideally) in a single protocol module; export typed Zipnet::<D>::*-style constructors that derive the ids internally.
A fingerprint convention, not a registry. Operator → consumer handshake is universe NetworkId + Config + datum schema + (if TDX-gated) MR_TD. No on-network advertisement required — mosaik’s standard discovery bonds the sides.

Walk the checklist for a new service end-to-end before writing any code. The most common mistake is not answering “what happens when StateMachine::signature() bumps?” before shipping.

When Shape B is the wrong call

A service whose traffic would dominate catalog gossip on the shared universe (high-frequency metric streams, bulk replication) belongs behind its own NetworkId — Shape A in design-intro — Two axes of choice. The narrow-public-surface discipline does not rescue a service whose steady-state traffic is inherently loud; at that point the noise cost dominates the composition benefit.

Optional directory collection

If your operator community wants a human-browsable list of known deployments, ship a sibling Map<InstanceName, InstanceCard> as a devops convenience, not as part of the consumer binding path. See Roadmap — Optional directory collection for the discipline.

Glossary

audience: all

Domain terms as they are used in this book and in the source.

Aggregator. The untrusted node that XOR-folds client envelopes for a round into a single AggregateEnvelope and forwards it to the committee. One aggregator in v1; a tree of aggregators in v2. Runs as zipnet aggregator.

Any-trust. Security assumption where anonymity holds as long as at least one party in a designated set is honest. The zipnet committee is an any-trust set.

Zipnet::<D>::* constructors. The three public entry points from a mosaik network handle to typed zipnet handles: Zipnet::<D>::submit (writer), Zipnet::<D>::receipts (encrypted per-submission outcomes), and Zipnet::<D>::read (decoded shuffled outputs). Each takes &Arc<Network> and a &zipnet::Config, derives every deployment-local ID internally, and returns a typed handle (Submitter<D>, Receipts<D>, Reader<D>). See Quickstart — publish and read.

deployment_id. The pure function Zipnet::<D>::deployment_id(&Config) -> UniqueId that produces the on-wire identity for a deployment. Useful for handshake diagnostics: both sides compute it locally and compare without any wire traffic.

Bond. mosaik term for a persistent QUIC connection between two members of the same Raft group, authenticated by the shared GroupKey.

Broadcast vector. B = num_slots * slot_bytes bytes of output per round. Default 16 KiB. Each finalized round commits one broadcast vector to the Broadcasts collection.

Client. A node that authors messages and seals them into envelopes inside a TEE. In the mock path (v1 default), the TEE is replaced by a plain process; see Security checklist.

ClientBundle. Public pair (ClientId, dh_pub) gossiped via a discovery ticket so servers can derive per-client pads.

ClientId. 32-byte blake3-keyed hash of the client’s X25519 public key. Stable as long as the client’s DH secret is stable.

Committee. The set of any-trust servers that collectively unblind the round’s aggregate. In v1 this is a Raft group with a bespoke CommitteeMachine state machine. One committee per instance.

Cover traffic. Client envelopes carrying a zero message, sent to widen the anonymity set at negligible extra cost. The SDK sends cover envelopes by default while a Submitter<D> handle is open but idle. See Publishing messages.

DC net. Dining Cryptographers network — the XOR-based anonymous broadcast construction zipnet descends from. See Chaum 1988.

DH secret. An X25519 static secret held by a client or a server. Compromise of one party’s DH secret only affects that party; compromise of every committee server’s DH secret breaks anonymity.

Encrypted mempool. The canonical motivating deployment shape: TEE-attested wallets seal transactions and publish them through zipnet; builders read the ordered log of sealed transactions; no single party can link a transaction back to its sender. Zipnet supplies the anonymous publish channel; the encryption of the payload itself (threshold, TEE-unsealing, etc.) sits on top.

Envelope. A client’s per-round contribution: a broadcast-vector- sized buffer containing message ‖ tag at the client’s slot and zeros elsewhere, XORed with the sum of the client’s per-server pads.

Falsification tag. A keyed-blake3 output of the plaintext message, written alongside the message in the same slot. Verifies that a slot’s payload is intact (§3, “ROMHash” in the paper).

Fold. The aggregator’s XOR combine of all envelopes for a round.

Footprint scheduling. The paper’s two-channel slot reservation scheme (§3.2). v2 feature.

GroupId. mosaik’s 32-byte identifier for a Raft group, derived from the GroupKey, consensus config, state machine signature, and any TicketValidator signatures. Fully determined by the deployment fingerprint (Config + datum schema) plus the deployment crate version.

GroupKey. Shared committee secret. Admission gate for joining the committee’s Raft group.

Deployment. A single zipnet deployment — one committee, one ACL, one set of round parameters — sharing a universe with other zipnet deployments and other mosaik services. Operators stand up and retire deployments; developers bind their agents to them by Config + datum schema.

Instance name. The intent component of a deployment’s fingerprint — a short, stable, namespaced string (e.g. acme.mainnet, preview.alpha, dev.ops) that lives as Config::name. Flat namespace per universe — name collisions alone do not silently bond (the window, init, and schema must all match) but it still pays to namespace defensively (<org>.<purpose>.<env>).

Config. The const-constructible zipnet::Config struct that bundles the deployment fingerprint: instance name, ShuffleWindow, and 32-byte init salt. Every field folds into the on-wire identity via Zipnet::<D>::deployment_id. Operators publish the Config (or its serialised fingerprint) as part of the handshake; consumers compile it in.

ShuffleWindow. The operating-window half of the deployment fingerprint: round period, participant bounds, fold and round deadlines. Ship one of the presets (interactive() or archival()) unless you have a specific reason to tune.

ShuffleDatum. Trait every shuffler payload type implements. Carries const TYPE_TAG: UniqueId (schema fingerprint) and const WIRE_SIZE: usize (exact bytes-on-the-wire size), plus encode / decode. Both constants fold into the deployment identity.

LiveRound. The currently-open round’s header: round id, client roster snapshot, server roster snapshot.

mosaik. The Flashbots library on which this prototype is built. Provides discovery, typed streams, consensus groups, and replicated collections. See docs.mosaik.rs.

MR_TD. 48-byte Intel TDX guest measurement. Published by the operator out of band; pinned by clients; enforced by the mosaik Tdx bonding layer. See TEE-gated deployments.

Pad. The output of the KDF for a given (client, server, round) triple; length B. XOR of pads is the DC-net’s one-time key.

Partial unblind. One committee server’s XOR of its per-client pads over the round’s participant set. XORing all partials into the aggregate yields the broadcast.

PeerId. mosaik identifier for a node: its ed25519 public key (via iroh). Different from ClientId / ServerId (which are DH-key-based).

Raft. The consensus protocol used by the committee group. mosaik uses a modified Raft with abstention votes.

Ratchet. Stepping the shared secret forward one round; shared_secret ← HKDF(shared_secret). Provides forward secrecy. v2 feature.

Round. One execution of the protocol: OpenRound → SubmitAggregate → N_S × SubmitPartial → finalize.

RoundId. Monotonically increasing integer; r0, r1, ....

RoundParams. Static shape of a round: num_slots, slot_bytes, tag_len, wire_version. Immutable for the lifetime of a deployment.

ServerBundle. Public pair (ServerId, dh_pub) gossiped via a discovery ticket so clients can derive per-server pads.

ServerId. 32-byte blake3-keyed hash of a committee server’s X25519 public key.

Slot. One slot_bytes-byte region of the broadcast vector. One active client per slot per round (modulo deterministic collisions).

State machine signature. UniqueId mixed into GroupId derivation. Bumped whenever apply semantics or Command shape changes.

TEE. Trusted Execution Environment. Intel TDX in the production path; mock in the v1 default path.

TDX. Intel Trust Domain Extensions — the TEE zipnet targets. Guest measurement is MR_TD. See TEE-gated deployments.

Ticket. Opaque bytes attached to a signed PeerEntry in mosaik discovery. Zipnet uses tickets of classes zipnet.bundle.client and zipnet.bundle.server to distribute DH pubkeys, and relies on mosaik’s require_ticket for per-instance ACL on the public primitives.

Universe. The shared mosaik NetworkId on which zipnet (and any other mosaik service) runs. The zipnet facade exports the constant zipnet::UNIVERSE = unique_id!("mosaik.universe"). Many deployments, and many unrelated services, coexist on one universe.

XOR. Exclusive-or over equal-length byte buffers. The DC-net’s fundamental operation.

Paper cross-reference

audience: contributors

Pointer table from the prototype’s source modules to the ZIPNet paper (eprint 2024/1227). Section / algorithm / figure numbers are from the camera-ready version. Crate paths are workspace-relative.

Paper item	Prototype location
§2.1 “Chaum’s DC net” (background)	`zipnet::proto::xor` (`crates/zipnet/src/proto/xor.rs`)
§2.2 “ZIPNet overview” (Figure 1b)	`crates/zipnet/src/node/lib.rs` diagram + Architecture
§3 “Falsifiable TEE assumption”	`zipnet::proto::crypto::falsification_tag` (`crates/zipnet/src/proto/crypto.rs`)
§3 “Setup” (PKI, attestation, sealed key DB)	`zipnet::proto::keys` + `zipnet::node::tickets::BundleValidator` (`crates/zipnet/src/node/tickets.rs`)
§3 “Sealed data”	v2 sealed storage in TEE; not implemented in v1
§3.1 “Rate limiting tags”	v2 item; not implemented
§3.2 “Scheduling” (footprint)	v2 item; not implemented (see roadmap)
§3.3 “Adversary and network model”	Threat model
§3.3 “Security argument”	Threat model — anonymity sketch
Algorithm 1 (client seal)	`zipnet::shuffler::client::seal` (`crates/zipnet/src/shuffler/client.rs`)
Algorithm 2 (aggregator fold)	`zipnet::shuffler::aggregator::RoundFold` (`crates/zipnet/src/shuffler/aggregator.rs`)
Algorithm 3 (server partial + finalize)	`zipnet::shuffler::server::partial_unblind` + `zipnet::shuffler::server::finalize` (`crates/zipnet/src/shuffler/server.rs`)
Appendix A (anonymous broadcast definition)	inherited — the prototype does not reprove it

Crate responsibilities

The workspace splits the paper’s constructions along a purity boundary (see Crate map):

Crate	Paper content	I/O?
`zipnet::proto`	Wire types, keys, XOR, falsification tag primitive	No
`zipnet::shuffler`	Algorithms 1 / 2 / 3 as pure functions over `zipnet::proto` types	No
`zipnet::node`	The mosaik integration — `CommitteeMachine`, role event loops, `TicketValidator`	Yes
`zipnet` (bin)	Unified operator binary (`zipnet {server\|aggregator\|client}`) — thin CLI wrapper around `zipnet::node::roles::{server, aggregator, client}`	Yes
`zipnet`	SDK facade (`Zipnet::<D>::{submit,receipts,read}`, `Config`, `UNIVERSE`); wraps `zipnet::node` for external consumers	Yes

zipnet::proto and zipnet::shuffler do not import mosaik or tokio; if a paper construction reaches for either, it is in the wrong crate.

Notation

The paper uses capital N (total users), N_S (servers), |m| (slot bytes), B (broadcast vector bytes). The prototype uses lowercase n / num_slots / slot_bytes / broadcast_bytes in code and generally follows the paper’s naming in comments.

Deliberate deviations from the paper

No schedule hash in v1. The paper mixes publishedSchedule into the KDF salt. The prototype passes a constant NO_SCHEDULE = [0u8; 32] in v1 and will replace it with the real schedule hash when footprint scheduling lands. Binding the schedule into the KDF is already plumbed (crypto::kdf_salt takes it as an argument), so the upgrade is a caller-site change.
Tag is keyed-blake3, not HMAC. The paper writes “ROMHash” informally; the prototype picks keyed-blake3 with a fixed domain-separating label for performance. Both are PRFs under standard assumptions; no security difference relative to the paper’s ROM-based argument.
No traitor tracing protocol. The paper’s §3 suggests that any malformed message flips hash bits and is detected with overwhelming probability. v1 only checks tags on observation; an adversarial client writing to an unused slot is visible via tag mismatch but not attributed. This matches the paper’s “falsifiable trust assumption” but does not implement the §3.1 rate-limiting PRF tags.
Anonymous broadcast channel for scheduling. The paper runs a second DC net for reservations (§3.2). v1 runs only the message channel.
Deployment fingerprinting replaces paper-implicit single- deployment identity. The paper treats a ZIPNet committee as a single global entity. The prototype runs many deployments side by side on a shared mosaik universe, each with its own content + intent addressed fingerprint (see Designing coexisting systems on mosaik). No paper construction is changed by this; every derivation folds Zipnet::<D>::deployment_id(&Config) in where the paper has an implicit single “deployment” constant.

Environment variables

audience: both

The operator binary is a single executable — zipnet — with three subcommands: server, aggregator, and client. This page lists the variables each subcommand respects, plus the ones common to every invocation. All are optional unless marked Required. Values are passed either as an env var or as the corresponding CLI flag; env beats flag when both are set (per clap(env = "…")).

Users do not read this page — the SDK takes no env vars. This is an operator reference. When it diverges from what the binary currently parses, the binary is lagging the documented deployment model; align the binary to this page, not the other way around.

Common to every subcommand

Variable	CLI flag	Default	Description
`ZIPNET_INSTANCE`	`--instance`	Required	Instance name for this deployment (e.g. `acme.mainnet`). Folds into committee `GroupId`, submit `StreamId`, broadcasts `StoreId`. All processes of one deployment must share this value.
`ZIPNET_UNIVERSE`	`--universe`	`zipnet::UNIVERSE` (`mosaik.universe`)	Override the shared mosaik universe `NetworkId`. Set only for isolated federations; leave unset for normal deployments.
`ZIPNET_BOOTSTRAP`	`--bootstrap`	(none)	Comma- or repeat-flag-separated `PeerId`s on the shared universe to dial on startup. Universe-level, not per-instance.
`ZIPNET_METRICS`	`--metrics`	(none)	Prometheus exporter bind address, e.g. `0.0.0.0:9100`.
`ZIPNET_SECRET`	`--secret`	(random)	Seed for this node’s iroh secret. Anything not 64-hex is blake3-hashed. Recommended on committee servers and the aggregator for stable `PeerId`.
`RUST_LOG`	—	`info,zipnet_node=debug`	Standard `tracing_subscriber` filter.

`zipnet server`

Variable	CLI flag	Default	Description
`ZIPNET_COMMITTEE_SECRET`	`--committee-secret`	Required	Shared committee admission secret. Treated as a root credential — all committee servers of the same instance must share this value; clients and the aggregator must not have it.
`ZIPNET_MIN_PARTICIPANTS`	`--min-participants`	`1`	Minimum registered clients before the leader opens a round.
`ZIPNET_ROUND_PERIOD`	`--round-period`	`2s`	How often the leader attempts to open a new round.
`ZIPNET_ROUND_DEADLINE`	`--round-deadline`	`6s`	How long a round may stay open before the leader force-advances.

`zipnet aggregator`

Variable	CLI flag	Default	Description
`ZIPNET_FOLD_DEADLINE`	`--fold-deadline`	`2s`	Time window after a round opens in which the aggregator accepts envelopes.

`zipnet client`

Variable	CLI flag	Default	Description
`ZIPNET_MESSAGE`	`--message`	(none)	UTF-8 message to seal each round. Omit to run as cover traffic.
`ZIPNET_CADENCE`	`--cadence`	`1`	Talk every Nth round (`1` = every round).

`zipnet ingest` (optional)

Compiled only when the binary is built with --features ingest. Exposes a small REST gateway that accepts POST /submit and republishes bodies into the anonymous stream as a single zipnet client.

Variable	CLI flag	Default	Description
`ZIPNET_INGEST_LISTEN`	`--listen`	`127.0.0.1:8080`	Socket address the REST server binds to.

HTTP submissions must carry exactly WIRE_SIZE bytes (240 under v1 defaults). See Operator overview for when to run the gateway and what trust boundary it introduces.

Duration syntax

The duration parsers accept Nms, Ns, Nm (e.g. 500ms, 2s, 1m). Hours / days are not supported; if you need them, file an issue.

Secret syntax

All “secret” style inputs (ZIPNET_SECRET, ZIPNET_COMMITTEE_SECRET) follow the same rule:

Exactly 64 hex characters → decoded as 32 raw bytes.
Anything else → blake3-hashed into 32 bytes.

This matches mosaik’s own secret-key handling, so operators can reuse whatever seed format they already have (e.g. openssl rand -hex 32).

Deployment derivation

Every deployment-local ID descends from a content + intent addressed root computed from the operator’s ZIPNET_INSTANCE plus the rest of the fingerprint (shuffle window, init salt, datum schema). The consumer side computes the same hash from its compiled-in Config and ShuffleDatum impl:

DEPLOYMENT = blake3("zipnet|" + name + "|type=" + TYPE_TAG +
                    "|size=" + WIRE_SIZE + "|window=" + window +
                    "|init=" + init)                      // UniqueId
SUBMIT     = DEPLOYMENT.derive("submit")                  // StreamId
BROADCASTS = DEPLOYMENT.derive("broadcasts")              // StoreId
COMMITTEE  = DEPLOYMENT.derive("committee")               // GroupKey material
...

A typo in any fingerprint field on either side lands on a GroupId nobody serves. The failure mode is Error::ConnectTimeout on the client, not a distinct “not found” error — zipnet has no on-network registry.

Two deployments with different ZIPNET_INSTANCE values on the same universe are completely independent committees: disjoint GroupIds, disjoint streams, no crosstalk. Useful for:

running dev/staging/prod in one machine pool,
running per-tenant deployments on shared hardware,
running a public testnet (preview.alpha) alongside production (mainnet).

Instance names share a flat namespace per universe — two operators picking the same name collide in the committee group and neither works correctly. Namespace defensively (<org>.<purpose>.<env>, e.g. acme.mixer.mainnet).

Universe override (`ZIPNET_UNIVERSE`)

Default is the shared mosaik universe (zipnet::UNIVERSE = unique_id!("mosaik.universe")). Override only when running an isolated federation that intentionally does not share peers with the rest of the mosaik ecosystem. Every server, aggregator, and client of one deployment must agree on this value; consumers of the SDK build against zipnet::UNIVERSE unless their code explicitly passes a different NetworkId to Network::new.

Metrics reference

audience: operators

Every zipnet binary exposes a Prometheus endpoint when ZIPNET_METRICS is set. The table below lists the metrics worth scraping in production. Metrics starting with mosaik_ are emitted by the underlying mosaik library and documented in the mosaik book — Metrics; the ones that are load-bearing for zipnet operations are listed here.

Metrics that are instance-scoped carry an instance label whose value is the operator’s ZIPNET_INSTANCE string (e.g. acme.mainnet). When a host multiplexes several instances (see Operator quickstart — running many instances), every instance-scoped metric is emitted once per instance.

Per-role metrics

Committee server

Metric	Kind	Meaning	Healthy value
`mosaik_groups_leader_is_local{instance=<name>}`	gauge (0/1)	Whether this node is the Raft leader for the instance	Exactly one `1` across the committee of each instance
`mosaik_groups_bonds{peer=<id>,instance=<name>}`	gauge (0/1)	Whether a bond to a specific peer is healthy	`1` for every other committee member of the same instance
`mosaik_groups_committed_index{instance=<name>}`	gauge	Highest committed Raft index	Monotonically increasing, step ≈ 2 per round
`zipnet_rounds_finalized_total{instance=<name>}`	counter	Rounds this node saw finalize	Increases at `~1 / ZIPNET_ROUND_PERIOD`
`zipnet_partials_submitted_total{instance=<name>}`	counter	Partials this node contributed	Increases 1-per-round
`zipnet_client_registry_size{instance=<name>}`	gauge	Clients currently registered	Roughly = expected client count
`zipnet_server_registry_size{instance=<name>}`	gauge	Servers currently registered	Equals committee size

The mosaik_groups_leader_is_local gauge is the one the operator quickstart tells you to check when bringing a new instance up — exactly one committee node should report 1 per instance.

Aggregator

Metric	Kind	Meaning	Healthy value
`mosaik_streams_consumer_subscribed_producers{stream=<id>,instance=<name>}`	gauge	Number of producers this consumer is attached to	`= client count` for `ClientToAggregator`
`mosaik_streams_producer_subscribed_consumers{stream=<id>,instance=<name>}`	gauge	Number of consumers attached to this producer	`= committee size` for `AggregateToServers`
`zipnet_aggregates_forwarded_total{instance=<name>}`	counter	Aggregates sent to the committee	≈ rounds finalized
`zipnet_fold_participants{round=<r>,instance=<name>}`	histogram	Clients per folded round	Depends on your client count
`zipnet_clients_registered_total{instance=<name>}`	counter	Client bundles mirrored into `ClientRegistry`	Grows to client count, then plateaus

Client

Metric	Kind	Meaning	Healthy value
`zipnet_envelopes_sent_total{instance=<name>}`	counter	Envelopes sealed and pushed	Increases by 1 per talk round
`zipnet_envelope_send_errors_total{instance=<name>}`	counter	`send` failures	Ideally 0
`zipnet_client_registered{instance=<name>}`	gauge (0/1)	Whether our bundle is in `ClientRegistry`	`1` after the first few seconds

Metrics that indicate trouble

Metric	Fires when	First action
`mosaik_groups_leader_is_local` is `1` on zero or `≥ 2` nodes of one instance for > 1 min	Split-brain or no leader	Incident response — split-brain
`mosaik_streams_consumer_subscribed_producers` drops to 0 on the aggregator	Clients disconnected	Check client-side logs for bootstrap failures
`zipnet_aggregates_forwarded_total` flat for > `3 × ZIPNET_ROUND_PERIOD`	Aggregator stuck OR committee cannot open rounds	Incident response — stuck rounds
`zipnet_server_registry_size < committee_size` for > 30 s	A committee server failed to publish	Check that server’s boot log
`mosaik_groups_committed_index` frozen	Raft stalled	Check clock skew, network partition

Every trouble alert should be scoped by instance so multi-instance hosts do not conflate a stuck testnet with a stuck production committee.

Recording rules for Prometheus

Useful derived series (all scoped by instance):

# Round cadence per instance
rate(zipnet_rounds_finalized_total[5m])

# Average participants per round per instance
  rate(zipnet_fold_participants_sum[5m])
/ rate(zipnet_fold_participants_count[5m])

# Aggregator fold saturation (clients dropped by the deadline)
(
  rate(zipnet_clients_registered_total[5m])
  -
  rate(zipnet_fold_participants_sum[5m]) / rate(zipnet_rounds_finalized_total[5m])
)

Logs that should never fire (without a concurrent alert)

rival group leader detected on any committee server.
SubmitAggregate with bad length / SubmitPartial with bad length in a committee log.
failed to mirror LiveRoundCell persistently.
committee offline — aggregate dropped — either the committee is down or bundle tickets never replicated.

If any of these fire without a concurrent incident, treat it as a protocol invariant break and escalate to the contributor on-call.

Keyboard shortcuts

ZIPNet on mosaik