Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Designing coexisting systems on mosaik

audience: contributors

Mosaik composes primitivesStream, Group, Collection, TicketValidator. It does not prescribe how a whole service — a deployment with its own operator, its own ACL, its own lifecycle — is shipped onto a network and made available to third-party agents. That convention lives one layer above mosaik and has to be invented per service family.

This page describes the convention zipnet uses, why it was picked, and what a contributor building the next service on mosaik (multisig signer, secure storage, attested oracle, …) should reuse. It is a mental model, not an API reference: the concrete instantiation is in Architecture.

The problem

A mosaik network is a universe where any number of services run concurrently. Each service:

  • is operated by an identifiable organisation (or coalition) and has its own ACL
  • ships as a bundle of internally-coupled primitives — usually a committee Group, one or more collections backed by that group, and one or more streams feeding it
  • must be addressable and discoverable by external agents who do not operate it
  • co-exists with many other instances of itself (testnet, staging, per-tenant deployments) and with unrelated services on the same wire

The canonical shape zipnet itself was built for is an encrypted mempool — a bounded set of TEE-attested wallets publishing sealed transactions for an unbounded set of builders to read, ordered and unlinkable to sender. Other services built on this pattern (signers, storage, oracles) have the same structural properties.

Nothing about these requirements is in mosaik itself. The library will happily let you stand up ten Groups and thirty Streams on one Network; it says nothing about which of them constitute “one zipnet” versus “one multisig”.

Two axes of choice

Every design in this space picks a point on two axes.

  1. Network topology. Does a deployment live on its own NetworkId, or on a shared universe with peers of every other service?
  2. Discovery. How does an agent go from “I want zipnet-acme” to bonded-and-consuming without hardcoded bootstraps or out-of-band config?

Four shapes fall out:

ShapeTopologyWhen to pick
A. Service-per-networkOne NetworkId per deployment; agents multiplex many Network handlesStrong isolation, per-service attestation scope, no cross-service state
B. Shared meta-networkOne universe NetworkId; deployments are overlays of Groups/StreamsMany services per agent, cheap composition, narrow public surface required to tame noise
C. Derived sub-networksROOT.derive(service).derive(instance) hybridsIsolation with structured discovery, still multi-network per agent
D. Service manifestOrthogonal: a rendezvous record naming all deployment IDsComposable with A/B/C; required for discoverable-without-out-of-band-config

Zipnet picks B for topology, with optional derived private networks for high-volume internal plumbing, and compile-time deployment- fingerprint derivation for discovery — no on-network registry required. The rest of this page unpacks why and how.

Narrow public surface

The single most important discipline in this model is that a deployment exposes a small, named, finite set of primitives to the shared network. The ideal is one or two — a stream plus a collection, two streams, a state machine plus a collection, and so on. Everything else is private to the bundle and wired up by the deployment author, who is free to hardcode internal dependencies as aggressively as they like.

Zipnet’s outward surface decomposes cleanly into two functional roles, even though it carries several declare! types:

  • write-side: ClientRegistrationStream and ClientToAggregator — ticket-gated, predicate-gated, used by external TEE clients to join a round and submit sealed envelopes.
  • read-side: LiveRoundCell, Broadcasts, plus the two registries — read-only ambient round state that external agents need in order to seal envelopes and interpret finalized rounds.

An integrator’s mental model is “a way to write, a way to read”. They do not need to know the committee exists, how many aggregators there are, or how DH shuffles are scheduled. Internally the bundle looks like this:

  shared network                                     (public surface)
  ─────────────────────────────────────────────────────────────────
  ClientRegistrationStream, ClientToAggregator  ─┐
                                                 │
  LiveRoundCell, Broadcasts, ClientRegistry,   ◀─┤
  ServerRegistry                                 │
                                                 │
  ─────────────────────────────────────────────────
  derived private network (optional)             │  (private plumbing)
                                                 ▼
      Aggregator fan-in / DH-shuffle gossip      Committee Group<CommitteeState>
      Round-scheduler chatter                    AggregateToServers stream
                                                 BroadcastsStore (backs Broadcasts)

The committee Group stays on the shared network because the public-read collections are backed by it and bridging collections across networks is worse than the catalog noise. Only the genuinely high-churn channels belong on a derived private network.

The underlying principle: content + intent addressing

The deployment-model decisions above are concrete applications of a single discipline that generalises to every mosaik-native organism.

Consensus-critical identifiers — GroupId, StreamId, StoreId, any on-wire ID that two parties must agree on to interact safely — MUST be derivable from a deterministic hash of three inputs:

  1. Intent: an operator-chosen label (the instance name).
  2. Content: every state-affecting parameter that determines what the deployment does — schema versions, wire-size constants, ConsensusConfig, round-window parameters, encoding format, every immutable initial-state input.
  3. ACL: the TicketValidator composition gating admission — JWT issuer keys, TDX MR_TD pins, expiry policies.

id = blake3(intent ‖ content ‖ acl).

This is content + intent addressing — Merkle-DAG storage’s content-as-id discipline merged with mosaik’s intent-addressed unique_id!. Both at once.

The argument for it is failure-mode based. Suppose only the name folded into the identity (the naïve case). Two operators pick acme.mainnet with different ShuffleWindows. Same GroupId. Their committees attempt to bond, Raft elects a leader across incompatible state machines, commits land on one side that the other rejects, the public broadcast log fills with garbage. The failure is silent and corrupting. With content + intent addressing, mismatched windows produce different IDs. The committees never see each other; consumers compiling against one configuration get ConnectTimeout against an operator running the other. Failure is loud and debuggable.

The same argument applies to ACL. Two operators sharing a name and parameters but using different TDX MR_TDs (e.g. one runs attested-prod, one runs mock-dev) should not bond. Folding the ticket-validator composition into the ID makes that automatic.

Concrete corollary: every signature-altering input lives in a deployment Config struct. The SDK derives identifiers from the struct, not from any subset of it. Operators publish the struct (or its serialized fingerprint) as the handshake. Consumers compile it in. There is no configuration that is both consensus-critical and not in the identity. If you find one, it’s a bug — either fold it in, or convince yourself it’s not actually consensus-critical and document why.

The three conventions in the next section are the application of this principle to zipnet specifically: content + intent addressed fingerprint derivation, the typed-primitive Shuffle<D> API (content in the type system), Config-based identity (content + intent + ACL bundled). Other mosaik organisms — multisig signers, storage, oracles — apply the same discipline with their own state shape.

The three conventions

Three things make this pattern work. A contributor starting a new service should reproduce all three.

1. Identifier derivation from the deployment fingerprint

Every public ID in a deployment descends from one root that canonically encodes every signature-altering input — the operator’s chosen name, the datum type’s schema and wire size, the shuffle window, the per-deployment init salt, the ACL composition. The SDK hashes these into one UniqueId and chains everything else off it with .derive() for structural clarity:

  DEPLOYMENT   = blake3("zipnet|" || name || "|type=" || TYPE_TAG ||
                        "|size=" || WIRE_SIZE || "|window=" || window ||
                        "|init=" || init)
  SUBMIT       = DEPLOYMENT.derive("submit")           // StreamId
  BROADCASTS   = DEPLOYMENT.derive("broadcasts")       // StoreId
  COMMITTEE    = DEPLOYMENT.derive("committee")        // GroupKey material
  ...

The fingerprint inputs all live in a zipnet::Config struct that’s const-constructible, plus the datum’s ShuffleDatum impl. Operators publish the Config (or its serialised form) and the datum schema; consumers compile both in. The consumer-side API is the three typed free-function constructors:

const ACME_MAINNET: zipnet::Config = zipnet::Config::new("acme.mainnet")
    .with_window(zipnet::ShuffleWindow::interactive())
    .with_init([0u8; 32]); // operator-published random bytes

let mut s = zipnet::Zipnet::<Note>::submit  (&network, &ACME_MAINNET).await?;
let mut r = zipnet::Zipnet::<Note>::receipts(&network, &ACME_MAINNET).await?;
let mut x = zipnet::Zipnet::<Note>::read    (&network, &ACME_MAINNET).await?;

Zipnet::<D>::* constructors derive the deployment id internally; raw StreamId/StoreId/GroupId values are never exposed across the crate boundary. Zipnet::<D>::deployment_id(&Config) is exposed as a pure function for diagnostics — print it and the operator can verify both sides agree on the fingerprint without a wire round-trip.

2. A Deployment-shaped convention

Authors should declare a deployment’s public surface once, in one place, so consumers can bind without reassembling ID derivations by hand. Whether this is a literal declare::deployment! macro or a hand-written impl Deployment is ergonomics; the constraint is that the public surface is a declared, named, finite set of primitives — not “whatever the bundle happens to put on the network today”.

Every deployment crate should export:

  • the public declare::stream! / declare::collection! types for its surface, colocated in a single protocol module
  • a bind(&Network, instance_name) -> TypedHandles function
  • the intended TicketValidator composition for each public primitive

A service that exposes eight unrelated collections has probably not thought hard enough about its interface.

3. A fingerprint convention, not a registry

Two parties agree on a deployment iff they compute the same fingerprint from the same Config + datum schema. No on-network advertisement is required — the service does not need to advertise its own existence. Bonding is automatic via mosaik’s discovery once both sides hash to the same GroupId.

The operator’s complete public contract is two items, plus an optional third:

  1. The Config struct (or its serialised hex fingerprint) — a single blob that captures name + window + init.
  2. The datum schema — D::TYPE_TAG and D::WIRE_SIZE, typically shipped as a small Rust crate the consumer depends on.
  3. (If TDX-gated) the committee image’s MR_TD, which the consumer pins via the tee-tdx Cargo feature.

These travel via release notes, a deployment-spec crate on crates.io, a setup email — anything out of band. The shared universe NetworkId is zipnet::UNIVERSE by default; only flagged explicitly when an isolated federation runs on a different one.

A directory may exist — a shared collection listing known deployments — but it is a devops convenience for humans enumerating deployments, not part of the consumer binding path. Build it if you need it; nothing about the pattern requires it.

A directory may exist — a shared collection listing known instances — but it is a devops convenience for humans enumerating deployments, not part of the consumer binding path. Build it if you need it; nothing about the pattern requires it.

Wire contracts within a typed primitive

The deployment model gets a service onto the shared universe and lets consumers bond to it; what flows over the wire after that is a second design space. For shuffler-class primitives — anonymous broadcast, mixers, threshold-signing fan-out, anything where a public observer’s correlation power is part of the threat model — two invariants are non-negotiable.

Constant payload size

Every shuffler is parameterised by a datum type D implementing a ShuffleDatum trait that carries const WIRE_SIZE: usize. Every value of D MUST canonically serialise to exactly WIRE_SIZE bytes; the SDK rejects encodings of any other length. WIRE_SIZE folds into the instance’s derived UniqueIdShuffle<TxV1> (240 bytes) and Shuffle<TxV2> (1024 bytes) at the same instance name produce different GroupId/StreamIds and do not interoperate. This collapses the “versioning under stable instance names” problem (see below) into the type system: a schema bump that changes WIRE_SIZE is a clean retire-and-replace, no on-wire negotiation, no silent split-brain.

Variable on-wire payload sizes leak sender identity through traffic analysis. The DC-net and anonymous-broadcast literature is unanimous: Chaum 1988 (Dining Cryptographers, foundational constant-bit-length transmission); Riposte (Corrigan-Gibbs, Boneh, Mazières — S&P 2015) on fixed-size cells with cuckoo-hashing for slot allocation; Vuvuzela (van den Hooff et al. — SOSP 2015) on fixed-size noise envelopes from every client every round; Atom (Kwon, Lazar, Devadas, Ford — SOSP 2017) on verifiable shuffles over fixed cells; Loopix (Piotrowska et al. — USENIX 2017) on Sphinx packets with fixed-size SURBs; Stadium (Tyagi et al. — SOSP 2017) on fixed-cell partitioning; ZIPNet (eprint 2024/1227) inherits the discipline by construction.

Multi-slot fragmentation without cover is explicitly hostile to anonymity — cross-slot correlation reveals which slots came from one sender. Variable-size application data is padded (preferred) or chunked with cover (sometimes); both are the application’s job, not the primitive’s. The shuffler primitive enforces a single fixed size at its boundary and refuses anything else.

Receipts encrypted to the requester

A submitter wants observability: did my envelope land, collide, get dropped? A naive receipt collection — public per-publisher outcomes broadcast alongside the shuffled output — would leak the sender→message linkage the protocol exists to hide.

The committed shape: a Receipts stream on the public surface, where every item is an opaque ciphertext encrypted via ECIES (ephemeral X25519 + AEAD) to the original submitter’s long-term X25519 pubkey — the same key already in ClientRegistry for DC-net pad derivation, so no new key infrastructure. Consumers trial-decrypt every receipt; AEAD auth tag failures discriminate. A public observer sees a stream of indistinguishable ciphertexts of identical size and cannot link any receipt to any submitter.

Trial-decrypt cost: O(N_receipts) ECDH + AEAD per round per receipt-watcher, microseconds each. Bounded for the permissioned client sets these primitives target. For large-N deployments, a per-round HMAC tag scheme (recipient-derivable, round-rotated to avoid cross-round linkability) is the upgrade path. Not in v1.

The contract for downstream services building their own typed shuffler-class primitives on mosaik: keep both invariants. Constant payload size at the type boundary; opaque public-stream receipts. Anything else compromises a property the primitive exists to provide.

What this buys you

  • A third-party agent’s mental model collapses to: “one Network, many services, each bound by instance name.”
  • Multiple instances of the same service coexist trivially — each derives disjoint IDs from its salt.
  • ACL is per-instance, enforced at the edge via require_ticket on the public primitives; no second ACL layer is needed inside the bundle.
  • Internal plumbing can move to a derived private network without changing the public surface.
  • Private-side schema changes (StateMachine::signature() bumps) are absorbed behind the instance identity, as long as operators and consumers cut releases against the same version of the deployment crate.

Where the pattern strains

Three things are not free under this convention. Every new service author should be honest about them up front.

Cross-service atomicity is out of scope

There is no way to execute “mix a message AND rotate a multisig signer” in one consensus transaction. They are different Groups with different GroupIds, possibly with disjoint membership. If a service genuinely needs that — rare, but real for some coordination-heavy cases — the right answer is a fourth primitive that is itself a deployment providing atomic composition across services, not an ad-hoc cross-group protocol.

Versioning under stable instance names

If StateMachine::signature() changes, GroupId changes, and consumers compiled against the old code silently split-brain. Under multi-instance, the expectation is that “zipnet-acme” is an operator-level identity that outlives schema changes. Two ways to reconcile:

  • Let the instance salt carry a version (zipnet-acme-v2), and treat version bumps as retiring the old instance. Clean, but forces consumers to re-pin and release a new build on every upgrade.
  • Keep the instance name stable across versions and require operators and consumers to cut releases in lockstep against a shared deployment crate version. Avoids churn in instance IDs, at the cost of tighter coupling between operator and consumer release cadences.

Zipnet v1 does not need to resolve this. V2 must.

Noisy neighbours on the shared network

A shared NetworkId means every service’s peers appear in every agent’s catalog. Discovery gossip, DHT slots, and bond maintenance scale with the universe, not with the services an agent cares about. The escape hatch is the derived private network for internal chatter; the residual cost — peer-catalog size and /mosaik/announce volume — is paid by everyone. If a service’s traffic would dominate the shared network (high-frequency metric streams, bulk replication) it belongs behind its own NetworkId, not on the shared one. Shape A is the correct call when the narrow-interface argument no longer outweighs the noise argument.

Checklist for a new service

When adding a service to a shared mosaik universe, use this list:

  1. Identify the one or two public primitives. If you cannot, the interface is not yet designed.
  2. Pick a service root: unique_id!("your-service").
  3. Define the fingerprint inputs: what instance_name means, who picks it, what window/config parameters fold in, whether the fingerprint encoding carries a version.
  4. Write typed constructors (e.g. Service::<D>::open(&network, &config)) that every consumer uses. Never export raw StreamId/StoreId/GroupId values across the crate boundary.
  5. Decide which internal channels, if any, move to a derived private Network. Default: only the high-churn ones.
  6. Specify TicketValidator composition on the public primitives. ACL lives here.
  7. Document your Config + datum-schema convention in release notes or docs. Consumers compile the fingerprint in; you are on the hook for keeping the fields stable and the code release version-matched.
  8. Call out your versioning story before shipping. If you cannot answer “what happens when StateMachine::signature() bumps?”, you will regret it.

Cross-references

  • Architecture — the concrete instantiation of this pattern for zipnet v1.
  • Mosaik integration notes — gotchas and idioms specific to the primitives referenced here.
  • Roadmap to v2 — where versioning-under-stable-names and cross-service composition work live.