Horizontal scaling
Pylon is a single Rust binary. Single-machine deploys are the happy path — one process serves HTTP, WebSocket, SSE, and job execution for the whole app. When traffic outgrows one machine, you scale horizontally by runningpylon on multiple instances behind a load balancer.
The catch: WebSocket broadcasts are in-process by default. A
mutation handled by machine A fans out to clients connected to
machine A. Clients connected to machine B never see it — until their
next reconnect or visibility-change triggers the client’s reconcile()
backstop. Live UX (sub-second propagation) requires more.
ClusterBus: cross-machine fanout
Pylon ships aClusterBus abstraction. Configure a transport and
every change event / presence relay / CRDT frame published locally
also publishes to the bus; subscriber threads on every peer machine
receive and re-broadcast to their own WebSocket / SSE clients.
The default transport is NoopBus — single-machine deploys pay
zero overhead.
The production transport is Redis PUB/SUB.
PYLON_CLUSTER_NAMESPACE prefixes the Redis channel so multiple
unrelated pylon deploys can share one Redis instance without
cross-talk. Defaults to pylon if unset.
Connection failures at startup are fatal. Pylon refuses to boot
if PYLON_CLUSTER_BUS is set but unreachable. The reasoning:
silently falling back to NoopBus on a multi-machine deploy means
every machine is deaf to peer mutations, which produces the same
“phantom row” UX failure the bus is supposed to fix — except much
harder to diagnose. Loud failure is the right default.
What’s fanned out
- Change events (
ChangeEvent) — every mutation, action write, and entity-CRUD broadcast. - Presence relays — typing indicators, cursor positions, any
message sent via
WsHub::broadcast_presence. - CRDT binary frames — Loro snapshots / updates for clients
subscribed via
useLoroDoc. Snapshot bytes are base64-encoded into the JSON envelope so a single pubsub channel handles every payload shape.
What’s NOT fanned out
/api/sync/pull?since=Ncursor catch-up. Each pylon process keeps its own in-memoryChangeLog; seqs are per-machine. Clients that pull from machine A see different seqs than from machine B. The client-sidereconcile()pass (added in @pylonsync/sync v0.3.130) is the backstop — clients pull authoritative entity row sets from/api/entities/<entity>/cursoron reconnect / visibility-change and remove locals not in the server set.- Per-tenant policy filtering. Each receiving machine re-runs its own per-client read policy before forwarding the inbound event to its connected WS/SSE clients. The bus carries raw events; authz is the local fanout’s job.
Self-event filtering
Pubsub backends deliver every published message to every subscriber, including the publisher itself. Without de-dup that produces a feedback loop: A publishes → A’s subscriber receives → A re-broadcasts → already shipped locally → double-delivery. Every envelope carries the publisher’sinstance_id (one per pylon
process, minted at startup). Each subscriber filters out events with
its own id before re-broadcasting. Operators don’t need to think
about this; it’s invisible.
Diagnostics
Pylon logs the bus mode at startup:When to enable
- Anytime you run more than one
pylonprocess serving the same app. - Fly autoscale with
min_machines_running > 1. - K8s deployments with
replicas > 1. - Blue/green rollouts where two versions of the binary briefly serve traffic simultaneously.
- Local multi-process dev simulating production.
When NOT to enable
- Single-machine deploys. NoopBus is free; adding Redis is added failure surface for zero benefit.
- Per-developer local dev. The reconcile() backstop covers the rare cases where two tabs need to see each other’s writes without a real cluster bus.