Sharding overview
Cluster sharding is the framework’s answer to “I have a few million entities, each needing its own actor, scattered across N nodes.” Examples: per-user sessions, per-IoT-device controllers, per-order coordinators, per-game-room actors.
The user gives the framework a way to extract an entity ID from each message; the framework hashes that ID to a shard; the coordinator decides which node hosts that shard; the node’s local region spawns the entity actor on demand. When nodes come and go, the coordinator rebalances shards across the new topology.
cluster of 3 nodes │ ┌─────────────────┼──────────────────┐ │ │ │ node-1 node-2 node-3 region region region │ │ │ shards 1-33 shards 34-66 shards 67-100 │ │ │ entities for entities for entities for keys hashing keys hashing keys hashing to those to those to those shards shards shardsEach entity is one actor, on one node, at a time — just like a singleton, but scaled to N entities at once. Failover happens automatically: when a node leaves, its shards move elsewhere and the entities re-spawn there.
A minimal example
Section titled “A minimal example”import { Actor, ActorSystem, Cluster, ClusterSharding, Props } from 'actor-ts';
type CartCmd = | { entityId: string; kind: 'add'; sku: string } | { entityId: string; kind: 'view'; replyTo: ActorRef<Cart> };
class CartActor extends Actor<CartCmd> { private items: string[] = [];
override onReceive(cmd: CartCmd): void { if (cmd.kind === 'add') this.items.push(cmd.sku); if (cmd.kind === 'view') cmd.replyTo.tell({ items: this.items }); }}
// Setup:const system = ActorSystem.create('my-app');const cluster = await Cluster.join(system, { host, port, seeds });const sharding = ClusterSharding.get(system, cluster);
const cartRegion = sharding.start<CartCmd>({ typeName: 'cart', entityProps: Props.create(() => new CartActor()), extractEntityId: (msg) => msg.entityId,});
// Usage — `tell` to the region, with entity ID inside the message:cartRegion.tell({ entityId: 'user-42', kind: 'add', sku: 'book-1' });cartRegion.tell({ entityId: 'user-42', kind: 'view', replyTo: ... });// ^^^^^^^^^^// Same ID → same entity actor, every time, regardless of node.The cartRegion is a single ActorRef from the caller’s
perspective. Behind the scenes:
- The region computes a shard from
entityId(default: hash to one of 100 shards). - It asks the coordinator “who owns this shard?”
- If the owner is this node, it spawns the entity (if not already present) and forwards the message.
- If the owner is another node, it forwards over the cluster transport to that node’s region, which does the same.
The three actors at play
Section titled “The three actors at play”| Actor | Role |
|---|---|
| Region | One per node. Routes messages to the right shard’s owner, hosts local entities. |
| Coordinator | One per cluster (singleton, on the leader). Decides which node owns each shard. Handles rebalancing on membership changes. |
| Entity | One per entityId. Hosted on whichever node currently owns the shard the ID hashes to. |
Each level has its own page in the cluster section — see ShardRegion, the allocation strategy, and rebalance for the mechanics.
Configuration
Section titled “Configuration”ShardingSettings<TMsg> — the fields you’ll most often touch:
interface ShardingSettings<TMsg> { typeName: string; entityProps: Props<TMsg>; extractEntityId: (message: TMsg) => string; extractEntityMessage?: (message: TMsg) => unknown; numShards?: number; // default 100 role?: string; // restrict to nodes carrying this role proxy?: boolean; // route-only, no local entities rememberEntities?: boolean; // re-spawn entities on failover (#) passivationIdleMs?: number; // auto-stop idle entities maxEntities?: number; // LRU cap per node}| Field | What it controls |
|---|---|
typeName | A string identifying this sharded type. Different types can coexist in the same cluster (cart, session, order). |
extractEntityId(msg) | Pull the entity ID from a message. This is the key that gets hashed to a shard. |
extractEntityMessage(msg) | (Optional) If the message envelope contains routing info plus a payload, this strips the envelope before the entity sees it. Defaults to the message as-is. |
numShards | How many shards the entity space is divided into. 100 is fine for most clusters; bump to 1000 for very large clusters (>50 nodes). |
role | Only members with this role host shards of this type. Useful for placing compute-heavy entities on dedicated nodes. |
proxy | This node forwards messages to the region but never hosts entities locally. Used for client nodes in an asymmetric cluster. |
rememberEntities | Persist the set of active entity IDs. After a coordinator failover (or full cluster restart), those IDs are spawned eagerly so messages don’t have to recreate the entire fleet. |
passivationIdleMs | Stop an entity after this much idle time. Frees memory; next message for the same ID re-creates the entity. |
maxEntities | Per-node cap. When exceeded, the LRU entity is passivated. |
The defaults are sensible for small clusters. For production, you
usually want rememberEntities: true and a passivationIdleMs
matched to your traffic pattern.
Passivation
Section titled “Passivation”import { Passivate, Actor } from 'actor-ts';
class CartActor extends Actor<CartCmd | Passivate> { override onReceive(msg): void { if (msg instanceof Passivate) { // Optionally do shutdown work, then send the passivate ack. this.context.parent.forEach(p => p.tell({ kind: 'passivate-ack' })); return; } // ... }}When passivationIdleMs is configured (or maxEntities is hit),
the region sends Passivate to the entity. The entity acks; the
region stops it and ensures buffered messages for the same ID are
drained to the next incarnation when it spawns.
If you want fully manual control, send the parent a
{ kind: 'passivate' } message yourself.
Rebalancing
Section titled “Rebalancing”When the cluster topology changes (a node joins or leaves), the coordinator runs a rebalance pass:
- Compute the new shard-to-node assignment from the active allocation strategy (default: hash mod regions).
- For each shard that moved, tell the old region to hand off the shard.
- The old region stops its entities (which may persist their state), tells the coordinator “handoff complete,” and stops routing for that shard.
- The new region spawns entities for that shard on demand as messages arrive.
The handoff isn’t instant — buffered messages wait for the “handoff complete” signal before being forwarded to the new owner. This avoids messages racing past a half-moved entity.
See Rebalance for the full protocol.
State across failover
Section titled “State across failover”Sharded entities are subject to the same restart semantics as any other actor — when an entity moves to a new node, the new instance starts with a clean slate.
For state that should survive:
PersistentActor— the entity persists events to a journal; on a fresh node, it replays the journal at startup. See PersistentActor.DurableStateActor— simpler: persist the current state snapshot; restore on restart. See DurableState.DistributedData— for state that should be readable from any node (not just the entity’s current host), use a CRDT in the DD replicator instead.
Without one of these, sharding gives you placement and routing, not durability.
When to reach for sharding
Section titled “When to reach for sharding”Three good fits:
- Per-user / per-tenant state that’s too much for one node to hold but doesn’t need to be readable from everywhere simultaneously.
- Per-entity workflows — sagas, order processing, long-running coordinators — that benefit from per-key serialization.
- Hotspots that follow keys — a streaming pipeline where each user’s events should be processed in order on a single actor.
When NOT to use sharding
Section titled “When NOT to use sharding”Where to next
Section titled “Where to next”- **ShardRegion ** (stub) — the per-node region actor; configuration deep dive.
- Allocation strategy — default hash-mod-regions, custom strategies.
- Rebalance — the handoff protocol.
- Remember entities — persistent entity-registry for fast recovery.
- Singleton overview — for one-actor-cluster-wide.
- Sharded daemon process — fixed-count workers distributed by sharding.
The ClusterSharding
API reference covers the full surface.