Skip to content

Sharding overview

Cluster sharding is the framework’s answer to “I have a few million entities, each needing its own actor, scattered across N nodes.” Examples: per-user sessions, per-IoT-device controllers, per-order coordinators, per-game-room actors.

The user gives the framework a way to extract an entity ID from each message; the framework hashes that ID to a shard; the coordinator decides which node hosts that shard; the node’s local region spawns the entity actor on demand. When nodes come and go, the coordinator rebalances shards across the new topology.

cluster of 3 nodes
┌─────────────────┼──────────────────┐
│ │ │
node-1 node-2 node-3
region region region
│ │ │
shards 1-33 shards 34-66 shards 67-100
│ │ │
entities for entities for entities for
keys hashing keys hashing keys hashing
to those to those to those
shards shards shards

Each entity is one actor, on one node, at a time — just like a singleton, but scaled to N entities at once. Failover happens automatically: when a node leaves, its shards move elsewhere and the entities re-spawn there.

import { Actor, ActorSystem, Cluster, ClusterSharding, Props } from 'actor-ts';
type CartCmd =
| { entityId: string; kind: 'add'; sku: string }
| { entityId: string; kind: 'view'; replyTo: ActorRef<Cart> };
class CartActor extends Actor<CartCmd> {
private items: string[] = [];
override onReceive(cmd: CartCmd): void {
if (cmd.kind === 'add') this.items.push(cmd.sku);
if (cmd.kind === 'view') cmd.replyTo.tell({ items: this.items });
}
}
// Setup:
const system = ActorSystem.create('my-app');
const cluster = await Cluster.join(system, { host, port, seeds });
const sharding = ClusterSharding.get(system, cluster);
const cartRegion = sharding.start<CartCmd>({
typeName: 'cart',
entityProps: Props.create(() => new CartActor()),
extractEntityId: (msg) => msg.entityId,
});
// Usage — `tell` to the region, with entity ID inside the message:
cartRegion.tell({ entityId: 'user-42', kind: 'add', sku: 'book-1' });
cartRegion.tell({ entityId: 'user-42', kind: 'view', replyTo: ... });
// ^^^^^^^^^^
// Same ID → same entity actor, every time, regardless of node.

The cartRegion is a single ActorRef from the caller’s perspective. Behind the scenes:

  1. The region computes a shard from entityId (default: hash to one of 100 shards).
  2. It asks the coordinator “who owns this shard?”
  3. If the owner is this node, it spawns the entity (if not already present) and forwards the message.
  4. If the owner is another node, it forwards over the cluster transport to that node’s region, which does the same.
ActorRole
RegionOne per node. Routes messages to the right shard’s owner, hosts local entities.
CoordinatorOne per cluster (singleton, on the leader). Decides which node owns each shard. Handles rebalancing on membership changes.
EntityOne per entityId. Hosted on whichever node currently owns the shard the ID hashes to.

Each level has its own page in the cluster section — see ShardRegion, the allocation strategy, and rebalance for the mechanics.

ShardingSettings<TMsg> — the fields you’ll most often touch:

interface ShardingSettings<TMsg> {
typeName: string;
entityProps: Props<TMsg>;
extractEntityId: (message: TMsg) => string;
extractEntityMessage?: (message: TMsg) => unknown;
numShards?: number; // default 100
role?: string; // restrict to nodes carrying this role
proxy?: boolean; // route-only, no local entities
rememberEntities?: boolean; // re-spawn entities on failover (#)
passivationIdleMs?: number; // auto-stop idle entities
maxEntities?: number; // LRU cap per node
}
FieldWhat it controls
typeNameA string identifying this sharded type. Different types can coexist in the same cluster (cart, session, order).
extractEntityId(msg)Pull the entity ID from a message. This is the key that gets hashed to a shard.
extractEntityMessage(msg)(Optional) If the message envelope contains routing info plus a payload, this strips the envelope before the entity sees it. Defaults to the message as-is.
numShardsHow many shards the entity space is divided into. 100 is fine for most clusters; bump to 1000 for very large clusters (>50 nodes).
roleOnly members with this role host shards of this type. Useful for placing compute-heavy entities on dedicated nodes.
proxyThis node forwards messages to the region but never hosts entities locally. Used for client nodes in an asymmetric cluster.
rememberEntitiesPersist the set of active entity IDs. After a coordinator failover (or full cluster restart), those IDs are spawned eagerly so messages don’t have to recreate the entire fleet.
passivationIdleMsStop an entity after this much idle time. Frees memory; next message for the same ID re-creates the entity.
maxEntitiesPer-node cap. When exceeded, the LRU entity is passivated.

The defaults are sensible for small clusters. For production, you usually want rememberEntities: true and a passivationIdleMs matched to your traffic pattern.

import { Passivate, Actor } from 'actor-ts';
class CartActor extends Actor<CartCmd | Passivate> {
override onReceive(msg): void {
if (msg instanceof Passivate) {
// Optionally do shutdown work, then send the passivate ack.
this.context.parent.forEach(p => p.tell({ kind: 'passivate-ack' }));
return;
}
// ...
}
}

When passivationIdleMs is configured (or maxEntities is hit), the region sends Passivate to the entity. The entity acks; the region stops it and ensures buffered messages for the same ID are drained to the next incarnation when it spawns.

If you want fully manual control, send the parent a { kind: 'passivate' } message yourself.

When the cluster topology changes (a node joins or leaves), the coordinator runs a rebalance pass:

  1. Compute the new shard-to-node assignment from the active allocation strategy (default: hash mod regions).
  2. For each shard that moved, tell the old region to hand off the shard.
  3. The old region stops its entities (which may persist their state), tells the coordinator “handoff complete,” and stops routing for that shard.
  4. The new region spawns entities for that shard on demand as messages arrive.

The handoff isn’t instant — buffered messages wait for the “handoff complete” signal before being forwarded to the new owner. This avoids messages racing past a half-moved entity.

See Rebalance for the full protocol.

Sharded entities are subject to the same restart semantics as any other actor — when an entity moves to a new node, the new instance starts with a clean slate.

For state that should survive:

  • PersistentActor — the entity persists events to a journal; on a fresh node, it replays the journal at startup. See PersistentActor.
  • DurableStateActor — simpler: persist the current state snapshot; restore on restart. See DurableState.
  • DistributedData — for state that should be readable from any node (not just the entity’s current host), use a CRDT in the DD replicator instead.

Without one of these, sharding gives you placement and routing, not durability.

Three good fits:

  1. Per-user / per-tenant state that’s too much for one node to hold but doesn’t need to be readable from everywhere simultaneously.
  2. Per-entity workflows — sagas, order processing, long-running coordinators — that benefit from per-key serialization.
  3. Hotspots that follow keys — a streaming pipeline where each user’s events should be processed in order on a single actor.

The ClusterSharding API reference covers the full surface.