Skip to content

Coordination overview

Sometimes the cluster needs a single-holder guarantee that membership-only mechanisms can’t provide:

  • The ClusterSingleton should never spawn two instances during a network partition.
  • The sharding coordinator on the leader side of a partition shouldn’t issue allocations if another side might also have a leader.

The Lease API gives you that guarantee. A lease is a distributed lock with a TTL — at most one process holds it at a time, with a backend (Kubernetes, etcd, your own) providing the single-holder constraint.

interface Lease {
acquire(): Promise<boolean>;
release(): Promise<void>;
checkAlive(): boolean;
onLost(handler: (reason: string) => void): () => void;
}

Four methods:

  • acquire() — try to claim the lease. Resolves true on success, false on contention (someone else has it).
  • release() — voluntarily drop ownership. No-op if not held.
  • checkAlive() — cheap, local “do I still own this lease?” check.
  • onLost(cb) — register a callback fired if ownership is lost unexpectedly (TTL expired without renewal, another holder took over).

The backend implements the contract. The framework’s built-in KubernetesLease uses K8s native Lease resources; InMemoryLease is the test / dev option.

Two production scenarios:

ScenarioWhy a lease helps
Cluster singletonPrevents dual-leadership during partition. See Singleton with lease.
Sharding coordinatorPrevents two coordinators making conflicting allocations. See Sharding with lease.

For most clusters, a downing strategy is enough. Leases are the paranoid-safe option:

  • Downing strategy alone: cluster picks a winner during partition; most apps are fine.
  • Downing strategy + lease: an additional check that prevents edge-case dual-leadership even if downing misfires.

If you can afford the operational cost (K8s namespace permissions, etcd cluster, etc.), enable leases for any singleton or sharding setup where dual-execution would cause real damage — double-charging a customer, double-publishing a stream event.

not held ──────────► acquired ──────► renewing periodically
▲ │
│ │
│ (renew fails OR explicit release)
│ │
└────────────────────┘
lease lost — onLost(cb) fires

The flow:

  1. Some actor (typically a singleton manager) calls lease.acquire().
  2. Backend tries to record this owner. Success → returns true.
  3. While holding, the backend renews every renewalIntervalMs (typically ttl / 3).
  4. If renewal fails (network blip, backend down, lease taken over), onLost fires with the reason.
  5. The actor releases ownership-dependent state immediately.

The TTL is critical — a holder that crashes without release loses the lease automatically when the TTL expires. Other contenders can then acquire.

interface LeaseSettings {
name: string; // unique lease identifier
owner: string; // this holder's identifier (pod name / UUID)
ttlMs: number; // auto-expire after this many ms of no-renewal
renewalIntervalMs?: number; // typically ttl/3
acquireRetries?: number; // max attempts per acquire()
acquireRetryDelayMs?: number; // delay between attempts
}
SettingTypical values
ttlMs15-30 s. Short enough to recover from a crashed holder, long enough to tolerate gossip / renewal delays.
renewalIntervalMs~ttl / 3. Renew often enough that a single failed renew doesn’t lose the lease.
acquireRetries3-10 for production.
acquireRetryDelayMs100-1000 ms.
BackendUse
InMemoryLeaseTests and dev. In-process; not actually distributed.
KubernetesLeaseProduction on K8s. Uses K8s native Lease CRDs.

For non-K8s production deployments, you’d implement the Lease interface against your coordination service (etcd, Consul, Zookeeper). The interface is small — ~50 lines of glue.

import { ClusterSingletonManager, KubernetesLease, Props } from 'actor-ts';
const lease = new KubernetesLease({
name: 'my-app-singleton-lease',
owner: process.env.POD_NAME!,
ttlMs: 30_000,
renewalIntervalMs: 10_000,
namespace: process.env.K8S_NAMESPACE!,
});
system.actorOf(
ClusterSingletonManager.props({
cluster,
typeName: 'job-scheduler',
singletonProps: Props.create(() => new JobScheduler()),
lease, // ← split-brain protection
}),
'singleton-manager-job-scheduler',
);

The singleton manager will:

  • Try to acquire the lease before spawning the singleton.
  • Renew it while alive.
  • Release it on graceful shutdown.
  • Respond to lease.onLost(...) by stopping its singleton.