Skip to content

Master key rotation

For at-rest encryption (object-storage encryption, durable-DD-encryption), the framework’s keys live in a MasterKeyRing — a versioned map of key-id → encryption key. Rotation is online: new writes go under a fresh key ID; old reads still work under whatever key they were encrypted with; a background sweep eventually re-encrypts older data.

import { MasterKeyRing } from 'actor-ts';
const keyRing = new MasterKeyRing({
keys: {
'k1': process.env.MASTER_KEY_V1!,
'k2': process.env.MASTER_KEY_V2!, // new — current
},
currentKeyId: 'k2',
});

currentKeyId is the key used for new writes. Reads consult keys to find the right key for whatever the stored data is encrypted under.

Three triggers:

  1. Scheduled rotation — a security policy (every 90 days, yearly).
  2. Suspected compromise — leaked key material; rotate immediately.
  3. Compliance — regulatory requirements mandating periodic rotation.

Even without a specific trigger, periodic rotation is good practice — limits blast radius of an undetected leak.

1. Generate a fresh key. Add to keyRing.keys; do NOT yet
promote to currentKeyId.
2. Roll out the updated keyRing to all nodes. Verify reads
work — old data still uses old key, new data still uses
previous key.
3. Update currentKeyId to the new key. Roll out. New writes
use the new key. Old data still readable.
4. Run the re-encryption sweep. Old data is read, decrypted,
re-encrypted under the new key.
5. Once the sweep completes, remove the old key from keyRing.
Keep a rollback window of ~7 days where the old key is
still available; after that, drop it.

The framework supports each of these steps without downtime.

const keyRing = new MasterKeyRing({
keys: {
'k1': process.env.MASTER_KEY_V1!,
'k2': process.env.MASTER_KEY_V2!, // new
},
currentKeyId: 'k1', // still v1 — only added v2 to ring
});

Roll out this config to every node. Reads of v1-encrypted data still work; writes still use v1.

This step is safe and reversible — if v2 isn’t actually needed yet, revert by removing it from keys.

const keyRing = new MasterKeyRing({
keys: {
'k1': process.env.MASTER_KEY_V1!,
'k2': process.env.MASTER_KEY_V2!,
},
currentKeyId: 'k2', // ← now v2
});

Roll out. New writes go under v2. Reads consult the keyRing and find the right key (v1 or v2) based on the stored data’s key-id header.

After this step, gradually new data accumulates under v2 as the workload writes. Old data stays under v1 until re-encrypted.

import { ReEncryptionSweep } from 'actor-ts';
const sweep = new ReEncryptionSweep({
store: snapshotStore,
keyRing,
targetKeyId: 'k2',
batchSize: 100,
rateLimit: 50, // items per second
});
await sweep.run();

The sweep walks the store, finds items not encrypted under targetKeyId, decrypts + re-encrypts.

Knobs:

  • batchSize — items processed per batch.
  • rateLimit — bound the I/O rate so the sweep doesn’t hammer the underlying storage.
  • filter — limit to specific persistenceIds (for partial rotation).

The sweep is idempotent + resumable — interrupting it and re-running picks up where it left off. Tracks progress in the store’s metadata.

After the sweep completes (every item encrypted under v2):

const keyRing = new MasterKeyRing({
keys: {
'k2': process.env.MASTER_KEY_V2!,
},
currentKeyId: 'k2',
});

Drop v1 entirely. Any data still encrypted under v1 (e.g., backups that haven’t been re-encrypted) is now unreadable.

Wait a rollback window before dropping. ~7 days lets you recover from “oh no, the sweep didn’t actually cover all the backups.” After confirmed migration, drop v1.

The keyRing doesn’t ship a key-storage backend. Common patterns:

SourcePattern
Env varsprocess.env.MASTER_KEY_V2 — simplest, fine for tests.
K8s secretsMounted as files; read at startup.
HashiCorp VaultPull dynamically at startup; refresh periodically.
AWS KMS / GCP KMS / Azure Key VaultCloud KMS APIs. Decrypt-on-load via the KMS encryption keys.

For production, KMS is the right answer — keys never leave the secure boundary in plaintext form.

If multiple clusters share the same encrypted store (e.g., a DR replica that reads the primary’s backups), all clusters need the same keyRing. Rotate them together; don’t let one cluster fall behind on key generations.