Skip to content

Stock metrics

When the metrics extension starts, the framework automatically records a baseline of metrics covering the actor system, mailboxes, cluster, sharding, persistence, and broker actors.

import { ActorSystem, MetricsExtensionId } from 'actor-ts';
const metrics = system.extension(MetricsExtensionId);
// Stock metrics start recording — no further setup

These are the metrics you’d write yourself anyway. Shipping them out of the box lets you wire a dashboard immediately.

Per-actor-class metrics — covers the actors hosting application logic:

MetricTypeLabelsMeaning
actor_messages_received_totalcounterclass, pathTotal messages routed to an actor’s mailbox.
actor_messages_processed_totalcounterclass, pathTotal messages handed to onReceive.
actor_messages_failed_totalcounterclass, pathMessages whose onReceive threw.
actor_message_duration_mshistogramclass, pathPer-message processing time.
actor_restarts_totalcounterclass, pathSupervisor-driven restarts.

The path label may be high-cardinality (one path per actor). For sharded entities (thousands of paths), the framework caps label series — only the first N paths get tracked individually, the rest aggregate into a single _other series.

MetricTypeLabelsMeaning
actor_mailbox_sizegaugeclass, pathCurrent depth.
actor_mailbox_enqueued_totalcounterclass, pathTotal enqueues.
actor_mailbox_dequeued_totalcounterclass, pathTotal dequeues.
actor_mailbox_dropped_totalcounterclass, path, reasonDrops by overflow policy.

mailbox_size is the point-in-time depth; combined with the enqueue / dequeue rates, you can compute backpressure.

A high mailbox_size with growing dropped_total is a slow-consumer signal — the actor can’t keep up with its arrival rate.

MetricTypeLabelsMeaning
cluster_members_countgaugestatusMembers in each state (Joining/Up/Unreachable/etc.)
cluster_gossip_messages_totalcounterdirectionIn/out gossip count.
cluster_member_transitions_totalcounterfrom, toState transitions per type.
cluster_unreachable_duration_mshistogramHow long unreachable members stay that way.

For monitoring cluster health:

  • cluster_members_count{status="up"} should equal your configured replica count.
  • cluster_members_count{status="unreachable"} > 0 is an alert trigger.
  • cluster_unreachable_duration_ms p99 gives “how flappy is the network” — high values mean your failure-detector might need tuning.
MetricTypeLabelsMeaning
sharding_shards_hostedgaugetype, regionShards hosted per region per type.
sharding_entities_countgaugetype, regionActive entities per region.
sharding_rebalances_totalcountertypeRebalance events.
sharding_handoffs_totalcountertype, outcomeHandoff success / failure.
sharding_passivations_totalcountertype, reasonIdle-timeout / max-entities / manual.

Useful dashboards:

  • Hot regionssharding_shards_hosted per region.
  • Entity churnsharding_passivations_total rate.
  • Rebalance frequency — high values indicate cluster instability.
MetricTypeLabelsMeaning
persistence_events_written_totalcounterpid_prefixEvents appended.
persistence_events_replayed_totalcounterpid_prefixEvents read during recovery.
persistence_snapshot_saves_totalcounterpid_prefixSnapshots written.
persistence_recovery_duration_mshistogrampid_prefixTime from preStart to recovery-complete.

recovery_duration_ms is one of the most actionable metrics — if recovery starts taking seconds, snapshot more aggressively.

pid_prefix is a label group derived from your persistenceId — e.g., account-* aggregates all account events.

For BrokerActor subclasses (Kafka, MQTT, etc.):

MetricTypeLabelsMeaning
broker_stategaugeactor, endpoint0 = disconnected, 1 = connected, etc.
broker_connect_attempts_totalcounteractor, endpoint, outcomeConnect attempts, success/failure.
broker_messages_in_totalcounteractor, topicInbound messages from broker.
broker_messages_out_totalcounteractor, topicOutbound messages to broker.
broker_buffer_sizegaugeactorOutbound buffer depth.
broker_buffer_overflow_totalcounteractor, policyBuffer overflows.

broker_state is the fastest signal for connection issues — a gauge that drops below 1 means a broker is down.

system.extension(MetricsExtensionId, {
enableStockMetrics: false, // off
});

If stock metrics’ overhead matters (CPU-tight loops, very-high actor churn), you can opt out. Most production deployments keep them on — the overhead is negligible.

const metrics = system.extension(MetricsExtensionId);
metrics.registry.setStaticLabels({
region: 'eu-west-1',
env: 'production',
});

Static labels apply to every metric. Useful for the global context the metric backend should join on (region, env, pod name).