Skip to main content

Core Metrics & Device Vitals

Memfault provides support for a set of core metrics that apply to a wide range of devices. These metrics are either automatically collected by the on-device SDKs, or the SDKs have built-in facilities for enabling collection. The core metrics are also exempt from Metric and Device Attribute quotas.

Example of Device Vitals

Once data is being collected, Memfault's servers will automatically process these metrics to provide a set of key insights ("Device Vitals") across your fleet. Each of the Device Vitals insights can be visualized in special pre-configured dashboard cards, which can be added to dashboards.

Required SDK Version

The appropriate SDK version needs to be integrated before these metrics can be used. See the guides linked from each platform's section for getting Memfault up and running on your devices.

PlatformRequired SDK Version
Android>= 4.13
Linux>= 1.9
MCU>= 1.5.0

A summary of the Memfault Core Metrics provided by the device SDKs:

CategoryMetric KeysAutomatically Collected on these SDKs
Battery Lifebattery_soc_pct_dropAndroid
battery_discharge_duration_ms
battery_soc_pct
Stable Hoursoperational_hoursAndroid, Linux, MCU
operational_crashfree_hours
Stable Sessionsoperational_crashesAndroid, Linux

Periodic Connectivity (custom)

sync_successful
sync_failure

Periodic Connectivity (Memfault)

sync_memfault_successfulAndroid, Linux
sync_memfault_failure
Always On Connectivityconnectivity_connected_timeAndroid
connectivity_expected_time

Battery Life

For background and principles around these metrics, see the following docs:

The Battery Life core metrics are:

  • battery_soc_pct: The battery's State of Charge (SoC) at the end of the Heartbeat interval, reported as a percentage from 0-100.
  • battery_soc_pct_drop: The drop in the battery's State of Charge (SoC) during the Heartbeat interval, reported as a percentage drop from 0-100.
  • battery_discharge_duration_ms: The time spent discharging the battery during the Heartbeat interval, in milliseconds.

SDK Collection

These metrics are standardized and supported across all Memfault SDKs, but collection varies by platform:

Battery metrics are collected automatically on devices running the Memfault Android SDK.

Insights

Add an Expected Battery Life card to your dashboard to see these insights.

Expected Battery Life chart

Version 1.1.0 has a worse battery life than 1.0.0.

Stable Hours

For background and principles around these metrics, see the following docs:

The Stable Hours core metrics are:

  • operational_hours: The number of hours the device has been operational, since the last collection of this metric.
  • operational_crashfree_hours: The number of hours the device has been operational without a crash, since the last collection of this metric.

The goal of these metrics is to provide a straightforward aggregate metric, "% of operational hours without crashes", that can be used to track the reliability of a fleet of devices over time and compare reliability across firmware releases.

These metrics are collected automatically by all Memfault SDKs.

note

On MCU, a "crash" is counted when the device restarts unexpectedly. On Android and Linux, a "crash" is counted when a process crashes and a trace is captured by the Memfault SDK.

These metric values increment only on continuous hours of operation. The stable hours measurement will work best when the devices regularly exceeds one hour of continuous operation. Therefore, this chart can only be used with periodic heartbeat reports.

If your device operates on sub-hour intervals (i.e. no uptime counter is maintained between sessions, or the device restarts often), Memfault recommends using "Stable Sessions" instead of "Stable Hours" to measure reliability.

Low Power Modes and Operational Hours

Devices in low power modes usually advance the necessary time value for the Memfault SDK to capture continuous operational hours through low power modes, but this is device and platform-specific:

On Android, the operational_hours and operational_crashfree_hours metrics will advance through device low-power modes, and will reset if the device restarts.

The crashfree hours algorithm currently considers ANRs, Tombstones, non-WTF Exceptions, and all Kernel Oops as crashes.

Insights

Add a Stable Hours card to your dashboard to see these insights.

Stable Hours chart

Version 1.1.0 crashes more often than other versions.

Stable Sessions

This device vital is similar to Stable Hours, except it is measuring stability in terms of sessions instead of hours. Therefore, this chart can only be used with session report types.

The Stable Sessions core metric is:

  • operational_crashes: The number of crashes that have occurred in the current session.

This metric is automatically collected by Linux & Android SDKs.

Insights

Add a Stable Sessions card to your dashboard to see these insights.

Stable Hours chart

Version 1.1.0 crashes more often than 1.0.0.

Connectivity

Connectivity metrics are split into two categories:

  • Periodic Connectivity: Metrics that track the success or failure of individual data sync sessions
  • Always-On Connectivity: Metrics that track the uptime of the device's connectivity

Both variants of the connectivity metrics provide a similar aggregate analysis as stable hours, where Memfault can show "% of successful data syncs" or "% of connectivity uptime" for groups of devices.

Periodic Connectivity

These metrics are used to report sync success or failure for a single data sync operation:

  • sync_successful: The number of successful syncs
  • sync_failure: The number of failed syncs
  • sync_memfault_successful: The number of successful syncs to Memfault
  • sync_memfault_failure: The number of failed syncs to Memfault

The sync_successful and sync_failure metrics need to be explicitly recorded when the device performs its data sync operation. It's up to the device to define what data sync should be used for these metrics. For example, syncing user configuration data for a BLE connected bird feeder might be considered a data sync. Memfault will use these metrics to compute the success rate of the fleet's data syncs.

SDK Collection

The SDK API for collecting these metrics varies by platform:

There are several options for recording successful and failed attempts for synchronizations with your backend (or any other criteria you'd like to use for connectivity).

The Kotlin reporting-lib has built-in support for syncs using the Reporting.report().sync() API on versions 1.4+.

If the Kotlin library is not available, a Counter metric with the special metric names sync_successful and sync_failure for success and failure events, with sumInReport = true, can be used as well.

Synchronization to Memfault

The sync_memfault_* metrics are automatically collected by the Android SDK.

Always-On Connectivity

Devices with continuous connectivity can use these metrics to track the relative uptime of their connectivity:

  • connectivity_connected_time_ms: The time spent connected, in milliseconds
  • connectivity_expected_time_ms: The time expected to be connected, in milliseconds

Examples of devices with continuous connectivity include:

  • Mains powered Wi-Fi connected devices
  • Cellular connected devices
  • Ethernet connected devices
  • Mains powered Zigbee/802.15.4/Thread devices

This connectivity metric is used by Memfault to compute relative uptime.

SDK Collection

Always On Connectivity Metrics collection varies by platform:

Automatically collected on Android.

For the purpose of computing connectivity_expected_time_ms, the Android SDK expects that the device should always be connected to a network.

Insights

Add Connectivity cards to your dashboard to see these insights.

Connectivity charts

Version 1.1.0 experiences more connection difficulties than other versions.