Skip to main content

Metrics

There are many system health vitals that are useful to track aside from crashes and reboots. The options are numerous, but you can expand the toggle to get a few examples.
  • RTOS related statistics
    • Amount of time spent in each RTOS task per unit time. This can help you understand if one task is starving the system
    • Heap high water marks
    • Stack high water marks
  • Time MCU was in different states
    • Stop, Sleep, Run Mode
    • Time each peripherals were active
  • Battery life drop per unit time.
  • Transport specific metrics (LTE, WiFI, BLE, LoRa, etc)
    • Amount of time transport was connected
    • Amount of connection attempts
  • Number of bytes over transport per unit time.

In the Memfault UI, you can configure Alerts based on these metrics, as well as explore metrics collected for any device.

Here is an example: the time Bluetooth was connected, the amount of bytes sent, and the battery life were tracked. In Memfault's UI, the data that gets collected from each device over time is visualized in customizable graphs:

Metrics

The Memfault SDK includes a "metrics" component that makes it easy to collect this information on an embedded device. In the sections below, we will walk through how to get started with the component.

Prerequisite

This guide assumes you have already completed the minimal integration of the Memfault SDK. If you have not, please complete the appropriate Getting Started guide.

Rate Limiting

Ingestion of Metrics may be rate-limited. Avoid sending data more than once per hour per device.

Metric Categories

Metrics can generally be categorized into these buckets:

  • Counters: A metric that is incremented or decremented over time. For example, the number of bytes sent over a transport.
  • Gauges: A metric that is set to an instantaneous value. For example, the current battery state of charge.
  • Timers: A metric that tracks the amount of time spent in a particular state or performing a particular action. For example, the amount of time the device was in a low power state.

The Memfault Firmware SDK provides convenience APIs for each of these types of metrics.

Defining Custom Metrics

All custom metrics can be defined with the MEMFAULT_METRICS_KEY_DEFINE macro in the memfault_metrics_heartbeat_config.def created as part of your port. In this guide, we will walk through a simple example of tracking the high water mark of the stack for a "Main Task" in our application and the number of bytes sent out over a Bluetooth connection.

// File $PROJECT_ROOT/third_party/memfault/memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_KEY_DEFINE(MainTaskStackHwm, kMemfaultMetricType_Unsigned)
MEMFAULT_METRICS_KEY_DEFINE(BtBytesSent, kMemfaultMetricType_Unsigned)
MEMFAULT_METRICS_STRING_KEY_DEFINE(ManufDate, sizeof("2022-05-09"))

Dependency Function Overview

The metrics subsystem uses the "timer" implemented as part of your initial port to control when data is aggregated into a "heartbeat". When the heartbeat subsystem is booted, a dependency function memfault_platform_metrics_timer_boot is called to set up this timer. Most RTOSs have a software timer implementation that can be directly mapped to the API or a hardware timer can be used as well. The expectation is that callback will be invoked every period_sec (which by default is once / hour).

The metrics subsystem supports a timer type (kMemfaultMetricType_Timer), which can easily track durations (i.e., time spent in MCU stop mode) and overall system uptime. To support this, the memfault_platform_get_time_since_boot_ms() function implemented as part of the initial port is used. Typically, this information is derived from either a system's Real Time Clock (RTC) or the SysTick counter used by an RTOS.

Setting Metric Values

There's a set of APIs in components/include/memfault/metrics/metrics.h, which can easily update heartbeats as events occur. The updates occur in RAM, so there is negligible overhead introduced. Here's an example:

#include "memfault/metrics/metrics.h"
// [ ... ]
void bluetooth_driver_send_bytes(const void *data, size_t data_len) {
MEMFAULT_METRIC_ADD(BtBytesSent, data_len);
// [ ... code to send Bluetooth data ... ]
}

String metrics are stored in the same heartbeat snapshot. The process for setting a string metric might look like this, for example:

#include "memfault/metrics/metrics.h"
void set_manufacturing_date_metric(const char *manufacturing_date) {
// set the manufacturing date string metric
MEMFAULT_METRIC_SET_STRING(ManufDate, manufacturing_date);

// optionally, trigger a heartbeat to immediately capture the metric record
memfault_metrics_heartbeat_debug_trigger();

// optionally, trigger an upload of Memfault chunk data
// [ ... code to trigger memfault upload ... ]
}
Note

If a string metric is not reported in a heartbeat interval, the previously reported value will not be overwritten by Memfault's backend. This can be used for bandwidth optimization by only reporting values on bootup or when they change.

For SDK versions 0.42.0 and above, if an integer metric is not set in a heartbeat interval, a null value is sent and ignored by Memfault's backend. For SDK versions before 0.42.0, a value of 0 is sent and recorded.

Including Sampled Values in a Heartbeat

memfault_metrics_heartbeat_collect_data() is called at the very end of each heartbeat interval.

By default, this is a weak empty function, but you will want to implement it if there's data you want to sample and include in a heartbeat (i.e. recorded RSSI, battery level, stack high water marks, etc).

The MainTaskStackHwm we are tracking in this guide is a good example of how to make use of this function.

#include "memfault/metrics/platform/overrides.h"
// [...]
void memfault_metrics_heartbeat_collect_data(void) {
// NOTE: When using FreeRTOS we can just call
// "uxTaskGetStackHighWaterMark(s_main_task_tcb)"
const uint32_t stack_high_water_mark = // TODO: code to get high water mark
MEMFAULT_METRIC_SET_UNSIGNED(MainTaskStackHwm, stack_high_water_mark);
}

Initial Setup & Debug APIs

While integrating the heartbeat metrics subsystem or adding new metrics, you can debug and test the new code in a few easy ways. Notably:

  • memfault_metrics_heartbeat_debug_trigger() can be called at any time to trigger a heartbeat serialization (so you don't have to wait for the entire interval to get data to flush)
  • memfault_metrics_heartbeat_debug_print() can be called to dump the current value of all the metrics being tracked
  • The heartbeat interval can be reduced from the default 3600 seconds for debugging purposes by setting MEMFAULT_METRICS_HEARTBEAT_INTERVAL_SECS in your memfault_platform_config.h interval to a shorter period, such as 30 seconds.

Metrics Storage

Metric events are stored in the in-memory ring buffer supplied to the memfault_metrics_boot() initialization function (snippet below is from the ports/templates example):

  // initialize the event storage buffer
static uint8_t s_event_storage[1024];
const sMemfaultEventStorageImpl *evt_storage =
memfault_events_storage_boot(s_event_storage, sizeof(s_event_storage));

// configure trace events to store into the buffer
memfault_trace_event_boot(evt_storage);

// record the current reboot reason
memfault_reboot_tracking_collect_reset_info(evt_storage);

// configure the metrics component to store into the buffer
sMemfaultMetricBootInfo boot_info = {
.unexpected_reboot_count = memfault_reboot_tracking_get_crash_count(),
};
memfault_metrics_boot(evt_storage, &boot_info);

It may be necessary to adjust the size of the buffer to fit the application's needs; for example, if the device uploads data to Memfault infrequently, the buffer may need to be increased.

Non-volatile Event Storage

The Memfault SDK provides a way to configure a non-volatile supplementary store for the event buffer.

To learn more about that component, see the following header files, which explain how it works:

Timestamping Metrics on Device

For devices that have an onboard source of time (RTC or GNSS receiver, etc), it can be useful to add a timestamp to metrics. This will set a "recorded time" value on the metric when it's decoded by Memfault's server, and the metric reports will show accordingly in the device's timeline.

A detailed description can be found in the Event Timestamps documentation page.

Metric Types

The Memfault SDK supports the following metric types:

TypeDescription
kMemfaultMetricType_SignedSigned integer
kMemfaultMetricType_UnsignedUnsigned integer
kMemfaultMetricType_TimerTimer (duration)
kMemfaultMetricType_StringString

Signed and Unsigned Integer Metrics

Unsigned and signed metrics are stored as 32-bit integers. To define these metrics:

memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_KEY_DEFINE(MySignedMetric, kMemfaultMetricType_Signed)
MEMFAULT_METRICS_KEY_DEFINE(MyUnsignedMetric, kMemfaultMetricType_Unsigned)

To set the value of these metrics:

// Set the metric value
MEMFAULT_METRIC_SET_UNSIGNED(MySignedMetric, -1234);
MEMFAULT_METRIC_SET_UNSIGNED(MyUnsignedMetric, 1234);

// Increment the metric value
MEMFAULT_METRIC_ADD(MySignedMetric, 1);
MEMFAULT_METRIC_ADD(MyUnsignedMetric, 1);

Timer Metrics

Timer metrics track a duration of time, in milliseconds. They are stored as 32-bit integers, and are tallied at the end of a heartbeat interval; the timer metric value is stored per-interval, so a continuously running timer will report values equivalent to the heartbeat interval:

To define a timer metric:

memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_KEY_DEFINE(MyTimerMetric, kMemfaultMetricType_Timer)

To start and stop the timer:

// Start the timer
MEMFAULT_METRIC_TIMER_START(MyTimerMetric);
// [ ... code to time ... ]
// Stop the timer
MEMFAULT_METRIC_TIMER_STOP(MyTimerMetric);

String Metrics

String metrics are stored as a fixed-length string. The maximum length of the string is defined when the metric is defined. To define a string metric:

memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_STRING_KEY_DEFINE(MyStringMetric, 32)

To set the value of the string metric:

MEMFAULT_METRIC_SET_STRING(MyStringMetric, "my string value");

String metrics are serialized in the next heartbeat after they are set.