Skip to main content

MCU Reboot Reasons

There are many reasons a device may reboot in the field — whether it be due to a crash, a brownout, or a firmware update.

Within the Memfault UI, reboot events are displayed for each device as well as summarized in the main "Overview" dashboard:

In this guide we will walk through how to use the reboot tracking module from the memfault-firmware-sdk to collect this data.

Prerequisite

This guide assumes you have already completed the minimal integration of the Memfault SDK. If you have not, please complete the appropriate Getting Started guide.

Overview

The reboot tracking module utilizes a noinit region of RAM to track information about reboots across resets.

  • If a MCU fault takes place, the fault reason will automatically be tracked
  • If the device reboots and the reason is not known, the device reset reason will be derived from the MCU reset reason register that was part of your initial port.
Rate Limiting

Ingestion of Reboots Events may be rate-limited.

Platform-specific Reboots

Each platform has its own capabilities in terms of collecting information on a reboot. To capture the reboot reason, your platform should implement memfault_reboot_reason_get(). On boot, the MCU SDK uses this function to determine a reboot reason from your platform's hardware. Then, it calls memfault_reboot_tracking_boot() to write the reason to the noinit region of RAM.

The MCU SDK implements memfault_reboot_reason_get() for several platforms, including:

  • Atmel SAML1x
  • ESP32 w/ ESP-IDF
  • ESP8266 w/ ESP8266 SDK
  • nRF Connect SDK supported devices
  • Infineon PSoC 6
  • Dialog DA145xx/DA1468x/DA1469x
  • Mynewt supported devices
  • nRF5 SDK supported devices
  • STM32CubeSDK supported devices including F4/F7/H7/L4/WB
  • NXP S32 SDK supported devices
  • Particle Device OS supported devices
  • NXP RT 1021
  • Silicon Labs Gecko SDK supported devices

If your platform is not included in the above list, you will need to implement memfault_reboot_reason_get(). There are a few points to keep in mind when implementing this function for your platform:

  • The function should clear the reboot reason from the hardware before completing. This guarantees that the next boot will receive the next reboot reason.
  • The function should try to classify reboot reasons as specifically as possible. Any reason marked as unknown (kMfltRebootReason_Unknown) or one of the error reasons will cause the reboot to be classified as an "unexpected reboot".

Then, implement memfault_platform_reboot_tracking_boot() to call memfault_reboot_reason_get():

memfault_platform_core.c
#include "memfault/ports/reboot_reason.h"

// [...]

MEMFAULT_PUT_IN_SECTION(".noinit.mflt_reboot_info")
static uint8_t s_reboot_tracking[MEMFAULT_REBOOT_TRACKING_REGION_SIZE];

void memfault_platform_reboot_tracking_boot(void) {
sResetBootupInfo reset_info = { 0 };
memfault_reboot_reason_get(&reset_info);
memfault_reboot_tracking_boot(s_reboot_tracking, &reset_info);
}

void memfault_platform_boot(void) {
// ...
memfault_platform_reboot_tracking_boot();
// ...
}

Unexpected Reboots

Unexpected reboots occur when the device restarts unintentionally. This is usually associated with a crash or other malfunction. Memfault categorizes these reboots as "unexpected", and that property is used to track fleet stability.

In the Memfault Firmware SDK, any reboot reason code >= to kMfltRebootReason_UnknownError or == kMfltRebootReason_Unknown will count as an unexpected reboot.

memfault/core/reboot_reason_types.h
loading...

The other reboot reason codes are considered "expected" reboots, and do not count as a crash when Memfault generates the metrics for fleet stability.

Expected Reboots

Sometimes resets take place due to expected, software-initiated behavior (i.e firmware update, button resets, etc.).

In the Memfault Firmware SDK, any reboot reason code >= than kMfltRebootReason_UserShutdown and < than MfltRebootReason_UnknownError will count as an expected reboot.

memfault/core/reboot_reason_types.h
loading...

Marking Imminent Reboots

For both expected and unexpected reboots that are initiated or detectable by software before the reboot, you can record when and where these types of resets take place through the memfault_reboot_tracking_mark_reset_imminent() API. For example, consider that you want to track anytime an expected reset occurs due to an over-the-air update:

#include "memfault/components.h"

// [...]

void ota_finalize_and_reboot(void) {
MEMFAULT_REBOOT_MARK_RESET_IMMINENT(kMfltRebootReason_FirmwareUpdate);
memfault_platform_reboot(); // resets device
}

Or, if, for example, you detect an dynamic allocation failure, you can mark an imminent, unexpected reset due to an out of memory error:

#include "memfault/components.h"

// [...]

void alloc_failed_callback(void) {
MEMFAULT_REBOOT_MARK_RESET_IMMINENT(kMfltRebootReason_OutOfMemory);
}

Another useful utility is the MEMFAULT_ASSERT_WITH_REASON() API, which will mark an imminent reboot reason and then trigger a reset via an assert. This API is useful when you want to reset to get the system out of an unknown state and note a specific reason for doing so. Using the previous out of memory error example, you will want to trigger a reset immediately after an allocation failure is detected rather than wait for the code to error out in a strange, unrelated place. Therefore, you should use mark an out of memory error and then assert:

#include "memfault/components.h"

// [...]

void alloc_failed_callback(void) {
MEMFAULT_ASSERT_WITH_REASON(0, kMfltRebootReason_OutOfMemory);
}
Reboot Precedence

On boot, memfault_reboot_tracking_boot() will always be called, but if a reboot reason was marked with memfault_reboot_tracking_mark_reset_imminent(), that reboot reason will take precedence over the reboot reason that is found and recorded on boot with memfault_reboot_reason_get() and memfault_reboot_tracking_boot().

Reboot Reason IDs

The full list of reboot reason values can be found in the source file reboot_reason_types.h or in this reference.

User Defined Reboot Reasons

note

Requires SDK version >= 1.7.0

Outside of the fixed set of reboot reasons you can also define custom reboot reasons tailored to your project.

To enable this feature, add the following define to your memfault_platform_config.h:

#define MEMFAULT_REBOOT_REASON_CUSTOM_ENABLE 1

Defining Custom Reboot Reasons

info

There is a maximum limit of 255 definable reboot reasons

To define custom reboot reasons, you will need to provide a config file. The name of this file defaults to memfault_reboot_reason_user_config.def, but you can override it via MEMFAULT_REBOOT_REASON_USER_DEFS_FILE in your project's config header. Below is an example of some custom reboot reasons:

// Defines an "expected" reboot with string `ExpectedReboot`
MEMFAULT_EXPECTED_REBOOT_REASON_DEFINE(ExpectedReboot)
// Defines an "unexpected" reboot with string `UnexpectedReboot`
MEMFAULT_UNEXPECTED_REBOOT_REASON_DEFINE(UnexpectedReboot)

Note the difference when defining expected vs. unexpected reboots outlined earlier. By properly sorting your custom reboot reasons into the unexpected and expected reboot buckets, you will enable accurate calculation of fleet stability metrics.

The above definitions create an entry in the eMemfaultRebootReason enum. To mark an imminent reboot reason without needing to call MEMFAULT_REBOOT_REASON_KEY() to get the full enum name, the following convenience macro can be used:

MEMFAULT_REBOOT_MARK_RESET_IMMINENT_CUSTOM(ExpectedReboot);

Deferring Reboot Reason Serialization

Most systems will record the Reboot event on system startup.

The exception is devices which need to initialize the values for memfault_get_device_info() after system initialization. You'll need to delay the call to memfault_reboot_tracking_collect_reset_info()to prevent invalid values from being inserted into the serialized reboot event.

For non-Zephyr applications, omit calling memfault_reboot_tracking_collect_reset_info() on boot, and instead call it once memfault_get_device_info() is valid.

Built-in Metrics

The Reboot Tracking component provides a built-in metric, MemfaultSdkMetric_UnexpectedRebootCount. At boot this metric is set to the number of unexpected reboots since the last POR or the count was reset. Use this metric to find devices with bootlooping behavior.

This counter is not cleared at boot, to clear this value call memfault_reboot_tracking_reset_crash_count() at a time after boot when it is determined the device is not in a boot loop.