Reboot Reason Tracking

There are many reasons a device may reboot in the field — whether it be due to a crash, a brown out, or a firmware update.

Within the Memfault UI, reboot events are displayed for each device as well as summarized in the main "Overview" dashboard:

In this guide we will walk through how to use the reboot tracking module from the memfault-firmware-sdk to collect this data.

1. Initialize reboot tracking module

First, we need to include the reboot_tracking header and allocate static storage for reboot tracking:

#include "memfault/core/reboot_tracking.h"
// [...]
// An opaque storage area used by the Memfault SDK to track reboot information.
static uint8_t s_reboot_tracking[MEMFAULT_REBOOT_TRACKING_REGION_SIZE];

Next we need to initialize the fields within the sResetBootupInfo parameter, where:

  • reset_reason_reg can be populated with the hardware reset reason or be set to 0 if no information is available. Nearly all MCUs have a register you can read on boot for this information. On the nRF52 it's RESETREAS, on STM32 parts it's RCC_RSR or RCC_CSR, on NXP parts it's AOREG1.
  • reset_reason contains additional information tracked by your firmware about why the reset took place. Valid values are one of the eMemfaultRebootReason or kMfltRebootReason_Unknown if there is no additional information to report.

Putting it all together and calling memfault_reboot_tracking_boot, we have:

#include "memfault/core/reboot_tracking.h"
// [...]
// An opaque storage area used by the Memfault SDK to track reboot information.
static uint8_t s_reboot_tracking[MEMFAULT_REBOOT_TRACKING_REGION_SIZE];
int main(void) {
// [...]
const sResetBootupInfo reset_reason = {
// example of reset_reason_reg for nRF52
.reset_reason_reg = NRF_POWER->RESETREAS,
// example where no additional info about the reboot is available
.reset_reason = kMfltRebootReason_Unknown,
};
// Note: Often MCU reset reason register bits are "sticky" and need to be manually cleared
NRF_POWER->RESETREAS |= NRF_POWER->RESETREAS;
// Initialize reboot tracking module
memfault_reboot_tracking_boot(s_reboot_tracking, &reset_reason);
}

2. Initialize event storage

⚠️ If you have already initialized event_storage for another subsystem (i.e error traces or metrics), skip this step and re-use the evt_storage reference you created already.

All events generated in the Memfault SDK are stored and transmitted using a compressed format (CBOR). As they await to be sent, they are stored in the same "event storage". The size of a single reboot event is ~50 bytes. The exact size needed can be determined with memfault_reboot_tracking_compute_worst_case_storage_size().

#include "memfault/core/event_storage.h"
// [...]
int main(void) {
// [... other initialization code ...]
static uint8_t s_event_storage[100];
const sMemfaultEventStorageImpl *evt_storage =
memfault_events_storage_boot(s_event_storage, sizeof(s_event_storage));
}

3. Save previous reset information in event storage

Once you've initialized the reboot reason module & set up event storage, you need to serialize the information about the reboot into event storage so that it can be published to the Memfault cloud:

#include "memfault/core/event_storage.h"
// [...]
int main(void) {
// [... other initialization code ...]
memfault_reboot_tracking_collect_reset_info(evt_storage);
}

4. Publish reset information to the Memfault cloud

⚠️ All data generated by the Memfault SDK is sent in the exact same manner. If you have already setup data transfer to the Memfault cloud, you can skip this step!

Extensive details about how data from the Memfault SDK makes it to the cloud can be found here. In short, all data is published via the same "chunk" REST endpoint.

#include "memfault/core/data_packetizer.h"
// [...]
bool try_send_memfault_data(void) {
// buffer to copy chunk data into
uint8_t buf[USER_CHUNK_SIZE];
size_t buf_len = sizeof(buf);
bool data_available = memfault_packetizer_get_chunk(buf, &buf_len);
if (!data_available ) {
return false; // no more data to send
}
// send payload collected to chunks/ endpoint
user_transport_send_chunk_data(buf, buf_len);
return true;
}
void send_memfault_data(void) {
// [... user specific logic deciding when & how much data to send]
while (try_send_memfault_data()) { }
}

5. [Optional] Persist Information across reboots

It is often useful to save metadata about a reboot that is about to take place. For example, the Memfault panics component will automatically store metadata about the type of fault and the instruction which caused it when a fault handler is invoked.

There are a few common ways the reboot tracking storage region can be allocated to persist information across reboots in RAM which we will explore below. Once you have chosen an approach, you will need to replace s_reboot_tracking from Step 1 with the new storage location.

Note that as long as power is not lost and the RAM is not initialized, RAM state will remain valid across an MCU reset. It is also ideal (but not required) that the memory location is stable across firmware updates (so that reboots due to OTA updates can be tracked).

The reboot tracking module will automatically detect and re-initialize itself in the event there is a full power loss or the region gets corrupted due to a bug.

Allocate Storage at Bottom of Stack

Details


If your project is using a CMSIS based linker script, the ISR Stack is always allocated at the top of available RAM and the __StackLimit linker script variable can be used to get the location of the bottom of the stack.

extern uint32_t __StackLimit;
static void *s_reboot_tracking = &__StackLimit;

Placing reboot tracking in this location has several advantages:

  • No linker script changes necessary.
  • The stack region is not scrubbed or initialized by default so values will persist across reboots.
  • This memory location is stable across firmware releases as well as between bootloaders and main firmware images because it's always allocated at the top of RAM. The location would only move if __STACK_SIZE was changed.
  • For ARM (and most other architectures) the stack grows down so the stack pointer should never be close to the bottom of the stack since that would be indicative of a stack overflow and a pretty serious memory corruption bug. (Generally the system heap or bss is directly below the stack!).

Allocate Storage By Adding "noinit" RAM region to linker script

Details


For GNU GCC, this can easily be achieved by placing the memory in a section that is not part of .bss or .data:

#include "memfault/core/reboot_tracking.h"
static uint8_t s_reboot_tracking[MEMFAULT_REBOOT_TRACKING_REGION_SIZE]
__attribute__((section(".mflt_reboot_info")));
/* Your .ld file */
MEMORY
{
/* [...] */
NOINIT (rw) : ORIGIN = <RAM_REGION_START>, LENGTH = 64
}
SECTIONS
{
/* [...] */
.noinit (NOLOAD): { KEEP(*(*.mflt_reboot_info)) } > NOINIT
}

6. [Optional] Add custom reset tracing for your application

The Memfault panics component will automatically generate traces any time a fault handler is invoked or anytime your system calls MEMFAULT_ASSERT_RECORD, MEMFAULT_ASSERT, memfault_fault_handling_assert. If you'd like to add reset tracking in other places, this can easily be achieved with the memfault_reboot_tracking_mark_reset_imminent API. For example, consider we want to track anytime a reset occurs due to an over-the-air update:

#include "memfault/core/compiler.h"
#include "memfault/core/reboot_tracking.h"
// [...]
void ota_finalize_and_reboot(void) {
// The pc & lr which result in the reboot can always be *optionally* recorded
void *pc;
MEMFAULT_GET_PC(pc);
void *lr;
MEMFAULT_GET_LR(lr);
sMfltRebootTrackingRegInfo reg_info = {
.pc = (uint32_t)pc,
.lr = (uint32_t)lr,
};
// Note: "reg_info" may be NULL if no register information collection is desired
memfault_reboot_tracking_mark_reset_imminent(kMfltRebootReason_FirmwareUpdate,
&reg_info);
// [... logic to reboot the MCU ...]
}