Reboot Reason Tracking

There are many reasons a device may reboot in the field — whether it be due to a crash, a brown out, or a firmware update. The reboot tracking module within the SDK allows one to easily do this. Here are the steps to get started:

1. Allocate RAM storage for reboot tracking (64 bytes)

As long as power is not lost, RAM state will remain valid across an MCU reset. The RAM region used for reboot tracking must not be initialized as part of your system's power-on firmware sequence. It is also ideal (but not required) that the memory location is stable across firmware updates (so that reboots due to OTA updates can be tracked).

The reboot tracking module will automatically detect and re-initialize itself in the event there is a full power loss or the region gets corrupted due to a bug.

There are a few common ways this region is generally allocated which we will outline below.

Allocate Storage at Bottom of Stack

Details


If your project is using a CMSIS based linker script, the ISR Stack is always allocated at the top of available RAM and the __StackLimit linker script variable can be used to get the location of the bottom of the stack.

extern uint32_t __StackLimit;
static void *s_reboot_tracking = &__StackLimit;

Placing reboot tracking in this location has several advantages:

  • No linker script changes necessary.
  • The stack region is not scrubbed or initialized by default so values will persist across reboots.
  • This memory location is stable across firmware releases as well as between bootloaders and main firmware images because it's always allocated at the top of RAM. The location would only move if __STACK_SIZE was changed.
  • For ARM (and most other architectures) the stack grows down so the stack pointer should never be close to the bottom of the stack since that would be indicative of a stack overflow and a pretty serious memory corruption bug. (Generally the system heap or bss is directly below the stack!).

Allocate Storage By Adding "noinit" RAM region to linker script

Details


For GNU GCC, this can easily be achieved by placing the memory in a section that is not part of .bss or .data:

#include "memfault/core/reboot_tracking.h"
static uint8_t s_reboot_tracking[MEMFAULT_REBOOT_TRACKING_REGION_SIZE]
__attribute__((section(".mflt_reboot_info")));
/* Your .ld file */
MEMORY
{
/* [...] */
NOINIT (rw) : ORIGIN = <RAM_REGION_START>, LENGTH = 64
}
SECTIONS
{
/* [...] */
.noinit (NOLOAD): { KEEP(*(*.mflt_reboot_info)) } > NOINIT
}

2. Initialize reboot tracking module

Nearly all MCUs have a register you can read on boot to understand why the device reset. For example, on the NRF52 it's RESETREAS, on STM32 parts it's RCC_RSR or RCC_CSR, on NXP parts it's AOREG1. When you initialize the reboot tracking subsystem, the value in this register can be appended to the prior reset information. If there's no register information to collect, you can also simply pass NULL as an argument.

#include "memfault/core/reboot_tracking.h"
// [...]
int main(void) {
// [...]
// Example of reset reason for nrf52
const sResetBootupInfo reset_reason = {
.reset_reason_reg = NRF_POWER->RESETREAS,
};
memfault_reboot_tracking_boot(s_reboot_tracking, &reset_reason);
// Note: MCU reset reason register bits are usually "sticky" and need to be cleared
NRF_POWER->RESETREAS |= NRF_POWER->RESETREAS;
}

3. Initialize event storage

All events generated in the Memfault SDK are stored and transmitted using a compressed format (CBOR). As they await to be sent, they are stored in the "event storage" core component. For reboot reasons, you need to hold one serialized event (~50 bytes). The exact size needed can be determined with memfault_reboot_tracking_compute_worst_case_storage_size().

#include "memfault/core/event_storage.h"
// [...]
int main(void) {
// [... other initialization code ...]
static uint8_t s_event_storage[100];
const sMemfaultEventStorageImpl *evt_storage =
memfault_events_storage_boot(s_event_storage, sizeof(s_event_storage));
}

4. Save previous reset information in event storage

Once you've initialized the reboot reason module & set up event storage, you can collect any prior reset information and prepare it for transmission

#include "memfault/core/event_storage.h"
// [...]
int main(void) {
// [... other initialization code ...]
memfault_reboot_tracking_collect_reset_info(evt_storage);
}

5. Publish reset information to the Memfault cloud

Extensive details about how data from the Memfault SDK makes it to the cloud can be found here. In short, all data is published via the same "chunk" REST endpoint.

#include "memfault/core/data_packetizer.h"
// [...]
bool try_send_memfault_data(void) {
// buffer to copy chunk data into
uint8_t buf[USER_CHUNK_SIZE];
size_t buf_len = sizeof(buf);
bool data_available = memfault_packetizer_get_chunk(buf, &buf_len);
if (!data_available ) {
return false; // no more data to send
}
// send payload collected to chunks/ endpoint
user_transport_send_chunk_data(buf, buf_len);
return true;
}
void send_memfault_data(void) {
// [... user specific logic deciding when & how much data to send
while (try_send_memfault_data()) { }
}

6. Add custom reset tracing for your application

The Memfault panics component will automatically generate traces any time a fault handler is invoked or anytime your system calls MEMFAULT_ASSERT_RECORD, MEMFAULT_ASSERT, memfault_fault_handling_assert. If you'd like to add reset tracking in other places, this can easily be achieved with the memfault_reboot_tracking_mark_reset_imminent API. For example, consider we want to track anytime a reset occurs due to an OTA:

#include "memfault/core/compiler.h"
#include "memfault/core/reboot_tracking.h"
// [...]
void ota_finalize_and_reboot(void) {
// The pc & lr which result in the reboot can always be *optionally* recorded
void *pc;
MEMFAULT_GET_PC(pc);
void *lr;
MEMFAULT_GET_LR(lr);
sMfltRebootTrackingRegInfo reg_info = {
.pc = (uint32_t)pc,
.lr = (uint32_t)lr,
};
// Note: "reg_info" may be NULL if no register information collection is desired
memfault_reboot_tracking_mark_reset_imminent(kMfltRebootReason_FirmwareUpdate,
&reg_info);
// [... logic to reboot the MCU ...]
}