Linux Reboot Reason Tracking
Introduction
There are many reasons a device may reboot in the field — whether it be due to a kernel panic, a user reset, or a firmware update.
Within the Memfault UI, reboot events are displayed for each device as well as summarized in the main "Overview" dashboard:

Reboots chart in the "Overview" dashboard of the Memfault Web App.
In this guide we will walk through how to use the reboot reason tracking feature from the Memfault Linux SDK to collect this data.
Keep meta-memfault-example open as a reference
implementation. Your integration should look similar to it once you're done
following the steps in this tutorial.
Prerequisites
The memfaultd daemon
Follow the integration guide to learn how to set this up
for your device. A key function of memfaultd is to detect, classify and upload
reboot events to the Memfault platform. This is enabled through the reboot
feature.
Linux kernel pstore / ramoops configuration
The detection of kernel panics as a reboot reason depends on the so called
ramoops subsystem and pstore filesystem. From the Linux kernel admin
guide:
Ramoops is an oops/panic logger that writes its logs to RAM before the system crashes. It works by logging oopses and panics in a circular buffer. Ramoops needs a system with persistent RAM so that the content of that area can survive after a restart.
The pstore is a RAM-backed filesystem that persists across reboots.
The easiest way to enable the pstore in your kernel when using Yocto is via
the KERNEL_FEATURES bitbake variable and add the pstore kernel
feature. To get finer-grained control over how the
pstore is configured, you can instead use its Kconfig
options directly.
We also recommend adding the debug-panic-oops kernel feature to enable a kernel panic when an "oops" is encountered:
KERNEL_FEATURES:append = " cgl/features/pstore/pstore.scc cfg/debug/misc/debug-panic-oops.scc"
In the meta-memfault-example QEMU integration, the KERNEL_FEATURES approach
is taken, see these lines of code.
Next, you will need to specify what region of RAM to use. There are several ways
of doing this. The recommended way is to use a Device Tree binding. Please
consult this section on ramoops parameters in the Linux kernel admin
guide for more details.
Lastly, it's possible to configure the kernel to always dump the kmsg logs
using printk.always_kmsg_dump. This is expected to
be disabled (the default).
Note that in the meta-memfault-example QEMU
integration, we are deviating from the
recommendation of using a Device Tree binding. The ramoops.* kernel command
line arguments are used instead. The reason for this is that QEMU typically
auto-generates a Device Tree on-the-fly and extending it is more complicated.
Systemd configuration
The memfaultd daemon will take care of cleaning up /sys/fs/pstore after a
reboot of the system.
Often, systemd-pstore.service is configured to carry out this task. This would
conflict with memfaultd performing this task. Therefore,
systemd-pstore.service has to be disabled. This service is automatically
excluded when including the meta-memfault layer.
Note that memfaultd does not provide functionality (yet) to archive pstore
files (like systemd-pstore.service can). If this is necessary for you, the
work-around is to create a service that performs the archiving and runs before
memfaultd.service starts up.
Detecting reboot due to Over The Air updates
The OTA agent you are using (for example swupdate) should inform the Memfault
SDK before rebooting after installing an update.
In our meta-memfault-example this is accomplished by configuring
swupdate to run the command
memfaultctl reboot --reason 3 after installing the update.
Reboot Reason Classification
Automatically detected reboot reasons
There are a few reboot reasons that memfaultd can determine with confidence for all embedded Linux devices out-of-the-box:
| Reason | Description | Information Source |
|---|---|---|
| Kernel Panic | The Linux kernel crashed | pstore / ramoops |
| User Reset | "Graceful" shutdown of the system | systemd state |
Providing a reboot reason to Memfault
There are many more types of reboots for which the detection can only be implemented in device-specific ways.
For example, an SoC may have special hardware to detect power brownouts. There is no "standard" way to read brownout detection state in Linux.
Another example: a device's software may contain logic to decide that the device needs to shut down or reset, e.g. because the battery has dropped below a certain point, or because the user pressed a button.
To track reboot reasons which the Memfault SDK cannot possibly know how to
detect, the SDK provides a way to extend the reboot reason classification, via
the last_reboot_reason_file and the memfaultctl reboot command.