We recommend capturing a full coredump trace in case the system encounters a fatal problem, like a hard fault or a failed assertion. However, in some cases it may not be desirable or possible to do so. For example, if stopping & rebooting the system is not an option, or if the error is recoverable but you would still like to understand how often it happens.
The trace event module within the SDK makes it easy to track errors in a way that requires less storage than full coredump traces and also allows the system to keep running after capturing the event. Only the program counter, return address and a custom "reason" are saved. Once uploaded to Memfault, each trace event will be associated with an Issue just like a coredump.
Here's an example where Trace Events are captured for Bluetooth protocol CRC errors and invalid message IDs:
This guide assumes you have already completed the minimal integration of the Memfault SDK. If you have not, check out the appropriate guide in the table below.
|MCU Architecture||Getting Started Guide|
|ARM Cortex-M||ARM Cortex-M Integration Guide|
|nRF Connect SDK||nRF Connect SDK Integration Guide|
|ESP32 ESP-IDF||ESP32 ESP-IDF Integration Guide|
|ESP8266||ESP8266 RTOS Integration Guide|
Aside from the program counter and return address, a trace event also contains a
user-defined "error reason". The list of custom reasons is defined in a separate
configuration file named
memfault_trace_reason_user_config.def which you need
To start, we recommend adding a "test" trace error reason you can easily trigger (i.e via a CLI command) and a couple for error paths in your codebase (such as peripheral bus read/write failures, transport errors and unexpected timeouts).
Here is what the
memfault_trace_reason_user_config.def file should look like:
Next, we'll need to use the
MEMFAULT_TRACE_EVENT_* macros to capture trace
events when errors occur.
Note that it is perfectly fine to use the same reason in different places if that makes sense in the context of your code. Because the program counter and return address are captured in the trace event, you will be able to see the 2 topmost frames (function name, source file and line) in Memfault's Issue UI and distinguish between the two.
For test purposes, you can add a CLI command that logs trace events using the different methods:
You can also start to add trace events for error paths:
It is also safe to use the
MEMFAULT_TRACE_EVENT() macro from an interrupt.
When called from an interrupt, the trace event info will just copy the collected
info to a temporary storage region in RAM to minimize interrupt latency. The
collected data can then be serialized to event storage by explicitly calling
memfault_trace_event_try_flush_isr_event(). The data will also be flushed
MEMFAULT_TRACE_EVENT() is called and the processor is
not in an interrupt.