Error Tracking with Trace Events

We recommend capturing a full coredump trace in case the system encounters a fatal problem, like a hard fault or a failed assertion. However, in some cases it may not be desirable or possible to do so. For example, if stopping & rebooting the system is not an option, or if the error is recoverable but you would still like to understand how often it happens.

The trace event module within the SDK makes it easy to track errors in a way that requires less storage than full coredump traces and also allows the system to keep running after capturing the event. Only the program counter, return address and a custom "reason" are saved. Once uploaded to Memfault, each trace event will be associated with an Issue just like a coredump.

Here's an example where Trace Events are captured for Bluetooth protocol CRC errors and invalid message IDs:

Integration Steps


Prerequisite: This guide assumes you have already completed the minimal port of the Memfault SDK to collect coredumps. If you have not, check out the getting started guides that are available for the GCC, IAR, or ARM MDK Compiler.

1. Add Trace Event files to Build System

If you are using the makefiles/MemfaultWorker.mk or cmake/Memfault.cmake to automatically collect sources, there is no work to do!

If you are using another build system, you just need to add the following file to the build system:

$(MEMFAULT_SDK_ROOT)/components/core/src/arch_arm_cortex_m.c
$(MEMFAULT_SDK_ROOT)/components/panics/src/memfault_trace_event.c

2. Initialize the Event Storage and Trace Event modules

All events generated in the Memfault SDK are stored and transmitted using a compressed format (CBOR). As they await to be sent, they are stored in the "event storage" core component. The size of each trace event requires ~60 bytes of storage. The exact size needed to store a single event can be determined with memfault_trace_event_compute_worst_case_storage_size()

#include "memfault/core/event_storage.h"
#include "memfault/panics/trace_event.h"
// [...]
int main(void) {
// [... other initialization code ...]
// Budget storage for up to ~10 trace events (~60 bytes each):
static uint8_t s_event_storage[600];
const sMemfaultEventStorageImpl *evt_storage =
memfault_events_storage_boot(s_event_storage, sizeof(s_event_storage));
// Pass the storage to initialize the trace event module:
memfault_trace_event_boot(evt_storage);
}

3. Create trace reasons definition file

Aside from the program counter and return address, a trace event also contains a user-defined "error reason". We'll need to define the list of possible reasons in a separate file.

In this guide we'll name this file memfault_trace_reason_user_config.def and assume its located at $YOUR_PROJECT_ROOT/config but you are free to use any name and location you'd like. The file will get #include-ed, so make sure the directory in which you create the file is part of the header search paths, such that the compiler can find it.

As an example, we'll want to track 2 types of errors: Bluetooth protocol CRC errors and invalid message IDs. Let's define a reason for each of these errors in memfault_trace_reason_user_config.def:

// memfault_trace_reason_user_config.def
MEMFAULT_TRACE_REASON_DEFINE(bt_crc_mismatch)
MEMFAULT_TRACE_REASON_DEFINE(bt_invalid_msg_id)

4. Add Trace Reason Define to Build System

The user trace definition file is picked up in the memfault-firmware-sdk via the MEMFAULT_TRACE_REASON_USER_DEFS_FILE macro. Expand the instructions below for instructions on how to set this define based on the compiler you are using:

GCC Compiler


With GCC you will need to add a new CFLAG and update your include path. Here's an example of how the addition would look if you are using Make as your build system

CFLAGS += -DMEMFAULT_TRACE_REASON_USER_DEFS_FILE=\"memfault_trace_reason_user_config.def\"
YOUR_INC_PATHS += $(PROJECT_ROOT)/config

ARM IAR Compiler


With IAR, you can make use of the "Preinclude file" feature to easily add defines to your build. In the preinclude file (in this example it's located at "config/preincludes.h") you will need to add:

// preincludes.h
#define MEMFAULT_TRACE_REASON_USER_DEFS_FILE "memfault_trace_reason_user_config.def"

You can then right click on your project and select "Options". You will need to add the include directory where the memfault_trace_reason_user_config.def is located and add a Preinclude file if you don't already have one as follows:

ARM MDK Compiler


With the ARM MDK you can right click on the Project and select "Options". You will then need to navigate to the C/C++ tab and add the following to the "Preprocessor Symbols" Define

MEMFAULT_TRACE_REASON_USER_DEFS_FILE=\"memfault_trace_reason_user_config.def\"

and add config to the "Include Paths" list.

5. Instrument your code to capture trace events

Next, we'll need to use the MEMFAULT_TRACE_EVENT macro to instrument the code where these errors can happen.

Note that it is perfectly fine to use the same reason in different places if that makes sense in the context of your code. Because the program counter and return address are captured in the trace event, you will be able to see the 2 topmost call sites (function name, source file and line) in Memfault's Issue UI.

#include "memfault/panics/trace_event.h"
// [ ...]
void ble_le_process_ll_pkt(...) {
// ...
if (invalid_msg_id) {
MEMFAULT_TRACE_EVENT(bt_invalid_msg_id);
// ...
}
// ...
}
void ble_rf_check_crc(...) {
if (crc_mismatch) {
MEMFAULT_TRACE_EVENT(bt_crc_mismatch);
// ...
}
}
void some_other_code_path(...) {
// ...
if (bad_msg_id) {
MEMFAULT_TRACE_EVENT(bt_invalid_msg_id);
// ...
}
}

6. Publish events to the Memfault cloud

The final step is to push the data the Memfault cloud. If you are already pushing coredumps to the cloud, there's nothing more you need to do in order to send Trace Events!

Extensive details about how data from the Memfault SDK makes it to the cloud can be found here. In short, all data is published via the same "chunk" REST endpoint.

#include "memfault/core/data_packetizer.h"
// [...]
bool try_send_memfault_data(void) {
// buffer to copy chunk data into
uint8_t buf[USER_CHUNK_SIZE];
size_t buf_len = sizeof(buf);
bool data_available = memfault_packetizer_get_chunk(buf, &buf_len);
if (!data_available ) {
return false; // no more data to send
}
// send payload collected to chunks/ endpoint
user_transport_send_chunk_data(buf, buf_len);
return true;
}
void send_memfault_data(void) {
// [... user specific logic deciding when & how much data to send
while (try_send_memfault_data()) { }
}

7. (Optional) Collecting Trace Events from Interrupts

It is also safe to use the MEMFAULT_TRACE_EVENT() macro from an interrupt. When called from an interrupt, the trace event info will just copy the collected info to a temporary storage region in RAM to minimize interrupt latency. The collected data can then be serialized to event storage by explicitly calling memfault_trace_event_try_flush_isr_event(). The data will also be flushed automatically anytime MEMFAULT_TRACE_EVENT() is called and the processor is not in an interrupt.