Memfault now supports RISC-V on ESP32-C3 devices. By following the steps in the
ESP32 ESP-IDF integration guide, you will be
able to integrate the Memfault Firmware SDK into a system that is using the
ESP-IDF, including the ESP32-C3
chip. The integration guide assumes you already have a working project/toolchain
for the ESP32. If you do not, the
official getting started guide
is a great resource.
With the introduction of the
Linked Device Search
feature, users can not only list the devices returned by search results, but
also the devices that are linked to those devices. This is helpful when defining
a population of devices (both primary and linked) before performing follow-up
tasks such as assigning them to a specific
Cohort, describing a new Device
Set or assigning Fleet Sampling Resolutions and
while doing so preventing any discrepancies (e.g. linked device staying in the
old Cohort while the primary
device is moved to a new one). Improvements to the Device Details page will also
allow users to easily inspect the configuration states and update the Fleet
Sampling Resolutions of both devices.
As your fleet grows, it becomes more costly to send, process and store the data
from all devices in growing fleets. With the Fleet Sampling feature, Memfault
helps manage the costs for your device data bandwidth by only collecting
diagnostics and performance data from a smaller, yet statistically significant,
subset of a fleet that you can control at any time.
Thanks to Chart Normalization,
all insights to understand issues occurring across your fleet will still be at
your disposal even when a only smaller number of devices are reporting from the
field. For more information, please refer to the
Fleet Sampling documentation.
Memfault was built from the ground up to handle devices with limited bandwidth,
intermittent connectivity, and minimal processing power capabilities. Following
the guidelines and code examples in the recently published
Using Memfault with Low-Bandwidth Devices
article, IoT operators can save on network costs and lower the power consumption
of their devices.
Memfault's new Custom Data Recordings (CDRs)
allow devices to send any custom data for specific events or periods at
arbitrary times. Memfault's Device Timeline shows these recordings grouped by
their user-defined reason in relation to the existing debug information, such as
metrics, logs, traces, or reboots to provide even more detail about the state a
device was in at a given moment.
The underlying data of CDRs can be of any format and is accessible for download
for further analysis outside of Memfault. This allows devices to send
vendor-specific or proprietary debug data from their sub-components. Via the
Memfault CLI, one can also augment the device timeline with data captured
outside of the device (e.g. reports for hardware-in-the-loop tests).
Two of the most important IoT reliability metrics are the expected and the
actual battery life for devices. Understanding and predicting trends for battery
life and detecting regressions with thousands to millions of devices in the
field is a challenging problem that
Memfault recently published advice and product improvements for.
The combination of metric charts, device timeline, and the recently added
documentation with
code samples
help with understanding battery life of MCU, Linux or Android devices.
Memfault's Linux SDK reached
version 1.0 when introducing support for Coredumps: In
the event of a crash of any process on the system, memfaultd produces a memory
dump that will be uploaded to Memfault for further processing to allow for
detailed debugging across the fleet.
Together with the already existing support for OTA, metrics, and reboot reasons,
Memfault now offers all its essential features on Linux devices!
The documentation of Memfault's Linux SDK was
extended even further to explain the integration steps as well as the growing
number of configuration options.
Memfault's Linux support reaches another milestone: Devices can now report
metrics and diagnostic data to measure the success of software updates (OTA) and
to proactively diagnose anomalies before users even experience their effect. The
Memfault Linux SDK 0.3.0 ships
with a configurable set of
plugins for collectd to obtain
standard KPIs at the operating system level (e.g. available storage or RAM, CPU
utilization, or network status and traffic). You can also use the SDK to
collect product-specific custom metrics via statsd.
When sent to the cloud, all telemetry data is being processed and distilled to
fleet-wide time-series metrics (e.g. "was
there an uptick in avg. CPU usage since the last version?"),
device attributes (e.g. "which devices at
site B ran for more than 6 months already without reboot?"), and detailed
per-device insights via the Timeline UI (e.g. "are there any anomalies on the
network traffic that correlate with crashes reported for this device?").
Memfault's Device Timeline provides a view for each device's metrics, reboots
and crashes to make debugging easier. With the recent performance improvements,
Device Timeline now renders considerably more time-series metrics simultaneously
and expandable "Panels" group relevant metrics together for ease of use (see
Value History, Foreground, Longwakes below).
Memfault extends its features on embedded Linux
toward basic fleet operations. You can now measure basic fleet-wide health
metrics by tracking reboots and their cause at scale. Similar to Memfault's
MCU and Android SDKs,
there is now a dedicated
Memfault Linux SDK 0.2.0 with source code
including examples. The SDK repository comes with Docker images including QEMU
support to simplify the first steps.
As part of the SDK, a new on-device agent memfaultd orchestrates the
configuration of related components such as SWUpdate for OTA. It will act as a
minimal yet central component in future releases for features such as metrics
and crash reporting.
Memfault's charts can now be normalized
to convert absolute values such as "number of incidents", sums, and counts to
corresponding relative values "per 1,000 devices". This helps in understanding
real trends when you are looking at values over time or when comparing values
between populations of different sizes.
This feature also works with custom metric charts for any custom metric. It is
particularly helpful to measure the success of an ongoing OTA update by
comparing devices from large production Cohort "Default" against those from a
smaller test Cohort "Beta". Chart normalization is also generally useful when
the population size changes over time (e.g. new devices being activated
continuously).
Memfault improved its
notification system and how notifications will be sent
on Alerts. For each individual Alert, you can now
decide which team members, external systems, or groups thereof should receive an
email. All members of @team-maintenance may want to receive notifications
about devices with an abnormal battery discharge rate while a spike in
connectivity issues on the "Beta" Cohort may only be relevant in the
#beta-release Slack channel.
At
Settings → Notifications,
there are extensive options to customize the @userhandle for any team member
to connect Memfault to dedicated Slack channels or any other external system
(e.g. PagerDuty, Opsgenie) by registering external email addresses. Any
combination of these
User Handles and External Targets can be added to a Notification Group
and used to control how to notify per Alert.
Memfault's over-the-air update service is now
available on Embedded Linux with SWUpdate via the
hawkBit DDI. This makes all Memfault
OTA management and hosting features such as Cohorts, staged rollouts, full vs.
delta releases, and a scalable global CDN available to Linux devices that
utilize one of the most popular update agents. Memfault also added support for
forced (non-interactive) updates – invaluable for delivering security updates to
embedded IoT devices.