Tracking Battery Life with Memfault
Memfault believes that the expected and the actual battery life of devices are two of the most important reliability metrics for devices that are not connected directly to power sources. Unfortunately, understanding and predicting battery life trends and regressions with thousands to millions of devices in the field is hard.
In this guide, we will walk through the benefits of tracking battery life with Memfault, how to understand and predict battery life for a single device as well as across an entire fleet of devices, and finally talk through common pitfalls when monitoring battery life.
Tracking Battery Life with Memfault
By using Memfault's device SDKs and the heartbeat metrics, the difficulties of monitoring battery life will become a problem of the past. Memfault provides two main benefits when it comes to monitoring battery life: the ability to view a single device's battery status over time, and the ability to understand battery trends across an entire fleet of devices.
The above view shows a device-level view of battery life over time. This view is best for customer support teams responding to customer issues around battery life and for engineers trying to determine the root causes for poor battery life on a single device.
The above view shows the average battery life drop per hour of all devices in a fleet between software versions 1.0.0 and 1.0.1. Even though small, there is a larger battery life drop on average for software version 1.0.1, which could signal a regression in firmware!
The above examples give you a quick idea of the capabilities of Memfault. Let's begin on how to get started digging into battery life on your devices using Memfault.
Steps to Track Battery Life
Android
If you are using the Bort SDK, the relevant battery metrics are automatically reported and you can skip down to the later sections to learn how to use these metrics within Memfault.
MCU & Linux
The MCU SDK >= v0.42.0 handles unset metric values differently from prior versions. Unset metrics are now dropped automatically by Memfault. We've updated this page with instructions on setup with and without these changes.
Two metrics need to be recorded within the MCU or Linux firmware to ensure that you can reliably track battery life at a device and fleet level. These two metrics are the charge level, or state of charge, of a device at the end of a heartbeat interval, and the charge level drop over the course of a heartbeat interval, where the heartbeat interval is a consistent, fixed interval of time.
Current Battery Charge Level
This value represents the current charge level of the battery in the system. This value should be linearly normalized between 0 and 10,000.
>= SDK v0.42.0: The value should only be set if valid for the entire interval.
< SDK v0.42.0: A value of -1 should be sent when the battery charge level can not be obtained or is known to be invalid.
This value should be based on a linear scale. Millivolts should not be sent.
Setup Within Memfault
- It is recommended to use the name
Battery_ChargeLevel
for this metric name. - Range should be set to 0 - 10,000, both start and end inclusive.
- Ingress Filtering should be set to "Accept only values within range" (only required with SDK < v0.42.0).
- Attribute should be enabled, and the Write Policy should be set to Writeable.
- Time Series should be disabled.
Example Metric Recordings
- To report a battery charge level of 50%, record 5,000.
- To report a battery charge level of 50 mAh out of a 500 mAh capacity battery, record 1,000.
- If the battery charge level could not be acquired or it is known to be invalid, record -1 (SDK < v0.42.0).
- If the battery charge level could not be acquired or it is known to be invalid, leave the metric unset (SDK >= v0.42.0).
Battery Level Drop Over Interval
This value represents the amount the battery charge level dropped within the heartbeat interval, normalized between 0 to 10,000. This can also be interpreted as the first derivative of the current battery charge level.
This metric should report an invalid reading of -1 if a charger or other power source was connected at any point during the heartbeat interval.
A value of -1 should also be sent when the battery charge level drop can not be obtained, is known to be invalid, or happens to be negative.
There's a reason to send -1 in these events. We want to capture only normally occurring battery charge level drops during intervals. If anything makes it a non-normal event, it should be thrown out so as to not ruin the fleet-level aggregation.
For >= SDK 0.42.0, leave the metric value unset when invalid instead.
Setup Within Memfault
- It is recommended to use the name
Battery_ChargeLevelDrop
for this metric name. - Range should be set to 0 - 10,000.
- Ingress Filtering should be set to "Accept only values within range" (only required for <= SDK v0.42.0).
- Attribute should be disabled.
- Time Series should be enabled.
Example Metric Recordings
- To report a drop from 87% to 84% over the course of the heartbeat interval, report 300.
- To report a battery charge level drop of 10 mAh out of a 500 mAh capacity battery, record 200.
- If the battery charge level drop could not be acquired or it is known to be invalid or negative, record -1 (SDK < v0.42.0).
- If the battery charge level drop could not be acquired or it is known to be invalid or negative, leave value unset (SDK >= v0.42.0).
Implementing Battery Metrics in Code
Below is C pseudocode that can be used to track the current battery charge level and battery charge level drop over a given heartbeat interval.
memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_KEY_DEFINE_WITH_RANGE(Battery_ChargeLevel,
kMemfaultMetricType_Signed,
0, 10000)
MEMFAULT_METRICS_KEY_DEFINE_WITH_RANGE(Battery_ChargeLevelDrop,
kMemfaultMetricType_Signed,
0, 10000)
Note that the value of -1 can be sent even though the range is defined as 0 - 10000. By setting the "Ingress Filtering" configuration to "Accept only values within range", Memfault will automatically filter out the -1 values in the web application.
Memfault C Instrumentation
- >= SDK v0.42.0
- < SDK v0.42.0
// Stores the battery charge level from the start of the heartbeat interval
static int32_t s_battery_charge_level_previous;
// Tracks whether a charger was connected during the heartbeat interval
static bool s_battery_charger_event_during_interval;
// Called at the start of a heartbeat interval
static void prv_device_metrics_start_heartbeat() {
// Sets the battery percentage at the start of the interval
s_battery_charge_level_previous = battery_get_current_pct();
// Determine whether a charger is connected at this exact moment.
// If the charger is connected during the hour, this should be
// updated by the battery/charging module too.
s_battery_charger_event_during_interval = battery_get_charger_connected();
}
// Calculate and record the drop in battery charge over the heartbeat interval
static void prv_metrics_heartbeat_record_battery_info(void) {
// Charge level normalized between 0 and 10,000
const int32_t battery_charge_level = battery_charge_level_get();
// Subtract current charge level from previous charge level to get
// a drop in battery charge level normalized between 0 and 10,000
const int32_t battery_charge_drop = battery_charge_level - s_battery_charge_level_previous;
const bool report_battery_drop = (!s_battery_charger_event_during_interval && battery_charge_drop >= 0);
if (report_battery_drop) {
memfault_metrics_heartbeat_set_signed(
MEMFAULT_METRICS_KEY_DEFINE(Battery_ChargeLevelDrop),
battery_charge_drop
);
}
// Also record the current battery charge level
if (battery_charge_level > 0) {
memfault_metrics_heartbeat_set_signed(
MEMFAULT_METRICS_KEY_DEFINE(Battery_ChargeLevel),
battery_charge_level
);
}
}
// Stores the battery charge level from the start of the heartbeat interval
static int32_t s_battery_charge_level_previous;
// Tracks whether a charger was connected during the heartbeat interval
static bool s_battery_charger_event_during_interval;
// Called at the start of a heartbeat interval
static void prv_device_metrics_start_heartbeat() {
// Sets the battery percentage at the start of the interval
s_battery_charge_level_previous = battery_get_current_pct();
// Determine whether a charger is connected at this exact moment.
// If the charger is connected during the hour, this should be
// updated by the battery/charging module too.
s_battery_charger_event_during_interval = battery_get_charger_connected();
}
// Calculate and record the drop in battery charge over the heartbeat interval
static void prv_metrics_heartbeat_record_battery_info(void) {
// Charge level normalized between 0 and 10,000
const int32_t battery_charge_level = battery_charge_level_get();
// Subtract current charge level from previous charge level to get
// a drop in battery charge level normalized between 0 and 10,000
const int32_t battery_charge_drop = battery_charge_level - s_battery_charge_level_previous;
const bool report_battery_drop = (!s_battery_charger_event_during_interval && battery_charge_drop >= 0);
if (report_battery_drop) {
memfault_metrics_heartbeat_set_signed(
MEMFAULT_METRICS_KEY_DEFINE(Battery_ChargeLevelDrop),
battery_charge_drop
);
} else {
// Don't report this metric during hours with charging events
// Make sure to set up the range & ingress filtering to ignore -1 in the Memfault service
memfault_metrics_heartbeat_set_signed(
MEMFAULT_METRICS_KEY_DEFINE(Battery_ChargeLevelDrop),
-1
);
}
// Also record the current battery charge level
memfault_metrics_heartbeat_set_signed(
MEMFAULT_METRICS_KEY_DEFINE(Battery_ChargeLevel),
battery_charge_level
);
}
Analyzing Battery Life for a Single Device
Viewing an individual device's battery life over time is usually the first step during a customer support or engineering debugging session when trying to diagnose poor battery performance.
In Memfault, a device's battery life can be found under Devices then Timeline.
The Memfault application also shows other important metrics alongside battery life, which might point out correlations between battery charge drops and other events, such as the screen backlight being on, the Wi-Fi radio being stuck in a particular state, or the CPU not sleeping enough.
Android
For Android devices, you can navigate to the Device Timeline tab in the application and view the "Battery Level" metric to view the battery charge level over time.
MCU & Embedded Linux
For MCU & Embedded Linux devices, you can navigate to the Device Timeline
tab in the application and view the specific battery metrics that your devices
are configured to send. If this guide was followed, the metric would be called
Battery_ChargeLevel
.
Analyzing Battery Life for a Fleet of Devices
A requirement of battery-operated devices is that they meet a minimum expected battery life. If a smartwatch promised to last for a week under normal usage, it should ideally last a week or more in real-world environments.
Fleet-wide battery trends can also be easily analyzed within Memfault.
Android
For Android projects and devices, a metric
battery_discharge_rate_pct_per_hour_avg
is created automatically. It
represents the battery charge level drop per hour represented as a percentage.
Below, we create a chart for this metric.
In the next section, we'll talk about how to use this chart.
MCU & Embedded Linux
For MCU & Embedded Linux projects, we receive the metric representing the battery charge level drop over the course of the heartbeat interval for all devices. Below, we create a chart for this metric.
In the next section, we'll talk about how to use this chart.
Predicting Expected Battery Life
Once we have the average drop in the battery charge level over time, we can easily calculate the number of hours of battery life that our devices should expect on average!
For our calculations below, we'll assume that all heartbeat intervals are one hour.
For example, if the average battery charge drop per hour was 2.2.
Common Battery Life Pitfalls
Here are some common pitfalls with monitoring battery life that many of us at Memfault have struggled with in the past. We hope you don't make the same mistakes with this knowledge.
Ignoring Charging Status During Heartbeat Interval
A common issue with reporting battery life information from devices is that the power state is ignored. Battery life statistics that are used in aggregations should not be taken into account if a device is currently plugged in or charging.
For example, the chart below shows the battery life of a device over a span of time:
The battery is discharging around 10% per data point. However, there is one chunk of time where there is a large battery life drop, but the device is plugged into a charger quickly and by the time the next battery life report takes place, it's been charged and is reporting a normal-looking number again. It might look as if the battery regression never took place!
The same goes for a device that is reporting data for long periods while connected to a charger. It might report 100% for 3 days, and these calculations should not be taken into account when determining battery life calculations for a fleet of devices.
Sending Current Battery Charge Level Sporadically
Firmware could be configured to send up battery life information at a non-standard interval. It may be whenever it connects to Wi-Fi, when it's within Bluetooth range of a cell phone, or when it's asked by a remote server for information.
In this information report, a device would usually send the current battery life percentage according to a pre-determined curve or a millivolt reading.
This type of information is decent when looking at a single device's history of battery life, but that is about it. From these individual reports, and sporadic reports at that, a battery life drop might be missed entirely, such as in the chart below:
When only taking into account the four captured data points, the battery life looks like it drops at a normal rate. However, there was a large drop in the middle.
The other reason why sending battery charge levels at non-regular intervals is hard is because of the large-scale calculation that needs to be done to aggregate millions of devices battery life expectancies becomes incredibly complex, expensive, and potentially impossible.
Not Waiting For Battery Measure to Settle
Batteries are quirky devices, and we don't have perfect tools at our disposal to get accurate readings from them at all times. Often, the voltage from the battery is used to come up with a guess as to the charge level of the battery.
Reading the voltage works well if the battery draw has been stable and continues to be stable, but the battery may experience a voltage drop during a sudden high current draw, such as the backlight being turned on or a vibration motor suddenly activating.
In these situations, you need to be extra careful when using the battery charge level reading. To mitigate this, it is advised to take a handful of samples at both the start and end of heartbeat intervals, each sample being a few seconds apart. Of these samples, the outliers can be discarded or all samples can be averaged to come up with a more stable reading.
Preventing Battery Life Regressions
The only real way to prevent battery life regressions reliably and quickly is to measure the metrics mentioned above on devices in a production environment on production firmware. It is impossible to run devices and firmware through the entire matrix of environmental conditions to understand real-world power usage.
This means that if a company makes a wearable device and it's shipped all over the world, it should be tested internally with a variety of Android and iOS phones, versions, connectivity interference, and usage patterns, but it should be understood that this will only be a small subset of the environments the device will experience.
The only proper way to understand battery life is to push out an update to a small percentage of the fleet of devices, maybe 1%, and gather data. Below, we are comparing an existing firmware, 1.0.0, with a new firmware that has recently begun rolling out to devices, 1.1.0.
We can quickly see that the battery charge level drop on 1.1.0 is, on average, much worse than on 1.0.0. If this was unexpected, this is a regression!
This release should be pulled and more investigation needs to be done to understand the root cause for the new battery life drain behavior.