Skip to main content

Device Vitals: Built-in metrics and out-of-the-box visualizations for stability, battery life and connectivity

With our newly released Device Vitals feature set we have added a streamlined and standardized way to start monitoring critical health and performance data from your devices. The three Device Vitals, Stability, Battery Life and Connectivity, are automatically computed by Memfault on the cloud side based on a set of metrics collected by the SDK. The metrics required for the computation are standardized across all platforms and are now built-in to all SDK’s from Android 4.13, MCU 1.5.0, and Linux 1.9.0.

We have also added a set of “out-of-the-box” dashboards for monitoring each vital and new charts and cards to add to your own custom dashboards. The combination of built-in metrics and out-of-the-box visualizations provides both existing and new Memfault customers an easy way to start monitoring some of the most important health and performance indicators across your fleet. Read about how to configure the built-in metrics and use the new visualizations in our technical documentation.

Jira Integration: Connect and automatically sync Memfault Issues with tickets in Jira

Our new Jira integration adds the ability to sync key information from Memfault Issues with Jira Issues. Users can now create Jira Issues from an Issue in Memfault and link Memfault Issues to existing Jira Issues from within Memfault. Once linked, key data is automatically and continuously sync’d with the Jira Issue including:

  • Memfault Issue name and issue status
  • Trace count and impacted device count
  • First seen and last seen dates

The Issue status in Memfault (e.g. Open, Resolved, Reopened) will also be automatically updated based on the Jira Issue status (e.g. Done, Unresolved). This new integration should make it much easier to work issues identified by Memfault into your existing development processes in Jira. You can read more about the integration and how to get set-up in our technical documentation.

Get all the data you need in one place: Add any chart to any dashboard

It is now possible to add any chart, previously available across any part of Memfault, into your custom dashboards. All of the charts and cards previously only available in either the Overview or Device Sets dashboards can now be added to any dashboard. The three pre-existing dashboards, Overview, Metrics and Device Sets have all been consolidated under the Dashboards section of the side nav and can now be renamed and customized.

This series of changes means that users can get maximum value from all data sources in Memfault in their custom dashboards.

The list of newly available charts and cards includes:

  • Active devices
  • Issue charts
  • Software Versions
  • Device sets
  • Active devices
  • Device incident alerts

Combine all these data sources into a single dashboard to maximize the relevance and usefulness of each dashboard and eliminate the need to jump between dashboards to get the insights you need. You can read more about the changes in Memfault’s Dashboards documentation.

Improved Processing Log: More data, better discoverability, more tools

The Processing Log has been updated to cover more processing-related activity, improve discoverability and make it easier to take action. The update adds multiple new filtering options including hardware version, software version and log level (e.g. Errors or Info). It also adds multiple new ways to take action on the information, including shareable links for each log, a download of the log for further investigation, and a quick link to upload missing symbol files.

The introduction of the processing log will make the initial integration much easier, providing instant feedback for a developer encountering unexpected behavior during integration. For customers already using Memfault this will provide much greater visibility of project related data processing and errors that might impacting the completeness of their data. Users can read more about the Processing log in Memfault’s documentation.

New charts for tracking Crashes and Reboots

Users can now add two new chart types into their custom dashboards - Reboots and “Crashes per 10k hours”. The Reboots chart gives visibility into the breakdown of different reboot reasons across each data set. The “Crashes per 10K hours” provides a calculation of the average number of crashes (unexpected reboots) across a minimum of 10,000 operating hours.

These new charts give teams a way to track the stability of their devices and measure and compare software quality across distinct populations. Teams will now be able to definitively measure software quality improvements or regressions between versions and even compare software stability across product lines. The Reboot chart is available to all users but the Crashes per 10k hours chart is currently only available to MCU customers. You can read more about these new charts in Memfault's documentation.

Build your own dashboards in Memfault

Users can now create custom dashboards and manage the layout of content within these new dashboards. This change makes dashboards much more flexible, able to cope with a wider variety of use cases such as dashboards for specific teams, software versions, cohorts, etc. Users can also re-arrange the content within the dashboard using drag, drop and resize functionality.

Create and manage dashboards in the All Dashboards tab within the “Dashboards” sub-menu and customize chart layouts using the “Layout mode” toggle available within each dashboard. Find out more about creating custom dashboards and using layout mode in Memfault’s documentation.

Improved issue management with tags

Users can now add tags to issues within Memfault and use the tags to filter issue searches and build issue charts. This change facilitates more sophisticated issue triaging, grouping and categorization within Memfault. For example, tags could be used to indicate issue priority and also to associate groups of related but separate issues together.

A user can add tags in the issue view, each issue can carry multiple tags and tags can be added or removed at any time.

Better visibility of processing errors

Users can now view a detailed listing of processing errors in the Processing Log under the “Integration Hub” sub-menu. The Processing Log contains details on errors such as missing symbol files on coredump upload, MAR file processing errors and instances of device data not being accepted by the server. This change adds a huge amount of additional visibility into errors related to processing of device data in Memfault.

These processing errors are now also reported on the device timeline adding another layer of debugging information for devices. Users can read more about the Processing Log in Memfault’s documentation.

Export lists of Devices as a CSV

Users can now bulk export lists of Devices and associated information as a CSV file. Use filters as normal to define the desired list of Devices and then export that list as a CSV. Once the CSV is generated it will be emailed to the user. This makes it easy to share information on specific groups of Devices to teams in the organization outside of Memfault for the purpose of further processing in external tools or scripts for logistics, reporting, etc.

By default on export the CSV will include basic information about the Device such as serial number, cohort, software and hardware version etc. and a user can also choose to include custom attributes in the export if required.

See data in Metric charts as soon as it’s received

Data in Metric charts is now available in “real-time” once received. Previously, data in Metric charts only updated once the day in UTC completes and this would add some delay in the insights the Metric charts were able to provide. Now the Metric data collected from devices is visible in Metric charts as soon as it is received by Memfault.

A screenshot of Memfault Metric charts

Memfault Metric charts now display data in "real-time"

This enhancement will allow users to react quicker to undesirable changes in behavior and provide closer to real-time information during significant events like software version roll-outs. Currently, this change is only applicable to Metric charts and does not apply to other chart types (e.g. Issues charts, Device set charts, etc).

Software versions displayed in Device Timeline

Device Timeline now includes a visualization of the active software version on the device alongside the metric and traces information. This allows a user to very quickly associate any metric behavior, crash or reboot event with the active software version at the time.

As demonstrated in the screenshot, this should also make it very easy to identify if a change of behavior coincides with a change in software version. The software version is now displayed by default across all Devices on all platforms.

Best Practice Guide: Using MQTT with Memfault

We have released a new best practices guide covering the use of the MQTT protocol with Memfault. Correct set-up for your MQTT implementation is critical as errors in set-up can result in data loss or data being decoded incorrectly rendering it impossible for Memfault to deliver accurate insights.

The guide provides a basic introduction to MQTT and specific advice for users looking to optimize their MQTT stack to ensure reliable data delivery to Memfault. Topics covered include:

  • Publishing QoS settings
  • Topic architecture and recommended topic structure
  • Minimizing publishing overheads with topic aliases
  • Choosing an MQTT payload size
  • Device and Service examples

Read the full MQTT guide in our best practice documentation.

New Data Aggregation for Metric Charts: Percentiles

Metric Charts now have a new data aggregation available for both chart rollup options (by Cohort or Software version, and Over Time). This aggregation is called “Percentiles” and will display the data set broken into the 1st, 5th, 50th, 95th and 99th percentiles. Displaying the metric data as percentiles makes it easier to understand the prevalence of behaviors (tracked as metrics) across your fleet.

As an example, if you are seeing significant spikes in battery discharge rates for a specific software version you can use percentile aggregation to get a clearer picture of the scale of this problem. Are these undesirable metric readings I am receiving contained to a small set of samples or is this a wide scale problem? Conversely, this aggregation should also make it easier to understand what “normal” actually looks like across your fleet.

This new aggregation is the default view when creating any new metric chart. You can read more about this new data aggregation in our Metric Charts documentation.

Best Practices Guide for Android Battery Debugging

Our Developer Experience team have released a new guide designed to help users get the most from Memfault when identifying and debugging battery issue on Android devices. This guide provides detail on:

  • What data the Android SDK can collect
  • Using metric charts for tracking battery health/performance
  • Setting fleet and device alerts
  • Identifying problem devices
  • Debugging individual devices using Device Timeline

The guide takes into account the recently released set of updates to our Android SDK with version 4.8.0 which included the addition of new battery usage metrics such as per app battery usage.

You can read the full guide in the Memfault Best Practices for Android Battery Debugging.

Bulk Issue Merging improvements

Users can now merge multiple issues together with a few clicks, eliminating the previously repetitive process of merging each issue independently. This addresses the scenarios where Memfault’s de-duplication algorithm may not group related issues due to unaccounted for variables within the issue signature making each appear as an independent issue.

Users can now use all of the filtering capabilities within the issues page to narrow down to a specific issue set and either bulk select or individually select a set of issues they believe should be merged into one issue. You can read more about issues and bulk merging in the Issue Management documentation.

Android SDK 4.8.0: More OTA control, more metrics and more issue tracking

The most recent Android SDK 4.8.0 introduces a number of new features for Android customers. Android customers now get powerful additional controls for the OTA update process, more granular battery usage metrics and tracking of SELinux violations.

The OTA improvements allow users to control download and install behavior independently and dependent on additional payload specific metadata and/or current device condition. For example, you can ensure that devices will prioritize updates a user has tagged as “critical” for download and will only install an update if certain battery conditions are met. You can read more about the specific configurations in Memfault’s documentation on configuring download and install for OTA updates.

The changes to battery metrics allow users to view the battery usage per app, distinguish usage in screen-on or screen-off scenarios, and see battery capacity, all on the device timeline view. This gives much more granular visibility into battery performance and further enhance users’ ability to root cause issues. You can read more about this in Memfault’s Android Battery Summary Metrics documentation.

Finally, we added support for tracking SELinux violations via Memfault. Tracking these issues will now be possible with all of the same functionality as the other pre-existing Android issue types.

Metric Charts: Comparison by Software Version or Cohort

Metric charts have been enhanced with the addition of a new chart type presenting a direct comparison version to version or cohort to cohort without time as a variable. This new chart type presents a comparison of min/mean/max data aggregated across the entire comparison populations based on your selection (e.g. min/mean/max ever reported on v1.0.0 vs min/mean/max ever reported on v.1.0.1).

The pre-existing “over-time” chart view restricts comparison to an 8 week window which is great for intensive monitoring but less good at judging absolute performance. This new chart type allows users to make more effective performance comparisons in real-world scenarios and take decisions with more confidence.

You can read more about this new chart type in Memfault's documentation.

Increased Configurability for Alerting and Notifications

Alerting and notification functionality has received a set of significant enhancements. These include the introduction of configurable incident start and end delays, the ability to decide at which stages of an incident a user wants a notification, and increased control over the scheduling for incident summary notifications.

These changes are designed to give users greater control on an alert-by-alert basis over the alerting behavior and incident related notifications, reducing unnecessary “noise” without sacrificing visibility. Ultimately these changes help ensure alerts trigger notifications which are highly relevant and timely.

You can read more about these changes to alerting, incidents and notifications in Memfault's documentation.

Simplified Debugging with Device Timezone in Memfault

The Timezone selector allow users to adjust the timezone within which all information viewed in Memfault is contextualized for that session. Currently, the timezone selections available include “Browser” (browser default), “Universal” (UTC), “Device” (configurable) and “Custom” (region selection).

Timezone selection within Memfault is designed to make it easier for international teams working with globally deployed devices to contextualize, coordinate and collaborate. Whether you want to standardize working within the “Universal” timezone across your team, or you want to use the “Custom” or “Device” options to adjust your session to the specific local timezone of a device you are debugging, having this type of control over context can be powerful.

To provide a specific example, let’s say you receive a report of a customer having an issue with a device via your support team. The end user has reported a problem at 11am this morning. Rather than the developer needing to do the calculation manually they can just adjust to the correct “Custom” timezone assuming they know where that device was located. This process is even quicker if “Device” timezone is configured and takes a single click with no additional information required.

This month we made some improvements to the function of the “Device” timezone selection for Android devices. This selection is now available by default for Android devices using the persist.sys.timezone system property.

You can read more about timezones here and about the Android specific “Device” timezone metric collection here.