Fleet Sampling
As your fleet grows, it becomes more costly to send, process and store the data from all of your devices. Fleet Sampling will help keeping the costs low by collecting diagnostics and performance data only from a smaller, yet statistically significant, subset of your fleet. At large fleet sizes, this smaller "sampled" population of devices provide sufficiently representative data to still provide all relevant insights and to understand issues as they occur across your fleet.
Memfault strongly recommends usage of Normalized Charts on Dashboard and Metric Charts with Fleet Sampling to understand trends across your fleet correctly. Normalized Charts are enabled by default for Fleet Sampling projects.
Fleet Sampling Aspects
Each device in the fleet can be configured by turning their specific Fleet Sampling Aspects on or off. See the differences between the aspects and their respective states in the table:
- MCU
- Linux
- Android
Aspects | Resolution: Off | Resolution: On |
Monitoring | Only periodic device check-ins | Periodic collection of Metrics with Heartbeats. |
Debugging | No crash/debug data collection | Coredumps, Trace Events, reboot reasons and Custom Data Recordings are collected. |
Sessions | No session report collection. | Collection of Metrics with Session Reports. |
Aspects | Resolution: Off | Resolution: On |
Monitoring | Disable attributes and metrics | Attributes and metrics are sent. |
Debugging | Coredumps and reboot events will not be uploaded | |
Logging | No logs | All log files are uploaded. For details, see Log Collection . |
Sessions | No session report collection. | Collection of Metrics with Session Reports. |
Aspects | Resolution: Off | Resolution: On |
Monitoring | Only periodic device check-ins | Batterystats metrics and Metrics are collected every 1h. |
Debugging | No crash/log collection | Crash reports (via Bug Reports or Caliper), reboot reasons, High-Resolution Telemetry (including high-resolution battery metrics), and Custom Events are collected. Log collection and upload in the case of an overlap with a Trace Event (see details here). |
Logging | No logs, except if they are part of a crash report | All log files are uploaded (see details here). |
Sessions | No session report collection. | Collection of Metrics with Session Reports. |
On Android and Linux, data is stored on the device when the relevant resolution is "Off" (up to configured storage limits). When a resolution is turned to "On" then any previously stored data for that resolution is uploaded - e.g. when the Logging resolution is enabled then all previously stored logs will be uploaded.
Please also see Rate Limits page as collection of these items are subject to rate limiting.
Your devices won't be "in the dark" even if all three aspect resolutions are set
to Off
— they'll still continue contacting Memfault to check for OTA
updates and download, be considered as active devices (which also contributes to
the Memfault dashboard), and visible under
Devices page
along with their software versions and last check-in times.
Managing aspect resolutions of your devices
It's helpful to have the best visibility on devices that are prone to be problematic on the field (frequent customer complaints/tickets) for further investigation or devices used during the development/testing phases so that crashes and performance issues can be detected ahead of time.
To set an aspect resolution of an individual device, navigate to the corresponding Device Details page, select Fleet Sampling Resolutions and click on the edit icon next to the resolution value.
Devices will be polling these changes periodically (by default every 2h) and will report their state back once the configuration is applied. Config state can have the following values:
Never reported | Device hasn't reported any config state yet and is most probably a "pending" device |
Outdated | Device hasn't contacted Memfault since the config was changed |
Synced | Device is using the configuration as seen on the Device Details page |
When using Developer Mode, an immediate configuration update can be requested after changing the fleet sampling configuration, instead of waiting for the next regular time the device is going to poll its configuration.
The Android SDK will report metrics showing the current fleet sampling configuration on each device. See Built-in Metrics for more information.
Quotas
Quota and usage information can be accessed under Settings → Quotas.
The number of devices for which Fleet Sampling can be turned on concurrently is limited. The concrete value varies by project and can even be different per Fleet Sampling aspect.
If the aspect resolution quota configured for the project is reached, an error
message will be shown at the top of the page when changing the sampling
resolution of a device to On
:
In order to free up quota, devices with the relevant aspect resolution On
can
be filtered under
Device Search and
their resolution can be set to Off
in bulk, as explained in
Setting aspect resolution of multiple devices
section.
Setting aspect resolution(s) of multiple devices
Memfault's Device Search allows you to precisely describe a population of devices before assigning their respective sampling resolutions. Using the search parameters, specific populations of the fleet where the most visibility is needed (i.e. devices experiencing fast battery discharges/connectivity issues in the past or devices in a specific Cohort) can be defined and their sampling resolutions can be updated all at once in bulk. Having such a visibility is also important before rolling out new software versions to be able to proactively monitor the potential negative effects of the roll out.
In the event of hitting quota limits when assigning sampling resolutions in bulk, a warning message will be presented to the user (see the screenshot below). You should free up the required quota for the assignment before continuing with the assignment. This can be done via:
- Updating your plan by emailing sales@memfault.com
- Turning the respective resolutions of some other devices to
Off
by using the same mechanism as a prior step.
Another option for predictably assigning sampling resolutions is to "limit" the
number of devices that will be affected from the bulk operation: It limits the
assignment to be applied on "the first N devices" that match with the search
criteria and the sorting order. In the example below, the quota for logging is
limited to 10 but 80 devices match the search query software_version = 1.0.0
:
Using the limit option, the change will be applied on the last seen 5 devices
with the software version
1.0.0
.
The list of devices to be updated with new sampling resolutions will only be materialized once the request is received upon clicking on "Start Bulk Operation" button. That means, the assignment will be performed against the search query that's used (or against the whole fleet in case of no search query) and the numbers should be taken as an estimation.
This conveys that in the context of the screenshot above, the devices to be updated may be different than what's displayed in the search results since new devices may have contacted Memfault and have a more recent last seen information in the meantime.
If devices to be updated with new sampling resolutions need to be precisely selected, please select them explicitly via the checkboxes in the search result before performing the action.
As this operation can take a long time, the result will be communicated via email to the user who initiated the action. The email contains a summary of the changes and how many devices are affected by the change.
Default Fleet Sampling configuration of new devices
Memfault will automatically assign resolutions to the newly enrolled devices, as long as the quota limits permit. When a device contacts Memfault for the first time, the default configuration to be propagated would be:
- Monitoring:
On
- Sessions:
On
- Debugging:
On
- Logging:
Off
(as this is only offered for a small set of devices)
If quota limits are reached, the respective aspect with no quota will be set to
Off
resolution. See
Setting aspect resolution of multiple devices
section to free up quota before a mass roll-out.
Log Collection
Log Collection feature is supported on the Android SDK, starting with version 4.2.0 and in the Linux SDK, starting with version 1.4.0.
Log Collection feature allows you to retrieve past logs from devices when the
Logging resolution is set to On
, even when the devices have not been sending
any crash reports or monitoring data. Upon receiving the change of the Logging
resolution to On
, the Device will send all previously stored logs and once
caught up, it will continue sending new logs until the resolution is set to
Off
. (See Android Logging and
Linux Logging for more details).
Logs are kept on the device as long as the storage limit permits. On Linux, the limit is set in the configuration file. On Android the limit is configured by Memfault. Please contact us for help updating it.
API
Using Memfault's REST API, the devices belonging to a project can be listed together with their aspect resolutions. However, there is currently no API to set the Fleet Sampling resolution programmatically. Instead, configure Device Attributes via API to mark devices you want to update and perform a bulk assignment searching for matching devices via the frontend as described above.