Linux Logging
Introduction
memfaultd
integrates with [fluent-bit
][fluent-bit-homepage] to collect logs
from your systems and upload them to Memfault platform. Memfault can also parse
your logs directly on the system and generate metrics from the logs even when
they are not uploaded to Memfault.
Logs collected by memfaultd
will be shown on the device
timeline.
Prerequisites
The memfaultd
daemon, built with logging
Follow the integration guide to learn how to install
memfaultd
on your device.
The logging feature is enabled by default in the meta-memfault
layer as of
v1.4.0
of Memfault SDK.
Read more on [how to configure which features memfaultd
builds with]
docs-linux-control-features.
fluent-bit
The fluent-bit source code includes a Yocto recipe to
compile and install fluent-bit
.
We include the same recipe (with some minor changes for compatibility with
recent versions of Yocto) and a sample configuration file in our
meta-memfault-example
layer.
Flow of logs
Fluent-bit collects logs from various sources on the system, encodes them in
fluent-bit's internal representation and forwards
them to memfaultd
via a local TCP connection on port 5170. Although it's
possible to configure fluent-bit to buffer to disk, we do not recommend enabling
this option as it would cause logs to be written twice to disk.
Memfaultd writes all logs to disk in the logs
subdirectory in the global
memfaultd
temporary directory. Writes are buffered for performance reason.
Memfaultd always maintains a single log file for all the log messages provided
by fluent-bit. When the file reaches a specific size or
age, they are moved to the Memfault upload staging
area (also called MAR staging) where they will be uploaded during the next
synchronization.
Memfault compresses logs on disk using the ZLib Deflate algorithm. Multiple protections are in place to ensure that logs cannot fill the disk, including rate limiting (in lines per minute) and size limits.
Configuring fluent-bit
Fluent-bit provides a rich set of input and filter plugins to control precisely what gets collected. Our default configuration for fluent-bit sets up collection of kernel and systemd log messages.
memfaultd
does not generate the fluent-bit
configuration file. It just
expects a connection from fluent-bit
on the default fluent-bit TCP port 5170.
Fluent-bit messages are expected to be delivered in the fluent-bit native
msgpack format.
This is the required output configuration:
[OUTPUT]
Name tcp
Host 127.0.0.1
Port 5170
Format msgpack
Match *
net.keepalive on
net.keepalive_idle_timeout 10
# Default retry limit is 1. We recommend setting to a higher value to
# decrease the chance of losing logs in the event that memfaultd is
# (re)starting while fluent-bit is attempting to flush logs:
Retry_Limit 5
Relevant /etc/memfaultd.conf
settings
You can adjust the behavior of memfaultd
when it comes to logging using the
following configuration options.
See a full configuration reference here.
fluent-bit.bind_address
Change the listening address and port of fluent-bit connector.
fluent-bit.max_buffered_lines
In most cases, memfaultd
will immediately write to disk new log lines. Some
buffering is required while rotating log files. This controls how many lines may
be buffered before back pressure is applied to fluent-bit. The default will be
safe for most use-cases.
fluent-bit.max_connections
This limits the number of open connections with fluent-bit. Fluent-bit will typically open one question for each input plugin. Connection keep-alive is optional but we recommend turning it on.
The default is set to 4. Increase this if you have more input plugins.
fluent-bit.extra_fluentd_attributes
To reduce the size of the log files, memfaultd
will only save the keys
"MESSAGE", "_PID", "_SYSTEMD_UNIT" and "PRIORITY" by default.
If your fluent-bit sources generate more keys that you need to save, add them to this list. They will be visible in the memfault dashboard and searchable.
logs.compression_level
Log files are compressed using the Deflate algorithm before writing to disk. This setting controls which compression level to apply.
- 0: No compression.
- 1: Fastest compression
- 9: Best compression
The default is 1. After in-house testing, Memfault believes this is the best compromise of CPU-cost to space saving for most use-cases.
logs.max_lines_per_minute
This setting controls how many lines per minute can be saved before discarding new logs.
When logging resumes, memfaultd
will print a message indicating how many lines
were skipped:
Memfaultd rate limited 42 messages."
logs.rotate_after_seconds
Regardless of size, log files are rotated when they reach a certain age.
logs.rotate_size_kib
Rotate the current log file when it reaches this size. After rotation, it will remain in the MAR staging area until the next upload (see general configuration).
Recommended configuration
We recommend starting with our example configuration file. If some applications are too verbose, you can use one of the fluent-bit filter plugins to limit the amount of logs collected.
Filtering out specific messages
To filter out specific messages, you can use the fluent-bit grep plugin.
# Exclude all messages containing the string "Connection timeout. Will retry."
[FILTER]
name grep
match *
exclude MESSAGE Connection timeout. Will retry.
Set enable_data_collection
By default, enable_data_collection
is false
. This is to enable asking end
users for consent before collecting or transmitting any data to Memfault
services.
Once the end user has given their consent, you can enable data collection like so:
$ memfaultctl enable-data-collection
To disable it:
$ memfaultctl disable-data-collection
The memfaultd
service will restart automatically whenever you run either of
those commands if called with a value different from the current configuration.
Take a look at the /etc/memfaultd.conf
reference for
more information.
Converting logs into metrics
memfaultd
v1.9.0 and later.Log-to-metrics is not enabled by default and must be turned on with the
log-to-metrics
feature flag.
In [memfaultd.bbappend
][mf-example-enable-log-to-metrics]:
CARGO_FEATURES:append = " log-to-metrics"
This feature works even when collectd is not used. Fluent-bit is required as a source of logs.
You can use memfaultd
to convert logs into metrics. Specific patterns in log
messages will be captured directly on the device and efficiently transformed
into metrics. Edge processing of logs is much more efficient than trying to
upload all the logs and process them in the cloud.
This enables:
- Monitoring and alerting on Kernel and System logs,
- Monitoring and alerting on application logs when it's not convenient or easy to instrument the application with StatsD metrics directly,
- Monitoring for security events and reporting them.
it comes at a cost that should be considered in the context of your project:
- Turning this feature on will add about 1MB of code (the regexp library) to
memfaultd
. - Depending on how many rules you define and how much logs are generated by your system there will also be a CPU impact that should be carefully evaluated.
We are here to help understand and evaluate this feature.
To configure the conversion of logs into metrics, you define a list of rules.
All the rules will be applied against each line of log. The rules will always be applied, even when the device is not uploading logs (due to its fleet sampling configuration) and when the device is not writing logs (due to rate limiting or size limits).
Each rule includes a set of filter
key-values that must be matched for the
rule to be applied. This can be used to filter on the log source (e.g.
"_SYSTEMD_UNIT": "init.scope"
) or to filter on the log level (e.g.
"_PRIORITY": "6"
) and limit how many times the regular expression will be
evaluated.
memfaultd
in verbose mode memfaultd -V
to see the logsreceived in structured format with all the key-value available to be used in the rules, as well as all the rules that are applied and what they matched.
$ memfaultd -V
...
DEBUG - LogToMetrics: Processing log: {"MESSAGE": String("INFO - [3]: Running activity Fibo30"), "PRIORITY": String("6"), "_PID": String("4607"), "_SYSTEMD_UNIT": String("wefaultd.service")}
DEBUG - LogToMetrics Pattern 'Out of memory: Killed process \d+ \((.*)\)'=> MATCH=false Captures=None
Only one type of rule is currently supported, count_matching
. This rule will
count the number of times the regular expression matches the log line and will
report the count as a metric. The name of the metric can be dynamic and defined
by text matched by the regular expression.
Refer to our configuration reference for the full syntax.
Example:
{
// ...
"logs": {
// ...
"log_to_metrics": {
"rules": [
{
"type": "count_matching",
"filter": {
"_SYSTEMD_UNIT": "init.scope"
},
"pattern": "(.*): Scheduled restart job, restart counter is at",
"metric_name": "systemd_restarts_$1"
},
{
"type": "count_matching",
"filter": {},
"pattern": "Out of memory: Killed process \\d+ \\((.*)\\)",
"metric_name": "oomkill_$1"
}
]
}
}
}
Testing your integration
During development, you can use memfaultctl sync
to force memfaultd
to
rotate the current logfile and upload it.
The log will be visible in Memfault Dashboard as soon as it has been processed (usually a few seconds later).
Viewing Logs in the Web Application
To see detailed reports from a specific device, find the device in Fleet → Devices, and then open its Timeline tab.