Linux Logging
Introduction
memfaultd
captures logs from your device automatically using
journald.
When journald is not available Fluent-bit can be used in
place. Logs are automatically attached to coredumps captured by memfault.
Continuous log capture can also be enabled on a per device basis.
Logs are parsed directly on the device and generate metrics from the logs even when they are not uploaded to Memfault.
Logs collected by memfaultd
will be shown on the device
timeline.
Prerequisites
The memfaultd
daemon, built with logging
Follow the integration guide to learn how to install
memfaultd
on your device.
The logging feature is enabled by default in the meta-memfault
layer as of
v1.4.0
of Memfault SDK.
Read more on how to configure which features memfaultd
builds
with.
Log Sources
The choice of which source memfaultd
should use for logs is configured via the
source
field in the logs
configuration
journald
The systemd
feature also must be enabled to use the journald
option for
logs.source
.
The journald
log source config provides a way to capture logs from your device
and send them to Memfault via memfaultd
out-of-the-box.
When using the journald
logs source option, memfaultd
will read the logs
sent to systemd
's journal since the last boot (similar to the output of
journalctl --boot
). On shutdown or restart, memfaultd
will save the location
of its current cursor to disk and resume from there so as to avoid re-processing
logs.
If you are running your application or set of applications via systemd
,
nothing else will need to be done to see their logs along with logs from the
Linux system in Memfault.
fluent-bit
The fluent-bit
log source configuration can be used on system without systemd,
or when logs filtering is required before processing by memfaultd
.
The fluent-bit source code includes a Yocto recipe to
compile and install fluent-bit
.
We include the same recipe (with some minor changes for compatibility with
recent versions of Yocto) and a sample configuration file in our
meta-memfault-example
layer.
Configuring fluent-bit
Fluent-bit provides a rich set of input and filter plugins to control precisely what gets collected. Our default configuration for fluent-bit sets up collection of kernel and systemd log messages.
memfaultd
does not generate the fluent-bit
configuration file. It just
expects a connection from fluent-bit
on the default fluent-bit TCP port 5170.
Fluent-bit messages are expected to be delivered in the fluent-bit native
msgpack format.
This is the required output configuration:
[OUTPUT]
Name tcp
Host 127.0.0.1
Port 5170
Format msgpack
Match *
net.keepalive on
net.keepalive_idle_timeout 10
# Default retry limit is 1. We recommend setting to a higher value to
# decrease the chance of losing logs in the event that memfaultd is
# (re)starting while fluent-bit is attempting to flush logs:
Retry_Limit 5
Fluent-bit /etc/memfaultd.conf
settings
You can adjust the behavior of memfaultd
when it comes to logging using the
following configuration options.
See a full configuration reference here.
fluent-bit.bind_address
Change the listening address and port of fluent-bit connector.
fluent-bit.max_buffered_lines
In most cases, memfaultd
will immediately write to disk new log lines. Some
buffering is required while rotating log files. This controls how many lines may
be buffered before back pressure is applied to fluent-bit. The default will be
safe for most use-cases.
fluent-bit.max_connections
This limits the number of open connections with fluent-bit. Fluent-bit will typically open one question for each input plugin. Connection keep-alive is optional but we recommend turning it on.
The default is set to 4. Increase this if you have more input plugins.
fluent-bit.extra_fluentd_attributes
To reduce the size of the log files, memfaultd
will only save the keys
"MESSAGE", "_PID", "_SYSTEMD_UNIT" and "PRIORITY" by default.
If your fluent-bit sources generate more keys that you need to save, add them to this list. They will be visible in the memfault dashboard and searchable.
Recommended Fluent-bit configuration
We recommend starting with our example configuration file. If some applications are too verbose, you can use one of the fluent-bit filter plugins to limit the amount of logs collected.
Filtering out specific messages
To filter out specific messages, you can use the fluent-bit grep plugin.
# Exclude all messages containing the string "Connection timeout. Will retry."
[FILTER]
name grep
match *
exclude MESSAGE Connection timeout. Will retry.
Flow of logs
Fluent-bit collects logs from various sources on the system, encodes them in
fluent-bit's internal representation and forwards
them to memfaultd
via a local TCP connection on port 5170. Although it's
possible to configure fluent-bit to buffer to disk, we do not recommend enabling
Memfaultd writes all logs to disk in the logs
subdirectory in the global
memfaultd
temporary directory. Writes are buffered for performance reason.
Memfaultd always maintains a single log file for all the log messages. When the
file reaches a specific size or
age, they are moved to the Memfault upload staging
area (also called MAR staging) where they will be uploaded during the next
synchronization.
You can suppress all writes to disks with the logs.storage
option. When set to
disable
, Memfaultd will not write any logs to disk but it will continue to
process them to generate metrics and maintain a small buffer of recent messages
to attach to coredumps. age, they are moved to the
Memfault upload staging area (also called MAR staging) where they will be
uploaded during the next synchronization.
Memfault compresses logs on disk using the ZLib Deflate algorithm. Multiple protections are in place to ensure that logs cannot fill the disk, including rate limiting (in lines per minute) and size limits.
Important Log Configs
logs.compression_level
Log files are compressed using the Deflate algorithm before writing to disk. This setting controls which compression level to apply.
- 0: No compression.
- 1: Fastest compression
- 9: Best compression
The default is 1. After in-house testing, Memfault believes this is the best compromise of CPU-cost to space saving for most use-cases.
logs.max_lines_per_minute
This setting controls how many lines per minute can be saved before discarding new logs.
When logging resumes, memfaultd
will print a message indicating how many lines
were skipped:
Memfaultd rate limited 42 messages."
logs.rotate_after_seconds
Regardless of size, log files are rotated when they reach a certain age.
logs.rotate_size_kib
Rotate the current log file when it reaches this size. After rotation, it will remain in the MAR staging area until the next upload (see general configuration).
logs.source
Whether memfaultd
should use journald
or fluent-bit
as a source of logs.
Enabling data collection
By default, enable_data_collection
is false
. This is to enable asking end
users for consent before collecting or transmitting any data to Memfault
services.
Once the end user has given their consent, you can enable data collection like so:
$ memfaultctl enable-data-collection
To disable it:
$ memfaultctl disable-data-collection
The memfaultd
service will restart automatically whenever you run either of
those commands if called with a value different from the current configuration.
Take a look at the /etc/memfaultd.conf
reference for
more information.
Converting logs into metrics
This feature is available in memfaultd
v1.9.0 and later.
Log-to-metrics is not enabled by default and must be turned on with the
log-to-metrics
feature flag.
In [memfaultd.bbappend
][mf-example-enable-log-to-metrics]:
CARGO_FEATURES:append = " log-to-metrics"
This feature works even when collectd is not used. Fluent-bit is required as a source of logs.
You can use memfaultd
to convert logs into metrics. Specific patterns in log
messages will be captured directly on the device and efficiently transformed
into metrics. Edge processing of logs is much more efficient than trying to
upload all the logs and process them in the cloud.
This enables:
- Monitoring and alerting on Kernel and System logs,
- Monitoring and alerting on application logs when it's not convenient or easy to instrument the application with StatsD metrics directly,
- Monitoring for security events and reporting them.
On the edge conversion of logs to metrics is a powerful feature but it comes at a cost that should be considered in the context of your project:
- Turning this feature on will add about 1MB of code (the regexp library) to
memfaultd
. - Depending on how many rules you define and how much logs are generated by your system there will also be a CPU impact that should be carefully evaluated.
We are here to help understand and evaluate this feature.
To configure the conversion of logs into metrics, you define a list of rules.
All the rules will be applied against each line of log. The rules will always be applied, even when the device is not uploading logs (due to its fleet sampling configuration) and when the device is not writing logs (due to rate limiting or size limits).
Each rule includes a set of filter
key-values that must be matched for the
rule to be applied. This can be used to filter on the log source (e.g.
"_SYSTEMD_UNIT": "init.scope"
) or to filter on the log level (e.g.
"_PRIORITY": "6"
) and limit how many times the regular expression will be
evaluated.
You can run memfaultd
in verbose mode memfaultd -V
to see the logs received
in structured format with all the key-value available to be used in the rules,
as well as all the rules that are applied and what they matched.
$ memfaultd -V
...
DEBUG - LogToMetrics: Processing log: {"MESSAGE": String("INFO - [3]: Running activity Fibo30"), "PRIORITY": String("6"), "_PID": String("4607"), "_SYSTEMD_UNIT": String("wefaultd.service")}
DEBUG - LogToMetrics Pattern 'Out of memory: Killed process \d+ \((.*)\)'=> MATCH=false Captures=None
Only one type of rule is currently supported, count_matching
. This rule will
count the number of times the regular expression matches the log line and will
report the count as a metric. The name of the metric can be dynamic and defined
by text matched by the regular expression.
Refer to our configuration reference for the full syntax.
Example:
{
// ...
"logs": {
// ...
"log_to_metrics": {
"rules": [
{
"type": "count_matching",
"filter": {
"_SYSTEMD_UNIT": "init.scope"
},
"pattern": "(.*): Scheduled restart job, restart counter is at",
"metric_name": "systemd_restarts_$1"
},
{
"type": "count_matching",
"filter": {},
"pattern": "Out of memory: Killed process \\d+ \\((.*)\\)",
"metric_name": "oomkill_$1"
}
]
}
}
}
Testing your integration
During development, you can use memfaultctl sync
to force memfaultd
to
rotate the current logfile and upload it.
The log will be visible in Memfault Dashboard as soon as it has been processed (usually a few seconds later).
Viewing Logs in the Web Application
To see detailed reports from a specific device, find the device in Fleet → Devices, and then open its Timeline tab.