Logo

A powerful, easily deployable network traffic analysis tool suite

Quick Start

Documentation

Components

Supported Protocols

Configuring

Arkime

Dashboards

Hedgehog Linux

Contribution Guide

Forwarding Third-Party Logs to Malcolm

Malcolm uses OpenSearch and OpenSearch Dashboards for data storage and search and visualization, and Logstash for log processing. Because these tools are data agnostic, Malcolm can be configured to accept various host logs and other third-party logs sent from log forwaders such as Fluent Bit and Beats. Some examples of the types of logs these forwarders might send include:

The types of third-party logs and metrics discussed in this document are not the same as the network session metadata provided by Arkime, Zeek, and Suricata. Please refer to the Malcolm Contributor Guide for information on integrating a new network traffic analysis provider.

Configuring Malcolm

The environment variables in filebeat.env for configuring how Malcolm accepts external logs are prefixed with FILEBEAT_TCP_…. These values can be specified during Malcolm configuration (i.e., when running ./scripts/configure), as can be seen from the following excerpt from the Installation example:

…
Expose Logstash port to external hosts? (y/N): y
…
Expose Filebeat TCP port to external hosts? (y/N): y
1: json
2: raw
Select log format for messages sent to Filebeat TCP listener (json): 1

Source field to parse for messages sent to Filebeat TCP listener (message): message

Target field under which to store decoded JSON fields for messages sent to Filebeat TCP listener (miscbeat): miscbeat

Field to drop from events sent to Filebeat TCP listener (message): message

Tag to apply to messages sent to Filebeat TCP listener (_malcolm_beats): _malcolm_beats
…

The variables corresponding to these questions can be found in filebeat.env:

These variables’ values will depend on the forwarder and the format of the data it sends. Note that unless creating a custom Logstash pipeline, users probably want to choose the default _malcolm_beats for FILEBEAT_TCP_TAG in order for logs to be picked up and ingested through Malcolm’s beats pipeline.

Secure communication

In order to maintain the integrity and confidentiality of data, Malcolm’s default (set via the BEATS_SSL environment variable in beats-common.env) is to require connections from external forwarders to be encrypted using TLS. When ./scripts/auth_setup is run, self-signed certificates are generated which may be used by remote log forwarders. Located in the filebeat/certs/ directory, the certificate authority and client certificate and key files should be copied to the host on which the forwarder is running and used when defining its settings for connecting to Malcolm.

Fluent Bit

Fluent Bit is a fast and lightweight logging and metrics processor and forwarder that works well with Malcolm. It is well-documented, supports a number of platforms including Linux, Microsoft Windows, macOS (either built via source or installed with Homebrew) and more. It provides many data sources (inputs).

Convenience Script for Linux/macOS

fluent-bit-setup.sh is a Bash script to help install and configure Fluent Bit on Linux and macOS systems. After configuring Malcolm to accept and parse forwarded logs as described above, run fluent-bit-setup.sh as illustrated in the examples below:

Linux example:

$ ~/Malcolm/scripts/third-party-logs/fluent-bit-setup.sh 
0   ALL
1   InstallFluentBit
2   GetMalcolmConnInfo
3   GetFluentBitFormatInfo
4   CreateFluentbitService
Operation: 0
Install fluent-bit via GitHub/fluent install script [Y/n]? y
================================
 Fluent Bit Installation Script 
================================
This script requires superuser access to install packages.
You will be prompted for your password by sudo.
…
Installation completed. Happy Logging!

Choose input plugin and enter parameters. Leave parameters blank for defaults.
  see https://docs.fluentbit.io/manual/pipeline/inputs
1   collectd
2   cpu
3   disk
4   docker
5   docker_events
6   dummy
7   dummy_thread
8   exec
9   fluentbit_metrics
10  forward
11  head
12  health
13  http
14  kmsg
15  mem
16  mqtt
17  netif
18  nginx_metrics
19  node_exporter_metrics
20  opentelemetry
21  proc
22  prometheus_scrape
23  random
24  serial
25  statsd
26  stdin
27  syslog
28  systemd
29  tail
30  tcp
31  thermal
Input plugin: 2
cpu Interval_Sec:  10
cpu Interval_NSec:  
cpu PID:  
Enter Malcolm host or IP address (172.16.0.20): 172.16.0.20
Enter Malcolm Filebeat TCP port (5045): 5045
Enter agent hostname (hostname): hostname
Enter fluent-bit output format (json_lines): json_lines
Nest values under field: cpu
Add "module" value: cpu

/usr/local/bin/fluent-bit -R /etc/fluent-bit/parsers.conf -i cpu -p Interval_Sec=10 -o tcp://172.16.0.20:5045 -p tls=on -p tls.verify=off -p tls.ca_file=/home/user/Malcolm/filebeat/certs/ca.crt -p tls.crt_file=/home/user/Malcolm/filebeat/certs/client.crt -p tls.key_file=/home/user/Malcolm/filebeat/certs/client.key -p format=json_lines -F nest -p Operation=nest -p Nested_under=cpu -p WildCard='*' -m '*' -F record_modifier -p 'Record=module cpu' -m '*' -f 1

Configure service to run fluent-bit [y/N]? y
Enter .service file prefix: fluentbit_cpu
Configure systemd service as user "user" [Y/n]? y
[sudo] password for user: 
Created symlink /home/user/.config/systemd/user/default.target.wants/fluentbit_cpu.service → /home/user/.config/systemd/user/fluentbit_cpu.service.
● fluentbit_cpu.service
     Loaded: loaded (/home/user/.config/systemd/user/fluentbit_cpu.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-08-09 09:19:43 MDT; 5s ago
   Main PID: 105521 (fluent-bit)
      Tasks: 5 (limit: 76711)
     Memory: 24.7M
        CPU: 8ms
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/fluentbit_cpu.service
             └─105521 /usr/local/bin/fluent-bit -R /etc/fluent-bit/parsers.conf -i cpu -p Interval_Sec=10 -o tcp://172.16.0.20:5045 -p tls=on -p tls.verify=off -p tls.ca_fil…

Aug 09 09:19:43 localhost fluent-bit[105521]: Fluent Bit v1.9.6
…
Aug 09 09:19:43 localhost fluent-bit[105521]: [2022/08/09 09:19:43] [ info] [output:tcp:tcp.0] worker #0 started
Aug 09 09:19:43 localhost fluent-bit[105521]: [2022/08/09 09:19:43] [ info] [output:tcp:tcp.0] worker #1 started

macOS example:

$ bash fluent-bit-setup.sh 
0       ALL
1       InstallFluentBit
2       GetMalcolmConnInfo
3       GetFluentBitFormatInfo
4       CreateFluentbitService
Operation: 0
Install fluent-bit via Homebrew [Y/n]? y
==> Downloading https://ghcr.io/v2/homebrew/core/fluent-bit/manifests/1.9.6
…
Choose input plugin and enter parameters. Leave parameters blank for defaults.
  see https://docs.fluentbit.io/manual/pipeline/inputs
1       collectd
2       dummy
3       dummy_thread
4       exec
5       fluentbit_metrics
6       forward
7       head
8       health
9       http
10      mqtt
11      nginx_metrics
12      opentelemetry
13      prometheus_scrape
14      random
15      serial
16      statsd
17      stdin
18      syslog
19      tail
20      tcp
Input plugin: 14
random Samples:  10
random Interval_Sec:  30
random Internal_NSec:  
Enter Malcolm host or IP address (127.0.0.1): 172.16.0.20
Enter Malcolm Filebeat TCP port (5045): 5045
Enter agent hostname (hostname): hostname
Enter fluent-bit output format (json_lines): json_lines
Nest values under field: random
Add "module" value: random

/usr/local/bin/fluent-bit -R /usr/local/etc/fluent-bit/parsers.conf -i random -p Samples=10 -p Interval_Sec=30 -o tcp://172.16.0.20:5045 -p tls=on -p tls.verify=off -p tls.ca_file=/Users/user/forwarder/ca.crt -p tls.crt_file=/Users/user/forwarder/client.crt -p tls.key_file=/Users/user/forwarder/client.key -p format=json_lines -F nest -p Operation=nest -p Nested_under=random -p WildCard='*' -m '*' -F record_modifier -p 'Record=module random' -m '*' -f 1

Configure service to run fluent-bit [y/N]? n

Convenience Script for Windows

fluent-bit-setup.ps1 is a PowerShell script to help install and configure Fluent Bit on Microsoft Windows systems.

PS C:\work> .\fluent-bit-setup.ps1

Download fluent-bit
Would you like to download fluent-bit (zip) to C:\work?
[Y] Yes  [N] No  [?] Help (default is "Y"): y

Select input plugin (https://docs.fluentbit.io/manual/pipeline/inputs):
1. dummy
2. dummy_thread
3. fluentbit_metrics
4. forward
5. nginx_metrics
6. opentelemetry
7. prometheus_scrape
8. random
9. statsd
10. tail
11. tcp
12. windows_exporter_metrics
13. winevtlog
14. winlog
15. winstat
Make a selection: 13

Enter parameters for winevtlog. Leave parameters blank for defaults.
  see https://docs.fluentbit.io/manual/pipeline/inputs

winevtlog Channels: Application,Security,Setup,Windows PowerShell
winevtlog Interval_Sec:
winevtlog Interval_NSec:
winevtlog Read_Existing_Events:
winevtlog DB:
winevtlog String_Inserts:
winevtlog Render_Event_As_XML:
winevtlog Use_ANSI:
Enter Malcolm host or IP address: 172.16.0.20
Enter Malcolm Filebeat TCP port (5045): 5045
Enter agent hostname (hostname): hostname
Enter fluent-bit output format (json_lines): json_lines
Nest values under field (winevtlog): winevtlog
Add "module" value (winevtlog): winevtlog

C:\work\bin\fluent-bit.exe -c "C:\work\winevtlog_172.16.0.20_1660062217.cfg"

Install fluent-bit Service
Install Windows service for winevtlog to 172.16.0.20:5045?
[Y] Yes  [N] No  [?] Help (default is "N"): Y
Enter name for service: fluentbit_winevtlog
Enter account name to run service (DOMAIN\user): DOMAIN\user

Status   Name               DisplayName
------   ----               -----------
Stopped  fluentbit_winev... fluentbit_winevtlog

Start fluent-bit Service
Start Windows service for winevtlog to 172.16.0.20:5045?
[Y] Yes  [N] No  [?] Help (default is "Y"): y

Status   Name               DisplayName
------   ----               -----------
Running  fluentbit_winev... fluentbit_winevtlog

Beats

Elastic Beats can also be used to forward data to Malcolm’s Filebeat TCP listener. Follow the Get started with Beats documentation for configuring Beats on a host system.

In contrast to Fluent Bit, Beats forwarders write to Malcolm’s Logstash input over TCP port 5044 (rather than its Filebeat TCP input). Answer Y when prompted Expose Logstash port to external hosts? during Malcolm configuration (i.e., when running ./scripts/configure) to allow external remote Beats forwarders to send logs to Logstash.

The Beat’s configuration YML file file might look something like this sample filebeat.yml file:

filebeat.inputs:
- type: log
  paths:
    - /home/user/logs/*.log

processors:
  - add_tags:
      tags: [_malcolm_beats]

output.logstash:
  hosts: ["172.16.0.20:5044"]
  ssl.enabled: true
  ssl.certificate_authorities: ["/home/user/Malcolm/filebeat/certs/ca.crt"]
  ssl.certificate: "/home/user/Malcolm/filebeat/certs/client.crt"
  ssl.key: "/home/user/Malcolm/filebeat/certs/client.key"
  ssl.supported_protocols: "TLSv1.2"
  ssl.verification_mode: "none"

The important bits to note in this example are the settings under output.logstash (including the TLS-related files described above in Configuring Malcolm) and the _malcolm_beats value in tags: unless creating a custom Logstash pipeline, users probably want to use _malcolm_beats in order for logs to be picked up and ingested through Malcolm’s beats pipeline. This applies regardless of the specific Beats forwarder being used (e.g., Filebeat, Metricbeat, Winlogbeat, etc.).

Most Beats forwarders can use processors to filter, transform, and enhance data prior to sending it to Malcolm. Consult each forwarder’s documentation to learn more about what processors are available and how to configure them. Use the Console output for debugging and experimenting with how Beats forwarders format the logs they generate.

Data Format and Visualization

Because Malcolm could receive logs or metrics from virtually any provider, Malcolm most likely does not have prebuilt dashboards and visualizations for third-party logs. Luckily, OpenSearch Dashboards provides visualization tools that can be used with whatever data is stored in Malcolm’s OpenSearch document store. Here are some resources covering OpenSearch Dashboards and building custom visualizations:

Document Indices

Third-party logs ingested into Malcolm as outlined in this document will be indexed into the malcolm_beats_* index pattern (unless a user has created their own Logstash pipeline), which can be selected in the OpenSearch Dashboards’ Discover view or when specifying the log source for a new visualization.

Because these documents are indexed by OpenSearch dynamically as they are ingested by Logstash, their component fields will not show up as searchable in OpenSearch Dashboards visualizations until its copy of the field list is refreshed. Malcolm periodically refreshes this list, but if fields are missing from visualizations, users may wish to do it manually.

Once Malcolm has ingested a new log type it has not seen before, users can manually refresh OpenSearch Dashboards’s field list by clicking ManagementIndex Patterns, then selecting the index pattern (malcolm_beats_*) and clicking the reload 🗘 button near the upper-right of the window.