A powerful, easily deployable network traffic analysis tool suite
Let’s continue with the example of the cooltool
service we added in the PCAP processors section above, assuming that cooltool
generates some textual log files we want to parse and index into Malcolm.
You’d have configured cooltool
in your cooltool.Dockerfile
and its section in the docker-compose
files to write logs into a subdirectory or subdirectories in a shared folder bind mounted in such a way that both the cooltool
and filebeat
containers can access. Referring to the zeek
container as an example, this is how the ./zeek-logs
folder is handled; both the filebeat
and zeek
services have ./zeek-logs
in their volumes:
section:
$ grep -P "^( - ./zeek-logs| [\w-]+:)" docker-compose.yml | grep -B1 "zeek-logs"
filebeat:
- ./zeek-logs:/data/zeek
--
zeek:
- ./zeek-logs/upload:/zeek/upload
…
You’ll need to provide access to your cooltool
logs in a similar fashion.
Next, tweak filebeat.yml
by adding a new log input path pointing to the cooltool
logs to send them along to the logstash
container. This modified filebeat.yml
will need to be reflected in the filebeat
container via bind mount or by rebuilding it.
Logstash can then be easily extended to add more logstash/pipelines
. At the time of this writing (as of the v5.0.0 release), the Logstash pipelines basically look like this:
filebeat
) sends logs to 1..n parse pipelinesSo, in order to add a new parse pipeline for cooltool
after tweaking filebeat.yml
as described above, create a cooltool
directory under logstash/pipelines
which follows the same pattern as the zeek
parse pipeline. This directory will have an input file (tiny), a filter file (possibly large), and an output file (tiny). In your filter file, be sure to set the field event.hash
to a unique value to identify indexed documents in OpenSearch; the fingerprint filter may be useful for this.
Finally, in your docker-compose
files, set a new LOGSTASH_PARSE_PIPELINE_ADDRESSES
environment variable under logstash-variables
to cooltool-parse,zeek-parse,suricata-parse,beats-parse
(assuming you named the pipeline address from the previous step cooltool-parse
) so that logs sent from filebeat
to logstash
are forwarded to all parse pipelines.
The following modifications must be made in order for Malcolm to be able to parse new Zeek log files:
logstash/pipelines/zeek/11_zeek_parse.conf
id
four-tuple, timestamp, etc., use the same convention used by existing Zeek logs in that file (e.g., ts
, uid
, orig_h
, orig_p
, resp_h
, resp_p
)logstash/pipelines/zeek/12_zeek_normalize.conf
for values like action (event.action
), result (event.result
), application protocol version (network.protocol_version
), etc.logstash/pipelines/zeek/11_zeek_parse.conf
The script scripts/zeek_script_to_malcolm_boilerplate.py
may help by autogenerating these filters for you.
Malcolm’s Logstash instance will do a lot of enrichments for you automatically: see the enrichment pipeline, including MAC address to vendor by OUI, GeoIP, ASN, and a few others. In order to take advantage of these enrichments that are already in place, normalize new fields to use the same standardized field names Malcolm uses for things like IP addresses, MAC addresses, etc. You can add your own additional enrichments by creating new .conf
files containing Logstash filters in the enrichment pipeline directory and using either of the techniques in the Local modifications section to implement your changes in the logstash
container
The logstash.Dockerfile installs the Logstash plugins used by Malcolm (search for logstash-plugin install
in that file). Additional Logstash plugins could be installed by modifying this Dockerfile and rebuilding the logstash
Docker image.