A powerful, easily deployable network traffic analysis tool suite

Quick Start



Supported Protocols




Hedgehog Linux

Contribution Guide

Asset Interaction Analysis

Malcolm provides an instance of NetBox, an open-source “solution for modeling and documenting modern networks.” The NetBox web interface is available at at https://localhost/netbox/ if connecting locally.

The design of a potentially deeper integration between Malcolm and Netbox is a work in progress.

Please see the NetBox page on GitHub, its documentation and its public demo for more information.

Enriching network traffic metadata via NetBox lookups

As Zeek logs and Suricata alerts are parsed and enriched (if the NETBOX_ENRICHMENT environment variable in ./config/netbox-common.env is set to true), the NetBox API will be queried for the associated hosts’ information. If found, the information retrieved by NetBox will be used to enrich these logs through the creation of the following new fields. See the NetBox API documentation and the NetBox documentation for more information.

For Malcolm’s purposes, both physical devices and virtualized hosts will be stored as described above: the device_type field can be used to distinguish between them.

NetBox has the concept of sites. Sites can have overlapping IP address ranges. The site to associate with network traffic can be specified when PCAP is uploaded, when configuring live analysis, and when configuring forwarding from Hedgehog Linux. If not otherwise specified, the value of the NETBOX_DEFAULT_SITE variable in environment variable in netbox-common.env will be used for these enrichment lookups.

Compare and highlight discrepancies between NetBox inventory and observed network traffic

As Malcolm cross-checks network traffic with NetBox’s model (as described above), the resulting enrichment data (or lack thereof) can highlight devices and services observed in network traffic for which there is no corresponding entry in the list of inventoried assets.

These uninventoried devices and services are highlighted in two dashboards:

Zeek Known Summary

Asset Interaction Analysis

This feature was implemented as described in idaholab/Malcolm#133.

Populate NetBox inventory manually

While the initial effort of populating NetBox’s network segment and device inventory manually is high, it is the preferred method to ensure creation of an accurate model of the intended network design.

The Populating Data section of the NetBox documentation outlines mechanisms available to populate data in NetBox, including manual object creation, bulk import, scripting and the NetBox REST API.

The following elements of the NetBox data model are used by Malcolm for Asset Interaction Analysis.

Populate NetBox inventory via passively-gathered network traffic metadata

If the NETBOX_AUTO_POPULATE environment variable in ./config/netbox-common.env is set to true, uninventoried devices with private IP addresses (as defined in RFC 1918 and RFC 4193) observed in known network segments will be automatically created in the NetBox inventory based on the information available. This value is set to true by answering Y to “Should Malcolm automatically populate NetBox inventory based on observed network traffic?” during configuration.

However, careful consideration should be made before enabling this feature: the purpose of an asset management system is to document the intended state of a network: with Malcolm configured to populate NetBox with the live network state, a network misconfiguration fault could result in an incorrect documented configuration.

Devices created using this autopopulate method will include a tags value of Autopopulated. It is recommended that users periodically review automatically-created devices for correctness and to fill in known details that couldn’t be determined from network traffic. For example, the manufacturer field for automatically-created devices will be set based on the organizational unique identifier (OUI) determined from the first three bytes of the observed MAC address, which may not be accurate if the device’s traffic was observed across a router. If possible, observed hostnames (extracted from logs that provide a mapping of IP address to host name, such as Zeek’s dns.log, ntlm.log, and dhcp.log) will be used in the naming of the automatically-created devices, falling back to the device manufacturer otherwise (e.g., MYHOSTNAME vs. Schweitzer Engineering @

Since device autocreation is based on IP address, information about network segments (IP prefixes) must be first manually specified in NetBox in order for devices to be automatically populated. Users should populate the description field in the NetBox IPAM Prefixes data model to specify a name to be used for NetBox network segment autopopulation and enrichment, otherwise the IP prefix itself will be used.

Although network devices can be automatically created using this method, services should inventoried manually. The Uninventoried Observed Services visualization in the Zeek Known Summary dashboard can help users review network services to be created in NetBox.

See idaholab/Malcolm#135 for more information on this feature.

Matching device manufacturers to OUIs

Malcolm’s NetBox inventory is prepopulated with a collection of community-sourced device type definitions which can then be augmented by users manually or through preloading. During passive autopopulation device manufacturer is inferred from organizationally unique identifiers (OUIs), which make up the first three octets of a MAC address. The IEEE Standards Association maintains the registry of OUIs, which is not necessarily very internally consistent with how organizations specify the name associated with their OUI entry. In other words, there’s not a foolproof programattic way for Malcolm to map MAC address OUI organization names to NetBox manufacturer names, barring creating and maintaining a manual mapping (which would be very large and difficult to keep up-to-date).

Malcolm’s NetBox lookup code used in the log enrichment pipeline attempts to match OUI organization names against the list of NetBox’s manufacturers using “fuzzy string matching”, a technique in which two strings of characters are compared and assigned a similarity score between 0 (completely dissimilar) and 1 (identical). The NETBOX_DEFAULT_FUZZY_THRESHOLD environment variable in netbox-common.env can be used to tune the threshold for determining a match. A fairly high value is recommended (above 0.85; 0.95 is the default) to avoid autopopulating the NetBox inventory with devices with manufacturers that don’t actually exist in the network being monitored.

Users may select between two behaviors for when the match threshold is not met (i.e., no manufacturer is found in the NetBox database which closely matches the OUI organization name). This behavior is specified by the NETBOX_DEFAULT_AUTOCREATE_MANUFACTURER environment variable in netbox-common.env:

Populate NetBox inventory via active discovery

See idaholab/Malcolm#136.

Compare NetBox inventory with database of known vulnerabilities

See idaholab/Malcolm#134.

Preloading NetBox inventory

YML files in ./netbox/preload under the Malcolm installation directory will be preloaded upon startup using the third-party netbox-initializers plugin. Examples illustrating the format of these YML files can be found at its GitHub repository.

[workflow files]

Backup and Restore

The NetBox database may be backed up and restored using ./scripts/netbox-backup and ./scripts/netbox-restore, respectively. While Malcolm is running, run the following command from within the Malcolm installation directory to backup the entire NetBox database:

$ ./scripts/netbox-backup
NetBox configuration database saved to ('malcolm_netbox_backup_20230110-133855.gz', 'malcolm_netbox_backup_20230110-133855.media.tar.gz')

To clear the existing NetBox database and restore a previous backup, run the following command (substituting the filename of the netbox_….gz to be restored) from within the Malcolm installation directory while Malcolm is running:

./scripts/netbox-restore --netbox-restore ./malcolm_netbox_backup_20230110-125756.gz

Users with a prior NetBox database backup (created with netbox-backup as described above) that they wish to be automatically restored on startup, that .gz file may be manually copied to the ./netbox/preload directory. Upon startup that file will be extracted and used to populate the NetBox database, taking priority over the other preload files. This process does not remove the .gz file from the directory upon restoring it; it will be restored again on subsequent restarts unless manually removed.

Note that network log enrichment will fail while a restore is in progress (indicated with HTTP/1.1 403 messages in the output of the netbox container in the Malcolm debug logs), but should resume once the restore process has completed.