Logo

A powerful, easily deployable network traffic analysis tool suite for network security monitoring

Quick Start

Documentation

Components

Supported Protocols

Configuring

Arkime

Dashboards

Hedgehog Linux

Contribution Guide

Automatic file extraction and scanning

Malcolm can leverage Zeek’s knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the ZEEK_EXTRACTOR_MODE variable in zeek.env, or on a per-upload basis for PCAP files uploaded via the browser-based upload form when Analyze with Zeek is selected.

To specify which files should be extracted, the following values are acceptable in ZEEK_EXTRACTOR_MODE:

Depending on the volume of files extracted from network traffic, file scanning can be resource-intensive. When enabled, it is recommended to select interesting or notcommtxt unless running on a high-performance system.

Extracted files are scanned by Strelka, an open-source “real-time, container-based file scanning system used for threat hunting, threat detection, and incident response.”

Individual Strelka scanners can be toggled or configured by editing strelka/config/backend/backend.yaml. To disable a scanner, comment it out by adding # to each line of its section under scanners:, including the scanner’s name:


scanners:
  
#  'ScanDocx':
#    - positive:
#        flavors:
#          - 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
#          - "docx_file"
#      priority: 5
#      options:
#        extract_text: False
  

To enable a scanner, uncomment its section:


scanners:
  
  'ScanDocx':
    - positive:
        flavors:
          - 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
          - "docx_file"
      priority: 5
      options:
        extract_text: False
  

It’s recommended to validate this configuration file after making changes to it. This could be done using an online YAML validator or locally depending on available tools:

Each scanner may have configurable options; see the scanner list for more details. Other Strelka-related configuration files can be found under strelka/config/. Consult the Strelka documentation for more details.

For the YARA scanner, Malcolm’s default YARA rule set and/or user-defined custom YARA rules are used for scanning.

The RULES_UPDATE_ENABLED environment variable in pipeline.env controls whether or not to regularly pull signature/rule definitions from the internet for file scanning engines, including ClamAV signatures and Malcolm’s default YARA rule set.

The FILESCAN_PRESERVATION environment variable in filescan.env determines the behavior for preservation of scanned files:

The FILESCAN_HTTP_SERVER_… environment variables in filescan.env and filescan-secret.env configure browsing and download access to the scanned files through the means of a simple HTTPS directory server accessible at https://localhost/extracted-files/ if connecting locally. Beware that these files may contain malware. As such, these files may be optionally ZIP archived (without a password or password-protected according to the WinZip AES encryption specification) or encrypted (to be decrypted using openssl, e.g., openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe) upon download. In other words:

User interface

The files extracted by Zeek and the data about those files can be accessed through several of Malcolm’s user interfaces.

The File Scanning dashboard displays the results of file scans performed by Strelka on files extracted from network traffic

The files dashboard displays metrics about the files transferred over the network

Arkime's session details for files.log entries

The extracted files directory interface