A powerful, easily deployable network traffic analysis tool suite for network security monitoring
This document outlines the steps a Malcolm developer goes through to publish a release of Malcolm. This guide assumes the developer has been doing their work downstream in a fork of the main Malcolm repository upstream, forked at romeogdetlevjr/Malcolm
by the fictitious Malcolm developer Romeo G Detlev Jr. concocted for this example.
Malcolm tracks issues (whether they be bugs, new features, enhancements, etc.) for release milestones using a GitHub project. Before building release candidate images, Romeo reviews the items for the upcoming release in the corresponding project milestone and ensures that all items assigned to it have their status set to Done, each item having been completed and tested locally by the developer to which the issue was assigned.
Romeo also ensures that all work towards this release has been pulled into the branch on his fork from which the release will be cut. If pull requests have been submitted upstream which resolve the issues assigned to this release, those pull requests should be merged into the branch at romeogdetlevjr/Malcolm
, whether they were submitted initially against that fork or pulled in manually by Romeo as part of this release process. Pull requests are not accepted directly into the main
branch of the official upstream fork. In other words, the branch of Malcolm in Romeo’s development fork should contain everything that is going to comprise this release of Malcolm.
There are several places in the Malcolm source code where the release version itself (e.g., 24.10.1
) needs to be present. Most of these places are in the documentation, consisting of markdown files, but others include docker-compose.yml, docker-compose-dev.yml, and the Kubernetes manifests. Most likely Romeo’s first commit into his branch as he worked on this release was to bump those version strings (like this), but he should verify now that he did so.
Images and artifacts for release should not be built on Romeo’s own development workstation. Instead, carefully reviews the documentation for using GitHub runners to build Malcolm images (including setting up his GitHub repository actions secrets and variables) and starts builds of the GitHub container images with a workflow or repository dispatch API trigger. He monitors the progress of the workflow actions and ensures that they complete successfully, including jobs for both docker (linux/amd64)
and docker (linux/arm64)
where applicable.
The workflow for building the Hedgehog Linux installer ISO can be run independently of the Malcolm container images; however, the workflow for building the Malcolm installer ISO needs to be run after all of the container image “build-and-push” actions have completed successfully, as those images are pulled and archived inside of the ISO itself. Once Romeo is sure that all of the actions for building the container images from the previous step have completed successfully, he initiates a run of the malcolm-iso-build-docker-wrap-push-ghcr
action.
Once all of the release candidate images have been built by their respective GitHub actions, Romeo can use the convenience helper script (found at ./scripts/github_image_helper.sh
in the Malcolm source code) which has the following purposes:
ghcr.io/romeogdetlevjr/malcolm/zeek:main
)ghcr.io/idaholab/malcolm/zeek:24.10.1
)Romeo carefully reviews the documentation on this convenience helper script, then runs it. When it has completed, he verifies with docker images
that he pulled the new container images (checking the containers’ ages with the CREATED
column) and that he has the .iso
files he expects to have.
Now that he’s got the .iso
files for Malcolm and Hedgehog Linux, Romeo fires up some virtualization software (VMware Workstation, VirtualBox, or, his personal favorite, virt-manager) and installs the ISOs into their respective VMs. He makes sure his VMs are configured to meet the recommended system requirements. He follows the end-to-end Malcolm and Hedgehog Linux ISO Installation example in the documentation to install and configure Malcolm and Hedgehog Linux, resulting in a configuration where the VMs are successfully communicating with each other.
Part of Romeo’s testing includes uploading PCAP files to test the parsers for Malcolm’s supported protocols, so he uses a set of PCAP files curated by another Malcolm developer for this purpose.
He also knows that verifying live traffic capture is an important part of testing both Hedgehog Linux and Malcolm. He has used a few open-source tools to generate “real” live Internet traffic in his VMs, including PartyLoud, alphasoc/flightsim, and 3CORESec/testmynids.org. He downloads these utilities into both VMs and configures both Malcolm and Hedgehog Linux to capture the live traffic generated.
Having uploaded a variety of PCAP files and configured live traffic analysis, Romeo validates the resulting traffic metadata generated by Zeek, Suricata, and Arkime looks correct in both OpenSearch Dashboards and Arkime. He makes a special note to use Arkime’s sessions interface to retrieve a PCAP payload for an Arkime session captured on each VM.
Romeo knows that soon™ the Malcolm project will include a robust automated system testing framework, but until then he realizes it’s on him to do his best to ensure the quality of this Malcolm release. He carefully reviews and tests each issue assigned to this milestone on the GitHub project board.
Earlier, Romeo reminded himself that images and artifacts for release should not be built on his own development workstation. While this is a worthy goal, at the time of this writing GitHub does not provide standard hosted runners for arm64, so the workflow for building the Hedgehog Linux Raspberry Pi image would have to be emulated in QEMU. Romeo knows from personal experience that this build process would exceed GitHub’s time limit and be killed, so he has to resort to building the Raspberry Pi image locally. He has read that arm64 standard runners are coming soon and suspects that soon Malcolm will support building the Hedgehog Linux Raspberry Pi image natively using GitHub runners.
Now that he’s satisfied that everything looks ship-shape for the release, Romeo drafts and submits a pull request from his development fork to the Malcolm repository upstream, where it should be carefully reviewed, preferably by Romeo and another Malcolm developer together.
Once the PR has been carefully reviewed by the necessary parties to everyone’s satisfaction, it can be merged info the main
branch upstream.
Earlier Romeo used the convenience helper script to pull and tag the container images that would become the official images for this release. He now pushes those images to ghcr.io, making them available to the public in the official upstream namespace with their final release tags. He uses some script-fu to do this, listing the container images, filtering for the newly-tagged idaholab
images for this release, and using xargs
to execute a docker push
command for each:
$ docker images \
| grep -P "ghcr\.io/idaholab/malcolm/.+24\.10\.1" \
| awk '{print $1 ":" $2}' \
| xargs -r -l docker push
Getting image source signatures
Copying blob f944ed4242ed skipped: already exists
…
Copying config 2c88f94597 done |
Writing manifest to image destination
…
Writing manifest to image destination
Getting image source signatures
Copying blob 43c4264eed91 skipped: already exists
…
Copying config caff12e3c5 done |
Writing manifest to image destination
The push should actually go very quickly, because the container registry is smart enough to realize that the images already exist (with the romeogdetlevjr
tags), so there will be a lot of “Copying blob … skipped: already exists” messages in the output.
Romeo’s primary development workstation is a Linux system running on the x86_64/amd64 architecture. He realizes that Malcolm has had arm64 support for some time. However, the convenience script he used to pull and tag the Malcolm images as described above is only doing so for the amd64
container images.
Romeo switches over to an arm64-based machine (in his case, his Apple M2 Max MacBook Pro) and repeats the steps from Pull the container images from ghcr.io and Push official images to ghcr.io above, only this time for the Malcolm images with the -arm64
suffixed tags.
Romeo appreciates it when open source projects include detailed release notes, so he carefully goes writes some to accompany this release of Malcolm. Using the pattern followed in previous Malcolm releases, he uses Markdown to draft release notes including:
configure
script, etc.)There are two general categories of files that need to be generated to be included with the Malcolm release as assets, broken down thusly:
Romeo checks out and switches his GitHub repository’s working copy so that it’s tracking the upstream branch (e.g., git checkout main
and git branch --set-upstream-to idaholab/main
). Running git log -1
should show that the latest commit to this branch is the merge of the pull request performed earlier.
Romeo creates a local directory to contain the release artifacts and runs ./scripts/malcolm_appliance_packager.sh
to package up the scripts and tarball for a standalone Docker installation (the output of that script is somewhat verbose, so it’s been summarized for display here):
$ mkdir releases
$ cd releases
$ ~/Malcolm/scripts/malcolm_appliance_packager.sh
…
mkdir: created directory …
Package Kubernetes manifests in addition to docker-compose.yml [y/N]? y
…
Packaged Malcolm to "/home/romeogdetlevjr/Malcolm/releases/malcolm_20241008_215936_deadbeef.tar.gz"
Do you need to package container images also [y/N]? n
To install Malcolm:
1. Run install.py
2. Follow the prompts
To start, stop, restart, etc. Malcolm:
Use the control scripts in the "scripts/" directory:
- start (start Malcolm)
- stop (stop Malcolm)
- restart (restart Malcolm)
- logs (monitor Malcolm logs)
- wipe (stop Malcolm and clear its database)
- auth_setup (change authentication-related settings)
Malcolm services can be accessed at https://<IP or hostname>/
$ ls -l
total 462,848
-rwxr-xr-x 1 romeogdetlevjr romeogdetlevjr 219,939 Oct 22 10:32 install.py
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 475 Oct 22 10:33 malcolm_20241008_215936_deadbeef.README.txt
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 115,865 Oct 22 10:32 malcolm_20241008_215936_deadbeef.tar.gz
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 41,372 Oct 22 10:32 malcolm_common.py
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 44,226 Oct 22 10:32 malcolm_kubernetes.py
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 24,865 Oct 22 10:32 malcolm_utils.py
The resultant .py
, .tar.gz,
and .txt
files are ready to be included as assets in the Malcolm release on GitHub.
As described in the documentation for downloading Malcolm, due to limits on individual files in GitHub releases, the binary image files have been split into 2GB chunks. The same scripts (for Bash (release_cleaver.sh) and PowerShell (release_cleaver.ps1)) used to join the files can be used to split them up:
$ ls -l
total 8,502,263,808
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 1,209,240 Oct 22 09:50 hedgehog-24.10.1-build.log
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 2,664,972,288 Oct 22 09:50 hedgehog-24.10.1.iso
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 963,775 Oct 22 09:49 malcolm-24.10.1-build.log
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 5,835,110,400 Oct 22 09:49 malcolm-24.10.1.iso
$ for ISO in *.iso; do ~/Malcolm/scripts/release_cleaver.sh "$ISO"; done
Splitting...
bf6e71385046b39d265af3dfc5b77677a0ac5eeac86bdc5be48791d0900715df hedgehog-24.10.1.iso
Splitting...
b4957741420ec06988d975cdb7f71eaa201918245f6fcb7ee2641d7d0ad97c52 malcolm-24.10.1.iso
$ ls -l
total 17,002,364,928
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 1,209,240 Oct 22 09:50 hedgehog-24.10.1-build.log
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 2,664,972,288 Oct 22 09:50 hedgehog-24.10.1.iso
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 2,000,000,000 Oct 22 10:40 hedgehog-24.10.1.iso.01
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 664,972,288 Oct 22 10:40 hedgehog-24.10.1.iso.02
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 87 Oct 22 10:40 hedgehog-24.10.1.iso.sha
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 963,775 Oct 22 09:49 malcolm-24.10.1-build.log
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 5,835,110,400 Oct 22 09:49 malcolm-24.10.1.iso
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 2,000,000,000 Oct 22 10:41 malcolm-24.10.1.iso.01
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 2,000,000,000 Oct 22 10:41 malcolm-24.10.1.iso.02
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 1,835,110,400 Oct 22 10:41 malcolm-24.10.1.iso.03
-rw-r--r-- 1 romeogdetlevjr romeogdetlevjr 86 Oct 22 10:41 malcolm-24.10.1.iso.sha
The resultant files (with the .iso.##
and .iso.sha
extensions) are the files ready to be included as assets in the Malcolm release on GitHub.
Romeo goes to the releases page of the upstream repository. He clicks Draft a new release. On the new release page, he enters the release tag under Choose a tag (e.g., v24.10.1
) with main
as the target. He puts Malcolm v24.10.1 as the release title, and pastes the content of the markdown release notes he wrote into the Write input where it prompts him to Describe this release.
Romeo attaches the asset files from the previous step where it says “↓ Attach binaries by dropping them here or selecting them.” He ensures that Set as the latest release is checked.
After reviewing the contents of this page, Romeo pushes the green Publish release button, making this the latest official Malcolm release.
Finally, Romeo navigates back to the GitHub project and changes the status of each issue under the now-released milestone from Done to Released. He then navigates to the milestones page on GitHub and clicks Close for that milestone.