In this article, we will conduct an in-depth exploration of an impactful vulnerability affecting various container runtimes.
A few days ago, the email inbox of Snyk’s partners and subscribers was reached by an alarming message regarding a recent discovery of a container breakout vulns.
This message stated that Snyk’s Security Labs team has recently identified and disclosed four vulnerabilities affecting core container ecosystem components, which can allow container breakouts.
The text of the mail contains the link to a Snyk blog post that explains these vulnerabilities, and in order to spread this message and to be sure that anyone has a full understanding of what happened and what can be done to take countermeasures, we want to deep dive the main discovery and give you our highlights.
In this article, our primary focus will be on one of the disclosed vulnerabilities: CVE-2024-21626.
Rory McNamara, Security researcher at Snyk Security Labs team, on the 20th of November 2023, identified four vulnerabilities in core container infrastructure components that allow container escapes.
As soon as the discovery and the verification phase ended, the Security Labs team initiated a process for responsible disclosure by notifying Docker, who forwarded one of the vulnerabilities to the open source runc security group.
Why these Docker and runc container breakout vulnerabilities are so relevant?
Let’s take a step back to the basics of Cloud Native security.
As stated on the official Kubernetes documentation site on the page “Overview of Cloud Native Security”, according to the defense in-depth computing approach to security, we can think about security in layers ( see the above image).
Each layer is built over the succeeding outermost layer, so you cannot safeguard against poor security standards in the base layers by only addressing security at the Code level.
Thanks to this image, you can easily understand that an attacker could leverage the discovered vulnerabilities to gain unauthorized access to the underlying infrastructure, and from there, they could potentially access whatever data was on the operating system - including sensitive data - and launch further attacks.
Deep dive into the vulnerability
The maintainers of runc officially announced the CVE-2024-21626 vulnerability on January 31, 2024.
A Common Vulnerabilities and Exposures (CVE) is a catalog of publicly disclosed computer security vulnerabilities.
Each CVE is associated with a unique identification number, known as a CVE ID, and an impact factor usually calculated via the CVSS framework.
Security advisories from vendors and researchers consistently cite CVE IDs, enabling IT professionals to coordinate their efforts in prioritizing and remedying these vulnerabilities, thereby enhancing the overall security of computer systems.
This particular CVE is capable of allowing for an order-of-operations container breakout centered around the WORKDIR command that can drive attackers directly into the underlying host operating system.
The issue here is that runc is predominantly utilized by other higher-level container software, like Docker.
The attack vector could be made by building a container image using a malicious Dockerfile or upstream image.
The vulnerability potentially affects both build and production runtime environments.
The cloud-native security team at SIGHUP promptly mobilized to investigate the vulnerability and develop a straightforward Proof of Concept for its reproduction.
The issue stems from a hasty handling of file descriptors by the runc runtime.
In Linux, a file descriptor is a non-negative integer that uniquely identifies an open file within a process.
It is essentially an abstract indicator or handle used to access files, sockets, or other input/output resources.
File descriptors are a fundamental concept in Unix-like operating systems, and they are used to represent input and output channels for processes.
From the official runc security advisory:
"In runc 1.1.11 and earlier, several file descriptors were inadvertently leaked internally within runc into
runc init, including a handle to the host's
/sys/fs/cgroup (this leak was added in v1.0.0-rc93).
If the container was configured to have
process.cwd set to
/proc/self/fd/7/ (the actual fd can change depending on file opening order in
runc), the resulting pid1 process will have a working directory in the host mount namespace and thus the spawned process can access the entire host filesystem.
This alone is not an exploit against runc, however a malicious image could make any innocuous-looking non-
/ path a symlink to
/proc/self/fd/7/ and thus trick a user into starting a container whose binary has access to the host filesystem."
Observing the official patch to the runc code provides us with additional insights.
The following two code segments are among the most significant patches:
In order to replicate the vulnerability, we provisioned a virtual machine on the fly using Vagrant.
The VM adhered to the following specifications:
Ubuntu 23.04 x86_64
24.0.7, build afdd53b
Subsequently, we crafted the simplest possible Dockerfile capable of exploiting the vulnerability:
It's also possible to directly execute the corresponding command below, without the need to compose the Dockerfile:
docker run -w /proc/self/fd/8 --name leaky-vessels --rm -it alpine:latest
In both cases, our container will launch in interactive mode, providing us with a shell within its environment.
Once we are inside the container, we can access the host filesystem like this:
The above output is the shadow of our user on the host VM.
We have successfully replicated the container breakout!
Remember to check more about the other vulnerabilities discovered: CVE-2024-21626, CVE-2024-23651, CVE-2024-23653, CVE-2024-23652 related to runc process.cwd & leaked fds container breakout, Buildkit Mount Cache Race, Buildkit GRPC SecurityMode Privilege Check, Buildkit Build-time Container Teardown Arbitrary Delete.
What can be done to mitigate
Snyk has promptly released two tools that can be utilized to identify the vulnerability: one tool performs static analysis on container images, while the other dynamically checks the linux kernel runtime.
Leaky Vessels Static Detector inspects the image tarball layers against some simple rules and regex, like, for example, the presence of the WORKDIR /proc/self/fd/[ID] command.
Leaky Vessels Dynamic Detector uses eBPF to hooks into Linux syscalls (e.g.,
mount) and function invocations of the Docker daemon and associates them with Docker builds and container processes to identify exploitations of these vulnerabilities
Both these tools can be seamlessly integrated into CI/CD pipelines to identify potential vulnerabilities and malicious content within container images before deployment or deployed into clusters and build systems to proactively detect and mitigate runtime exploits.
Falco: Sysdig published an insightful blog post that explains how Falco can come to the rescue to help identify these vulnerabilities. In the article, you will find all the rules to set up to identify and mitigate the four vulnerabilities reported.
What can be done to remediate
The maintainers of the various afflicted container runtimes have already released updates to address this vulnerability, these updates are mandatory to keep your environment safe.
Below are the versions of some vulnerable software that address the vulnerability:
- runc: 1.1.12
- containerd: 1.6.28
- docker: buildkit 0.12.5 and moby 25.0.2
In order to go deeper, we strongly recommend reading the official runc advisory.