The xz/liblzma Compromise and Software Supply Chain Security
At the end of March 2024, a major software supply chain attack was identified: some upstream forks of the popular xz/liblzma library that underpins the massively popular OpenSSH Server was compromised. A rogue contributor appears to have worked to influence the maintainers of the library, adding in seemingly innocuous changes, which ultimately culminated in the addition of a backdoor. This event is important for many reasons - not only does it represent the unfortunate compromise of a major software project (in part, at least) that a huge portion of the internet relies upon, but it also highlights one of the biggest emerging threats facing consumers of open source: Not only are the software packages being used a potential source of vulnerabilities, they also provide attackers a path to direct compromise.
So how does this sort of attack actually work? In the case of xz/liblzma, it started with some social engineering: a rogue contributor began working on building a relationship with the team maintaining an upstream library, ultimately working to get some suggested code changes merged in. The unfortunate truth of the matter is, many open source libraries, once developed, don't often require much additional care and feeding aside from occasional bugfixes. The vast majority of them aren't full of exciting, dynamic features, and once they hit a steady state, they often take a back seat to new projects. This means that ultimately, the original owners may want to try to transition the project to a new maintainer, or they may simply have less bandwidth to effectively monitor and screen new code additions down the line, particularly if the project doesn't have financial backing, and is simply relying on the free time and availability of the project maintainer. In this case, there are multiple conversation threads between the malicious contributor and the project maintainers - which led to a set of changes where the actual payload was added: an obfuscated script which would ultimately establish the backdoor.
Unfortunately, while this particular attack has achieved a high level of visibility due to the popularity and criticality of the software it was targeting, it is far from an outlier. Thousands of attacks just like it occur every quarter - from account compromises, to campaigns targeting user bases of prominent libraries, to targeted attacks attempting to compromise specific organizations, and perpetrated by threat actors ranging in skill from hobbyist to nation state, these issues represent a clear threat to organizations of all sizes and sophistications.
The Challenges of Software Supply Chain Security & Open Source
Fundamentally, the most important thing we can do as an industry to address these problems starts with an emphasis on the supply chain part of software supply chain security. Many products that operate exclusively in the vulnerability scanning portion of the security space talk a lot about "software supply chain security," but entirely miss the mark - a Software Bill of Materials (SBOM) or Software Composition Analysis tool does not imply a "supply chain," it simply provides an inventory of packages (and perhaps, licenses and CVEs), and nothing more.
The "supply chain" of software is much bigger than a simple list of packages: it is the value chain and set of "suppliers" that lead to the development of software - this includes internal tools and processes, and each "manufacturer" (software development team) that produces all of the libraries described by an SBOM. What the XZ compromise showcases is the fact that each of these pieces are a source of risk: insider threats may exist within the third-party development teams of open source projects. The tools and services that those teams use and rely upon could also be compromised, leading to unfortunate consequences. And perhaps most importantly, that there is often very limited amounts of scrutiny applied in the continued development of these components, a compromise of which will impact all downstream consumers.
From that perspective, understanding the risks around the people involved - the project authors and contributors - is critical: the actor who implemented the XZ backdoor also contributed to other prominent projects, in many cases following a similar pattern of behavior (minus the final implementation of a malicious payload), adding small, innocuous changes to multiple forks of a popular library. This is also not a new issue, as purveyors of malicious code will often retain their accounts, leaving other libraries online after the malicious software is taken down.
Further, looking at the components of the attack itself, this is absolutely in line with thousands of other attacks that have occurred over the last few years. Obfuscated payloads are a hallmark of attacks in this domain. This attack in the NuGet ecosystem delivered the SeroXen remote access tool (RAT). Another attack in the Python ecosystem stole and exfiltrated cloud developer credentials. Another attack in npm repurposed credential stealing Python malware. And there are many other examples of attackers using obfuscation to hide their true intentions.
How to Address These Concerns
While dealing with these challenges may seem daunting, there are emerging solutions to help manage against the risks associated with this class of attack. This includes modern maturity frameworks, such as the OSSF's S2C2F, which speaks in depth to how to begin building a program with adequate controls to help manage against the risks of malicious software in open source dependencies. These are also the fundamental concerns that led to the foundation of Phylum - real value chain analysis and control automation. Without solutions in place, however, given the speed at which these sorts of attacks are continuing to gain traction, it is only a matter of time until a breach is experienced.