Build System and Version Control Compromises - the New Normal

Build System and Version Control Compromises - the New Normal
Photo by Joshua Sortino / Unsplash

While SolarWinds made headlines within the last few months for the sheer scope of impact, a sharp uptick in build and version control system compromises have followed in the intervening months, targeting third-party tools and open source applications. This is a massive problem that will only get worse as attackers continue to focus on this new vector of attack.

Scope and Impact

Third-party libraries and tools provide a very attractive attack surface in modern environments. As automation across the software development lifecycle has continued to evolve and enable faster development and deployment cycles, less eyes are consistently on each stage in the process. Overall, this has been a good thing, allowing development to progress much more quickly than in the past and greatly reducing the burden of product delivery. However, it also means that there are now parts of the development pipeline that remain largely undefended.

Over the past few years, development has largely migrated to cloud-hosted services, and software supply chain complexity has increased greatly, leading to an explosion in terms of publicly accessible attack surface. While this sort of attack has only really started to garner public attention in the last year or so with the compromise of Twilio's SDK, high profile incidents within this domain have continued to pop up - with alleged nation state actors and a broad spectrum of other threats and vectors of compromise appearing at a quickly increasing rate.

Automation-Driven Threats

So what is the common thread that unifies these supply chain-borne threats? A combination of build automation and poorly-defined security models. Even 3-5 years ago, the software supply chain landscape looked much different. Open source package ecosystems were a small fraction of their current size. For reference, NPM alone was a scant 12,500 packages in size in 2015, the time when most package analysis products getting off the ground. Today that number is over 1,500,000. Further, in 2015, far  fewer organizations had fully automated their build and deployment processes versus today.

If that wasn't bad enough, many organizations now leverage cloud hosted services during both their build and deployment processes, where more assets are publicly accessible, and permissions become much more complex, creating even more exposure. To make matters worse, components such as Continuous Integration/Continuous Deployment (CI/CD) infrastructure and containers also essentially crowdsource more parts of the process.

What This Means

In essence, this now creates a complex system to manage building and deploying sensitive assets that are core to organizational function, where most of pieces are effectively untrusted code. This starts with managing access controls on potentially public-facing infrastructure, such as S3 buckets, and continues through the entire supply chain - including open-source packages and infrastructure-as-code (IaC) components.

Each of these components, minus the public-facing infrastructure, have the ability to directly execute code written by untrusted authors. This provides the opportunity for a malicious author to compromise not only the final deliverable (that is, the software that will end up in production), but also every piece of infrastructure along the way, from developer workstations, to the CI/CD components themselves, to production systems where core software components reside. Each of these components provides attackers with the tools necessary to execute a SolarWinds-style attack, compromising build servers and injecting malicious code into any portion of a build they choose.

Working Toward a Solution

In short, modern development employs complex systems, backed by huge amounts of third-party infrastructure and software. As this landscape has shifted and evolved in recent years, security technologies continue to struggle to keep pace. New solutions that go beyond a primitive analysis of known vulnerabilities and issues are required to keep pace with attackers in this much larger, more dynamic environment.

Getting a handle on the risk these new attack surfaces pose begins with good security hygiene. Minimizing dependencies, strict versioning, auditing (where possible), and employing principles of least privilege are absolutely critical to success. Unfortunately, however, this is infeasible to maintain in practice. Dependency management and build processes in modern software ecosystems is hopelessly complex, and even if you manage to strictly version the software your projects rely upon, you are still reliant on the software libraries upstream to behave properly. To make matters worse, many packages are connected directly to version control services like Github or Gitlab - which have no protections against any modifications to published software.

To that end, we feel that a new approach is necessary. The problems facing organizations with modern development processes in 2021 are markedly different than those even 5 years ago; the sheer scope and scale of libraries, packages, containers, and plugins has grown so large that simply relying on the community to self-police no longer works. Identifying issues at the speed of manual audits and security research may work when the open source ecosystem has thousands (and perhaps even tens of thousands) of published libraries, but utterly fails when there are tens of millions of components to reason about.

This modern landscape requires at-scale automation, and just as importantly, the ability to effectively triage issues as they emerge. Anything less will leave organizations vulnerable to emerging threats within their software supply chain, and overwhelm already overburdened security professionals with meaningless alerts.

Phylum Research Team

Phylum Research Team

Hackers, Data Scientists, and Engineers responsible for the identification and takedown of software supply chain attackers.