Apr 24, 2024 2 min read Insights and Resources

Series: How Malicious Python Code Gains Execution

The primary vector for malicious code running in software developer environments (e.g., local system, CI/CD runners, production servers, etc.) is software dependencies. This is third-party code which often means open-source software, also known as running code from strangers on the internet.

The prized goal for attackers is arbitrary code execution. It’s the stuff high CVE scores are made of and often the topic of how vulnerabilities can turn into exploits. It’s the foothold needed to run cryptominers, steal secrets, or encrypt data for ransom. It’s no wonder why threat actors want it, but how do they get it? Sutton’s Law makes it obvious why they go after open-source software: because executing arbitrary code is easy there.

This is a series examining the methods malicious Python code gains execution. Some of the methods are obvious and some are potentially undiscovered or at least not found in the wild, yet. What they all mostly have in common is the reliance on a software dependency in the form of a Python package, which is where we begin.

Python Package Spoofing

Threat modeling is a useful defensive exercise to predict and prevent future attacks. By thinking like a malicious actor, we can identify the attack surface, enumerate possible compromise vectors, and neutralize them with considered countermeasures. As security researchers, we’ll don a hat of a darker color to put ourselves in the right mindset. Certainly not a black hat, but maybe more of a gray thinking cap. The remainder of the series documents these findings.

Python Trojan Functions and Imports
Python Package Installation Attacks
Devious Python Build Requirements
Modern Python Build Hooks
Adding Spurious Wheels to PyPI
Python Executable Hooks
Compiled Python Files
(More links will be added here as new posts in the series are published)

Putting our white hat back on, there are countermeasures to protect developers from these attacks. First, use a lockfile every time an environment is created to ensure reproducibility. Then, guard against any changes to that lockfile by automatically monitoring the health of the lockfile and the dependencies contained therein. Finally, don’t allow arbitrary code to run anywhere in your development process.

Phylum can detect, report, and block malicious packages. Other solutions are merely looking for known vulnerabilities and will therefore miss this entire risk domain. Use Phylum to analyze dependencies. Integrations exist to guard PRs with a free GitHub app or a GitHub action. There is also a CLI and pre-commit hook for local development, as well as a phylum Python package that can be pip/pipx installed. Additional supported CI platforms include GitLab CI, Azure Pipelines, and Bitbucket Pipelines, with more coming.

At the time of this writing, Phylum offers Python lockfile and manifest support for pip, pipenv, and poetry. A free community edition is available for everyone to automate software supply chain security to block new risks, prioritize existing issues, and only use trusted open-source code.

--cta--

Charles Coggins

Senior Software Engineer, responsible for integrations and author of the "phylum" Python package. Documentation and quality champion, runner, baseball and scout dad, pod-faster, and lover of outdoors.

Charles Coggins

Subscribe to our research

You might also like...