Python Trojan Functions and Imports

Python Trojan Functions and Imports | Phylum
Photo by Super Snapper / Unsplash
🗣️
This is part of a series of posts examining the methods malicious Python code gains execution. If you haven't already, you'll likely want to start with the core concept of package spoofing.

Calling a trojan function

This method is also maybe the most obvious: add additional code to existing functions. What easier way to gain code execution in Python than to write a function and let users call it! What better way to ensure users call that function than to modify an existing function they already call! It really is that simple.

Expanding on the spoofed certifi example, that package offers two functions, either of which is likely to be called. We'll modify the contents() function since it is shorter. It is defined three times, each depending on different Python versions. Add malicious code to run before or after the intended functionality:

❯ git diff -U1 certifi/core.py
diff --git a/certifi/core.py b/certifi/core.py
index 91f538b..b130060 100644
--- a/certifi/core.py
+++ b/certifi/core.py
@@ -46,2 +46,3 @@ if sys.version_info >= (3, 11):
     def contents() -> str:
+        print("[!] Malicious code goes here")
         return files("certifi").joinpath("cacert.pem").read_text(encoding="ascii")
@@ -82,2 +83,3 @@ elif sys.version_info >= (3, 7):
     def contents() -> str:
+        print("[!] Malicious code goes here")
         return read_text("certifi", "cacert.pem", encoding="ascii")
@@ -113,2 +115,3 @@ else:
     def contents() -> str:
+        print("[!] Malicious code goes here")
         return read_text("certifi", "cacert.pem", encoding="ascii")

These print statements are not malicious but serve to show when the malicious code runs during execution of this newly modified package:

❯ python
Python 3.12.2 (main, Feb 14 2024, 10:56:22) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import certifi
>>> certifi.contents()
[!] Malicious code goes here
'\n# Issuer: CN=GlobalSign Root CA O=GlobalSign nv-sa OU=Root CA\n# Subject: CN=GlobalSign Root CA O=GlobalSign nv-sa OU=Root CA\n# Label: "GlobalSign Root CA"\n# Serial: 4835703278459707669005204\n# MD5 Fingerprint: 3e:45:52:15:09:51:92:e1:b7:5d:37:9f:b1:87:29:8a\n# SHA1 Fingerprint: b1:bc:96:8b:d4:f4:9d:62:2a:a8:9a:81:f2:15:01:52:a4:1d:82:9c\n# SHA256 Fingerprint: eb:d4:10:40:e4:bb:3e:c7:42:c9:e3:81:d3:1e:f2:a4:1a:48:b6:68:5c:96:e7:ce:f3:c1:df:6c:d4:33:1c:99\n-----BEGIN CERTIFICATE-----\nMII
---TRIMMED-FOR-BREVITY---

An advantage to using this method is that it works equally for source distributions (aka tarballs) and built distributions (aka wheels). The malicious code only runs when the affected function runs, so it doesn't matter how the package is installed but only that it is installed. What if we want an easier infection vector? One that doesn't rely on a specific function to be called?

Importing a trojan package or module

Instead of adding malicious code to an existing function, it is possible to lift that code up to the package or module level so that it runs on import:

  • import package: the __init__.py from package runs
  • from package import subpackage: the __init__.py from subpackage runs

In both cases, the __init__.py file runs so we can put our malicious code there:

❯ git diff -U5 certifi/__init__.py
diff --git a/certifi/__init__.py b/certifi/__init__.py
index 1c91f3e..3894b45 100644
--- a/certifi/__init__.py
+++ b/certifi/__init__.py
@@ -1,4 +1,6 @@
 from .core import contents, where

 __all__ = ["contents", "where"]
 __version__ = "2024.02.02"
+
+print("[!] Malicious code goes here")

And see it run upon import:

❯ python
Python 3.12.2 (main, Feb 14 2024, 10:56:22) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import certifi
[!] Malicious code goes here

This technique has the same advantage of working equally for source and built distributions. It casts a wider net, covering any use of the affected package instead of a single function. However, it still requires deliberate code execution and that might be limited to a hardened production server where there are less goodies to nab.

Do look a trojan horse in the mouth

Developers are the new high-value targets for attackers infecting the software supply chain. Threat actors know developers control the keys to the kingdom: SSH, GPG, and signing keys; API tokens, infrastructure secrets, and proprietary source code. The malicious code inserted in otherwise benign functions and packages could steal these keys and exfiltrate them with webhooks. We've seen this behavior before with expired author or maintainer domain takeovers and compromised accounts.

The techniques attackers use to trick developers into using malicious packages with these trojan features include typosquatting, starjacking, dependency confusion, and lockfile injection.

--cta--

Charles Coggins

Charles Coggins

Senior Software Engineer, responsible for integrations and author of the "phylum" Python package. Documentation and quality champion, runner, baseball and scout dad, pod-faster, and lover of outdoors.