This is part of a series of posts examining the methods malicious Python code gains execution.
This technique is more about obfuscating malicious Python code but it still demonstrates a method for that malicious code to gain execution in a non-standard way. Compiled Python modules (*.pyc files) can be imported just like plain text modules but they are harder to analyze since they require decompilation to discern their true intent.
--cta--
PEP 3147 describes how compiled Python modules are created, stored, and used. It provides this hint as to how they can be leveraged for malicious purposes:
For backward compatibility, Python will still support pyc-only distributions, however it will only do so when the pyc file lives in the directory where the py file would have been, i.e. not in the __pycache__ directory. [A] pyc file outside of __pycache__ will only be imported if the py source file is missing.
This legacy support path, highlighted below in the flowchart provided by PEP 3147, can be used to create packages with obfuscated intent.
The basic process can be illustrated by starting with the same spoofed certify package from an earlier entry in this blog series. Start by creating the malware in a plaintext Python module:
Then, ensure the module is called. In this case, we add a standard import statement in the core module, which should execute for all uses of this package.
Next, compile the source module and remove the original:
Finally, update the package_data entry in setup.py to ensure *.pyc files are included in the distribution:
The source distribution will now contain sneaky.pyc, whose obfuscated malicious contents are executed when the certifi package is used:
This technique is limited in that *.pyc files are specific to the Python version for which they were compiled. For instance, the source distribution created in the example above was done so with CPython 3.12 and will fail when used in an environment with a different version:
The solution is to create multiple built distributions, one for each of the targeted Python versions, with matching compiled modules. Instead of a single certify-2024.2.2.tar.gz source distribution, certify-2024.2.2-cp312-none-any.whl and certify-2024.2.2-cp311-none-any.whl wheels would be created for CPython 3.12 and CPython 3.11, respectively. The code for doing so is left as an exercise for the reader.
This technique can be made even more stealthy by using importlib from the standard library to dynamically load the compiled module instead of using a standard import statement. Previous reporting exists to show one such method used by the malicious package fshec2.
Senior Software Engineer, responsible for integrations and author of the "phylum" Python package. Documentation and quality champion, runner, baseball and scout dad, pod-faster, and lover of outdoors.
Subscribe to our research
Keep up with the latest software supply chain attacks