Malicious Open-Source Package Authors are Bad, and Should Feel Bad

Malicious Open-Source Package Authors are Bad, and Should Feel Bad

With the numerous publications around malware findings in open source, it is no secret that malware is pervasive. What may come as a surprise (but really shouldn’t) is that most of this reported malware is terrible. Not terrible in the same way you might refer to a piece of malware that installs ransomware on hospital infrastructure as terrible; terrible as in, it is incredibly amateurish and unlikely to even function at all.

Nonetheless, it is prudent to take all findings seriously until analysis proves otherwise.

On September 11, 2022, we identified a new malware publication immediately after it was released [1].

color-vividpy

The package name seems benign enough. However, cracking open the only executable file __init__.py, we are met with a wall of obfuscated Python.


copyright='color-vividpy'
class Pyobfuscate_com():
 def __init__(self:object,_system:float=(7.5 == 8.9) or (9.2 != 9.2),_bits:float=0,*_rasputin:str,**_decode:bool)->exec:self._delete,_system,self._bytes,self._exec,self._bit,_decode[_bits]=lambda _eval:"".join(chr(int(_encode)-len(_eval.split('~')))if _encode!='*'else'ζ'for _encode in str(_eval).split('~')),lambda _system:exit()if self._bit[15]+self._bit[17]+self._bit[8]+self._bit[13]+self._bit[19] in open(__file__, errors=self._bit[8]+self._bit[6]+self._bit[13]+self._bit[14]+self._bit[17]+self._bit[4]).read() or self._bit[8]+self._bit[13]+self._bit[15]+self._bit[20]+self._bit[19] in open(__file__, errors=self._bit[8]+self._bit[6]+self._bit[13]+self._bit[14]+self._bit[17]+self._bit[4]).read()else"".join(_system if _system not in self._bit else self._bit[self._bit.index(_system)+1 if self._bit.index(_system)+1<len(self._bit)else 0]for="" _system="" in="" "".join(chr(ord(t)-651063)if="" t!="ζ" else"\n"for="" t="" self._delete(_system))),lambda="" _boom:_system(_boom),lambda="" _system:str(_decode[_bits](f"{self._bit[4]+self._bit[-13]+self._bit[4]+self._bit[2]}(''.join(%s),{self._bit[6]+self._bit[11]+self._bit[14]+self._bit[1]+self._bit[0]+self._bit[11]+self._bit[18]}())"%list(_system))).encode(self._bit[20]+self._bit[19]+self._bit[5]+self._bit[34])if="" _decode[_bits]="=eval" else="" exit(),exit()if="" else'abcdefghijklmnopqrstuvwxyz0123456789',eval;return="" self.__pyobfuscate__(_decode[(self._bit[-1]+'_')[-1]+self._bit[18]+self._bit[15]+self._bit[0]+self._bit[17]+self._bit[10]+self._bit[11]+self._bit[4]])="" def="" __pyobfuscate__(self,_execute:="" str)-="">exec:return(type(None)(),self._exec(self._bytes(_execute)))[0]
_=__import__((b'z\x00l\x00i\x00b\x00').decode('\x75\x74\x66\x2d\x31\x36\x2d\x6c\x65...

# ... Clipped for brevity ...

The copyright (more on this later - it’s important!) seems to indicate an attempt at protecting some intellectual property. Still, being the professionals we are, we aren’t easily deceived, so we press on.

🟢 For those of you wishing to follow along at home, malware samples are available via our free service.

Pyobfuscate.com

This particular sample was obfuscated using https://pyobfuscate.com/py, which, amusingly, provides you with an option to “encrypt” your Python source code (did they mean obfuscate?). As an unrelated aside, it also seems like this service was just shamelessly ripped off.

Poking at the periphery of this obfuscation, we can get a decent sense as to what it’s doing:

Attempting to hide imports (this is really just zlib)


Attempting to hide imports (this is really just zlib)

Performing some rudimentary checks to ensure we haven’t deobfuscated the source


lambda _system:exit()if self._bit[15]+self._bit[17]+self._bit[8]+self._bit[13]+self._bit[19] in open(__file__, errors=self._bit[8]+self._bit[6]+self._bit[13]+self._bit[14]+self._bit[17]+self._bit[4]).read()...

Eventually executing an exec operation to unroll the obfuscation and run the underlying Python code


def __pyobfuscate__(self,_execute: str)->exec:return(type(None)(),self._exec(self._bytes(_execute)))[0]

If we took the time to unroll this by hand, we’d eventually arrive at something resembling the following before running through the exec operation, but this sounds like a lot of work.


*~675830~675834~675837~675836~675839~675841~675758~675836~675840~675758~675770~675758~675840

Deobfuscating

Here’s the thing about interpreted languages: you must eventually give the interpreter something it understands. In other words, the code must run. Replacing the eval operation with print should give us the plain, unobfuscated source code. So… let’s just do that.

Replacing the __pyobfuscate__ definition with the following should do the trick:


def __pyobfuscate__(self,_execute: str)->exec:print(self._bytes(_execute))

Running the new code [2] will result in… absolutely nothing. We’ve triggered one of the self-protection mechanisms that make a call to exit() prematurely and end the process.

If you look closely at these protections, they are referring to the source code of __file__ and checking the file structure to ensure we haven’t mucked with it. Couldn’t we just swap out __file__ for the path to the unadulterated malware? Yes, yes, we can.


...+self._bit[19] in open("/path/to/originalMalwareSample.py", errors=self._bit[8]+...

Re-running our updated code correctly spits out the deobfuscated Python [3]:


import os , sys

fr = os.path.basename(__file__)

string1 = 'Obfuscated by https://pyobfuscate.com'

file1 = open(fr, "r")
readfile = file1.read()

if string1 in readfile: 
    pass
else: 
    print("Don't Edit/Remove Copyright")
    input("Press Enter to Exit")
    sys.exit("")
file1.close()

import os
import time
from datetime import datetime

# ... Clipped for brevity ...

Spooky and Ineffectual

The source is no doubt an attempt at maliciousness. Boasting such things as [4]:

  • AES encryption!
  • Checking for processes like Wireshark, Fiddler, etc.
  • Password stealers
  • Screenshot exfiltration
  • Looking for Paypal, Coinbase, and Binance authentication

But here’s the thing: None of this works. If we made it this far (we won’t for reasons I’ll explain shortly), we’re missing third-party dependencies. The imports at the top of the file will fail with an exception before we ever get to the spookiness.

Even worse, we won’t get to the point of actually attempting to import anything, due to this block of code in our deobfuscation:


string1 = 'Obfuscated by https://pyobfuscate.com'

file1 = open(fr, "r")
readfile = file1.read()

if string1 in readfile: 
    pass
else: 
    print("Don't Edit/Remove Copyright")
    input("Press Enter to Exit")
    sys.exit("")
file1.close()

Remember the copyright we noted earlier?


copyright='color-vividpy'

The malware author changing the copyright string means the rest of the malware will never execute. Thwarted by their own attempts at obfuscation. Embarrassing.

It Doesn’t Have to Be Good, If No One Is Looking

The fact of the matter is, attackers don't have to try very hard to have some level of success. Why should they burn an 0-day when the stupid thing suffices? [5]

Consider the coa and rc attacks from last year. The attackers managed to release malicious versions of legitimate NPM packages used by millions of developers. It probably would have been wildly successful, except for the fact that it broke build pipelines.

The reality is that it is less work to distribute a handful of malicious packages than it is to perform a security review of each published package. Without a priori knowledge of which packages are malicious, the level of effort required to be successful is disproportionately in favor of the malicious actors. And so, we must remain vigilant against all packages if we do not want to fall prey to even the most amateur malware authors.

Footnotes

[1] We worked with PyPI and this package has been removed from the ecosystem.

[2] Running the new code in a VM.

[3] Again, for those of you playing the home game, you could also just deobfuscate it all with a few lines of Bash


sed "s~open(__file__~open(\"$FILE\"~g" "$FILE" > $WORKING_FILE
sed -i "s/exec:return(type(None)(),self._exec(self._bytes(_execute)))\[0\]/exec:print(self._bytes(_execute))/g" "$WORKING_FILE"

[4] This appears to be partially ripped off https://github.com/venaxyt/Token-Grabber-Advanced/blob/main/Token Grabber.py

[5] If it's stupid and it works, is it really stupid? We should, as a community, make it harder for the stupid thing to even be viable.

Phylum Research Team

Phylum Research Team

Hackers, Data Scientists, and Engineers responsible for the identification and takedown of software supply chain attackers.