Disass, script reverse engineering for dummies
On our daily job, we have to manage malicious piece of code every day. On this domain, we historically had two approaches: dynamic analysis on our own sandbox or manual and static analysis with reverse engineering skills. Because static analysis can be boring for known samples, we developed a framework to automatically analyzing malware. We released Disass some time ago and gave a short explanation of the tool during Botconf 2013 in Nantes, France. We received many comments and questions so we thought a blog post could help explain the way Disass is working.
Until last year, in order to automate static analysis, we wrote scripts (often in Python because Python is cool) that can highlight and extract relevant informations from malicious binaries. But these scripts are seldom robust and their behaviour is only guaranteed on the sample the manual analysis has been done.
That's precisely why we wrote Disass. Basically, Disass is a binary analysis framework written in Python to ease the automation of malware reverse engineering. The purpose of Disass is to automatically retrieve relevant information in malware such as the C&C, the user agent, cipher keys, etc. By the way, Disass allows to understand static analysis in human readable code
There are two types of disassembler algorithms: linear and flow-oriented. Disass is based on a linear disassembly module named "diStorm", which is a lightweight, easy-to-use and fast decomposer library.
A linear disassembly uses the size of the disassembled instruction to determine which byte should be disassembled next, without regarding flow-control instructions. The interesting point in a linear disassembly is that it's made for iteratively work on a block of code. The bad point is that linear disassembly is unsuitable to distinguish code and data. It can be partially circumvented with the use of a tool such as pefile.
Let's go deeper: to understand how to use the framework, the example below shows the usage of a Disass script on a real malware called "Trojan.Letsgo". This malware was made famous by the APT1 report from Mandiant. Further information on this malware can be obtained on http://www.cyberengineeringservices....
We will therefore work on a sample of Trojan.Letsgo which MD5 hash is: bcd2a7361d0a91a51123102a876c7af8
FIRST STAGE: manual analysis
When you reverse engineer the sample, you can quickly identify the function to connect to a C&C, and see that the trojan tries to retrieve a web page. The image below shows this more clearly:
SECOND STAGE: Disass module creation
Considering the manual analysis, you know what are the interesting functions. When you reach the function call "InternetConnectA", you can see the second argument is a variable containing the C&C address. To retrieve this value the following lines can do the job:
# Search call to function InternetConnectA disass.go_to_next_call('InternetConnectA') # InternetConnectA( hInternet, lpszServerName, # nServerPort, lpszUsername, # lpszPassword, dwService, # dwFlags, dwContext ); addr_cc = disass.get_arguments(2) # extract string from address print " CC\t: %s" % disass.get_string(addr_cc)
We can repeat the same instructions to know which file is recovered from this server:
disass.go_to_next_call('HttpOpenRequestA') lpszVerb = disass.get_string(disass.get_arguments(2)) lpszObjectName = disass.get_string(disass.get_arguments(3)) lpszVersion = disass.get_string(disass.get_arguments(4)) print " Request\t: %s %s %s" % (lpszVerb,lpszObjectName,lpszVersion)
THIRD STAGE: Disass power in use
Now, we can easily run this module:
$ python sample/apt1_letusgo_parser.py sample/letusgo/malware.exe CC : 184.108.40.206 Request : GET /new/iistart.html HTTP/1.1
Imagine now that you managed to collect multiple samples a another RAT called "TABMSGQL" and you want to do batch-analysis on them. Once you have your Disass script written, it's pretty straitforward:
$ python sample/apt1_tabmsgsql_parser.py sample/TABMSGSQL_samples/* CC : http://cas.m-e.org.ru/main/1.asp CC : http://admin.datastorage01.org/images/1.asp CC : http://220.127.116.11/safe/1.asp CC : http://www.dsds.co.kr/bbs/db/1.asp CC : http://cas.ibooks.tk/bbs/db/1.asp CC : http://media.finanstalk.ru/images/db/1.asp
On other malware, interesting information can be recovered such as encryption keys or hard coded values in the binary (Mutex, pipe name, etc.). This kind of things are very valuable to give value to your TTP (Tactics, Techniques, & Procedures).
In order to help people getting their hand on Disass, it comes with some ready-to-use examples that work on several malware samples we discovered during forensics investigation regarding APT attacks.
Don't forget, this tool is an advanced PoC. At the moment it only supports PE32 binaries.
Some next steps are already identified:
- PE64 support
- Identify C++ stubs
- Identify cryptographic algorithms
If you want contribute on disass framework, it's availble on our Bitbucket page. If you have any idea feel free to contact us or submit to the tracker project.