Malware analysis is a process of identifying malware behavior, what they are doing, what they want, and what their main goals are. Malware analysis involves a complex process in its activity. Forensics, reverse engineering, disassembly, debugging, these activities take a lot of time in the progress. The goal of malware analysis is to gain an understanding of how a malware works, so that we can protect our organization by preventing malware attacks.
There are two common methodologies of the malware analysis process commonly used by malware analysts: static analysis (or code analysis) and dynamic analysis (or behavior analysis). These two techniques allow analysts to understand quickly, and in detail, the risks and intentions of a given sample malware.
For performing static analysis, you need a strong understanding in programming and x86 assembly language concept. During the static analysis process, you don't have to execute the malware. Generally, the source code of malware samples is not readily available. You have to do disassembling and decompiling first, and after successfully performing reverse engineering you can analyze the low-level assembly code. Most malware analysts perform a static analysis at an earlier stage in the malware analysis process because it is safer than dynamic analysis. The challenge in static analysis is the complexity in modern malware, where some of the malware implement anti-debugging systems to prevent malware analysts from analyzing the pieces of code.
Dynamic analysis (behavior analysis) is a process in malware analysis that performs an execution of the malware itself and observes the malware activity. It also observes the changes that occur when the malware is being executed. Infecting a system with malware from the wild can be very dangerous. Malware infection on your system can cause damage to your system such as file deletion, change in registry, file modification, stealing confidential data/information, and so on. When performing malware analysis, you need a safe environment and the network should not connect to production networks. With dynamic analysis, you can monitor the changes made to the filesystem, registry, processes, and its network communication. The advantage of performing dynamic analysis is that you can fully understand how a malware works.
To handle the number of malware samples, some automated malware analysis techniques have been developed. Automating some aspects of malware analysis is critical for organizations processing large numbers of malicious programs. Automation will allow analysts to focus more on the tasks that need more attention in human analysis.
When using Cuckoo as an automated malware analysis tool, it is expected to reduce the amount of time analyzing a malware in a conventional way. There are some steps in dynamic malware analysis that require a lot of time; one of the instances are while we're setting up a virtualized environment for a malware to run. The process may seem easy, but if we have several malware to analyze, it will be pretty time-consuming.
As malware became more sophisticated, we needed more technology that would allow us to analyze malware easily without compromising our system. One such technology that can be used is sandboxing. Sandboxing has a wide and various explanation among IT people. For a reference, you can see the explanation from Wikipedia at http://en.wikipedia.org/wiki/Sandbox_(computer_security). In specific terminology (computer security), sandboxing is a technique for isolating a program (in this case, malware) by providing confined execution environments, which can be used for running unreliable programs from the main environment. To give a clear explanation about sandboxing technology, let's imagine a sandbox or sandpit playground for children. Sandpit is a container filled with sand for children to play. The "pit" or "box" itself is simply a container for storing the sand so that it does not spread outward across lawns or other surrounding surfaces. The children can do anything in the sandpits as long as they are still in the sandbox. By providing a sandbox, we can execute malicious applications and see the malware activities.
We can also analyze the malware safely and securely without worrying about the changes that will occur during the process. There are several malware sandboxes you can use for building your own automated malware analysis lab. For example, Buster Sandbox Analyzer, Zero Wine, Malheur, Cuckoo Sandbox, and so on. Cuckoo is the right tool to perform an analysis for a sandboxed malware because Cuckoo has a complete feature, it is fully open source, and has good support from its community.
What is a malware analysis lab, and why should we build a malware lab? Malware lab is a safe environment to analyze malware. Basically, it is an isolated environment which contains a lot of useful tools for malware analysts that helps them in analyzing the malicious software. We should build a malware lab to be more proactive to new and modern threats that can suddenly attack our organization. It is also a form of advanced detection before antivirus vendors found a new malware specimen. The scope of the malware analysis lab can be determined by examining the processes that will occur in the malware analysis process.
Static analysis involves disassembling and reverse engineering the code of the malware. This can be done in a static state where the code is analyzed without being executed. No complex configuration is required for the lab, because actually you won't execute the malware itself. This lab is provided just to safeguard if you accidentally execute the binary malware when you are performing the code analysis. For dynamic analysis, you need to set up a more complex lab, as you need to execute the malware. Malware behaves differently depending on the operating system environment where they are being executed.
You should pay more attention regarding the location of malware analysis hosts on your network. Trojan, worms, and other types of malware can be self-replicating, so it's highly likely that simply running an executable code on a production network can lead to another machine on the same network being infected.
Setting up a malware analysis lab is actually quite simple and requires a minimum amount of hardware. Isolating your malware analysis lab from other computers in the network is not enough. In addition, you also need to isolate your lab from the Internet if you are not sure. You should consider this option, because sometimes a malware needs to communicate with the malware author server, for example, Botnet command and control servers.
There are two options in building a malware analysis lab, that is, a physical environment and a virtualization environment. As mentioned earlier, both of them have advantages and disadvantages. Building your physical lab will require a lot of money and time in building the environment as well. In this situation, building a malware lab using the virtualization technique will save your money and time. Virtualization software allows you to save the state of a virtual machine as it runs so that you can revert back to it when necessary. This term is usually called snapshot. Using this snapshots feature, you can have a virtual machine environment that contains an operating system with a full set of weapons of dynamic and static analysis tools, and then perform a dynamic analysis with the malware, and finally you can save the session using the snapshot feature so that you can load the initial infected state at will. After finishing your malware analysis, you can choose to save or discard that snapshot and revert back to a clean image. Then, using the snapshot feature, you do not have to worry about malware that will infect your Guest OS, as you will be able to easily restore to the previous state.
From now on, you can be aware that the automated analyses of malware, which uses virtualization in operating systems, will help you to shorten the time in analyzing malware samples. Virtualization technologies have become a key component in automated malware analyses because of the cost effectiveness in hardware consumption and CPU resource utilization. By using a popular operating system and intentionally infecting it with a captured malware sample, it is generally useful to monitor the activities of the malware and determine the suspicious activities that occurs. The drawback of implementing automated malware analysis is that this method can be easily detected by malware writers as it frequently uses evasion techniques such as anti-debugging, packers, encryption, obfuscating code, and so on. But you can try to hide as many virtualization traces as possible. There is a lot of information on the Internet regarding virtualization detection techniques and countermeasures of malware analysis.
As described in its official website (http://www.cuckoosandbox.org/), Cuckoo is a malware sandboxing utility which has practical applications of the dynamical analysis approach. Instead of statically analyzing the binary file, it gets executed and monitored in real time. As a simple explanation, Cuckoo is an open source automated malware analysis system that allows you to perform analysis on sandboxed malware. Cuckoo Sandbox started as a Google Summer of Code project in 2010 within the Honeynet Project. After the initial work during the summer of 2010, the first beta release was published on February 5th, 2011, when Cuckoo was publicly announced and distributed for the first time.
Cuckoo was originally designed and developed by Claudio "nex" Guarnieri, who is still the main developer and coordinates all efforts from joined developers and contributors. In March 2012, Cuckoo Sandbox won the first round of the Magnificent7 program organized by Rapid7. Cuckoo was chosen by Rapid7 for the first round of Magnificent7 sponsorships due to the developers' innovative approach to traditional and mobile-based malware analysis. Cuckoo is used to automatically run and analyze files and collect comprehensive analysis results that outline what the malware does while running inside an isolated Windows operating system. Cuckoo is designed for use in analyzing the following kinds of files:
Cuckoo can also produce the following types of results:
Traces of win32 API calls performed by all processes spawned by the malware
Memory dumps of the malware processes
Network traffic trace in PCAP format
Screenshots of the Windows desktop taken during the execution of the malware
Full memory dumps of the machines
Each analysis is launched in a fresh and isolated virtual machine. Cuckoo's infrastructure is composed by a host machine (the management software) and a number of guest machines (virtual machines for analysis).
The host runs the core component of the sandbox that manages the whole analysis process, whereas the guests are the isolated environments where the malware actually get safely executed and analyzed. The following diagram shows Cuckoo's architecture:
There are no specific requirements for hardware equipment. Requirements for minimum RAM is 2 GB (for virtualization) and free space in the hard disk drive of about 40 GB. In this book, I will use the following hardware specifications as the Host OS:
Quad Core CPU
4 GB RAM
320 GB HDD
$ sudo apt-get install python
Cuckoo needs the
SqlAlchemy application as the database toolkit for Python. So you need to install
SqlAlchemy with the following command line:
$ sudo apt-get install python-sqlalchemy
$ sudo pip install sqlalchemy
There are other optional dependencies that are mostly used by modules and utilities. The following libraries are not strictly required, but you should have the libraries to guarantee Cuckoo Sandbox runs smoothly in your environment:
$ sudo apt-get install python-dpkt python-jinja2 python-magic python-pymongo python-libvirt python-bottle python-pefile ssdeep
Or you can install all the packages using
pip package management (except
$ sudo pip install dpkt jinja2 pymongo bottle pefile
You have to install
ssdeep fuzzy hashes of samples; but before installing
Pydeep, we need to install some dependencies with the following command line:
$ sudo apt-get install build-essential git libpcre3 libpcre3-dev libpcre++-dev
Next, you have to clone
pydeep from the the
git source (put
pydeep in the
$ cd /opt $ git clone https://github.com/kbandla/pydeep.git pydeep $ cd /opt/pydeep/ python setup.py build sudo python setup.py install
You will also need to install
yara to categorize malware samples (put yara in
$ sudo apt-get install automake -y $ cd /opt $ svn checkout http://yara-project.googlecode.com/svn/trunk/yara $ cd /opt/yara $ sudo ln -s /usr/bin/aclocal-1.11 /usr/bin/aclocal-1.12 $ ./configure $ make $ sudo make install $ cd yara-python $ python setup.py build $ sudo python setup.py install
You need to install
tcpdump in order to dump network traffic which occurs during analysis:
$ sudo apt-get install tcpdump
If you want to run the
tcpdump, you need root privileges; but since you don't want Cuckoo to run as root, you'll have to set specific Linux capabilities to the binary, as shown in the following command line:
$ sudo setcap cap_net_raw,cap_net_admin=eip /usr/sbin/tcpdump
You can verify the results of the last command with:
$ getcap /usr/sbin/tcpdump /usr/sbin/tcpdump = cap_net_admin,cap_net_raw+eip
If you don't have
setcap installed, you should install this library:
$ sudo apt-get install libcap2-bin
Otherwise (not recommended) run the following command line:
$ sudo chmod +s /usr/sbin/tcpdump
chmod +s command means SUID bit. you add both user ID and group ID permission to a file. In this case, it is tcpdump. If you set the SUID bit "
tcpdump, then other users can run it and they will become the root for as long as the
tcpdump process is executing. That is why this step is not recommended.
First, download Cuckoo from its website at http://www.cuckoosandbox.org/download.html.
There are two ways to set Cuckoo up in your Host OS. You can either download the
tarballfile or you can clone from source using
$ tar –zxvf cuckoo-current.tar.gz
Before configuring Cuckoo in your Host OS, you need to set up the Guest OS, as the Guest OS will be mentioned in Cuckoo's configuration files (you will write down the Guest OS name in the configuration file). In this book, we will use VirtualBox Version 4.2.12 for 64 bit. You can download VirtualBox from the website https://www.virtualbox.org/wiki/Downloads.
In this book, we will use VirtualBox 4.2.12 for the Linux Host (If you can't find Version 4.2.12, you can use newer versions. But if you want to download Version 4.2.12, please go to
https://www.virtualbox.org/wiki/Download_Old_Builds_4_2). There are several versions of VirtualBox for your Linux OS. We will download Ubuntu 12.04 LTS ("Precise Pangolin") AMD64 version (this one is for the 64-bit version if you are using a 32-bit version, you can choose to download i386).
Before setting up your Guest OS in VirtualBox, you need to pay attention to Vbox driver. You need to set up
vboxdrv first before creating your Guest OS. In order to set up the
vboxdrv, you need to install kernel headers of your Linux. The kernel headers will be required in compiling
vboxdrv. If you want to be sure about your kernel version, you can use this command:
$ uname –a
You will see an output like this:
Linux digit-labs 188.8.131.52-generic #28-ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64x86_64 x86_64 x86_64 GNU/Linux
It means you are using kernel Version 184.108.40.206, and you need to install the kernel headers using this command:
$ apt-get install linux-headers-220.127.116.11-generic
After you're finished installing the Linux headers, you can set up
vboxdrv with the following command lines:
$ sudo /etc/init.d/vboxdrv setup * Stopping VirtualBox kernel modules [OK] * Recompiling VirtualBox kernel modules [OK] * Starting VirtualBox kernel modules [OK]
1GB RAM memory
10 GB of hard disk space
VDI format for the virtual disk
Dynamically allocated storage
Windows XP SP3
When you are installing the Guest OS, you have to create the Guest OS name for the Cuckoo Sandbox VirtualBox configuration file.
In the first step, we will create the guest OS. You can write down your guest OS name, and operating system type. Since we are using Windows XP as guest OS, you can choose Windows XP in the OS type and version.
Before you start your Guest OS in VirtualBox, you need to configure the network, sharing folder, and the installing of VirtualBox Guest Addition to improve its capabilities in the malware analysis process.
Basically, VirtualBox has several types of network configuration that can be used by the Guest OS. Each type has a different capability based on your need, we can learn more about it in the VirtualBox website:
Cuckoo is written in Python language, so you will need to install Python and other libraries as dependencies. Here is a website for you to download malware samples from, which will be used in this book:
You can download malware samples from the website. They will also provide you with some useful tools that can be downloaded from the same website. If you want to get additional information about this book, you can visit the aforementioned website, and put your comments there.
Based on the explanation in the website, we should use the Host-only networking type, because it will isolate our Guest OS from the outside network. With this networking type, Host OS and Guest OS can interact with each other, but the Guest OS can "see" the outside network or internet.
In the VirtualBox main window, click on the File button and select Preferences...:
Click on the last icon on the side pane that says Edit Host-only Network to view your network configuration. If the DHCP server is not enabled, you need to manually configure your Guest OS IP Address but I suggest you leave it as it is:
Go to the Adapter 1 tab and tick the option Enable Network Adapter. In the Attached to drop-down menu, you have to choose Host-only Adapter and in the Name drop-down menu choose vboxnet0 (network adapter name is based on what you have created).
After finishing your configuration for the Guest OS, you can start your Guest OS into the beginning installation process.
I assume that you have already finished your Guest OS installation process and logged in to your Guest OS. You will need to manually configure your Guest OS, as the DHCP server is not enabled in the host-only network configuration. Give your OS IP address with the same network segment as the Host OS. In this case, if you leave the host-only configuration as it is, the Host OS and Guest OS IP addresses will be set as 192.168.56.1 and 192.168.56.101, respectively.
Try to ping each other to make sure that the Host OS and Guest OS is already connected.
Then click on the green icon at the top-right corner of your window that says Add Shared Folder (Ins):
Choose the folder (in your Host OS) that you want to be shared with your Guest OS in the Folder Path (for example
/home/username/Downloadsor we can make our own folder somewhere else).
Give the shared folder a name (by default your computer will give a shared folder name, you can change the folder name as you wish), and tick the sharing options according to your choice:
Select the drive you want from the drop-down menu.
In the Folder text field, fill it in with
sharesis the shared folder name in the previous screenshot).
Go to Computer or Windows Explorer, and you will see the shared folder.
Now, to configure your Guest OS you have to:
Install Python for Windows. You can download the software at http://python.org/download/.
Install PIL (Python Imaging Library) Python module to created desktop screenshots. This software is available at http://www.pythonware.com/products/pil/.
Turn off automatic Windows updates.
Turn off Windows firewall.
Install third-party applications (Microsoft Office 2003/2007, Acrobat Reader 9.5, Mozilla Firefox 3.6, and so on) at http://www.oldapps.com/. This step is optional.
Next, copy the Python agent to our Windows shared folder using this command line on the Host OS:
$ cp /home/digit/cuckoo/agent/agent.py /home/digit/cuckoo/shares/
From your Windows Guest OS, copy the
PYW files run the script without invoking the console window, especially if your program is GUI based. If you double-click the
agent.pyfile, a command prompt window will appear on your desktop. If you rename the file to a
.pywfile, there will be no pop-up window appearing on your desktop. It is similar to a background process in Linux.
To always run the
agent.pywfile in startup process, you need to put it in the
Startupfolder in the following paths:
For Windows XP go to
C:\Document and settings\username\Start Menu\Programs\Startup.
For Windows 7 go to
As you can see in the screenshot below:
You also need to configure Host OS IP forwarding and filtering rules using
$ iptables -A FORWARD -o eth0 -i vboxnet0 -s 192.168.56.0/24 -m conntrack --ctstate NEW -j ACCEPT $ iptables -A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT $ iptables -A POSTROUTING -t nat -j MASQUERADE $ sysctl -w net.ipv4.ip_forward=1
The next step is the configuration of Cuckoo Sandbox.
You can either run Cuckoo from your own user or create a new one dedicated just to your Sandbox setup. We recommend you to create a specific user for your Cuckoo Sandbox environment. Make sure that the user that runs Cuckoo is the same user that you will use to create and run the virtual machines, otherwise Cuckoo will not be able to identify and launch them. Just run the following command line in terminal:
$ sudo adduser cuckoo
If you're using VirtualBox, make sure the new user belongs to the
vboxusers group (or the group you used to run VirtualBox):
$ sudo usermod -G vboxusers cuckoo
If you're using KVM or any other
libvirt-based module, make sure the new user belongs to the
libvirtd group (or the group your Linux distributor uses to run
$ sudo usermod -G libvirtd cuckoo
Now it's time for the best part, let's install and configure Cuckoo Sandbox.
Extract or checkout your copy of Cuckoo to a path of your choice and you're ready to go. For example, we can put it in the
cuckoo.conf: This configuration file contains information about the general behavior and analysis options in Cuckoo Sandbox.
<machinemanager>.conf: This file holds the information about your virtual machine configuration. (Depends on the name of virtualization that we used.)
processing.conf: This file is used for enabling and conﬁguring the processing of modules.
reporting.conf: This file contains information about reporting methodologies.
.conf files are described in detail in the following sections.
This file contains the basic and general configuration information of Cuckoo. For example, you can ask Cuckoo to check the newest version when it is being executed. If you use this feature, Cuckoo will download the newest version, and you can store the old version or delete it. It defines in the version_check on the
cuckoo.conf file. You can describe your virtualization method in the
cuckoo.conf file. For example, if you are using VirtualBox, you can write in
machine_manager= virtualbox, or if you are using VMware, you can change this line to
You can also write down the Host OS IP address and port number that will be used by Cuckoo Sandbox. By default, the IP address is set as 192.168.56.1 (because we are using host-only networking method), and the default port is 2042. (Don't forget to define your networking interface.) We have defined the interface for Cuckoo,
vboxnet0 (look at the discussion about VirtualBox configuration in the Configure the network section).
Machine managers are the modules that deﬁne how Cuckoo will interact with your virtualization tools. In
cuckoo.conf, you will write down your virtualization software. If you use VirtualBox, the
<machinemanager>.conf will refer to the
virtualbox.conf configuration. If you use VMware,
<machinemanager>.conf will refer to the
In this book we use VirtualBox, so you just need to pay attention to the
virtualbox.conf file. You can edit this file based on your need. For example, if you want to run VirtualBox in GUI, you should edit the mode and set it as
gui. If you feel comfortable using VirtualBox with command lines, then you should write down
mode = headless in
Remember in the Guest OS installation, I mentioned that you need to pay attention while naming the Guest OS because you will edit the Guest OS name in this configuration. Therefore, in the
[cuckoo1] section, you can specify the Guest OS name. If you give your Guest OS name
cuckoo1, you can edit
label = cuckoo1 (don't forget we created the Guest OS name
Since we are using Windows XP as the Guest OS, you have to define the
platform section as
platform = windows
Don't forget to write down the Guest OS IP address. We are using host-only networking, by default the first OS in guest system will be given the IP address 192.168.56.101.
This configuration ﬁle will allow you to enable, disable, and conﬁgure all the processing modules.
Basically, you do not need to make any changes to the default configuration in this file. But you can add your own VirusTotal API key in it. If you don't have a VirusTotal account yet and want to have one, just create an account in VirusTotal's website at https://www.virustotal.com/en/, and put the key in this line:
# Add your VirusTotal API key here. The default API key, kindly# provided by the VirusTotal team, should enable you with a # sufficient throughput and while being shared with all our users, # it should not affect your use. key = a0283a2c3d55728300d064874239b5346fb991317e8449fe43c902879d758088
conf/reporting.conf ﬁle contains information on automated reports generation. This file contains information about the methodologies or kinds of reporting that you want to use after the completion of the analysis process. You can either disable or enable the reporting method.
After you finish configuring your Cuckoo Sandbox environment, you can test your first malware analysis process.
The virtual machine is now ready to test malware, but for the first time you need to create a snapshot file using this command:
$ vboxmanage snapshot "WIndows-cuckoo" take "WIndows-cuckooSnap01" --pause
The following commands are used to restore the snapshot:
$ vboxmanagecontrolvm "WIndows-cuckoo " poweroff $ vboxmanage snapshot "WIndows-cuckoo" restorecurrent $ vboxheadless --startvm "WIndows-cuckoo"
The snapshot of the Guest OS is the most important part for the process of analyzing malware using Cuckoo Sandbox. Make sure everything is set and ready to analyze malware and carry out the following steps to perform the analysis:
To start your Cuckoo Sandbox, you need to run:
The output from your terminal will be something like the following screenshot:
Cuckoo is now running and waiting for analysis. You can submit sample malware or malicious URLs. You have to change the directory to
/cuckoo/utils/and then use the
submit.pyfile to perform a malware analysis:
Now, you have successfully prepared the Host OS and Guest OS in the VirtualBox and then installed Cuckoo Sandbox. It is important to make sure that all the dependencies that are needed in the Host OS along with
yara are present. For the Guest OS, always turn off the defensive parameter and Windows firewall and use any software that the malware often use to interact with, for example, Adobe Reader 9.5, Internet Explorer 6, Microsoft Office 2003, and so on.
Always set your configuration in
<machinemanager>.conf in exactly the same way as it is in the virtualization software you are using. For example, if you are using KVM, you have to set
machinemanager.conf. Since we are using VirtualBox, you have to set
virtualbox in the configuration. You have to be careful at the time of inserting the name of the Guest OS in VirtualBox to
cuckoo.conf configuration file. For example, if you create a Guest OS named
cuckoo1, you have to write down
cuckoo1 in the
cuckoo.conf configuration file. The most important part of all is not to forget to make a backup of the whole system and configurations.
In the next chapter, we will continue learning about Cuckoo Sandbox's features, such as analyzing PDF files, URLs, and binary files, Memory Forensic using Cuckoo Sandbox (using the Memory dump feature), and additional Memory Forensic using Volatility.