The number of cyber attacks is undoubtedly on the rise, targeting government, military, public and private sectors. These cyber attacks focus on targeting individuals or organizations with an effort to extract valuable information. Sometimes, these cyber attacks are allegedly linked to cybercrime or state-sponsored groups, but may also be carried out by individual groups to achieve their goals. Most of these cyber attacks use malicious software (also called malware) to infect their targets. Knowledge, skills, and tools required to analyze malicious software are essential to detect, investigate and defend against such attacks.
In this chapter, you will learn the following topics:
- What malware means and its role in the cyber-attacks
- Malware analysis and its significance in digital forensics
- Different types of malware analysis
- Setting up the lab environment
- Various sources to obtain malware samples
Malware is a code that performs malicious actions; it can take the form of an executable, script, code, or any other software. Attackers use malware to steal sensitive information, spy on the infected system, or take control of the system. It typically gets into your system without your consent and can be delivered via various communication channels such as email, web, or USB drives.
The following are some of the malicious actions performed by malware:
- Disrupting computer operations
- Stealing sensitive information, including personal, business, and financial data
- Unauthorized access to the victim's system
- Spying on the victims
- Sending spam emails
- Engaging in distributed-denial-of-service attacks (DDOS)
- Locking up the files on the computer and holding them for ransom
Malware is a broad term that refers to different types of malicious programs such as trojans, viruses, worms, and rootkits. While performing malware analysis, you will often come across various types of malicious programs; some of these malicious programs are categorized based on their functionality and attack vectors as mentioned here:
- Virus or Worm: Malware that is capable of copying itself and spreading to other computers. A virus needs user intervention, whereas a worm can spread without user intervention.
- Trojan: Malware that disguises itself as a regular program to trick users to install it on their systems. Once installed, it can perform malicious actions such as stealing sensitive data, uploading files to the attacker's server, or monitoring webcams.
- Backdoor / Remote Access Trojan (RAT): This is a type of Trojan that enables the attacker to gain access to and execute commands on the compromised system.
- Adware: Malware that presents unwanted advertisements (ads) to the user. They usually get delivered via free downloads and can forcibly install software on your system.
- Botnet: This is a group of computers infected with the same malware (called bots), waiting to receive instructions from the command-and-control server controlled by the attacker. The attacker can then issue a command to these bots, which can perform malicious activities such as DDOS attacks or sending spam emails.
- Information stealer: Malware designed to steal sensitive data such as banking credentials or typed keystrokes from the infected system. Some examples of these malicious programs include key loggers, spyware, sniffers, and form grabbers.
- Ransomware: Malware that holds the system for ransom by locking users out of their computer or by encrypting their files.
- Rootkit: Malware that provides the attacker with privileged access to the infected system and conceals its presence or the presence of other software.
- Downloader or dropper: Malware designed to download or install additional malware components.
A handy resource for understanding malware terminologies and definitions is available at https://blog.malwarebytes.com/glossary/.
Classifying malware based on their functionalities may not always be possible because a single malware can contain multiple functionalities, which may fall into a variety of categories mentioned previously. For example, malware can include a worm component that scans the network looking for vulnerable systems and can drop another malware component such as a backdoor or a ransomware upon successful exploitation.
Malware classification can also be undertaken based on the attacker's motive. For example, if the malware is used to steal personal, business, or proprietary information for profit, then the malware can be classified as crimeware or commodity malware. If the malware is used to target a particular organization or industry to steal information/gather intelligence for espionage, then it can be classified as targeted or espionage malware.
Malware analysis is the study of malware's behavior. The objective of malware analysis is to understand the working of malware and how to detect and eliminate it. It involves analyzing the suspect binary in a safe environment to identify its characteristics and functionalities so that better defenses can be built to protect an organization's network.
The primary motive behind performing malware analysis is to extract information from the malware sample, which can help in responding to a malware incident. The goal of malware analysis is to determine the capability of malware, detect it, and contain it. It also helps in determining identifiable patterns that can be used to cure and prevent future infections. The following are some of the reasons why you will perform malware analysis:
- To determine the nature and purpose of the malware. For example, it can help you determine whether malware is an information stealer, HTTP bot, spam bot, rootkit, keylogger, or RAT, and so on.
- To gain an understanding of how the system was compromised and its impact.
- To identify the network indicators associated with the malware, which can then be used to detect similar infections using network monitoring. For example, during your analysis, if you determine that a malware contacts a particular domain/IP address, then you can use this domain/IP address to create a signature and monitor the network traffic to identify all the hosts contacting that domain/IP address.
- To extract host-based indicators such as filenames, and registry keys, which, in turn, can be used to determine similar infection using host-based monitoring. For instance, if you learn that a malware creates a registry key, you can use this registry key as an indicator to create a signature, or scan your network to identify the hosts that have the same registry key.
- To determine the attacker's intention and motive. For instance, during your analysis, if you find that the malware is stealing banking credentials, then you can deduce that the motive of the attacker is monetary gain.
To understand the working and the characteristics of malware and to assess its impact on the system, you will often use different analysis techniques. The following is the classification of these analysis techniques:
- Static analysis: This is the process of analyzing a binary without executing it. It is easiest to perform and allows you to extract the metadata associated with the suspect binary. Static analysis might not reveal all the required information, but it can sometimes provide interesting information that helps in determining where to focus your subsequent analysis efforts. Chapter 2, Static Analysis, covers the tools and techniques to extract useful information from the malware binary using static analysis.
- Dynamic analysis (Behavioral Analysis): This is the process of executing the suspect binary in an isolated environment and monitoring its behavior. This analysis technique is easy to perform and gives valuable insights into the activity of the binary during its execution. This analysis technique is useful but does not reveal all the functionalities of the hostile program. Chapter 3, Dynamic Analysis, covers the tools and techniques to determine the behavior of the malware using dynamic analysis.
- Code analysis: It is an advanced technique that focuses on analyzing the code to understand the inner workings of the binary. This technique reveals information that is not possible to determine just from static and dynamic analysis. Code analysis is further divided into Static code analysis and Dynamic code analysis. Static code analysis involves disassembling the suspect binary and looking at the code to understand the program's behavior, whereas Dynamic code analysis involves debugging the suspect binary in a controlled manner to understand its functionality. Code analysis requires an understanding of the programming language and operating system concepts. The upcoming chapters (Chapters 4 to 9) will cover the knowledge, tools, and techniques required to perform code analysis.
- Memory analysis (Memory forensics): This is the technique of analyzing the computer's RAM for forensic artifacts. It is typically a forensic technique, but integrating it into your malware analysis will assist in gaining an understanding of the malware's behavior after infection. Memory analysis is especially useful to determine the stealth and evasive capabilities of the malware. You will learn how to perform memory analysis in subsequent chapters (Chapters 10 and 11).
Analysis of a hostile program requires a safe and secure lab environment, as you do not want to infect your system or the production system. A malware lab can be very simple or complex depending on the resources available to you (hardware, virtualization software, Windows license, and so on). This section will guide you to set up a simple personal lab on a single physical system consisting of virtual machines (VMs). If you wish to set up a similar lab environment, feel free to follow along or skip to the next section (Section 6: Malware Sources).
Before you begin setting up a lab, you need a few components: a physical system running a base operating system of Linux, Windows, or macOS X, and installed with virtualization software (such as VMware or VirtualBox). When analyzing the malware, you will be executing the malware on a Windows-based virtual machine (Windows VM). The advantage of using a virtual machine is that after you finish analyzing the malware, you can revert it to a clean state.
VMware Workstation for Windows and Linux is available for download from https://www.vmware.com/products/workstation/workstation-evaluation.html, and VMware Fusion for macOS X is available for download from https://www.vmware.com/products/fusion/fusion-evaluation.html. VirtualBox for different flavors of operating systems is available for download from https://www.virtualbox.org/wiki/Downloads.
To create a safe lab environment, you should take the necessary precautions to avoid malware from escaping the virtualized environment and infecting your physical (host) system. The following are a few points to remember when setting up the virtualized lab:
- Keep your virtualization software up to date. This is necessary because it might be possible for malware to exploit a vulnerability in the virtualization software, escape from the virtual environment, and infect your host system.
- Install a fresh copy of the operating system inside the virtual machine (VM), and do not keep any sensitive information in the virtual machine.
- While analyzing a malware, if you don't want the malware to reach out to the Internet, then you should consider using host-only network configuration mode or restrict your network traffic within your lab environment using simulated services.
- Do not connect any removable media that might later be used on the physical machines, such as USB drives.
- Since you will be analyzing Windows malware (typically Executable or DLL), it is recommended to choose a base operating system such as Linux or macOS X for your host machine instead of Windows. This is because, even if a Windows malware escapes from the virtual machine, it will still not be able to infect your host machine.
The lab architecture I will be using throughout the book consists of a physical machine (called host machine) running Ubuntu Linux with instances of Linux virtual machine (Ubuntu Linux VM) and Windows virtual machine (Windows VM). These virtual machines will be configured to be part of the same network and use Host-only network configuration mode so that the malware is not allowed to contact the Internet and network traffic is contained in the isolated lab environment.
Windows VM is where the malware will be executed during analysis, and the Linux VM is used to monitor the network traffic and will be configured to simulate Internet services (DNS, HTTP, and so on) to provide an appropriate response when the malware requests for these services. For example, the Linux VM will be configured such that when the malware requests a service such as DNS, the Linux VM will provide the proper DNS response. Chapter 3, Dynamic Analysis, covers this concept in detail.
The following figure shows an example of a simple lab architecture, which I will use in this book. In this setup, the Linux VM will be preconfigured to IP address
192.168.1.100, and the IP address of the Windows VM will be set to
192.168.1.x (where x is any number from
100). The default gateway and the DNS of the Windows VM will be set to the IP address of the Linux VM (that is,
192.168.1.100) so that all the Windows network traffic is routed through the Linux VM. The upcoming section will guide you to set up the Linux VM and Windows VM to match with this setup.
You need not restrict yourself to the lab architecture shown in the preceding Figure; different lab configurations are possible, it is not feasible to provide instructions on every possible configuration. In this book, I will show you how to set up and use the lab architecture shown in the preceding figure.
It is also possible to set up a lab consisting of multiple VMs running different versions of Windows; this will allow you to analyze the malware specimen on various versions of Windows operating systems. An example configuration containing multiple Windows VMs will look similar to the one shown in the following diagram:
To set up the Linux VM, I will use Ubuntu 16.04.2 LTS Linux distribution (http://releases.ubuntu.com/16.04/). The reason I have chosen Ubuntu is that most of the tools covered in this book are either preinstalled or available through the apt-get package manager. The following is a step-by-step procedure to configure Ubuntu 16.04.2 LTS on VMware and VirtualBox. Feel free to follow the instructions given here depending on the virtualization software (either VMware or VirtualBox) installed on your system:
If you are not familiar with installing and configuring virtual machines, refer to VMware's guide at http://pubs.vmware.com/workstation-12/topic/com.vmware.ICbase/PDF/workstation-pro-12-user-guide.pdf or the VirtualBox user manual (https://www.virtualbox.org/manual/UserManual.html).
- Download Ubuntu 16.04.2 LTS from http://releases.ubuntu.com/16.04/ and install it in VMware Workstation/Fusion or VirtualBox. If you wish to install any other version of Ubuntu Linux, you are free to do so as long as you are comfortable installing packages and solving any dependency issues.
- Install the Virtualization Toolson Ubuntu; this will allow Ubuntu's screen resolution to automatically adjust to match your monitor's geometry and provide additional enhancements, such as the ability to share clipboard content and to copy/paste or drag and drop files across your underlyinghost machineand theLinux virtual machine. To install virtualization tools on VMware Workstation or VMware Fusion, you can follow the procedure mentioned at https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1022525or watch the video at https://youtu.be/ueM1dCk3o58. Once installed, reboot the system.
- If you are using VirtualBox, you must install Guest Additions software. To accomplish this, from the VirtualBox menu, select
. This will bring up the Guest Additions Dialog Window. Then click on
Insert guest additions CD image
Runto invoke the installer from the virtual CD. Authenticate with your password when prompted and reboot.
- Once the Ubuntu operating system and the virtualization tools are installed, start the Ubuntu VM and install the following tools and packages.
- Install pip; pip is a package management system used to install and manage packages written in Python. In this book, I will be running a few Python scripts; some of them rely on third-party libraries. To automate the installation of third-party packages, you need to installpip. Run the following command in the terminal to install and upgradepip:
$ sudo apt-get update $ sudo apt-get install python-pip $ pip install --upgrade pip
The following are some of the tools and Python packages that will be used in this book. To install these tools and Python packages, run these commands in the terminal:
$ sudo apt-get install python-magic $ sudo apt-get install upx $ sudo pip install pefile $ sudo apt-get install yara $ sudo pip install yara-python $ sudo apt-get install ssdeep $ sudo apt-get install build-essential libffi-dev python python-dev \ libfuzzy-dev $ sudo pip install ssdeep $ sudo apt-get install wireshark $ sudo apt-get install tshark
- INetSim (http://www.inetsim.org/index.html) is a powerful utility that allows simulating various Internet services (such as DNS, and HTTP) that malware frequently expects to interact with. Later, you will understand how to configure INetSim to simulate services. To install INetSim, use the following commands. The use of INetSim will be covered in detail in Chapter 3, Dynamic Analysis. If you have difficulties installing INetSim, refer to the documentation (http://www.inetsim.org/packages.html):
$ sudo su # echo "deb http://www.inetsim.org/debian/ binary/" > \ /etc/apt/sources.list.d/inetsim.list # wget -O - http://www.inetsim.org/inetsim-archive-signing-key.asc | \ apt-key add - # apt update # apt-get install inetsim
- You can now isolate Ubuntu VM within your lab by configuring the virtual appliance to use Host-only network mode. On VMware, bring up the
Network Adapter Settingsand choose
Host-only modeas shown in the following Figure. Save the settings and reboot.
In VirtualBox, shut down Ubuntu VM and then bring up
Network and change the adapter settings to
Host-only Adapter as shown in the following diagram; click on
On VirtualBox, sometimes when you choose the
Host-only adapter option, the interface name might appear as Not selected. In that case, you need to first create at least one host-only interface by navigating to
Host-only networks |
Add host-only network. Click on
OK; then bring up the
Network and change the adapter settings to
Host-only Adapter, as shown in the following screenshot. Click on
- Now we will assign a static IP address of
192.168.1.100to the Ubuntu Linux VM. To do that, power on the Linux VM, open the terminal window, type the command
ifconfig, and note down the interface name. In my case, the interface name is
ens33. In your case, the interface name might be different. If it is different, you need to make changes to the following steps accordingly. Open the file
/etc/network/interfacesusing the following command:
$ sudo gedit /etc/network/interfaces
Add the following entries at the end of the file (make sure you replace
ens33 with the interface name on your system) and save it:
auto ens33 iface ens33 inet static address 192.168.1.100 netmask 255.255.255.0
/etc/network/interfaces file should now look like the one shown here. Newly added entries are highlighted here:
# interfaces(5) file used by ifup(8) and ifdown(8) auto lo iface lo inet loopback auto ens33 iface ens33 inet static address 192.168.1.100 netmask 255.255.255.0
Then restart the Ubuntu Linux VM. At this point, the IP address of the Ubuntu VM should be set to
192.168.1.100. You can verify that by running the following command:
$ ifconfig ens33 Link encap:Ethernet HWaddr 00:0c:29:a8:28:0d inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fea8:280d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21 errors:0 dropped:0 overruns:0 frame:0 TX packets:49 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:5187 (5.1 KB) TX bytes:5590 (5.5 KB)
- The next step is to configure INetSim so that it can listen to and simulate all the services on the configured IP address
192.168.1.100. By default, it listens on the local interface (
127.0.0.1), which needs to be changed to
192.168.1.100. To do that, open the configuration file located at
/etc/inetsim/inetsim.confusing the following command:
$ sudo gedit /etc/inetsim/inetsim.conf
Go to the
service_bind_address section in the configuration file and add the entry shown here:
The added entry (highlighted) in the configuration file should look like this:
# service_bind_address # # IP address to bind services to # # Syntax: service_bind_address <IP address> # # Default: 127.0.0.1 # #service_bind_address 10.10.10.1 service_bind_address 192.168.1.100
By default, INetSim's DNS server will resolve all the domain names to
127.0.0.1. Instead of that, we want the domain name to resolve to
192.168.1.100 (the IP address of Linux VM). To do that, go to the
dns_default_ip section in the configuration file and add an entry as shown here:
The added entry (highlighted in the following code) in the configuration file should look like this:
# dns_default_ip # # Default IP address to return with DNS replies # # Syntax: dns_default_ip <IP address> # # Default: 127.0.0.1 # #dns_default_ip 10.10.10.1 dns_default_ip 192.168.1.100
Once the configuration changes are done,
Save the configuration file and launch the INetSim main program. Verify that all the services are running and also check whether the
inetsim is listening on
192.168.1.100, as highlighted in the following code. You can stop the service by pressing CTRL+C:
$ sudo inetsim INetSim 1.2.6 (2016-08-29) by Matthias Eckert & Thomas Hungenberg Using log directory: /var/log/inetsim/ Using data directory: /var/lib/inetsim/ Using report directory: /var/log/inetsim/report/ Using configuration file: /etc/inetsim/inetsim.conf === INetSim main process started (PID 2640) === Session ID: 2640 Listening on: 192.168.1.100 Real Date/Time: 2017-07-08 07:26:02 Fake Date/Time: 2017-07-08 07:26:02 (Delta: 0 seconds) Forking services... * irc_6667_tcp - started (PID 2652) * ntp_123_udp - started (PID 2653) * ident_113_tcp - started (PID 2655) * time_37_tcp - started (PID 2657) * daytime_13_tcp - started (PID 2659) * discard_9_tcp - started (PID 2663) * echo_7_tcp - started (PID 2661) * dns_53_tcp_udp - started (PID 2642) [..........REMOVED.............] * http_80_tcp - started (PID 2643) * https_443_tcp - started (PID 2644) done. Simulation running.
- At some point, you need the ability to transfer files between the host and the virtual machine. To enable that on VMware, power off the virtual machine and bring up the
Guest Isolationand check both
Enable drag and dropand
Enable copy and paste.
On Virtualbox, while the virtual machine is powered off, bring up
Advanced and make sure that both
Shared Clipboard and Drag 'n' Drop are set to
Bidirectional. Click on
- At this point, the Linux VM is configured to use
Host-onlymode, and INetSim is set up to simulate all the services. The last step is to take a snapshot (clean snapshot) and give it a name of your choice so that you can revert it back to the clean state when required. To take a snapshot on
VMware workstation, click on
Take Snapshot. On
Virtualbox, the same can be done by clicking on
Apart from the drag and drop feature, it is also possible to transfer files from the host machine to the virtual machine using shared folders; refer to the following for VirtualBox (https://www.virtualbox.org/manual/ch04.html#sharedfolders) and to the following for VMware (https://docs.vmware.com/en/VMware-Workstation-Pro/14.0/com.vmware.ws.using.doc/GUID-AACE0935-4B43-43BA-A935-FC71ABA17803.html).
Before setting up the Windows VM, you first need to install a Windows operating system (Windows 7, Window 8, and so on) of your choice in the virtualization software (such as VMware or VirtualBox). Once you have Windows installed, follow these steps:
- Download Python from https://www.python.org/downloads/. Be sure to download Python 2.7.x (such as 2.7.13); most of the scripts used in this book are written to run on the Python 2.7 version and may not run correctly on Python 3. After you've downloaded the file, run the installer. Make sure you check the option to install
Add python.exe to Path, as shown in the following screenshot. Installing pip will make it easier to install any third-party Python libraries, and adding Python to the path will make it easier to run Python from any location.
- Configure your Windows VM to run in
Host-onlynetwork configuration mode. To do that in
VirtualBox, bring up the
Network Settingsand choose the
Host-only mode; save the settings and reboot (this step is similar to the one covered in the Setting Up and Configuring Linux VM section).
- Configure the IP address of the Windows VM to
192.168.1.x(choose any IP address except
192.168.1.100because the Linux VM is set to use that IP) and set up your
Default gatewayand the
DNS serverto the IP address of Linux VM (that is,
192.168.1.100), as shown in the following screenshot. This configuration is required so that when we execute the hostile program on the Windows VM, all of the network traffic will be routed through the Linux VM.
- Power on both the
Linux VMand the
Window VM, and make sure they can communicate with each other. You can check for the connectivity by running the ping command, as shown in this screenshot:
- Windows Defender Service needs to be disabled on your Windows VM as it may interfere when you are executing the malware sample. To do that, press the Windows key + R to open the Run menu, enter gpedit.msc, and hit
Enterto launch the
Local Group Policy Editor. In the left-hand pane of
Local Group Policy Editor, navigate to
Windows Defender. In the right-hand pane, double-click on the
Turn off Windows Defender policyto edit it; then select
Enabledand click on
- To be able to transfer files (drag and drop) and to copy clipboard content between the host machine and the Windows VM, follow the instructions as mentioned in Step 7 of the Setting Up and Configuring Linux VM section.
- Take a clean snapshot so that you can revert to the pristine/clean state after every analysis. The procedure to take a snapshot was covered in Step 10 of the Setting Up and Configuring Linux VM section.
At this point, your lab environment should be ready. The Linux and Windows VMs in your clean snapshot should be in
Host-only network mode and should be able to communicate with each other. Throughout this book, I will be covering various malware analysis tools; if you wish to use those tools, you can copy them to the clean snapshot on the virtual machines. To keep your clean snapshot up to date, just transfer/install those tools on the virtual machines and take a new clean snapshot.
Once you have a lab set up, you will need malware samples for performing analysis. In this book, I have used various malware samples in the examples, since these samples are from real attacks, I have decided not to distribute them as there may be legal issues distributing such samples with the book. You can find them (or similar samples) by searching various malware repositories. The following are some of the sources from where you can get malware samples for your analysis. Some of these sources allow you to download malware samples for free (or after free registration), and some require you to contact the owner to set up an account, after which you will be able to obtain the samples:
- Hybrid Analysis: https://www.hybrid-analysis.com/
- KernelMode.info: http://www.kernelmode.info/forum/viewforum.php?f=16
- VirusBay: https://beta.virusbay.io/
- Contagio malware dump: http://contagiodump.blogspot.com/
- AVCaesar: https://avcaesar.malware.lu/
- Malwr: https://malwr.com/
- VirusShare: https://virusshare.com/
- theZoo: http://thezoo.morirt.com/
You can find links to various other malware sources in Lenny Zeltser's blog post https://zeltser.com/malware-sample-sources/.
If none of the aforementioned methods work for you and you wish to get the malware samples used in this book, please feel free to contact the author.
Setting up an isolated lab environment is crucial before analyzing malicious programs. While performing malware analysis, you will usually run the hostile code to observe its behavior, so having an isolated lab environment will prevent the accidental spreading of malicious code to your system or production systems on your network. In the next chapter, you will learn about the tools and techniques to extract valuable information from the malware specimen using Static Analysis.