Preventing Ransomware

By Abhijit Mohanta , Mounir Hahad , Kumaraguru Velmurugan
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Malware from Fun to Profit

About this book

Ransomware has turned out to be the most aggressive malware and has affected numerous organizations in the recent past. The current need is to have a defensive mechanism in place for workstations and servers under one organization.

This book starts by explaining the basics of malware, specifically ransomware. The book provides some quick tips on malware analysis and how you can identify different kinds of malware. We will also take a look at different types of ransomware, and how it reaches your system, spreads in your organization, and hijacks your computer. We will then move on to how the ransom is paid and the negative effects of doing so. You will learn how to respond quickly to ransomware attacks and how to protect yourself. The book gives a brief overview of the internals of security software and Windows features that can be helpful in ransomware prevention for administrators. You will also look at practical use cases in each stage of the ransomware phenomenon. The book talks in detail about the latest ransomware attacks involving WannaCry, Petya, and BadRabbit.

By the end of this book, you will have end-to-end knowledge of the trending malware in the tech industry at present.

Publication date:
March 2018
Publisher
Packt
Pages
266
ISBN
9781788620604

 

Chapter 1. Malware from Fun to Profit

Malware is a software with malicious intent and that changes the system without the knowledge of the user. Malware uses the same technologies that are used by genuine software but the intent is bad. The following are some examples:

  • Software such as TrueCrypt uses algorithms and techniques to encrypt a file to protect privacy, but, at the same time, ransomware uses the same algorithms to encrypt files to extort the user.
  • Similarly, Firefox uses HTTP protocol to browse the web while malware uses HTTP protocol to post its stolen data to its command and control (C&C) server

This chapter will help you understand malware. You will be able to understand various aspects of malware such as self-protection, armoring, and surviving a reboot. As condensing all of the malware concepts into the chapter is tough, the concepts have been explained in a brief manner so that going forward readers can understand various terminologies related to malware. To understand the minute technical details, readers should try to dig more into the keywords. Most things have been explained in the context of the Windows operating system. The chapter starts with some basic Windows concepts such as virtual memory and DLLs. It has simplified the illustration of concepts such as API hooking, rootkits, and various techniques without using much technical depth. A section of the chapter focuses on various types of malware and also some historical background of malware. Going forward, you will be able to correlate ransomware with other malware. Readers are advised to read this chapter carefully as the explained concepts will be referred to in upcoming chapters.

 

1. The malware story


History has seen a lot of malware. Malware has mutated over decades to what you see now. This section tells you about some historical landmarks in the malware industry.

1.1 Malware in the womb

Like all kinds of research, virus research started with theoretical papers. In 1949, John von Neumann wrote a paper called Theory of self-reproducing automata.

Then there was some proof of concept following von Neumann's theory of viruses.

1.2 The birth of malware

Creeper was an experimental virus written by Bob Thomas as a proof of concept for von Neumann theory. Creeper was a self-replicating program which used to make copies of itself on the same system and to other systems too. Creeper was not harmful to the system other than filling up space.

1.3 Malware started crawling

The Rabbit virus, created in 1974, was a self-replicating virus and caused the system to crash by eating up system resources.

Animal was the first Trojan. It did not do any damage to the system but moved on to different systems.

Frederick Cohen first coined the term virus for software.

Brain was the first boot sector virus, released in 1986. The virus was created in Pakistan and is therefore also called the Lahore virus.

Brain was followed by the Vienna virus and the Lehigh virus.

The Morris worm, created in 1988 by Robert Tappan Morris, had the capability to spread using the internet by exploiting a buffer overflow vulnerability.

Ghostball, released in 1989, was the first to infect executables.

1.4 Malware started playing

Happy99 was a worm that appeared in early 1999. It used to attach itself to emails and display fireworks.

The Melissa worm, released in 1999, was meant for Microsoft Word and created a lot of network traffic.

The iloveyou worm, also known as the love bug worm, was released in 2000. It was written in VBScript by a Filipino student. It was known to spread to millions of computers in a short amount of time.

The Code Red worm was known to spread in Microsoft systems in 2001. The Nimda worm was the next famous one, in the same year, and seemed to infect Microsoft operating systems.

 SQL Slammer was seen in 2003 and spread through the entire internet using by exploiting a bug in Microsoft SQL Server.

Baggle and Brontok were mass-mailing worms seen in the 2000s.

Conficker was another notorious piece of malware seen in 2008 which was known to infect Microsoft servers.

1.5 Malware started earning

Most of the forms of malware, when they started, were never written to generate revenue. Earlier computers were mostly used for emails.

With the growth of technology, computers were used in banking. Computers were commercialized and many people bought computers at low cost. People started using computers for storing personal data, playing games, and banking.

In 2006, Zeus, a form of banking malware, was detected. Spyeye was another banking trojan with similar lines of Zeus. Other banking Trojans that followed were Tinba, GozNym, and Dyre. These pieces of malware stole bank usernames, passwords, credit card details, and other personal information.

Malware also expanded from desktop users to shopkeepers. Malware also started targeting point of sale (POS) devices. There was an increase in the use of POS devices worldwide. BlackPos was seen in 2012. It was followed byAlina, Skimmer, and BackOff.

Malware also came into the extortion business. The first ransomware attack was heard of in 1989 and targeted the healthcare industry. After that, ransomware attacks were not seen much. In 2005, ransomware was seen again. It started with screen lockers and then moved on to crypto-ransomware. CryptoLocker was seen in 2013. Soon, a lot of crypto-ransomwares was seen. Locky and Cerber were the famous ones.

In most cases of malware infection, the victim was chosen at random. But there were attacks that targeted individuals. These kinds of attacks used a combination of multiple forms of malware and social engineering. Stuxnet was one of the most famous targeted attacks in history.

The following sections describe malware techniques and types of malware. Operating system concepts will be explained in a very simple manner as they will be used later in the book. Minute technical details will not be explained here.

 

2. Windows operating system basics


Malware analysis is a subject in itself. To technically understand malware, one needs to have a good knowledge of operating system internals. In this chapter, only a few operating system concepts, such as file format, virtual memory, API hooking, and DLL are explained in the simplest possible way so that understanding the malware becomes easy. The concepts have been explained in the context of the Windows operating system.

2.1 File format

File format is one of the most important concepts you need to understand in order to understand malware. Here is a simple task for readers to perform in order to understand the concept of file format:

  1. Open a WordPad program on a Windows machine by typing wordpad in the Windows search tab.
  2. Type in this is my text in the newly opened WordPad. Then save the file with name test.rtf. When you try to save the file, a window pops up asking if you want to save the file in Rich Text Format (RTF). You can just give it the name test.rtf and save it.

 

  1. Now open test.rtf with Notepad. You can simply do this by right-clicking on test.rtf, going to Open with, and browsing and opening with notepad.exe. What do you see?

test.rtf opened in Notepad

Your text lies toward the end and the file starts with {\rtf1. This is how WordPad has saved whatever we wrote into it. It has saved our text in what is called RTF file format. There is other information saved in the file. For example, information about the font is stored in a tag that starts with {\fonttbl. Here, the font used is calibre, as you can see in the screenshot. When you open the file with WordPad, the WordPad program parses the file format and displays the meaningful data to the user. In short, RTF file format tells the WordPad program how it should display the stored text to the user. File formats are complex structures which can have multiple substructures inside them, in a hierarchical order.

There are numerous file formats for different programs. Microsoft Word has the capability to parse DOC, DOCX, and XLS files, which follow the Object Library (OLE) file format. Similarly, the Adobe and Foxit PDF readers can read the PDF file format.

A binary or executable in Windows follows the PE file format. Microsoft Windows has a program which is called loader, that can parse the .exe with reference to the PE file structure. Loader finds out details such as which code needs to be executed first (this is called the entry point) and how the executable should be placed in virtual memory. Similarly, a Linux executable follows the ELF file format.

There is an exhaustive list of file formats on Wikipedia at https://en.wikipedia.org/wiki/List_of_file_formats.

2.2 Windows executable made simple

What is a Windows executable? What happens when you double-click an exe? Every operating system has a way to execute a binary or executable. In the case of Windows, an executable file name ends with .exe and it's in a file format called PE. When you double-click a Windows binary (for example, iexplore.exe in C:\Program Files\Internet Explorer is the binary or executable for Internet Explorer), Windows parses the iexplore.exe file in the context of the PE file format and finds out the code that it needs to execute first. The location of this code (the first code that needs to be executed when the .exe is double-clicked) in the .exe file is called the entry point. Technically, a lot of steps are involved before Windows executes the code at the entry point, for example, Windows maps the executable and supporting libraries (DLLs) into the virtual memory (explained in the next section). Now, when the code is executed in virtual memory, we call it a thread. A process consists of many threads. A detailed explanation of how a Windows process is created is explained in the book Windows Internals, Part 1 by Mark E. Russinovich. This is one of the best books for learning about Windows operating system internals.

2.3 Windows virtual memory made simple

When an executable (.exe) is double-clicked, a process is created. We talked about this in the last section. Each process has its own virtual memory. The code of the executable and supporting libraries (DLLs) are loaded into virtual memory. Now, each process in a 32-bit window has a 4 GB virtual memory address space. Does that sound confusing?

If a computer has 2 GB of RAM, then how can each process have a 4 GB virtual memory? Well, here is a simplified explanation for this. The virtual memory is split into regions called pages. The processor cannot execute all the code at one time. Only the pages which contain currently executing code are loaded into RAM. At a particular instance of time, the RAM has pages from virtual memory of various processes. Windows memory management takes care of the whole process of loading and unloading pages through a method termed paging. Virtual memory gives a process the illusion of having 4 GB of RAM.

There is a tool called Process Hacker which, by default, shows the processes executing on the system. Double-clicking on a process name (in this case, Notepad) brings up a new window for that particular process. The window has various tabs corresponding to a property of a process, such as modules, memory, and threads.

The libraries (DLLs) used by Notepad and the notepad.exe itself are called modules. The Modulestab in the Process Hacker tool shows the loaded modules:

Note

What we see here is the virtual memory, not the physical memory or RAM. Base Address in the screenshot is the start address of a particular module in virtual memory and Size denotes the size of the module.

Modules in notepad.exe's virtual address space

In virtual memory, a module can be split into several pages. It's not necessary that all the code in a particular module will be there in the physical memory or RAM as explained earlier.

If you switch to the Memory tab, you can see the address:

Pages in the notepad.exe virtual address space

The preceding screenshot of the Process Hacker tool shows a page which starts at address 0x1000000 of size 4 KB in the virtual memory of the Notepad process. The page lies in the notepad.exe module. The notepad.exe module is divided into four memory blocks. Each memory block is composed of pages. Process Hacker displays the contiguous pages with same properties as a memory block.

The four memory blocks start at the 0x1000000, 0x1001000, 0x1009000, 0x100b000 addresses in virtual memory.

Other than the modules, there are pages allocated for different purposes, such as heap and stack. Heap and stack are used by the program while assigning variables and assigning memory for carrying out certain operations, such as decryption.

A virtual memory of the 32-bit process is 4 GB, which is further divided into user space and kernel space, with 4 GB each. User space is specific to a process while kernel space is shared by all processes. Kernel space includes critical device drivers and other critical codes of the operating system.

Paging is a much more complex operation than explained here and Windows use a combination of hardware and several data structures to implement paging.

2.4 Windows DLL made simple

A lot of programs need to perform the same set of operations, for example, a WordPad program, a Notepad, and Adobe Acrobat Reader all need to open a file, close a file, and write to a file. Similarly, Internet Explorer and Mozilla Firefox both need to connect to the internet. Writing code to write to a file or connect to the internet for each of these programs would lead to redundancy. So, the concept of the library came into programming. A library has an implementation of some common functionalities which can be used by multiple programs.

DLL is one such concept and is shorthand for dynamic link libraries. DLL has many functions in it that can be used by other programs. DLL is loaded into the virtual address space of the executable. A DLL follows the PE file format. The DLL makes the functions in it available to other programs, which needs the function through an export table (an export table is part of the PE file format).

The functions are called APIs. The export table contains the address of the APIs:

kernel32.dll export CopyFile API

The preceding screenshot shows a tool called CFF Explorer that can show the components of a PE file format. The screenshot displays the CopyFile API exported by kernel32.dll. As per the name, the API can be used by other programs to copy a copy a file.

Note

DLL is also an executable but you can't just execute it by double-clicking. DLL functions should be called by another executable to execute the code inside a DLL. Windows has a tool called rundll32.exe to execute DLLs.

kernel32.dll (location: C:\Windows\system32) has the functions CreateFile() and WriteFile(), which can be used to create, open, or write to a file. These functions are available in the export table of kernel32.dll. The executable that needs to write to file will import this DLL and load it into its virtual memory space. It then calls the writeFile() function whenever it needs to write to file. So kernel32.dll can be used by both WordPad or Notepad, which removes the need to implement these on their own.

2.4.1 How does an API call happen?

When an exe is mapped into virtual memory during the creation of a process, its supporting DLLs are also loaded into the virtual memory of the process. The APIs in a DLL are assigned a certain address in virtual memory. These addresses are not fixed and Windows assigns these addresses every time a process is created. If a program needs to call an API in a DLL, it needs to use the address of the API:

Note

This is a simplified version of how an executable calls an API. To understand the technical details of it, the reader should go through the "import table" in the PE file format. The following link explains the PE file format in details including the import table at https://msdn.microsoft.com/en-us/library/ms809762.aspx.

If you wish to know about a Microsoft API, you can search in Microsoft Developer Network (MSDN). Here is an example of a description of the Sleep function in MSDN:

VOID WINAPI Sleep(
  _In_ DWORD dwMilliseconds
);

The Sleep function takes an input parameter in milliseconds. It is good practice to refer to MSDN when there is a reference to a certain Microsoft API in this book. MSDN has detailed descriptions of the APIs.

For understanding the PE file format and various structures in it in a geeky way, the reader can refer to Peering Inside the PE: A Tour of the Win32 Portable Executable File Format by Matt Pietrek, on the Microsoft website https://msdn.microsoft.com/en-us/library/ms809762.aspx.

2.5 API hooking made simple

Hooking is a frequently referenced word in malware. In very simple terms, malware performs API hooking to modify the legitimate API in such way that it executes the code of the malware when a program calls the API. In other words, malware executes its intent by modifying a genuine function.

Say a malware wants to see all the email that goes out from Internet Explorer. What does it need to do for this?

Before this, we need to understand how Internet Explorer will send the email. Internet Explorer uses the HttpSendRequest() Windows API from wininet.dll to send data to your email server:

  • As mentioned before, a DLL is mapped into the virtual address space of a process. The APIs are loaded at a certain address in the memory allocated to DLL and an executable access the API by using the address.
  • The malware wants to intercept all calls made to Internet Explorer to the API httpSendRequest(). Since the API is called using an address, the malware replaces this address with the address of its own code. We can call this API hooking. In this case (see the following diagram) malware has replaced the address of the API with the address of the hook-httpSendRequest function, which is part of the injected malware code.
  • When Internet Explorer needs to call httpSendRequest, it uses its address (step 1). But the malware has already replaced this address with the address of hook-httpSendRequest. Hence the control goes to malware code HOOK-HttpSendRequest function (step 2).
  • Now the malicious code can see whatever data Internet Explorer passes to the HttpSendRequest() API. Malware can now post your data to the hacker or do anything it else it wants to.
  • The hook-httpSendRequest() function calls the actual HttpSendRequest() (step 3). The Hook function can manipulate the results returned from original httpSendRequest and send those back to Internet Explorer (step 5):

Virtual memory of Internet Explorer showing API hooking

Malware can use API hooks for hiding its file and process. A malware can hook functions related to file, process, and registry enumeration to hide. This technique is called rootkit and will be explained later in the chapter. In order to hide a file, malware has to hook the FindFirstFile() and FindNextFile() APIs which are used to iterate files on a Windows system. To hide a process, malware needs to hook the Process32First() API and Process32Next() API, which are used to enumerate a process.

An API hook is specific to a process. If one hooks an API in a particular process, it does not mean that the hook propagates to other processes. The reason is the virtual memory is different for both processes. So one has to hook the API in the other process too. Windows Task Manager and Sysinternals Process Explorer can both be used to see the running processes in a system. If a malware hooks the Process32First() API and Process32Next API inTask Managerto hide its process, Process Explorercan still see the malicious process if the APIs are not hooked inProcess Explorer.

Malware can use various methods to hook APIs. Most of them involve exploiting the data structures in the PE file format.

 

3. Malware components


Malware can have various components:

  • Payload: This is the core component of malware, designed to execute its actual motive
  • Obfuscator: Usually a packer or protector for encrypting or compressing the malware
  • Persistence: How the malware manages to stay in the system
  • Stealth component: Hides the malware from antivirus and other tools, and security analysts
  • Armoring: Protects the malware from being easily identified by researchers
  • Command and control (C&C): This is the control center for the malware

These components are explained in the following sections.

3.1 Payload

The payload is the core of the malware. Malware is created for different purposes. Here is a list:

  • Malware can steal data such as usernames, passwords, and browser data
  • It can steal credentials from the victim's machine
  • It can steal banking information
  • Malware can download other malware
  • Malware can show advertisements to a victim without their consent
  • It can act as ransomware

It's not limited to these and there can be many more functionalities. The malware which executes these functionalities is called the payload. A payload is armed with techniques to protect and hide. Finally, before delivery, the malware is packaged with a packer or obfuscator, which adds an extra layer to the sheath to the payload.

3.2 Obfuscator/packer – a wolf in sheep's clothing

One major objective of malware is to evade antivirus software. Malware can be obfuscated using packers and protectors. A packer compresses the data in malware, making it easier transmit over the network. Obfuscation is a by-product of a packer because the compressed data is far different from the original data. The compressed malicious code is far different from the original code. Hence it is hidden from plain sight as well as the antivirus software . A malware researcher has to reverse engineer the packed code to extract the malicious code. Antivirus researchers write code that can do the same for antivirus engines. A packer can use several algorithms to compress the data. LZMA, APLib, LZSS, and ZLib are popular compression algorithms.

When a packer compresses the executable, it adds a decompression stub at the entry point of the exe and then adds the compressed data to the exe. A decompression stub is a code or function used to decompress the compressed data. It knows the location and size of the compressed data. When a packed executable is executed, the code in decompression stub is first executed, which decrypts the compressed malicious code in memory. After this, the malicious code takes control:

Packed PE file

Packers come with additional code to make malware analysis harder.

There are several packers that can be used to pack and protect both genuine software and malware. Here are a few popular ones:

  • UPX
  • Aspack
  • Asprotect
  • PECompact

Researchers came up with generic methods to unpack the known packer (UPX is a well-known packer that is unpacked with ESP trick). Also, antiviruses came up with code that can unpack many of the known packers. Malware then moved on to custom packers to prevent inexperienced researchers from unpacking them. Also, the number of custom packers increased over time, which made the work of security researchers harder.

3.3 Malware persistence

Malware should start the next time the system reboots so that it can continue with its activity.

Windows has certain features which can help a program to start when Windows boots. Here are few of them:

  • Startup folders
  • Run entries in registries
  • Windows services
  • Scheduled tasks
  • Files that are executed at Windows start

Startup folders and run entries are referred to in a lot of places in this book. An explanation of these terms follows.

3.3.1 Startup folders

If you keep your program or folders in certain directories, the programs will execute at Windows start.

C:\Documents and Settings\username\Start Menu\Programs\Startup and C:\Documents and Settings\All Users\Start Menu\Programs\Startup are two of them.

3.3.2 Run entries

A registry is a hierarchical database which keeps track of system settings. A registry has several registry keys for different purposes. A registry entry is usually a key-value pair. System settings also include the list of programs that need to start when you first boot. Malware researchers usually term them run entries

Here are some frequently used keys:

  • HKCU\Software\Microsoft\Windows\CurrentVersion\RunOnce
  • HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run
  • HKCU\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\Shell
  • HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\Shell
  • HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\Userinit

The value of these keys contains the absolute path (full path) of the malicious program. When Windows starts, the programs that are pointed to by these registry keys are started first. That's how malicious programs start even before the user starts their work.

3.3.3 Windows services

Services are background processes in the Windows operating system. Some of the services execute independently while other execute under the svchost.exe process.

If you want to view services installed on your Windows operating system, you use the command msconfig. It gives a list of servicesstartup programs, and bootup programs. Many of the services need to be executed before the user logs in. The following registry keys are used to launch an exe as a service before the user logs in:

  • HKLM\SYSTEM\CurrentControlSet\services
  • HKLM\Software\Microsoft\Windows\CurrentVersion\Run\Services
  • HKLM\Software\Microsoft\Windows\CurrentVersion\Run\Services\Once

The registry key points to the absolute path of the malware exe file.

Malware can also run as a service under the svchost.exe process. This is a Windows process. As the name suggests, it hosts services (svc is shorthand for services). The following registry key is associated with services executing under svchost:

  • HKLM\Software\Microsoft\Windows NT\CurrentVersion\Svchost

3.3.4 Files executed at Windows start

There are certain batch and init files that are executed at the system start. Here are a few of them:

  • c:\autoexec.bat
  • C:\Windows\wininit.ini
  • C:\Windows\\winstart.bat

The malware places its absolute path in these files and it automatically executes at system start.

3.4 Stealth – a game of hide-and-seek

Malware needs to hide from the victim and antivirus. When a malware is executed on Windows, it creates its own file and registry entry in the system. It launches its own process and creates network connections. Malware can hide its files, process, and registry in multiple ways:

  • File properties
  • Injecting code into the legitimate process
  • Using rootkits
  • Fileless malware

3.4.1 File properties – an old-school trick

This is an old-school method still employed sometimes. Extremely simple methods were used by malware to hide their files.

Changing the property of the file to a hidden or system file was the easiest method. The victim is not able to see the file unless they make changes to the settings in the system to view hidden and system files.

Sometimes, malware use filenames to trick the victim. The malware stays in the system with the full name payslip.pdf.exe. The extension of the file is exe but the name is payslip.pdf. The victim will suspect if he sees that the actual file extension is .exe instead of .pdf.

If the victim has the settings to view the extension of a file, then he can suspect it to be an executable, otherwise they end up infecting the machine by clicking on the malware file.

But it could easily be identified by researchers and antivirus engines. Eventually, malware came up with code injection and rootkits.

3.4.2 Injecting code into a legitimate process

Malware can inject its own code into an already running legitimate process, then make the legitimate code execute malicious code. This can be implemented using traditional thread injection, DLL injection, and process hollowing.

A traditional thread injection is implemented through the following steps:

  1. Open the target process using openProcess api(). The target process is mostly a clean process already executing on the system, such as svchost.exe or explorer.exe.
  2. Allocate space to the target process using the VirtualAllocEx() API.
  3. Write malicious code to the remote process using the WriteProcessMemory() API.
  4. The injected code is executed as a thread in the target process using the CreateRemotethread() API.

Today, most malware uses a technique called process hollowing or Runpe. Though the method has existed for more than a decade, its usage seems to have picked up in the past few years. The reason could be that it's hard for malware analysts to debug process hollowing. Process hollowing launches a process in suspended mode and then writes its own binary into the newly created process. Then it resumes the target process. This technique is used by most ransomware packers today.

3.4.3 Rootkits

It's important for malware to be stealthy so nobody observes its activity on the system. A rootkit is a technology to hide malware by modifying system functions or data structures exposed by the operating system. Rootkits can hide the following:

  • The malware process
  • The malware file
  • Registry entry created malware
  • Network connections

Malware can use techniques such as API hooking to manipulate the APIs in order to hide their files and processes. The API hooking made simple section explains how API hooks can be used to hide files and process. They wereuser mode rootkits.

There are kernel mode rootkits too. Rootkits can further modify data structures used by their operating system. For example, Windows maintain the list of processes executing in the system using a double-linked list. This list is available in the Windows kernel space. Each node in the Linked List is a structure called EPROCESS, which contains information about a process. Now, in order to hide a process, malware can unlink the EPROCESS corresponding to the malware from the list. This makes the malware process invisible. This method is called Direct Kernel Method Manipulation (DKOM). Since this is done in kernel mode, none of the processes can see the process, as kernel space is common to all the processes in Windows, as explained in the Windows virtual memory made simple section of this chapter:

Other methods can be employed by kernel mode rootkits:

  • SSDT hooking
  • IDT hooking
  • IRP hooking

Explaining these is beyond the scope of the book. These techniques can be found on the internet and are described in malware analysis books. Kernel mode rootkits were very popular from 2009 onward. Some the famous pieces of malware which came with rootkits were TDSS and ZeroAccess.

It's worth knowing about rootkits when you work with malware although we hardly see any ransomware armored with rootkit today. The reason could be ransomware doesn't want to hide. It openly threatens the victim, unlike data-stealing malware, which needs to hide. But rootkits can be used by other malware that may be linked to ransomware. A downloader (malware that downloads other malware) which can download ransomware to its victim machine can hide in the system using rootkits so that its presence is not detected.

3.4.4 Fileless malware

This is an ongoing trend. This kind of malware is usually coded in PowerShell. Powershell is a scripting language used for Windows to automate tasks. A Powershell script is executed which can directly download and inject code into a legitimate process's memory. So the downloaded malware is never written as a file to the disk. Hence, we call these fileless attacks. Most Powershell malware can be categorized as downloaders.

Here is a list of fileless malware:

  • Powerliks
  • Kovter
  • PowerSnif
  • POSHSPY

SoreBrect is a piece of ransomware that uses the fileless technique.

3.5 Armoring

Security software and analysts always pose a threat to malware. Malware uses several techniques to protect itself. We can consider packers and rootkits as two of those techniques. Here are a few types of software that can pose a threat to malware:

  • Windows troubleshooting tools: Task Manager and Registry Editor are tools that can be used to troubleshoot Windows. Task Manager can show the list of running processes on the system, hence a malware process may be identified. Registry Editor can be used to remove run entries (explained in malware persistence in section 3.3) used by malware. These tools are a threat to malware itself. So malware needs to disarm them.
  • Malware analysis tools: Researchers use a number of tools to analyze malware. Here are few of the tools:
    • Debuggers: In simple terms, a debugger is a tool that can be used to test and find bugs in software. Ollydbg and IDA pro are some of the famous debuggers used for more than a decade. Malware researchers can debug malware with debuggers. In this case, it is not meant to find a bug in the malware, but to find out how the malware works.
    • System monitoring tools: There are other tools which analysts use to monitor files, registry, process, and network. Filemon, Regmon, and ProcMon are the famous ones. Wireshark is one the most used network sniffing tools.

Malware tries to detect these tools in a number of ways. One well-known trick to detect the debugger is by using the IsDebuggerPresent() API provided by Microsoft. Malware uses this API to find out it is running under a debugger. Malware tries to detect the presence of files and processes related to these tools. Malware can look for the presence of ollydbg.exe, tcpdump.exe, wireshark.exe, and so on. Malware researchers mostly use the virtual machine to execute malware in a restricted environment. VMware, VirtualBox, and Qemu are the most famous ones. Malware also tries to detect the presence of virtual machines. In a virtual machine, a host operating system is installed, which consists of all types of tools needs for analysis. The virtual machine has the capability to take a snapshot instance of the guest operating system. A snapshot of a clean instance of the guest operating system is kept and, post analysis, it is reverted back to the clean snapshot. Malware tries to figure out whether it is being executed from inside a virtual machine.

Here are a few methods that malware can use if executed in Windows guest OS on VMware:

  • VMware process in the guest: The guest operating system has few processes of VMware running in it. A guest OS has the following processes running in it: Vmwaretrat.exe, Vmtoolsd.exe, Vmwareuser.exe, and Vmacthlp.exe. Malware can detect these processes with the help of a Windows API used in enumerating processes.
  • VMware-related files: Malware can check for the presence of the files vmtray.dll, mmouse.sys, and vmGuestLib.dll in Windows driver folders.
  • Registry keys: Malware can check for the registry key HKLM\HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0\Identifier for VMware.

If the malware is able to detect tools or virtual machines, it can simply exit and analysts can't figure out what happened unless somebody deep dives into it.

Now, with the increase in malware, automated malware analysis is carried out. The automated malware analysis is termed a sandbox. A sandbox consists of a virtual machine in which a guest operating system is installed with malware analysis tools. A sandbox keeps a clean snapshot of the virtual machine with tools installed that are used to log the various activities of malware, such as file modifications, network connections, and registry changes. 

Automation is used to place malware inside the virtual machine and then execute it. After executing the malware, the automation code extracts the logs from the guest operating system and restores the virtual machine snapshot back to a clean state. The logs can consist of API tracesfile modifications, registry modifications, and network activities done by the malware. After extracting the logs, the virtual machine snapshot is restored to a clean one. Cuckoo is a one well-known open source sandbox. Other sandboxes include Joe Sandbox.

Malware uses similar techniques to detect a sandbox as a sandbox comprises a virtual machine and security tools. But there are some techniques specific to sandbox detection too. One of the most popular ones is using the sleep() API to wait for a long time before actually executing the malicious part. Most sandboxes are designated a particular time frame to execute a malware. After the time lapses, the virtual machine is restored to a clean instance. So if malware sleeps for a longer duration, the sandbox cannot find out the actual functionality of the malware.

3.6 Command and control server

C&C is the command center for the malware. It can also send instructions or other data to the malware which is on the victim machine. Here are a few functionalities of the C&C server:

  • Malware can update itself with its newer versions from the C&C server.
  • Many forms of malware can receive configuration files from the C&C server which says what the malware needs to do on a victim machine.
  • A malware can send stolen data to the C&C server. Stolen data can include username, passwords, and so on.
  • A number of types of ransomware receive the key to encrypt from the C&C server.

Earlier C&C servers had fixed IPs and domain names which used to be a part of the malware code. It became easier for security vendors to block these IPs and domains using firewalls and intrusion detection products.

To avoid detection, malware started generating domain names for connecting to the C&C server from the victim machine. They started using an algorithm called Domain Generation Algorithm (DGA). The algorithm generates thousands of domain names dependent on factors such as date and time. The hacker also has a similar algorithm and they register one of the domain names only for a limited amount of time.

Here is one example to explain the DGA algorithm. The algorithm creates a domain name out of various parts of the date. Say the date is March 10, 2017. Malware can create the following domain names on that day using permutation and combination:

  • www.10032017March.com
  • www.March03102017.com
  • www.2017March1003.com
  • www.03101630March.com

And a lot more. For a particular day, the hacker registers only www.March03102017.com for its C&C server. So, the malware is able to connect to its C&C server. But for law authorities or security professionals, it would be hard to identify which domain is active on that day.

Conficker was one the first pieces of malware which made the DGA algorithm popular and it was known to generate 50,000 domain names a day.

 

4. Types of malware


Malware can be categorized into different types based on the damage it causes to the system. Malware does not necessarily use a single method to cause damage; it can employ multiple ways. We will look into some known malware types. The following are some categories:

  • Backdoor
  • Downloader
  • Virus or file infector
  • Worm
  • Botnet
  • Remote Access Tool (RAT)
  • Hacktool
  • Keylogger and password stealer
  • Banking malware
  • POS malware
  • Ransomware
  • Exploit and exploit kits

To be clear, malware can act as a backdoor as well a password stealer or can be a combination of any of them. Some of the definitions are simple enough to understand in one line while others need some detailed explanation.

4.1 Backdoor

A backdoor can be a simple functionality for a malware. It opens a port on the victim machine so that the hacker can log in without the victim's knowledge and carry out their work. A piece of backdoor malware can create a new process of itself or inject malicious code that opens a port in legitimate code executing in the system. Backdoor activity was usually part of other malware. Most the RAT tools (explained later) have a backdoor module that opens a port on the victim machine for the hacker to get in.

4.2 Downloader

A downloader is a piece of malicious software that downloads other malware. It has a URL for the malware that needs to be downloaded. Hence, when executed, it downloads other malware. Bedep was mostly known to download CryptoLockersUpatre was another popular downloader.

4.3 Virus or file infector

File infection malware piggybacks its code in clean software. It alters an executable file on a disk in such a way that malware code is executed before or after the clean code in the file is executed. A file infector is often termed a virus in the security industry. A lot of antivirus products tag it as a virus.

In the context of PE executables of Windows, a file infector can work in the following manner:

  1. Malware adds malicious code at the end of a clean executable file.
  2. It changes the entry point of the file to point to the malicious code located at the end. When the exe is double-clicked, the malware code is executed first.
  3. The malicious code keeps the address of the clean code which was earlier the entry point. After completing the malicious activity, the malware code transfers control to the clean code:

Clean and infected PE files

A virus can infect a file in several ways. It can place its code at different places in the malicious code. File infection is a way to spread in the system.

Many of these file infectors infect every system file on Windows. So malware code has to execute irrespective of whether you start Internet Explorer or a calculator program.

Some very famous PE file infectors are Virut, Sality, XPAJ, and Xpiro.

4.4 Worm

A worm spreads in a system by various mechanisms. File infection can also be considered a worm-like behavior.

A worm can spread in several ways:

  • A worm can spread to other computers on the network by brute forcing default usernames and passwords of network shares or other machines.
  • A worm can spread by exploiting the vulnerability in network protocols.
  • A worm can spread using pen drives. When an autorun worm is executed, it looks for a pen drive attached to a system. The worm creates a copy of itself in the pen drive and also adds an autorun.inf file to the pen drive. When an infected pen drive is inserted into a new machine, autorun.inf is executed by Windows, which in turn executes the copied .exe. The copied exe is can now copy itself at different locations in the new machine where the pen drive is inserted.

4.5 Botnet

A botnet is a piece of malware that is based on the client-server model. The victim machine that is infected with the malware is called a bot. The hacker controls the bot by using a C&C server. This is also called a bot herder. A C&C server can issue commands to the bots. If a large number of computers are infected with bots, they can be used to direct a lot of traffic toward any server. If the server is not secure enough and is incapable of handling huge traffic, it can shut down. This is usually called a denial of service (DOS) attack. A bot can use internet protocols or custom protocols to communicate with its C&C server.

ZeroAccess and GameOver are famous botnets of the recent past.

4.6 Keylogger and password stealer

Keyloggers have been well known for a long time. They can monitor keystrokes and log them to a file. The log file can be transferred to the hacker later on.

A password stealer is a similar thing. It can steal usernames and passwords from the following locations:

  • Browsers store passwords for social networking sites, movie sites, song sites, email, and gaming sites.
  • FTP clients such as filezilla and smartftp, which can be used in companies or individuals to save data in FTP servers.
  • Email clients such as Thunderbird and Outlook are used to access emails easily.
  • Database clients used mostly by engineers and students
  • Banking applications
  • Users store passwords in password managers so that they don't have to remember them. Malware can steal passwords from these applications. LastPass and KeepPass are password manager applications.

Hackers can use these credentials to steal more data or access the private information of somebody or to try to access military installations. They can target executives using this kind of malware to steal their confidential information.

zeus and citadel are famous password stealers.

4.7 Banking malware

Banking malware is financial malware. It can include the functionality of keylogging and password stealing from the browser.

Banks have come up with virtual keyboards, which is a major blow to keyloggers. Now, most malware uses a man-in-the-middle (MITM) attack. In this kind of attack, a piece of malware is able to intercept the conversation between the victim and the banking site.

There are two popular MITM mechanisms used by banking malware these days: formgrabbing and browser injects.

In form grabbing, the malware hooks the browser APIs and sends the intercepted data to its C&C server. Simultaneously, it can send the same data to the bank website too.

Web inject works in the following manner:

  • Malware can perform API hooking in the browser to intercept the web page that as requested by the victim browser.
  • An original web page is a form in which victim needs to input various things, such as the amount they need to transfer, credentials, and so on. The malware modifies extra fields in this intercepted web page to add some extra fields, such as CVV number, PIN, and OTP, which are used for additional authentication. These additional fields are injected using an HTML form. This form varies based on the bank. Malware keeps a configuration file which tells the malware which form needs to be injected in the page of which banking site.
  • After modifying the web page, the malware sends data to the victim's browser. So the victim sees the page with extra fields as modified by the malware. 
  • Hence, the malware is able to steal the additional parameters needed for authentication.

Tibna, Shifu, Carberp, and Zeus are some famous pieces of banking malware.

4.8 POS malware

The method of money transfer is changing. Cash transactions in shops are changing. POS devices are installed in a lot of shops these days. Windows has a Windows POS operating system for these kinds of POS devices. The POS software in these devices is able to read the credit card information when one swipes a card in the POS device.

If malware infects a POS device, it scans the POS software for credit card patterns. Credit card numbers are 16 digits. Malware scans for 16-digit patterns in the memory to identify and then steal credit card numbers.

BlackPOS, Dexter, JackPOS, and BackOff are famous pieces of POS malware.

4.9 Hacktool

Hacktools are often used to retrieve passwords from browsers, operating systems, or other applications. They can work by brute forcing or identifying patterns. Cain and AbelJohn the Ripper, and Rainbow Crack were old hacktools. Mimikatz is one of the latest hacktools associated with some top ransomware such as Wanncy and NotPetya to decode and steal the credentials of the victim.

4.10 RAT

A RAT acts as a remote control, as the name suggests. It can be used as both good and bad. RATs can be used by system administrators to solve the issues of their clients by accessing the client's machine remotely. But since RATS usually give full access to the person sitting remotely, they can be misused by hackers. RATs have been used in sophisticated hacks lots of times.

RATS have been misused for multiple purposes, such as the following:

  • Monitoring keystrokes using keyloggers
  • Stealing credentials and data from the victim machine
  • Wiping out all data from a remote machine
  • Creating a backdoor so that a hacker can log in

Gh0st Rat, Poison Ivy, Back Orifice, Prorat, and NjRat are well-known RATs.

4.11 Exploit

Software is written by humans and, obviously, there will be bugs. Hackers take advantage of some of these bugs to compromise a system in an unauthorized manner. We call such bugs vulnerabilities. There are a number of vulnerabilities due to various reasons, mostly due to imperfect programming. If programmers have not considered certain scenarios while programming the software, this can lead to a vulnerability in the software.

Here is a simple C program that uses the function sctrcpy() to copy a string from source to destination:

C program with the strcpy() function

The programmer has failed to notice that the size of the destination is 10 bytes and the source is 23 bytes. In the program, the source is allocated 23 bytes of memory while the destination is assigned 11 bytes of memory space. When the strcpy() function copies the source into the destination, the copied string goes beyond the allocated memory of the destination. The memory beyond the memory assigned to the destination can have important things related to the program which would be overwritten. This kind of vulnerability is called buffer overflow. Stack overflow and heap overflow are commonly known as buffer overflow vulnerability. There are other vulnerabilities, such as use-after-free when an object is used after it is freed (we don't want to go into this in depth as it requires an understanding of C++ programming concepts and assembly language).

A program that takes advantage of these vulnerabilities for a malicious purpose is called an exploit

To explain an exploit, we will talk about a stack overflow case. Readers are recommended to read about C programs to understand this. Exploit writing is a more complex process a and requires knowledge of assembly language, debuggers, and computer architecture. We will try to explain the concept as simply as possible.

The following is a screenshot of a C program. Note that this is not a complete program and is only meant to illustrate the concept:

C program having stack overflow

The main() function takes input from the user (argv[1]) then passes it on to the vulnerable function vulnerable_function. The main function calls the vulnerable function. So after executing the vulnerable function, the CPU should come back to the main function (that is, line no 15). This is how the CPU should execute the program: line 14 | line 4 | line 5 | line 6 | line 15.

Now, when the CPU is at line 6, how does it know that it has to return to line 15 after that? Well, the secret lies in the stack. Before getting into line 4 from line 14, the CPU saves the address of line 15 on the stack. We can call the address of line 15 the return address. The stack is also meant for storing local variables too. In this case, the buffer is a local variable in vulnerable_function. Here is what the stack should look like for the preceding program:

Stack showing buffer address and return address

This is the state of the stack when the CPU is executing the vulnerable_function code. We also see that return address (address of line 15) is placed on the stack. Now the size of the buffer is only 16 bytes (see the program). When the user provides an input(argv[1]) that is larger than 16 bytes, the extra length of the input will overwrite the return address when strcpy() is executed. This is a classic example of stack overflow. When talking about exploiting a similar program, the exploit will overwrite the RETURN ADDRESS. As a result, after executing line 6, the CPU will go to the address which has overwritten the return address. So now the user can create a specially crafted input (argv[1]) with a length greater than 16 bytes. The input contains three parts - address of the buffer, NOP, and shellcode. The address of the buffer is the virtual memory address of the variable buffer. NOP stands for no operation instruction. As the name implies, it does nothing when executed. 

Shellcode is nothing but an extremely small piece of code that can fit in a very small space. Shellcode is capable of doing the following:

  • Opening a backdoor port in the vulnerable software
  • Downloading another piece of malware
  • Spawning a command prompt to the remote hacker, who can access the system of the victim
  • Elevating the privileges of the victim so the hacker has access to more areas and functions in the system:

Input argv[1] to exploit

The following image shows the same stack after the specially crafted input is provided as input to the program. Here, you can see return address is overwritten with the address of the buffer so, instead of line 15, the CPU will go to the address of the buffer. After this NOP, the shellcode will be executed:

Shellcode placed in the buffer

The final conclusion is, by providing an input to the vulnerable program, the exploit is able to execute shellcode which can open up a backdoor or download malware. 

The inputs can be as follows:

  • An HTTP request is an input for a web server
  • An HTML page is an input for a web browser
  • A PDF is an input to Adobe Reader

And so on - the list is infinite.

Note

You can explore these using the keywords provided as it cannot be explained in a few lines and goes beyond the scope of this book.

We often see vulnerabilities mentioned in blogs. Usually, a CVE number is mentioned for a vulnerability. One can find the list of vulnerabilities at http://www.cvedetails.com/. The wannacry ransomware used CVE-2017-0144 . 2017 is the year when the vulnerability was discovered. 0144 denotes that this was the 144th vulnerability discovered in 2017. Microsoft also issues advisories for vulnerabilities in Microsoft software. https://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-0144 gives the details of the vulnerability. The vulnerability description tells us that the bug lies in the SMBv1 server software installed in some of Microsoft operating system versions. Also, the URL can refer to some of the exploits.

 

5. How does antivirus name malware?


With increasing forms of malware, it was important to classify them. In 1991, the Computer Antivirus Research Organization (CARO) came up with a naming convention for malware.

The website http://www.caro.org/articles/naming.html gives directions on how security researchers should name a piece malware. Other than this, malware is sometimes named with the strings found in the malware file. The names of malware can vary from antivirus to antivirus, based on how they have detected it. Also, the naming convention may vary with different antivirus vendors.

Here is how Microsoft names malware: https://www.microsoft.com/en-us/wdsi/help/malware-naming.

VirusTotal is a website that hosts antivirus software. When one uploads a file to VirusTotal, the antivirus engines scan the file and display the results. The following screenshot shows detections from various pieces of antivirus software from www.virustotal.com for a particular malware:

Screenshot from virustotal.com

As shown in the screenshot, various antiviruses name the malware in different ways. Microsoft detects it as TrojanDownloader:JS/Nemucod, while others name it in different formats.

Note

Do not upload files from your organization or from your customers until you are sure that the file does not contain any sensitive data. You can search hashes (MD5,SHA1, and SHA2) of a file in VirusTotal.

 

6. Summary


In this chapter, we have covered the history of malware, types of malware, and the techniques used by malware to masquerade the system.

The next chapter is about malware analysis. The chapter focuses on the analysis of Windows executables. It gives a quick overview of malware analysis which can help system admins to conclude quickly about malware.

About the Authors

  • Abhijit Mohanta

    Abhijit Mohanta has a decade of experience in cybersecurity. He works as a security Researcher at Juniper Networks. He has worked with Cyphort (now part of Juniper), McAfee, and Symantec as a security researcher. His expertise includes reverse-engineering, automation, malware analysis, Microsoft Windows programming, and machine learning. He has worked on antivirus, sandboxes, and intrusion prevention systems. He has also authored a number of blogs about malware and has a couple of patents pending related to malware detection.

    Browse publications by this author
  • Mounir Hahad

    Mounir Hahad head of threat research at Juniper Networks, is a cybersecurity expert focused on malware research, detection techniques, and threat intelligence. Prior to joining Juniper, he was the head of threat research at Cyphort, a company focused on advanced threat detection and security analytics. He has also held various leadership positions at Cisco and IronPort working on VPN, UTM, email, and web security. He holds a PhD in computer science from the University of Rennes in France.

    Browse publications by this author
  • Kumaraguru Velmurugan

    Kumaraguru Velmurugan has 10+ years' experience in malware analysis and remedial measures. He has been associated with different antivirus and sandbox products in his career. He is a passionate reverse-engineer, interested in assembly programming and automation in the cyber security domain. He has authored (and assisted technically in) blogs on interesting key features employed by malware and owns a patent on malware remedial measures.

    Browse publications by this author
Book Title
Access this book and the full library for FREE
Access now