Home Security Offensive Shellcode from Scratch

Offensive Shellcode from Scratch

By Rishalin Pillay
books-svg-icon Book
eBook $35.99 $24.99
Print $43.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $35.99 $24.99
Print $43.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Chapter 1: The Ins and Outs of Shellcode
About this book
Shellcoding is a technique that is executed by many red teams and used in penetration testing and real-world attacks. Books on shellcode can be complex, and writing shellcode is perceived as a kind of "dark art." Offensive Shellcode from Scratch will help you to build a strong foundation of shellcode knowledge and enable you to use it with Linux and Windows. This book helps you to explore simple to more complex examples of shellcode that are used by real advanced persistent threat (APT) groups. You'll get to grips with the components of shellcode and understand which tools are used when building shellcode, along with the automated tools that exist to create shellcode payloads. As you advance through the chapters, you'll become well versed in assembly language and its various components, such as registers, flags, and data types. This shellcode book also teaches you about the compilers and decoders that are used when creating shellcode. Finally, the book takes you through various attacks that entail the use of shellcode in both Windows and Linux environments. By the end of this shellcode book, you'll have gained the knowledge needed to understand the workings of shellcode and build your own exploits by using the concepts explored.
Publication date:
April 2022
Publisher
Packt
Pages
208
ISBN
9781803247427

 

Chapter 1: The Ins and Outs of Shellcode

Welcome to the first chapter of the book, and more importantly, the start of your journey of learning about shellcode and how it can be applied in offensive security.

When you think about offensive security, the first thoughts that may come to mind are penetration testing, hacking, exploits, and so on. One thing that all of those have in common is the use of shellcode. Shellcode is extremely helpful – it can be used in various ways to either perform an exploit, obtain a reverse shell, or control the targeted computer, among other things.

When learning about something new, the best way is to start from the bottom up. This means that you need to get a good solid foundation of the topic and then add to that knowledge as you progress. It can be likened to building a house, where you start with the foundation and then work your way up to the roof. So, in this chapter, we will focus on gaining a good understanding of shellcode.

We will cover the following topics:

  • What is shellcode?
  • Breaking down shellcode
  • Exploring the common types of shellcode
 

What is shellcode?

The term shellcode was originally derived based on its purpose to spawn or create a reverse shell via the execution of code. It has nothing to do with shell scripting, which essentially entails writing scripts of bash commands to complete a task.

Shellcode interacts with the registers and functions of a program by directly manipulating the program in order to perform an outcome. Due to this interaction, it is written in an assembler and then translated into hexadecimal opcodes. We will cover assemblers and opcodes later in this chapter.

When a vulnerability is discovered, shellcode can be used to exploit that vulnerability. Depending on the complexity of the vulnerability, you may make use of a few lines of code to exploit it. In some cases, the size of your shellcode can be quite substantial. The bottom line is that sometimes, obtaining a reverse shell or a specific outcome when using shellcode can be very lightweight. This results in a very efficient attack that can be used if you provide the right input to the program.

Examples of shellcode

Let's take a look at a few samples. We will begin by looking at a simple piece of code that is written in C. The purpose of this code is to return a shell. The privilege level of the returned shell will depend on the privilege level of the target program at the time this shellcode is run. In simple terms, the newly spawned shell will inherit the same permissions as the target program:

#include <stdio.h>
int main()
{
    char *args[2];
    args[0] = "/bin/sh";
    args[1] = NULL;
    execve("/bin/sh", args, NULL);
    return 0;
}

When this compiled and modified further with an editor, it's possible to turn it into input strings that can then be used against a vulnerable program to obtain a shell.

There are additional steps required to make this piece of code useable.

Shellcode is often used with buffer overflow attacks. In its simplest terms, a buffer overflow happens when a program writes data into memory that is larger than what has been have reserved. The end result is that the program may crash, overwrite data, or execute other code.

In the following piece of code, you will notice that the code is expecting an input of a certain number of characters. This is defined by the char input [12] command:

#include <stdio.h>
int main()
{
    char input[12];
    printf("Please enter your password: ");
    // If the password is longer than 12 characters, a buffer overflow will happen;
    scanf("%s", input);
    printf("Your password is %s", input);
    return(0);
}

Because there is no input validation and the program has reserved 12 bytes of memory for the input, if a string of data longer than 12 bytes is entered, then the application will crash. This specific action may not be useful if you are looking at obtaining a reverse shell, but it is useful if your intent is to cause an application to crash.

Using the logic of a buffer overflow, a carefully crafted piece of shellcode can exploit this vulnerability. The end result could be a specific attack such as a stack-based buffer overflow attack, or a heap-based buffer overflow attack. We will cover these later in the book.

Now on to a more complex example of shellcode. In January 2021, a malware sample was shared with a research team at Check Point. This malware sample resembled a loader that belongs to a well-known APT group called Lazarus. This malware made use of a phishing attack that included a document loaded with a macro that was used as a job application on LinkedIn, a popular platform for professionals.

The macro in the document made use of Visual Basic for Applications (VBA) shellcode that did not contain suspicious APIs such as VirtualAlloc, WriteProcessMemory, or CreateThread. These types of APIs are usually detected by endpoint protection products since these relate to memory allocation, writing to memory, and starting a new CPU thread.

Now, when this VBA macro was executed, it made use of a number of interesting techniques. Firstly, it created aliases of the various API calls so that its intent was less obvious. It then made use of various calls such as HeapCreate and HeapAlloc to create an executable memory location. Later, it made use of functions such as FindImage that employed a UuidFromStringA API function that had a list of hardcoded UUID values. This UuidFromStringA ultimately provides a pointer to a memory heap address allowing it to be used to decode data and write it to memory without making use of the more common functions such as memcpy or WriteProcessMemory. The following is a snippet of the shellcode; however, here it's executing the code to start up the Windows calculator application, which is referenced by its executable name calc, but you can see the complexity of the shellcode:

#include <Windows.h>
#include <Rpc.h>
#include <iostream>
#pragma comment(lib, "Rpcrt4.lib")
const char* uuids[] =
 
{ 
    "6850c031-6163-636c-5459-504092741551",
    "2f728b64-768b-8b0c-760c-ad8b308b7e18",
    ..snip..
};
int main() 
{
 
    HANDLE hc = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
    void* ha = HeapAlloc(hc, 0, 0x100000);
    DWORD_PTR hptr = (DWORD_PTR)ha;
    int elems = sizeof(uuids) / sizeof(uuids[0]);
    
    for (int i = 0; i < elems; i++) {
          RPC_STATUS status = UuidFromStringA((RPC_CSTR)uuids[i], (UUID*)hptr);
          if (status != RPC_S_OK) {
               printf("UuidFromStringA() != S_OK\n");
               CloseHandle(ha);
                return -1;
        }
         hptr += 16;
    }
    printf("[*] Hexdump: ");
    for (int i = 0; i < elems*16; i++) {
        printf("%02X ", ((unsigned char*)ha)[i]);
    }
    EnumSystemLocalesA((LOCALE_ENUMPROCA)ha, 0);
    CloseHandle(ha);
    return 0;
}

We will not go further into the analysis of this shellcode, the VBA, or the attack since it's out of scope for this book. The aim of this example is to show you the complexity of what shellcode looks like and how it can make use of multiple elements.

Shellcode versus a payload

As we start to dig into the components of shellcode, let's make a clear differentiation between shellcode and payloads. Often these are referred to as the same thing; however, they are actually different.

Payloads

A payload is a piece of custom code that an attacker wants the system to run. This custom code can be delivered by various means, such as a script or even within shellcode. An example of a payload is a reverse shell that generates a Windows Command Prompt connection. It can also be a bind shell, which is a payload that binds a shell to a listening port on the target machine to which the attacker can connect. A payload might potentially be as basic as a set of commands to run on the target operating system.

Think of the payload as the code that you want to run. It serves the purpose of doing something useful that you want it to do. Payloads can be included within shellcode so that they are executed by a program.

The following is an example of shellcode that has a payload included. The payload is highlighted for reference:

#include <windows.h>
void main() {
  void* exec;
  ...snip...
  unsigned char payload[] =
    "\x38\x45\xff\x48\xf7\xe7\x65\x48\x8b\x58\x60\x48\x8b\x5b\x18\x41\x6b\x5b\x20\x48\x8b\x1b\x48\x8b\x1b\x48\x8b\x5b\x20\x49\x45\xd8\x8b"
    ...snip...
  unsigned int payload_len = 205;
  exec = VirtualAlloc(0, payload_len, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
  RtlMoveMemory(exec, payload, payload_len);
  rv = VirtualProtect(exec, payload_len, PAGE_EXECUTE_READ, &oldprotect);
  th = CreateThread(0, 0, (LPTHREAD_START_ROUTINE)exec, 0, 0, 0);
  WaitForSingleObject(th, -1);
}

In the example, you will notice that we have a payload incorporated into the shellcode. As the shellcode runs, memory is allocated using exec = VirtualAlloc(…), then references the payload using …exec, payload…, and ultimately runs the payload.

Shellcode

Shellcode is frequently used as part of the payload when a software vulnerability is exploited to gain control of or exploit a compromised computer. Think of shellcode as a set of precisely designed commands that may be executed once injected into a running application. In relation to a vulnerability, it's a set of instructions used as a payload. In most cases, the shellcode is written in assembly language. In most situations, a command shell or a Meterpreter shell will be supplied after the target computer has completed the set of instructions. This brings us back to its original purpose, as discussed in the introduction of this chapter, which is to establish a shell.

 

Breaking down shellcode

Shellcodes can be written in various architectures. The main architectures that you are likely see in your day-to-day working life are x86-64 and ARM. There are big differences between the x86-64 and ARM CPU architectures. For instance, the x86-64 architecture makes use of Complex Instruction Set Computing (CISC) while ARM makes use of Reduced Instruction Set Computing (RISC).

The following table highlights some of the key differences between these two instruction sets:

You will be able to find more in-depth information on the differences between the CISC and RISC architectures on the internet. The aim of this book is not to dive into the complexity of CPU architectures. However, having a good idea of the CPU architecture of your target will ultimately help you to better craft your shellcode.

To write shellcode, you need to have a good understanding of assembly language. Computers cannot run code from assembly language, and the reason for this is that computers understand machine code, also known as machine language. Assembly language provides an interface layer to machine language.

Here is a simple Hello World program in assembly language code, which is specific to Linux operating systems:

section.text   
global _start     ;must be declared for linker (ld)_start:            ;tells linker entry point   movedx,len     ;message length   movecx,msg     ;message to write   movebx,1       ;file descriptor (stdout)   moveax,4       ;system call number (sys_write)   int0x80        ;call kernel   moveax,1       ;system call number (sys_exit)   int0x80        ;call kernelsection.datamsg db 'Hello, world!', 0xa  ;string to be printedlen equ $ - msg     ;length of the string

When the preceding code is compiled and executed, it will display the text defined in the kernelsection.datamsg db 'Hello World!' line.

Assembly language consists of three main components. These are executable instructions, assembler directives, and macros. Executable instructions provide instructions to the processor, assembler directives define the assembly, and macros provide a text substitution mechanism. In the next chapter, we will cover assembly language in more detail.

Machine language is a very low-level programming language. It is written in binary, in other words, 1s and 0s. Due to it being binary, it is easily understood by computers. The inverse is that it is very difficult to understand by humans. So, imagine trying to read shellcode that is in the form of machine language – it could be nearly impossible, depending on the complexity of the code. The execution of machine language is super-fast, purely since it is in binary format.

A sample of machine language is as follows:

1110 0001 1010 0010 0010 0011 0000 0011

The key takeaway is that in order to make use of machine language, assembly language is needed.

The more common type of programming language you may come across is a high-level programming language. This type of language is more human friendly and readable. Examples of this type of language are C, C++, and Python. At the beginning of this chapter, the first example of shellcode was written in C – that is what a high-level programming language looks like.

As you progress in the book, you will better understand the uses of the various components that make up shellcode. This includes the various tools that can be used to create shellcode, convert code to assembly language, and obtain machine code.

 

Exploring the common types of shellcode

When penetration testing, different categories of shellcode can be used. Ultimately, shellcode can be broken down into two main categories, local and remote. Within each category, there are various types of shellcode that exist and that perform different functions. In this section, we will explore these various types of shellcode. Keep in mind that this is not a complete list as new types of shellcode are constantly being developed. Let's explore the various types of shellcode that exist, starting with local shellcode and moving on to remote shellcode.

Local shellcode

Local shellcode is run on the target computer and does not perform any network activities. This type of shellcode can be used to escalate privileges, execute a payload, spawn a shell, or break out of a jailed shell. Let's examine some examples of local shellcode.

execve shellcode

execve is a syscall that is used within Linux systems to execute a program on the local system. It is commonly used for privilege escalation when executing a shell. In the first example of shellcode at the beginning of this book, you saw a sample of the execve system call being used within shellcode.

You can learn more about execve by looking at the man page for the system call.

By executing the man execve command on Linux, you will be presented with a full write-up about it:

NAME       
execve - execute program
SYNOPSIS
       #include <unistd.h>
       int execve(const char *filename, char *const argv[],
                  char *const envp[]);
DESCRIPTION       
execve() executes the program pointed to by filename.  filename must be either a binary executable, or a script starting with a line of the form..
..snip..

Generally, execve is used in conjunction with the following:

  • filename: A pointer to a string specifying the path to a binary
  • argv[]: An array of command-line variables
  • envp[]: An array of environment variables

Right at the beginning of this chapter, an example of execve was shown. Here is a recap of the command specifically related to execve:

execve("/bin/sh", args, NULL);

As per the man page, execve can be used to execute a program. Since this syscall is able to execute either an executable or a script, it's commonly used in shellcode.

Buffer overflow

Buffer overflow attacks result from an exploited vulnerability locally. A buffer in relation to memory is an area used by a running program. This location is a temporary location that has temporary data stored by an application. A buffer overflow happens when the length of the input data exceeds (overflows) the limit of the buffer. This overflow causes the program to write data outside of its buffer allocation, perhaps in other sections of memory. This process causes the program to crash. The program crashing is not dangerous in itself, but let's assume the program is written with a binary such as setuid.

The setuid binary ultimately allows a program to run under a special privileged permission, the permission of a user or system/root privilege. So, moving back to the program, if you are able to cause a buffer overflow, ultimately you can make it execute a payload that executes a system call to spawn a reverse shell.

Egg hunter

When it comes to writing shellcode used to exploit a program, one of the challenges that is faced is the limited space. That limited space may hamper what you are trying to execute and ultimately cause your execution to fail. Consider a basic shellcode with the primary purpose of providing a reverse shell. Depending on what you use to generate it, you may end up with a size of 32 bytes or more. Now, what if the target program does not have that amount of free space within its allocated buffer? Well, that simple shellcode will not work.

This is where egg hunting comes into play. The main purpose of egg hunting is to search the memory for a specified egg that is defined when crafting the egg hunter shellcode. This egg is a location in memory that is a unique string, also referred to as a tag. Once this egg is found, the shellcode located directly after the egg will be executed.

In Chapter 4, Developing Shellcode for Windows, we will cover egg hunting in more detail with some examples.

Shellcode reflective DLL injection

Shellcode reflective DLL injection (sRDI) is a mechanism that allows you to turn a DLL into position-free shellcode that can subsequently be injected using your preferred shellcode injection and execution method.

To understand how this technique works, let's look back at some history. DLL injection involves the use of a malicious DLL file that was read from the disk and loaded into a target process. While it worked a few years back, the problem with this technique is that anti-virus manufacturers caught on to it and started to flag these types of files, not to mention the security improvements made by operating system vendors over time. That being said, you may have the ability to use a completely new DLL that has not been seen before and still have the chance of success with a normal DLL injection – but we can assume that this opportunity is unlikely.

Around 2009, we began to see a reflective DLL injection that made use of something called a ReflectiveLoader from the malicious DLL. When injected, this DLL would then drop a thread and work its way back to locate the DLL and map it automatically. Ultimately, DLLMain would be called, and your code would be running in memory.

In 2015, we saw a reflective DLL injection that allowed a function to be called after DLLMain and allowed the passing of user arguments. This is made possible by the use of shellcode and a bootstrap placed before the call of the ReflectiveLoader. This allowed you to load a DLL, call the entry point, and pass data to another exported function.

If you would like to look at some public references for this technique, you can take a look at the sRDI published at https://github.com/monoxgas/sRDI.

Remote shellcode

Remote shellcode runs on another computer through a network or via remote connectivity. Remote shellcodes make use of TCP/IP connections in order to provide access to the target machine shell. Shellcodes of this type are categorized based on how they are set up. For example, you have a bindshell if the shellcode binds to a certain port on the target computer. If the shellcode used establishes a connection back to you, then you have a reverse shell.

Bindshell

A bindshell does exactly what its name implies. It binds the shell to a specific port or socket. In essence, the target machine works as a server waiting for a connection on a specific port. Once a connection is established, a shell is provided. This technique is not really used much, as most targets have a firewall in place to block incoming connections. That being said, there is still a chance of discovering an endpoint that has a firewall rule allowing connections to it.

An example bindshell written in C looks like this:

#include <stdio.h>
...snip..
int main ()
{
    struct sockaddr_in addr;    
  addr.sin_family = AF_INET;
    addr.sin_port = htons(4444);    
  addr.sin_addr.s_addr = INADDR_ANY;
  ...snip.. 
{
  ...snip..
  }
    execve("/bin/sh", NULL, NULL);    
return 0;
}

In the preceding example, the use of AF_NET is used to create an IPv4 socket. We then have the port defined by addr.sin_port and at the end, we have execve, which is used to spawn the shell.

Download and execute

This type of shellcode is slightly different from the rest in that it does not spawn a shell. Instead, it is used to download and execute something. This can be a malicious program, a payload, or malware, among others.

In environments today, web filtering products have a number of enhancements to block potentially malicious traffic. Even newer web browsers have these enhancements, such as SmartScreen on Microsoft Edge. These features present a number of issues when trying to get a target to perform a drive-by download or the execution of shellcode that makes use of visibly malicious patterns.

However, even with these advancements in detection, it is still possible to get shellcode to download and execute something, such as by making use of urlmon.dll and one of its APIs called URLDownloadToFileA, for example.

 

Summary

In this chapter, we looked at the basics of shellcode. You learned what exactly shellcode is and looked at some examples ranging from simple to complex shellcode. We covered the differences between shellcode and payloads and dived into the components of shellcode. As you saw, shellcode requires a good understanding of instruction sets, memory, and various languages. You learned the flow of how shellcode is interpreted by computers in the form of machine language and assembly language. Finally, we explored various types of shellcode used in the field.

In the next chapter, we will dive into assembly language. You will learn what assembly language is, the types of assembly language, the components that make up assembly language, and how they work.

 

Further reading

About the Author
  • Rishalin Pillay

    Rishalin Pillay is an Offensive Cybersecurity expert who holds a number of awards and certifications from multiple companies in the Cybersecurity industry. He is well known for his contributions to online learning courses related to Red Teaming and as the author of Learn Penetration Testing. He holds Content Publisher Gold and Platinum awards for his contributions made towards the Cybersecurity Industry, including the Events Speaker Gold award for influential public speaking at Tier-1 business events.

    Browse publications by this author
Latest Reviews (1 reviews total)
Offensive Shellcode from Scratch
Unlock this book and the full library FREE for 7 days
Start now