Network forensics is one of the sub-branches of digital forensics where the data being analyzed is the network traffic going to and from the system under observation. The purposes of this type of observation are collecting information, obtaining legal evidence, establishing a root-cause analysis of an event, analyzing malware behavior, and so on. Professionals familiar with digital forensics and incident response (DFIR) know that even the most careful suspects leave traces and artifacts behind. But forensics generally also includes imaging the systems for memory and hard drives, which can be analyzed later. So, how do network forensics come into the picture? Why do we need to perform network forensics at all? Well, the answer to this question is relatively simple.
Let's consider a scenario where you are hunting for some unknown attackers in a massive corporate infrastructure containing thousands of systems. In such a case, it would be practically impossible to image and analyze every system. The following two scenarios would also be problematic:
- Instances where the disk drives may not be available
- Cases where the attack is in progress, and you may not want to tip off the attackers
Whenever an intrusion or a digital crime happens over the wire, whether it was successful or not, the artifacts left behind can help us understand and recreate not only the intent of the attack, but also the actions performed by the attackers.
If the attack was successful, what activities were conducted by the attackers on the system? What happened next? Generally, most severe attacks, such as Advanced Package Tool (APT), ransomware, espionage, and others, start from a single instance of an unauthorized entry into a network and then evolve into a long-term project for the attackers until the day their goals are met; however, throughout this period the information flowing in and out of the network goes through many different devices, such as routers, firewalls, hubs, switches, web proxies, and others. Our goal is to identify and analyze all these different artifacts. Throughout this chapter, we will discuss the following:
- Network forensics methodology
- Sources of evidence
- A few necessary case studies demonstrating hands-on network forensics
To perform the exercises covered in this chapter, you will require the following:
- A laptop/desktop computer with an i5/i7 processor or any other equivalent AMD processor with at least 8 GB RAM and around 100 GB of free space.
- VMware Player/VirtualBox installation with Kali OS installed. You can download it from https://www.offensive-security.com/kali-linux-vm-vmware-virtualbox-image-download/.
- Installing Wireshark on Windows:Â https://www.wireshark.org/docs/wsug_html_chunked/ChBuildInstallWinInstall.html.
- Netcat From Kali Linux (already installed).
- Download NetworkMiner fromÂ https://www.netresec.com/?page=Networkminer.
- The PCAP files for this chapter, downloaded from https://github.com/nipunjaswal/networkforensics/tree/master/Ch1.
Every investigation requires a precise methodology. We will discuss the popular network forensics methodology used widely across the industry in the next section.
To installÂ Wireshark on Windows, go toÂ https://www.wireshark.org/docs/wsug_html_chunked/ChBuildInstallWinInstall.html.
To assure accurate and meaningful results at the end of a network forensic exercise,Â you,Â as a forensic investigator, must follow a rigid path through a methodological framework. This path is shown in the following diagram:
- Obtain information: Obtaining information about the incident and the environment is one of the first things to do in a network forensics exercise. The goal of this phase is to familiarize a forensic investigator with the type of incident. The timestamps and timeline of the event, the people, systems, and endpoints involved in the incidentâall of these facts are crucial in building up a detailed picture of the event.Â
- Strategize: Planning the investigation is one of the critical phases in a network forensics scenario, since logs from various devices can differ in their nature; for example, theÂ volatility ofÂ log entries from a firewall compared with that of details such as the ARP of a system would be very different. A good strategy would impact the overall outcome of the investigation. Therefore, you should keep the following points in mind while strategizing the entire forensics investigation process:
- Define clear goals and timelines
- Find the sources of evidence
- Analyze the cost and value of the sources
- Prioritize acquisition
- Plan timely updates for the client
- Collect: In the previous phase, we saw how we need to strategize and plan the acquisition of evidence. In the collect phase, we will go ahead and acquire the evidence as per the plan; however, collecting the evidence itself requires you to document all the systems that are accessed and used, capturing and saving the data streams to the hard drive and collecting logs from servers and firewalls. Best practices for evidence collection include the following:
- Make copies of the evidence and generate cryptographic hashes for verifiability
- Never work on the original evidence; use copies of the data instead
- Use industry-standard tools
- Document all your actions
- Analyze: The analysis phase is the core phase where you start working on the data and try your hands at the riddle. In this phase, you will make use of multiple automated and manual techniques using a variety of tools to correlate data from various sources, establishing a timeline of events, eliminating false positives, and creating working theories to support evidence. We will spend most of the time in this book discussing the analysis of data.
- Report: The report that you produce must be in layman's termsâthat is, it should be understood by non-techie people, such as legal teams, lawyers, juries, insurance teams, and so on. The report should contain executive summaries backed by the technical evidence. This phase is considered one of the essential stages, since the last four steps need to be explained in this one.
For more on OSCAR methodology, you can visitÂ https://www.researchgate.net/figure/OSCAR-methodology_fig2_325465892.
- Tapping the wire and the air
- CAM table on a network switch
- Routing tables on routers
- Dynamic Host Configuration Protocol logs
- DNS server logs
- Domain controller/ authentication servers/ system logs
- IDS/IPS logs
- Firewall logs
- Proxy Server logs
Â Many commercial vendors provide network taps and SPAN ports on their devices for snooping where they will forward all traffic seen on the particular port to the analyzer system. The technique is shown in the following diagram:
In the case of WLAN or Wi-Fi, the captures can be performed by putting an external wireless receptor into promiscuous mode and recording all the traffic for a particular wireless access point on a particular channel. This technique is shown in the following diagram:
Network switches contain content-addressable memory tables that store the mapping between a system's MAC address and the physical ports. In a large setup, this table becomes extremely handy, as it can pinpoint a MAC address on the network to a wall-jacked system, since mappings are available to the physical ports. Switches also provide network-mirroring capabilities, which will allow the investigators to see all the data from other VLANs and systems.
Routing tables in a router maps ports on the router to the networks that they connect.Â The following table is a routing table. These tables allow us to investigate the path that the network traffic takes while traveling through various devices:
Most of the routers have inbuilt packet filters and firewall capabilities as well. This means that they can be configured to log denied or certain types of traffic traveling to and from the network.
Dynamic Host Configuration Protocol (DHCP) servers generally log entries when a specific IP address is assigned to a particular MAC address, when a lease was renewed on the network, the timestamp it renewed, and so on, thus having significant value in network forensics.Â The following screenshot of the router's DHCP table presents a list of dynamically allocated hosts:
Name server query logs can help understand IP-to-hostname resolution at specific times. Consider a scenario where, as soon as a system got infectedÂ with malwareÂ on the network, it tried to connect back to a certain domain for command and control. Let's see an example as follows:
We can see in the preceding screenshot that a DNS request was resolved for
malwaresamples.com website and the resolved IP address was returned.
Having access to the DNS query packets can reveal Indicators of Compromise for a particular malware on the network while quickly revealing the IP address of the system making the query, and can be dealt with ease.
Authentication servers can allow an investigator to view login attempts, the time of the login, and various other login-related activities throughout the network. Consider a scenario where a group of attackers tries to use a compromised host to log into the database server by using the compromised machine as a launchpad (pivoting). In such cases, authentication logs will quickly reveal not only the infected system, but also the number of failed/passed attempts from the system to the database server.
From a forensic standpoint, intrusion detection/prevention system logs are the most helpful. IDS/IDPS logs provide not only the IP address, but also the matched signatures, on-going attacks, malware presence, command-and-control servers, the IP and port for the source and destination systems, a timeline, and much more. We will cover IDS/IPS scenarios in the latter half of this book.
Firewall logs provide a detailed view of activities on the network. Not only do firewall solutions protect a server or a network from unwanted connections, they also help to identify the type of traffic, provide a trust score to the outbound endpoint, block unwanted ports and connection attempts, and much more. We will look at firewalls in more detail in the upcoming chapters.
Web proxies are also one of the most useful features for a forensic investigator. Web proxy logs help uncover internal threats while providing explicit detail on events such as surfing habits, the source of web-based malware, the user's behavior on the network, and so on.
Since we now have an idea about the various types of logs we canÂ consider for analysis, let us quickly familiarize ourselves on the basics of Wireshark.
Readers who are familiar with theÂ basicsÂ ofÂ Wireshark can skip this section and proceed with the case studies; however, readers who areÂ unfamiliar withÂ the basics or who need to brush up on Wireshark essentials, can feel free to continue through this section. Let's look at some of the most basic features of Wireshark. Look at the following screenshot:
Once we execute Wireshark, we are presented with a screen similar to the preceding picture. On the left-hand side, we have a list of the available interfaces to capture packets from. In the middle, we have recent packet capture files and on the right- hand side, we have online help and user guides. To start a new packet-capture, you can select an interface, such as Ethernet, if you are connected over the wire, or Wi-Fi, if you are connected on a wireless network. Similarly, if you need to open a packet-capture file, you can press the
Open button, browse to the capture file, and load it in the Wireshark tool. Let's capture packets from the wireless interface by selecting
Wi-Fi and pressing the
Start button, as shown in the following screenshot:
We can see from the preceding screenshotÂ that we have various types of packets flowing on the network.Â Let's understand TCP conversations, endpoints, and basic Wireshark filters in the upcoming sections.
You may want to view the list of IP endpoints that your system is communicating with. To achieve this, you can navigate to the
Statistics tab and selectÂ
onversations, as shown in the following screenshot:
We can see that we have a variety of endpoints that are having conversations, the number of bytes transferred between the endpoints, and the duration of their data exchange.Â TheseÂ options become extremely handy when you want to investigate malicious traffic and identify the key endpoints that are being contracted.Â Additionally, we can see that most of the conversations in the preceding screenshot involvesÂ
192.168.1.15Â but we may not recognize the IP addresses its talking to.
Â We can also make use of theÂ
Endpoints option from the
Statistics tab, as shown in the following screenshot:
From the preceding screenshot, we can see all the endpoints, and sorting them using the number of packets will give us a clear understanding of the endpoints that are transmitting the highest number of packets, which is again quite handy when it comes to analyzing anomalous network behavior.
Domain names were invented to make it more easy to remember sites with common phrases. Having a list of IP addresses in the previous section would make no sense to us, but having a list that shows the resolution of the IPs into domain names can help us a lot.Â On clicking the
Show address resolution/Â
Resolved AddressesÂ option, we will be presented with the following:
Well, this now makes proper sense, as we have a list of IP addresses with their domain resolutions that can help us eliminate the false positives. We saw in the previous endpoint section that the second-highest number of packets in the endpoints originated from
184.108.40.206. Since we don't have an idea of what IP address this could be, we can easily refer to the address resolutions and figure out that this is
dropbox-dns.com, which looks suspicious. Let's search for it on Google using the string
client.dropbox-dns.com, and browsing the first result from the search, we have the following result:
We can see from the preceding search result (the official Dropbox website,Â https://www.dropbox.com/) that the domain is a legitimate Dropbox domain and the traffic originating to and from it is safe (assuming that Dropbox is permitted on the network or if allowed for a select group of users that the traffic is associated with those users only). This resolution not only helps us identify domains, but also speaks a lot about the software running on the target as well. We already identified Dropbox as running on the system. We also identified the following domains from the Resolved Addresses pane in Wireshark:
- A Gmail account being accessed
- A Qihoo 360 antivirus
- An HDFC bank account
- The Grammarly plugin
- The Firefox browserÂ
Set up some basic display filters in Wireshark to only view packets of interest, as shown in the following screenshot:
We can see that simply typing in
dns as the filter will display DNS packets only; however, we can see that MDNS protocol packets are also displayed.
Considering that we only require DNS packets and not MDNS protocol packets, we can set the filter as
dns && !mdns, where
! denotes a NOT operation, as shown in the following screenshot:
We can see from this that we don't have an exact filter for MDNS. So, how do we filter the MDNS packets out? We can see that the MDNS protocol communicates over port
5353. Let's filter that out instead of using an
!mdns filter, as shown in the following screenshot:
We can see that providing the filter
dns and !(udp.port eq 5353) presents us withÂ onlyÂ the DNS packets. Here,Â
eqÂ means equal, the
!Â means NOT, and
udp.portmeans the UDP port. This means that, in layman's terms, we are asking Wireshark to filter DNS packets while removing all the packets that communicate over UDP port
In the latest version of Wireshark
mdns is a valid protocol and display filter such as
dns && !mdns works fine.
However, we also haveÂ OCSP and Simple Service Discovery Protocol (SSDP) protocol data alongside the data that is filtered from the stream. To filter out the OCSP and SSDP protocol data, we can type in
http && !ocsp, and since SSDP poses a similar problem to MDNS, we can type
!udp.port==1900. This means that the entire filter becomes
http && !ocsp && !udp.port==1900, as shown in the following screenshot:
We can see from this that we have successfully filtered HTTP packets. But can we search through them and filter only HTTP POST packets? Yes, we can,Â using the expression
http contains POST && !ocspÂ as shown in the following screenshot.Â
We can see that providing theÂ
HTTP contains POSTÂ filter filters out all the non-HTTP POSTÂ requests.Â Let's analyze the request by right-clicking and selecting the option to follow the HTTP stream, as shown in the following screenshot:
We can see that this looks like a file that has been sent out somewhere, but since it has headers such as
x-360-cloud-security-desc, it looks as though it's the cloud antivirus that is scanning a suspicious file found on the network.Â
Let's take note of the IP address and match it with the address resolutions, as shown in the following screenshot:
Well, the address resolutions have failed us this time. Let's search the IP on https://who.is/, as shown in the following screenshot:
Yes, it belongs to the QiHU 360 antivirus.
We can also select HTTP packets based on the response codes, as shown in the following screenshot:
We can see that we have filtered the packets using
200 denotes a status OK response. This is handy when investigating packet captures from compromised servers, as it gives us a clear picture of the files that have been accessed and shows us how the server responded to particular requests.
It also allows us to figure out whether the implemented protections are working well, because upon receiving a malicious request, in most cases, the protection firewall issues a
404 (NOT FOUND) or a
403 (Forbidden) response code instead of 200 (OK).
Let's now jump into some case studies and make use of the basics that we just learned.
Consider a scenario where an attacker has planted a keylogger on one of the systems in the network. Your job as an investigator is to find the following pieces of information:
- Find the infected system
- Trace the data to the server
- Find the frequency of the data that is being sent
- Find what other information is carried besides the keystrokes
- Try to uncover the attacker
- Extract and reconstruct the files that have been sent to the attacker
Additionally, in this exercise, you need to assume that theÂ packet capture (PCAP)Â fileÂ is not available and that you have to do the sniffing-out part as well. Let's say that you are connected to a mirror port on the network where you can see all the data traveling to and from the network.
The capture file for this network capture is available atÂ https://github.com/nipunjaswal/networkforensics/blob/master/Ch1/Noobs%20KeyLogger/Noobs%20Keylogger.pcap.
We can begin our process as follows. We already know that we are connected via a mirror port. Let's sniff around on the interface of choice. If connected to the mirror port, choose the default interface and proceed with collecting packets, as shown in the following screenshot:
Most keyloggers work on the web (HTTP), FTP, and email for delivering the keystrokes back to the attacker. We will try all of these to check whether there's anything unusual with packets from these protocols.
Let's try HTTP first by setting the
http filter, as shown in the following screenshot:
There is HTTP data, but everything seems fine.
Let's try a couple of protocols, SMTP and POP, to check for anything unusual with the email protocol, as shown in the following screenshot:
Everything seems fine here as well.
Let's try FTP as well, as shown in the following screenshot:
Well, we have plenty of activity on the FTP! We can see that the FTP packets contain theÂ
PASS commands in the capture, which denotes a login activity to the server. Of course, this can be eitherÂ the keylogger or a legitimate login from any user on the network.Â Additionally, we can see a
STOR command that is used to store files on the FTP server. However, let's note down the credentials and filenames of the uploaded files for our reference and investigate further.Â Since, we know that the
STOR command is used to store data on the server.
Let'sÂ view these data packets by changing filter to
ftp-data, as shown in the following screenshot:
Changing filter to ftp-data
ftp-data will only contain mostly the files and data transferred rather that all the other FTP commands
Let's see what we get when we follow the TCP stream of the packet, we can see that we have the following data being posted to the server:
We can see that the data being transmitted contains the word
Ardamax, which is the name of a common piece of keylogger softwareÂ that records keystrokes from the system it has infected and sends it back to the attacker. Let's save the packet capture in PCAP format by selecting File | Save As and choosing the
.pcap format. We will be using the
.pcap format only since the free version of NetworkMiner support only PCAP files and not the
Let's open the saved fileÂ using NetworkMiner as shown in the following screenshot:
Opening the saved file using network miner
We can see we have a number of hosts present in the network capture.
Let's navigate to the
Credentials tab, as shown in the following screenshot:
We can see that we have the username and password captured in the PCAP file displayed under Credentials tab in NetworkMiner. We previously saw theÂ
STOR command, which is commonly used in uploading files to an FTP from the Wireshark dump.
Let's browse to the
Files tab and see the files that we are interested in:
We can see plenty of files. Let's open the files that we found using the
STOR command in the browser, as shown in the following screenshot:
The attacker was not only keylogging, but was also fetching details such as the active window title along with the key logs. So, to sum this up, we have the following answers to the questions that we asked at the beginning of the exercise:
- Find the infected system:
- Trace the data to the server:
- Find the frequency of the data that is being sent: The difference between two consecutive
STORcommands for a similar file type is 15 seconds
- Find what other information is carried alongside the keystrokes: Active window titles
- Try to uncover the attacker: Not yet found
- Extract and reconstruct the files sent to the attacker:
We have plenty of information regarding the hacker. At this point, we can provide the details we found in our analysis in the report, or we can go one step further and try to uncover the identity of the attacker. If you chose to do so, then let's get started in finding out how to uncover this information.
Logging into a computer that youâre not authorized to access can result in criminal penalties (fines, imprisonment, or both).
We already found their credentials in the server. Let's try logging into the FTP server and try to find something of interest, as shown in the following screenshot:
We can see that we are easily able to log into the server. Let's use an FTP client, such as Royal TSX in Mac (FileZilla for Windows), to view the files that reside on the server, as shown in the following screenshot:
Wow! So much information has been logged; however, we can see two directories named
Jo.Â The directory
Jo is empty but we may have something in the directory named
Let's view the contents of
John, as shown in the following screenshot:
It looks as though the attacker is applying for jobs and keeps their updated resume on their server. The case-study analysis proves that the keylogger is aÂ newbie. In answering the last question regarding the identity of the attacker, we have successfully conducted our first network forensic analysis exercise. The resume we found might have been stolen from someone else as well. However, this is just the tip of the iceberg. In the upcoming chapters, we will look at a variety of complex scenarios; this was an easy one.
In the next example, we will look at TCP packets and try figuring out what were the event causing such network traffic.
Let's analyze another capture fileÂ fromÂ https://github.com/nipunjaswal/networkforensics/blob/master/Ch1/Two%20to%20Many/twotomany.pcap, that we currently don't know any details about and try reconstructing the chain of events.
We will open the PCAP in Wireshark, as follows:
From the preceding screenshot, we can see that numerous SYN packets are being sent out to the
220.127.116.11 IP address. However, looking closely, we can see that most of the packets are being sent every so often from a single port, which is
36051,Â to almost every port onÂ
18.104.22.168. Yes, you guessed right: this looks like a port scan. Initially the SYN packet is sent out, and on receiving a SYN/ACK, the port is considered open.
We know that the originating IP address,Â
172.16.0.8,Â is an internal oneand the server being contracted is
22.214.171.124. Can you figure out the following?:
- Scan type
- Open ports
Answering the first question requires a more in-depth understanding of a TCP-oriented communication and its establishment, TCP works on a three-way handshake, which means that on receiving a synchronizeÂ (SYN) packet from the source IP address, the destination IP address sends out a synchronize/ acknowledgmentÂ (SYN/ACK) packet that is followed by a final acknowledgmentÂ (ACK) packet from the source IP address to complete the three-way handshake. However, as we can see from the preceding screenshot, only a SYN/ACK is sent back fromÂ portÂ
80,Â and there hasn't been an ACK packet sent outÂ by the source IP address.
This phenomenon means that the ACK packet was never sent to the destination by the source, which means that only the first two steps of the three-way handshake were completed. This two step half open mechanism causes the destination to use up resources as the port will be help open for a period of time. Meanwhile, this is a popular technique leveraged by a scan type called SYN scan or half-open scan, or sometimes the stealth scan. Tools such as Nmap make use of such techniques to lower the number of network packets on the wire. Therefore, we can conclude that the type of scan we are dealing with is a SYN scan.
Nmap uses RST packet in half open scan periodically to prevent resource exhaustion at the destination.
Applying the filer
ip.src==126.96.36.199, we can see the responses sent byÂ Â
188.8.131.52. It is evident that we have received the SYN/ACK from portsÂ
22, which are open ports. We can also see that there has been network loss, and the sender has sent the packets again. Additionally, we can see Reset Acknowledgment Packets (RST) that denote misconfigurations or the application running on the not willing to connect: the reasons for such behavior can differ.
Over the course of this chapter, we learned about the basics of network forensics. We used Wireshark to analyze a keylogger and packets from a port scan. We discovered various types of network evidence sources and also learned the basics methodology that we should follow when performing network forensics.
In the next chapter, we will look at the basics of protocols and other technical concepts and strategies that are used to acquire evidence, and we will perform hands-on exercises related to them.
All credits for this above capture file goes to Chris Sanders GitHub repository atÂ https://github.com/chrissanders/packets.
To improve your confidence in your network forensics skills, try answering the following questions:
- What is the difference between theÂ
ftp-dataÂ display filter in Wireshark?
- Can you build an
httpfilter for webpages with specific keywords?
- We saved files from the PCAP using NetworkMiner. Can you do this using Wireshark? (Yes/No)
- Try repeating these exercises with Tshark.
For further information on Wireshark, refer to https://www.packtpub.com/networking-and-servers/mastering-wireshark