Chapter 4. Nonvolatile Data Acquisition
In this chapter, we will discuss the acquisition of Hard Disk Drives or HDD. Data acquisition is critical because performing analysis on the original hard drive may cause failure on the only hard drive that contains the data or you may write to that original hard drive by mistake.
So, creating a forensics image from the hard drive must be performed prior to the analysis. The acquisition of the HDD can be either conducted at the incident scene or in the analysis lab, on a live or a powered off system, and over network or locally, as we will see in this chapter.
In a nutshell, we will cover the following topics:
Forensic image
Incident response CDs
Live imaging of a hard drive
Linux for the imaging of a hard drive
Virtualization in data acquisition
Evidence integrity
Disk wiping in Linux
Imaging of a hard drive is the process of creating an exact forensic image of the victim or the suspect hard drive in order to conduct the analysis on the imaged hard drive instead of the original one. To create an exact copy of the hard drive, there are two options that can be followed:
Duplication: This is where the destination of this process is the whole hard drive. In some references, this step can be addressed as cloning when the destination hard drive has the same brand, model, and size of the source hard drive. Duplication can be conducted using what are called forensic hardware duplicators. These are hardware devices, which basically have two interfaces for the source and the destination hard drives. Once they start operating, they will just copy blocks of data from the source to the destination regardless of the structure of the filesystem that is used in the source hard drive.
Usually, hardware duplication is faster than other software tools as it operates in wire...
Due to the needed speed in the Incident Response (IR) process, the usage of incident response CDs can save precious time. IR CDs usually are Linux distributions. These distributions contain many incident response and digital forensic tools, which aim to boot mainly from the target system to acquire different types of possible evidence without the need to disconnect the hard drive.
This is designed to leave the least traces on the target system, so it boots with write protection enabled by default to all the connected hard drives. This gives the user the ability to grant the write access to the destination hard drive only. It is better to not connect to the destination hard drive until the system boots from the incident response CD. Of course, booting from the IR CD means that the system under investigation is down, and you will start the machine and boot from the CD. No running system memory is available in this case.
IR CDs also have the ability to acquire the memory...
Live imaging of a hard drive
In case of a live system, you will need to do the following:
Image the volatile data, such as system memory first as discussed earlier
Power the system down
Disconnect the hard drive
Image the hard drive separately
However, in some situations, you will also need to image the hard drive without switching the system off. An example is in case the system is a server that is hosting a critical service that cannot be taken down, or there is an encryption present in the system, which will be reactivated if the system is powered off. This is why live acquisition is the preferred choice all the time.
Linux for the imaging of a hard drive
Suppose that you already have a dead system and you need to take the machine's hard drive out in order to image it. What you need to do first is make sure that you are connecting the hard drive to your preferred Linux machine via write blocker to prevent any accidental writing to the hard drive, which could change the evidence and make it inadmissible.
In the Linux operating system, there is a built-in tool called dd
. The dd
tool is considered to be a forensically sound tool, as it copies blocks of data, regardless of its structure. There are a lot of suggestions of what dd
stands for, but we can say that dd
stands for duplicate disk or duplicate data, and if someone used it in wrong way it can be disk destroyer or delete data. This tool can convert and copy files and hard drives.
Suppose the suspicious hard drive, which is the source and is connected by a write blocker, is mounted as /dev/sda
and the destination hard drive is mounted as sdb...
Virtualization in data acquisition
Virtualization offers great benefits to digital forensics science. In virtualization, everything is a file, including the guest memory and the guest hard drive. What the handler needs to do is to identify the right file of the source that they need to acquire and copy this file to the external storage.
The snapshot concept that can be found in most of the virtualization programs offers the investigator more images of the machine at different times. This can, if acquired and analyzed, view the timeline behavior of the machine, that is, before and after the malware infection:
In the previous image, we can see the vmem
files of the VMware program. VMware is one of the virtualization programs. This image contains the current memory file and two vmem
files for two snapshots taken on two different dates. The size of the files are all the same because this is like the memory dump process, it copies the entire machine's...
Evidence integrity (the hash function)
What can we do to prove that the evidence hasn't been altered or changed? This step is very important to prove in court, if required, that you didn't add, remove, or edit the evidence during imaging or analysis. Most of the imaging tools come with many hash function implementations, such as MD5, SHA1, and SHA256. The hash function is a mathematical implementation, which is an irreversible or one-way function. This means that if you have input data A and hash function F, you will get F(A) = H. However, it's been proved that F'(H) != A, where F is the hash function and F' is any mathematical function. We can't get A, the original data, from H, the hash digest.
For example, if we have different strings applied to the same hash function, the hash function must map each string to different hashes:
As shown in the preceding diagram, each text resulted in a different hash after applying all the texts to the same hash function. Even if the change...
The investigator can't use the same hard drive for two different hard drives if they are using the duplication way to image the hard drive. This could overlap different files from different cases and will result in unreliable and untrue findings. After completing work on the duplicate hard drive, you must wipe it and prepare it for another case or hard drive. Don't wait until another case is assigned to you; wiping takes a long time.
This process is equivalent to the imaging process but the source file is a file full of zeros. In the Linux operating system there is a /dev/zero
file. You need to use this file as your input file to the dd
tool, and your output file will be the hard drive that needs to be wiped. Another file that can be used in this process as well is/dev/null
:
dd if=/dev/zero of=/dev/sda bs=2K conv=noerror,sync
In this chapter, we covered some IRCDs and discussed the live acquisition using IRCDs and FTK Imager and imaging over the network using IRCDs and the dd
tool. We also viewed how to preserve evidence integrity and how to wipe a disk for forensic usage.
In the next chapter, we will discuss how to create a timeline of the system activities and why it is important from a digital forensics prospective.