Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python Digital Forensics Cookbook

You're reading from  Python Digital Forensics Cookbook

Product type Book
Published in Sep 2017
Publisher Packt
ISBN-13 9781783987467
Pages 412 pages
Edition 1st Edition
Languages
Concepts
Authors (2):
Chapin Bryce Chapin Bryce
Profile icon Chapin Bryce
Preston Miller Preston Miller
Profile icon Preston Miller
View More author details

Table of Contents (11) Chapters

Preface 1. Essential Scripting and File Information Recipes 2. Creating Artifact Report Recipes 3. A Deep Dive into Mobile Forensic Recipes 4. Extracting Embedded Metadata Recipes 5. Networking and Indicators of Compromise Recipes 6. Reading Emails and Taking Names Recipes 7. Log-Based Artifact Recipes 8. Working with Forensic Evidence Container Recipes 9. Exploring Windows Forensic Artifacts Recipes - Part I 10. Exploring Windows Forensic Artifacts Recipes - Part II

Recording file attributes

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Now that we can iterate over files and folders, let’s learn to record metadata about these objects. File metadata plays an important role in forensics, as collecting and reviewing this information is a basic task during most investigations. Using a single Python library, we can gather some of the most important attributes of files across platforms.

Getting started

All libraries used in this script are present in Python’s standard library. The os library, once again, can be used here to gather file metadata. One of the most helpful methods for gathering file metadata is the os.stat() function. It's important to note that the stat() call only provides information available with the current operating system and the filesystem of the mounted volume. Most forensic suites allow an examiner to mount a forensic image as a volume on a system and generally preserve the file attributes available to the stat call. In Chapter 8, Working with Forensic Evidence Containers Recipes, we will demonstrate how to open forensic acquisitions to directly extract file information.


To learn more about the os library, visit https://docs.python.org/3/library/os.html.

How to do it…

We will record file attributes using the following steps:

  1. Obtain the input file to process.
  2. Print various metadata: MAC times, file size, group and owner ID, and so on.

How it works…

To begin, we import the required libraries: argparse for argument handling, datetime for interpretation of timestamps, and os to access the stat() method. The sys module is used to identify the platform (operating system) the script is running on. Next, we create our command-line handler, which accepts one argument, FILE_PATH, a string representing the path to the file we will extract metadata from. We assign this input to a local variable before continuing execution of the script:

from __future__ import print_function
import argparse
from datetime import datetime as dt
import os
import sys

__authors__ = ["Chapin Bryce", "Preston Miller"]
__date__ = 20170815
__description__ = "Gather filesystem metadata of provided file"

parser = argparse.ArgumentParser(
description=__description__,
epilog="Developed by {} on {}".format(", ".join(__authors__), __date__)
)
parser.add_argument("FILE_PATH",
help="Path to file to gather metadata for")
args = parser.parse_args()
file_path = args.FILE_PATH

Timestamps are one of the most common file metadata attributes collected. We can access the creation, modification, and access timestamps using the os.stat() method. The timestamps are returned as a float representing the seconds since 1970-01-01. Using the datetime.fromtimestamp() method, we convert this value into a readable format.

The os.stat() module interprets timestamps differently depending on the platform. For example, the st_ctime value on Windows displays the file's creation time, while on macOS and UNIX this same attribute displays the last modification of the file's metadata, similar to the NTFS entry modified time. This is not the only part of os.stat() that varies by platform, though the remainder of this recipe uses items that are common across platforms.
stat_info = os.stat(file_path)
if "linux" in sys.platform or "darwin" in sys.platform:
print("Change time: ", dt.fromtimestamp(stat_info.st_ctime))
elif "win" in sys.platform:
print("Creation time: ", dt.fromtimestamp(stat_info.st_ctime))
else:
print("[-] Unsupported platform {} detected. Cannot interpret "
"creation/change timestamp.".format(sys.platform)
)
print("Modification time: ", dt.fromtimestamp(stat_info.st_mtime))
print("Access time: ", dt.fromtimestamp(stat_info.st_atime))

We continue printing file metadata following the timestamps. The file mode and inode properties return the file permissions and inode as an integer, respectively. The device ID refers to the device the file resides on. We can convert this integer into major and minor device identifiers using the os.major() and os.minor() methods:

print("File mode: ", stat_info.st_mode)
print("File inode: ", stat_info.st_ino)
major = os.major(stat_info.st_dev)
minor = os.minor(stat_info.st_dev)
print("Device ID: ", stat_info.st_dev)
print("\tMajor: ", major)
print("\tMinor: ", minor)

The st_nlink property returns a count of the number of hard links to the file. We can print the owner and group information using the st_uid and st_gid properties, respectively. Lastly, we can gather file size using st_size, which returns an integer representing the file's size in bytes.


Be aware that if the file is a symbolic link, the st_size property reflects the length of the path to the target file rather than the target file’s size.
print("Number of hard links: ", stat_info.st_nlink)
print("Owner User ID: ", stat_info.st_uid)
print("Group ID: ", stat_info.st_gid)
print("File Size: ", stat_info.st_size)

But wait, that’s not all! We can use the os.path() module to extract a few more pieces of metadata. For example, we can use it to determine whether a file is a symbolic link, as shown below with the os.islink() method. With this, we could alert the user if the st_size attribute is not equivalent to the target file's size. The os.path() module can also gather the absolute path, check whether it exists, and get the parent directory. We can also gather the parent directory using the os.path.dirname() function or by accessing the first element of the os.path.split() function. The split() method is more commonly used to acquire the filename from a path:

# Gather other properties
print("Is a symlink: ", os.path.islink(file_path))
print("Absolute Path: ", os.path.abspath(file_path))
print("File exists: ", os.path.exists(file_path))
print("Parent directory: ", os.path.dirname(file_path))
print("Parent directory: {} | File name: {}".format(
*os.path.split(file_path)))

By running the script, we can relevant metadata about the file. Notice how the format() method allows us to print values without concern for their data types. Normally, we would have to convert integers and other data types to strings first if we were to try printing the variable directly without string formatting:

There's more…

This script can be further improved. We have provided a couple of recommendations here:

  • Integrate this recipe with the Iterating over loose files recipe to recursively extract metadata for files in a given series of directories
  • Implement logic to filter by file extension, date modified, or even file size to only collect metadata information on files matching the desired criteria
You have been reading a chapter from
Python Digital Forensics Cookbook
Published in: Sep 2017 Publisher: Packt ISBN-13: 9781783987467
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}