Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python Digital Forensics Cookbook

You're reading from  Python Digital Forensics Cookbook

Product type Book
Published in Sep 2017
Publisher Packt
ISBN-13 9781783987467
Pages 412 pages
Edition 1st Edition
Languages
Concepts
Authors (2):
Chapin Bryce Chapin Bryce
Profile icon Chapin Bryce
Preston Miller Preston Miller
Profile icon Preston Miller
View More author details

Table of Contents (11) Chapters

Preface 1. Essential Scripting and File Information Recipes 2. Creating Artifact Report Recipes 3. A Deep Dive into Mobile Forensic Recipes 4. Extracting Embedded Metadata Recipes 5. Networking and Indicators of Compromise Recipes 6. Reading Emails and Taking Names Recipes 7. Log-Based Artifact Recipes 8. Working with Forensic Evidence Container Recipes 9. Exploring Windows Forensic Artifacts Recipes - Part I 10. Exploring Windows Forensic Artifacts Recipes - Part II

Multiple hands make light work

Recipe Difficulty: Medium

Python Version: 2.7 or 3.5

Operating System: Any

While Python is known for being single threaded, we can use built-in libraries to spin up new processes to handle tasks. Generally, this is preferred when there are a series of tasks that can be run simultaneously and the processing is not already bound by hardware limits, such as network bandwidth or disk speed.

Getting started

All libraries used in this script are present in Python’s standard library. Using the built-in multiprocessing library, we can handle the majority of situations where we would need multiple processes to efficiently tackle a problem.


To learn more about the multiprocessing library, visit https://docs.python.org/3/library/multiprocessing.html.

How to do it…

With the following steps, we showcase basic multiprocessing support in Python:

  1. Set up a log to record multiprocessing activity.
  2. Append data to a list using multiprocessing.

How it works…

Let's now look at how we can achieve multiprocessing in Python. Our imports include the multiprocessing library, shortened to mp, as it is quite lengthy otherwise; the logging and sys libraries for thread status messages; the time library to slow down execution for our example; and the randint method to generate times that each thread should wait for:

from __future__ import print_function
import logging
import multiprocessing as mp
from random import randint
import sys
import time

Before creating our processes, we set up a function that they will execute. This is where we put the task each process should execute before returning to the main thread. In this case, we take a number of seconds for the thread to sleep as our only argument. To print a status message that allows us to differentiate between the processes, we use the current_process() method to access the name property for each thread:

def sleepy(seconds):
proc_name = mp.current_process().name
logger.info("{} is sleeping for {} seconds.".format(
proc_name, seconds))
time.sleep(seconds)

With our worker function defined, we create our logger instance, borrowing code from the previous recipe, and set it to only record to the console.

logger = logging.getLogger(__file__)
logger.setLevel(logging.DEBUG)
msg_fmt = logging.Formatter("%(asctime)-15s %(funcName)-7s "
"%(levelname)-8s %(message)s")
strhndl = logging.StreamHandler(sys.stdout)
strhndl.setFormatter(fmt=msg_fmt)
logger.addHandler(strhndl)

We now define the number of workers we want to spawn and create them in a for loop. Using this technique, we can easily adjust the number of processes we have running. Inside of our loop, we define each worker using the Process class and set our target function and the required arguments. Once the process instance is defined, we start it and append the object to a list for later use:

num_workers = 5
workers = []
for w in range(num_workers):
p = mp.Process(target=sleepy, args=(randint(1, 20),))
p.start()
workers.append(p)

By appending the workers to a list, we can join them in sequential order. Joining, in this context, is the process of waiting for a process to complete before execution continues. If we do not join our process, one of them could continue to the end of the script and complete the code before other processes complete. While that wouldn't cause huge problems in our example, it can cause the next snippet of code to start too early:

for worker in workers:
worker.join()
logger.info("Joined process {}".format(worker.name))

When we execute the script, we can see the processes start and join over time. Since we stored these items in a list, they will join in an ordered fashion, regardless of the time it takes for one worker to finish. This is visible below as Process-5 slept for 14 seconds before completing, and meanwhile, Process-4 and Process-3 had already completed:

There's more…

This script can be further improved. We have provided a recommendation here:

  • Rather than using function arguments to pass data between threads, review pipes and queues as alternatives to sharing data. Additional information about these objects can be found at https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes.
You have been reading a chapter from
Python Digital Forensics Cookbook
Published in: Sep 2017 Publisher: Packt ISBN-13: 9781783987467
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}