Learning Python Web Penetration Testing

3.8 (4 reviews total)
By Christian Martorella
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

Web penetration testing is the use of tools and code to attack a website or web app in order to assess its vulnerability to external threats. While there are an increasing number of sophisticated, ready-made tools to scan systems for vulnerabilities, the use of Python allows you to write system-specific scripts, or alter and extend existing testing tools to find, exploit, and record as many security weaknesses as possible. Learning Python Web Penetration Testing will walk you through the web application penetration testing methodology, showing you how to write your own tools with Python for each activity throughout the process. The book begins by emphasizing the importance of knowing how to write your own tools with Python for web application penetration testing. You will then learn to interact with a web application using Python, understand the anatomy of an HTTP request, URL, headers and message body, and later create a script to perform a request, and interpret the response and its headers. As you make your way through the book, you will write a web crawler using Python and the Scrappy library. The book will also help you to develop a tool to perform brute force attacks in different parts of the web application. You will then discover more on detecting and exploiting SQL injection vulnerabilities. By the end of this book, you will have successfully created an HTTP proxy based on the mitmproxy tool.

Publication date:
June 2018


Chapter 1. Introduction to Web Application Penetration Testing

In this chapter, we will look at the following topics:

  • Understanding the web application penetration testing process
  • Typical web application toolkit
  • Training environment

Let's get started!


Understanding the web application penetration testing process

In this section, we will understand what web application penetration testing is and the process behind it. We will start by learning what web application penetration testing is, the importance of performing these tests, what professional methodologies look like, and we'll briefly explain why it is important to have skills to use Python to write our own tools.

Penetration testing is a type of security testing that evaluates the security of an application from the perspective of an attacker. It is an offensive exercise where you have to think like an attacker and understand the developers as well as the technology involved in order to unveil all the flaws.

The goal is to identify all the flaws and demonstrate how they can be exploited by an attacker, and what the impact will be on our company. Finally, the report will provide solutions to fix the issues that have been detected. It's a manual and dynamic test. Manual means that it heavily depends on the knowledge of the person doing the test, and that is why learning how to write your own penetration testing tools is important, and will give you an edge in your career. Dynamic testing is where we test the running application. It is not a static analysis of the source code. The security test is useful to validate and verify the effect of the application security controls to us and to identify the lax of these security controls.

So, why should we perform penetration testing? Nowadays, IT has taken the world by storm. Most of the company processes and data are handled by computers. This is the reason why companies need to invest in security testing, in order to validate the effectiveness of security controls, and many a times the lack of them.

One report by EMC (https://www.scmagazine.com/study-it-leaders-count-the-cost-of-breaches-data-loss-and-downtime/article/542793/) states that the average report regarding annual financial loss per company is 497,037 USD for down time, 860,273 USD for security breaches, and 585,892 USD for data loss. Plus, all the time, the company resources are put into incident response and fixing, testing, and deploying the issue:

That is why performing penetration testing will help companies to protect their customer's data, intellectual property, and services. Penetration testing is a simple methodology formed by four main sections, which are as follows:

  • Reconnaissance: In this phase, we'll gather information to identify the technologies used, the infrastructure supporting the application, software configuration, load balances, and so on. This phase is also known as fingerprinting.
  • Mapping: We then move into the mapping phase, where we build a map or diagram of the application pages and functionalities. We aim to identify the components and their relationships. One of the techniques to support mapping is spidering or crawling. Also, in this phase, we'll discover nonlinked resources by performing brute force attacks.
  • Vulnerability: Once we have all the components, parameters, forms, and functionalities mapped out, we move to phase three, where we'll start vulnerability discovery.
  • Exploitation: After identifying all the vulnerabilities, we can move to the last phase, which is the exploitation of the vulnerabilities. Depending on the scope of the pen test, once you exploit vulnerability, you can start the process all over again from your new vantage point. Usually, this the target DMZ, which you would try to get into their internal network segment.

One step that is not represented here is the reporting phase, where you document all the findings so that you can present them to your customer, company.

Finally, there are two types of penetration tests, which are the black box and the white box. Black box test takes place when you don't have any information about the target, which is basically the same situation as an attacker, and white box test takes place when the customer provides us with documentation, source code, and configurations to accelerate the process, and we only focus on interesting areas.

You maybe wondering, what areas should you test during this process? These are some of the most important ones to cover:

  • Configuration and deployment management testing
  • Identity management testing
  • Authentication testing
  • Authorization testing
  • Session management testing
  • Input validation
  • Testing error handling
  • Cryptography
  • Business logic testing
  • Client-side testing

We'll cover some of these areas in this chapter.


You can expand your knowledge on these areas by reading the OWASP testing guide: https://www.owasp.org/index.php/OWASP_Testing_Project.

So, why build your own tools? Web applications are very different since they're developed using multiple technologies, combinations, flows, and implementations.

This is the reason why there is not a single tool that will cover all the scenarios that you will find during your career. Many times, we'll write scripts to test specific issues or to make certain tasks, and to exploit a vulnerability. During the course of this book, we'll see how to write tools and test different areas such as authentication, input validation, and discovery, and we'll end up writing a simple Hypertext Transfer Protocol (HTTP) proxy that could be the foundation of our own security scanner. Writing your own tools is a valuable skill that will put you ahead of many penetration testers that do not have the capability to adapt tools, or write their own. In certain penetration test engagements, this could make all the difference.


Typical web application toolkit

In this section, we'll take a look at the different tools used by security professionals to perform web application penetration tests.

HTTP Proxy

The most important tool for testing web applications is the HTTP Proxy. This tool allows you to intercept all the communication between the browser and the server in both directions. These proxies are called man-in-the-middle proxies. These tools will let us understand how an application works, and most importantly, it will allow us to intercept the requests, responses, and modify them.

Usually, the proxy will run in the same machine as the browser you're using for testing the application. The most used HTTP proxies by security professionals are Burp Suite from PortSwigger security (https://portswigger.net/burp/proxy.html) and Zed Attack Proxy (ZAP) (https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project). We also have the MITM proxy. It is a newer alternative developed in Python and is good to build tools or automate certain scenarios. The downside is that it's the only console, and there is no GUI, which for our purposes, is a benefit.

Crawlers and spiders

Crawlers and spiders are used for mapping web applications, automating the task of cataloging all the content and functionality. The tool automatically crawls the application by following all the links it finds, submitting forms, analyzing the responses for new content, and repeating this process until it covers the whole application.

There are standalone crawlers and spiders such as Scrapy (http://scrapy.org), which are written in Python or command-line tools such as HT track (http://www.httrack.com). We have crawlers and spiders integrated with the proxies such as Burp and ZAP that will benefit from the content that has passed through the proxy to enrich knowledge about the app.

One good example on why this is valuable is when the application is heavy on JavaScript. Traditional crawlers won't interpret JS, but the browsers will. So, the proxy will see it and add it to the crawler catalog. We'll see Scrapy in more detail later.

Vulnerability scanners

Now, let's step into more complex tools: the vulnerability scanners.

These tools are considered more complex as they have to automate most of the security testing methodology in one tool. They will do the crawling, discovery, vulnerability detection, and some of the exploitation. The two most used open source web application security scanners are w3af (http://w3af.org/), which is written in Python, and Arachni (http://www.arachni-scanner.com/), which is written in Ruby.

There are multiple commercial alternatives such as Acunetix (http://www.acunetix.com/), which is one of the cheapest and provides good value for money.

Brute forces/predictable resource locators

Web brute forces or discovery tools are used to find content such as files, directories, servlets, or parameters through dictionary attacks. These tools use word lists which have been put together by security professionals during the last 10 years, which contain known filename directories or just words found in different products or web applications.

The precursor for these types of tools was DIRB (http://dirb.sourceforge.net/), which is still available and maintained by Dark Raver. Another great alternative is Wfuzz (http://www.edge-security.com/wfuzz.php), which I developed in the past and is now maintained and developed by Xavier Mendez. You can find this tool in Kali, the most used penetration testing distribution.

Tools such as Burp and ZAP provide these capabilities. All these tools benefit from word lists such as the ones provided by FUZZDB (https://github.com/fuzzdb-project), a database of wordlists for web application testing. We'll see how to build a tool for this purpose similar to Wfuzz.

Specific task tools

We have a vast array of tools that are focused to specific tasks such as encoders and hashers, Base 64, MD5, SHA1, and Unicode.

Tools that are created to exploit a specific type of vulnerability are, for example, SQL injectors such as SQL map, XSS consoles such as Beef to demonstrate the impact of a XSS and DOM XSS, scanners such as Dominator, and many more. Also, an important type of tool in the tool kit is the post exploitation tool.

These tools are needed once you manage to exploit a vulnerability and help you to control the server, upload files, Shells, proxy content to the internal network, and expand your attack internally. There are many other tools to overcome the infinite challenges we find while testing new applications and technologies.


Testing environment

In this section, we'll take a look at our testing lab environment. We will start by installing the VirtualBox software to run our lab VM. We'll access the vulnerable web application, get familiar with the text editor, and finally, I will give you an important warning.

The first tool that we need is VirtualBox. This will allow you to run the lab environment virtual machine created for this training. You can download VirtualBox from https://www.virtualbox.org/wiki/Downloads. Choose your host OS and download the installer. After downloading VirtualBox, we can download the virtual machine created for this course from https://drive.google.com/open?id=0ByatLxAqtgoqckVEeGZ4TE1faVE.

Once the file is downloaded, we can proceed with the installation of VirtualBox.

Install VirtualBox, which in my case I have to do by double-clicking on the .dmg file. Follow the installation instructions. And once you're finished, decompress the lab virtual machine. In my case, I use an archive in OS X. You can use 7 ZIP in other platforms.

Once decompressed, we will start VirtualBox.

Open the VM. Once the VM is loaded in VirtualBox, we'll start the machine and wait for it to boot until we get the login prompt. We'll log in with the user Packt and the password secret.


The root user password is packt2016.

Now, we have our lab ready for action. For the purpose of this book, we have created a vulnerable web application that will allow us to test for different types of vulnerabilities using our own developed tools. The application simulates a very simple banking application.

It is developed in PHP with MySQL and it is served by Apache. Now, we'll open the browser in our VM. Load the URL www.scruffybank.com. I created an /ETC/hosts entry to redirect that hostname to local host. This application is running in an Apache server in the VM.

You should see the index page. If you click on Learn More, you will see the following information:

On the top right-hand side, you can access the login page.

Our last tool in the lab is the text editor, where we'll write the scripts. One possible choice would be Atom, a multi-platform open source and free editor developed by the GitHub folks. Feel free to install or use the editor you prefer.

In order to start Atom, go to the desktop item named Atom and the editor will start with a blank file. You can start typing the code, but until you save the file and add an extension, it won't do syntax highlighting.

I will open an example in my home directory called Video-3.py. This is what a Python script looks like in Atom:


I want to highlight that many of the penetration testing activities, if not all of them, are not allowed to be performed without the target company's permission. In many countries, these activities are illegal, again without proper permissions. Always use a testing environment whenever you want to try a new tool or technique. Again, whenever you'll perform a penetration test for a customer, get written authorization.



In this chapter, we have seen what web application penetration testing is, why it is important to perform the test, what the methodology to follow is when performing a penetration test, the different domains that need to be covered, and why it is important to know how to write your own tools with Python.

We have also seen the tools that make the web application pen tested tool kit. This helped us understand how the tools align with the methodology and will also serve as inspiration when we need to create our own tools, learn from them, and understand how they work.

We also saw the lab environment that we'll be using throughout this book.

We have installed VirtualBox, run the lab virtual machine, and accessed the testing web app, scruffy bank. We saw a quick example of the text editor, and finally, we saw an important warning about the consequences of doing penetration testing without permission from the customer.

In Chapter 2, Interacting with Web Applications, we'll learn how to interact with a web application using Python, understand the anatomy of an HTTP request, URL, headers, message body, and we'll create a script to perform a request and interpret the response and its headers.

About the Author

  • Christian Martorella

    Christian Martorella has been working in the field of information security for the last 18 years and is currently leading the product security team for Skyscanner. Earlier, he was the principal program manager in the Skype product security team at Microsoft. His current focus is security engineering and automation. He has contributed to open source security testing tools such as Wfuzz, theHarvester, and Metagoofil, all included in Kali, the penetration testing Linux distribution.

    Browse publications by this author

Latest Reviews

(4 reviews total)
estoy en el proceso actualizado para leer
The information contained is always a value to the working class and consumers.

Recommended For You

Book Title
Access this book, plus 7,500 other titles for FREE
Access now