Mastering Modern Web Penetration Testing

3.7 (3 reviews total)
By Prakhar Prasad
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Common Security Protocols

About this book

Web penetration testing is a growing, fast-moving, and absolutely critical field in information security. This book executes modern web application attacks and utilises cutting-edge hacking techniques with an enhanced knowledge of web application security.

We will cover web hacking techniques so you can explore the attack vectors during penetration tests. The book encompasses the latest technologies such as OAuth 2.0, Web API testing methodologies and XML vectors used by hackers. Some lesser discussed attack vectors such as RPO (relative path overwrite), DOM clobbering, PHP Object Injection and etc. has been covered in this book.

We'll explain various old school techniques in depth such as XSS, CSRF, SQL Injection through the ever-dependable SQLMap and reconnaissance.

Websites nowadays provide APIs to allow integration with third party applications, thereby exposing a lot of attack surface, we cover testing of these APIs using real-life examples.

This pragmatic guide will be a great benefit and will help you prepare fully secure applications.

Publication date:
October 2016
Publisher
Packt
Pages
298
ISBN
9781785284588

 

Chapter 1. Common Security Protocols

This is the first chapter of this book and it will cover some basic security protocols and mechanisms. These concepts are really necessary to grasp further chapters. These little things will be very useful to understand web applications as a whole.

We'll start off with the same-origin policy (SOP), which is a restrictive policy that prevents web pages from bashing together (in a simple sense). Then we've cross-origin resource sharing (CORS), which is relatively new and allows resource sharing. Later on, we'll cover different encoding techniques used in web applications, such as URL or percent encoding, double encoding, and Base64 encoding.

 

SOP


Same-origin policy is a security enforcement found in most common browsers that restricts the way a document or script (or other data) that gets loaded from one origin can communicate and associate with properties of another origin. It's a crucial concept of security which runs web applications of various kinds.

To understand the same-origin policy better, let us consider an example. Imagine that you're logged into your webmail, such as Gmail, in one browser tab. You open a page in another browser tab that has some pieces of JavaScript (JS) that attempts to read your Gmail messages. This is when the same-origin policy kicks in: as soon as an attempt is made to access Gmail from some other domain that is not Gmail then the same-origin policy will prevent this interaction from happening. So, basically, the same-origin policy prevented a random web page which was not a part of Gmail from performing actions on your behalf on an actual Gmail web page.

Allow me to explain more specifically what origin actually means. Origin is considered on the basis of protocol, port number, and, more importantly, the hostname of the webpage. Please note that the path of the page does not matter as long as the rest of the mentioned things are satisfied.

Keep in mind that the same-origin policy is not only for JS but for cookies, AJAX, Flash, and so on. Data stored inside localStorage is also governed by this policy, that is, origin-separated.

The following table exhibits different same-origin policy results based on hostname, port number, and protocol when compared with the origin: http://example.com/meme/derp.html.

URL

Result

Explanation

http://example.com/random/derp.html

Pass

Path does not matter

http://example.com/other/meme/derp.html

Pass

Path does not matter

http://www.example.com/meme/derp.html

Fail

Different domain

http://example.com:8081/meme/derp.html

Fail

Different ports

ftp://example.com/meme/derp.html

Fail

Different protocol

http://demo.example.com/meme/derp.html

Fail

Different domain

http://packtpub.com/meme/derp.html

Fail

Different domain

Demonstration of the same-origin policy in Google Chrome

Now we've geared up with the basics of the same-origin policy, let me try to demonstrate an example in which I'll try to violate the same-origin policy and trigger the security mechanism:

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>SOP Demo</title>
</head>
<body>
  <iframe src="http://example.com" name="demo"></iframe>

  <script>
  document.getElementsByName('demo')[0].onload = function() {
    try {
      console(frames[0].hostname)
    } catch(e) {
      console.log(e);
    }
  }
  </script>
</body>
</html>

As soon as this code runs inside the Chrome browser, it throws the following message in the console.log() output:

I ran the script from output.jsbin.com and Chrome's same-origin policy effectively kicked in and prevented output.jsbin.com from accessing the contents of the example.com iframe.

Switching origins

JS provides a way to change origins if certain conditions are met. The document.domain property allows the origin of the current page to change into a different origin, for example origin A can switch to origin B; this will only work if the current page is the subset of the main domain.

Let me explain the mentioned concept with an example. Consider a page running under example.com, which has two iframes, abc.example.com and xyz.example.com. If either of these iframes issues document.domain = 'example.com' then further same origin checks will be based on example.com. However, as I mentioned, a page can't misuse this functionality to impersonate a completely different domain. So, malicious.com cannot issue an origin to change to bankofamerica.com and access the data of it:

This screenshot shows the error thrown by the Google Chrome browser when example.com attempts to impersonate bankofamerica.com by changing its document.domain property.

Quirks with Internet Explorer

As expected, Microsoft Internet Explorer (IE) has its own exceptions to the same-origin policy; it skips the policy checks if the following situations are encountered:

  • IE skips the origin check if the origin falls under the Trust Zone, for example, internal corporate websites.

  • IE doesn't give any importance to port numbers, so http://example.com:8081 and http://example.com:8000 will be considered as the same origin; however, this is won't be true for other browsers. For example, there are browser bugs which can lead to SOP bypass; one such example is an SOP bypass in Firefox abusing the PDF reader – https://www.mozilla.org/en-US/security/advisories/mfsa2015-78/.

Cross-domain messaging

Sometimes, there exists a need to communicate across different origins. For a long time, exchanging messages between different domains was restricted by the same-origin policy. Cross-domain messaging (CDM) was introduced with HTML5; it provides the postMessage() method, which allows sending messages or data across different origins.

Suppose there is an origin A on www.example.com which, using postMessage(), can pass messages to origin B at www.prakharprasad.com.

The postMessage() method accepts two parameters:

  • message: This is the data that has to be passed to the receiving window

  • targetDomain: The URL of the receiving window

Sending a postMessage():

receiver.postMessage('Hello','http://example.com')

Receiving a postMessage():

window.addEventListener('message',function(event) {
  if(event.origin != 'http://sender.com') return;
  console.log('Received:  ' + event.data,event);
  },false);

AJAX and the same-origin policy

As of today, all interactive web applications make use of AJAX, which is a powerful technique that allows the browser to silently exchange data with the server without reloading the page. A very common example of AJAX in use is different online chat applications or functionality, such as Facebook Chat or Google Hangouts.

AJAX works using the XMLHTTPRequest() method of JS. This allows a URL to be loaded without issuing a page refresh, as mentioned. This works pretty decently till the same-origin policy is encountered, but fetching or sending data to a server or URL which is at a different origin is a different story altogether. Let us attempt to load the home page of packtpub.com using a web page located at output.jsbin.com through an XMLHTTPRequest() call. We'll use the following code:

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>AJAX</title>
</head>
<body>
  <script>
    var request = new XMLHTTPRequest();
    request.open('GET', 'http://packtpub.com', true);
    request.send();
  </script>
</body>
</html>

As soon as this code runs, we get the following security error inside the Google Chrome browser:

This error looks interesting as it mentions the 'Access-Control-Allow-Origin' header and tells us that packtpub.com effectively lacks this header, hence the cross-domain XMLHTTPRequest() will drop as per security enforcement. Consider an example in which a web page running at origin A sends an HTTP request to origin B impersonating the user and loads up the page, which may include Cross-Site Request Forgery (CSRF) tokens, and then they can be used to mount a CSRF attack.

So the same-origin policy basically makes calling separate origin documents through AJAX functions a problem. However, in the next section, we'll attempt to dig deeper into this.

 

CORS


CORS allows cross-domain HTTP data exchange, which means a page running at origin A can send/receive data from a server at origin B. CORS is abundantly used in web applications where web fonts, CSS, documents, and so on are loaded from different origins, which may not be of the origin where the resources are actually stored. Most content delivery networks (CDNs) which provide resource-hosting functionality typically allow any website or origin to interact with themselves.

CORS works by adding a new HTTP header that allows the web server to speak up a list of whitelisted domains that are allowed to connect and interact with the server. This thing is also browser enforced; the browser reads the header and processes accordingly.

The following flow chart shows the CORS flow at different positions:

CORS flowchart diagram (Source: https://www.soasta.com)

CORS headers

There are less than a dozen HTTP headers that are related to CORS but I'll try to explain a few commonly used CORS headers:

  • Access-Control-Allow-Origin: This is a response header; as soon as a request is made to the server for exchanging data, the server responds with a header that tells the browser whether the origin of the request is listed inside the value of this response. If the header is not present or the response header does not contain the request origin inside the header, then the request is dropped and a security error is raised (as seen earlier in the last section), otherwise the request is processed.

    Example: Access-Control-Allow-Origin: http://api.example.com

  • Access-Control-Allow-Methods: This is another response header; the server responds with this header and instructs the browser to check for allowed HTTP methods mentioned inside it. If the server only allows GET and a POST request is initiated then it will be dropped if not mentioned in this list.

    Example: Access-Control-Allow-Methods: GET

  • Origin: This is a request header which tells the server from which domain origin the request was attempted. The origin header is always sent alongside cross-domain requests.

    Example: Origin: http://example.com

Pre-flight request

A pre-flight request is just a normal HTTP request that happens before the actual cross-domain communication. The logic behind this is to ensure the client and server are fully compatible (protocol, security, and so on) with each other before the data is actually exchanged. If they are not, then the relevant error is raised.

Please keep that in mind that a pre-flight request only triggers if:

  • Custom HTTP headers are sent

  • The body MIME-type is different than text/plain

  • The HTTP method is different than GET or POST

The following is a typical pre-flight request-response pair:

Request:

OPTIONS / HTTP/1.1
Origin: http://api.user.com
Access-Control-Request-Method: PUT
Host: api.example.com
Accept-Language: en-US
Connection: keep-alive
User-Agent: Browser

Response:

HTTP/1.1 204 No Content 
Access-Control-Allow-Origin: http://api.user.com
Access-Control-Allow-Methods: GET, POST, PUT
Content-Type: text/html; charset=utf-8

Simple request

A simple CORS request is similar to a pre-flight request without the initial capability exchange sequence occurring. In a typical simple CORS request, the following sequence happens:

Request: http://example.com – Origin A

Response: http://cdn.prakharprasad.com – Origin B

  1. Origin A attempts to access the home page of a CDN running at origin B, http://cdn.prakharprasad.com, using CORS.

  2. Origin A sends a GET request to the Origin B web server.

  3. The Origin B server responds with Access-Control-Allow-Origin.

 

URL encoding – percent encoding


In this section, I'll explain percent encoding, which is a commonly used encoding technique to encode URLs.

URL encoding is a way in which certain characters are encoded or substituted by % followed by the hexadecimal equivalent of the character. Developers often use encoding because there are certain cases when an intended character or representation is sent to the server but when received, the character changes or gets misinterpreted because of transport issues. Certain protocols such as OAuth also require some of its parameters, such as redirect_uri, to be percent encoded to make it distinct from rest of the URL for the browser.

Example: < is represented as %3c in percent encoding format.

URL encoding is done typically on URI characters that are defined in RFC 3986. The RFC mentions that the characters must be separated into two different sets: reserved characters and unreserved characters.

Reserved characters have special meanings in the context of URLs and must be encoded into another form, which is the percent-encoded form to avoid any sort of ambiguity. A classic example of such ambiguity can be /, which is used to separate paths in a URL, so if the necessity arises to transmit the / character in a URL then we must encode it accordingly, so that the receiver or parser of the URL does not get confused and parse the URL incorrectly. Therefore, in that case / is encoded into %2F, this will be decoded into / by the URL parser.

Unrestricted characters

The following characters are not encoded as part of the URL encoding technique:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 - _ . ~

Restricted characters

The following characters are encoded as part of the URL encoding technique:

!	*	'	(	)	;	:	@	&	=	+	$	,	/	?	#	[	]

Encoding table

The following is a list of characters with their encoded form:

Character

Encoded

:

%3A

/

%2F

#

%23

?

%3F

&

%24

@

%40

%

%25

+

%2B

<space>

%20

;

%3B

=

%3D

$

%26

,

%2C

< 

%3C

> 

%3E

^

%5E

`

%60

\

%5C

[

%5B

]

%5D

{

%7B

}

%7D

|

%7C

"

%22 

Encoding unrestricted characters

Although the percent encoding technique typically encodes restricted characters, it is also possible to encode unrestricted characters by providing an equivalent ASCII hexadecimal code for the character, preceded by %.

For example, if we had to encode A into percent encoding, we can simply provide %41; here, 41 is the hexadecimal for 65, which, in turn, is the ASCII code for capital A.

A web-based URL encoder/decoder can be found here:

http://meyerweb.com/eric/tools/dencoder/

 

Double encoding


Double percent encoding is the same as percent encoding with a twist that each character is encoded twice instead of once. This technique comes in pretty handy when attempting to evade filters which attempt to blacklist certain encoded characters, so we can double encode instead and let the filter decode to the original form. This technique only works where recursive decoding is done.

It is the same technique that was used in the infamous IIS 5.0 directory traversal exploit in 2001.

Double encoding sometimes works well in Local File Inclusion (LFI) or Remote File Inclusion (RFI) scenarios as well, in which we need to encode our path payload. Typically ../../ or ..\..\ is used to traverse back to the parent directory; some filters detect this and block the attempt. We can utilize the double technique to evade this.

Introducing double encoding

In percent encoding, if we had %3C as our percent-encoded character then it gets decoded into <. In double encoding, the percent-encoded character is again encoded, which means that the % prefixed hex-character gets encoded again to %25 plus the hex-character of the original character. So if I had to encode < using double encoding, I'll first encode it into its percent-encoded format, which is %3c and then again percent encode the % character. The result of this will be %253c. Normally, this should be decoded only once but there are scenarios where the developer makes the mistake of decoding it multiple times or situations in which this happens by design. This effectively results in bypasses of filters depending on the scenario:

  • Normal URL: http://www.example.com/derp/one/more/time.html

  • Percent encoded: http%3A%2F%2Fwww.example.com%2Fderp%2Fone%2Fmore%2Ftime.html

  • Double encoded: http%253A%252F%252Fwww.example.com%252Fderp%252Fone%252Fmore%252Ftime.html

IIS 5.0 directory traversal code execution – CVE-2001-0333

In 2001, a directory traversal vulnerability in Microsoft's popular IIS 5.0 web server appeared. The vulnerability was critical because it was a zero authentication code execution vulnerability. The vulnerability was due to double decoding of a URL passed into the request.

Microsoft issued security bulletin MS01-026 to address this flaw and also described the vulnerability in their own words. I'll quote the technical advisory published at Microsoft's website:

A vulnerability that could enable an attacker to run operating system commands on an affected server. When IIS receives a user request to run a script or other server-side program, it performs a decoding pass to render the request in a canonical form, then performs security checks on the decoded request. A vulnerability results because a second, superfluous decoding pass is performed after the security checks are completed. If an attacker submitted a specially constructed request, it could be possible for the request to pass the security checks, but then be mapped via the second decoding pass into one that should have been blocked -- specifically, it could enable the request to execute operating system commands or programs outside the virtual folder structure. These would be executed in the security context of the IUSR_machinename account which, by virtue of its membership in the Everyone group, would grant the attacker capabilities similar to those of a non-administrative user interactively logged on at the console.

This excerpt mentions specifically that a vulnerability results because a second, superfluous decoding pass is performed after the security checks are completed. This clearly speaks by itself that double decoding is done by mistake in the IIS server that allows someone to traverse path names and execute commands by communicating with the cmd.exe parser; the code gets executed under the rights of the IIS webserver account.

Whenever IIS was asked to serve a CGI page with ../../ in the path which goes outside the root directory then the request would have got blocked as it is a clear path traversal outside of the root directory.

Assuming that the root directory is a Windows folder, if we send the following request, it will be blocked as it contains ../../ for directory traversal inside the path name.

Normal URL:

http://example.com/scripts/../../winnt/system32/cmd.exe?/c+dir+c:\

Then using the superfluous second decoding, as Microsoft likes to call it. We can perform path traversal and execute commands by hitting the command-line parser of Windows.

So the following double-encoded URL will bypass and execute code under the context of IIS server account name.

Double-encoded URL:

http://example.com/scripts/%252E%252E%252F%252E%252E%252Fwinnt/system32/cmd.exe?/c+dir+c:\

Using double encoding to evade XSS filters

We have covered a directory traversal security check bypass through the double encoding technique. In this section, I'll cover how we can evade some XSS filters or checks that perform double decoding of the input.

Assuming that we've an XSS filter that detects <, >, /, or their percent-encoded forms, we can apply the double encoding technique to our XSS payload, if our input gets recursively decoded.

Original request with XSS payload (blocked): http://www.example.com/search.php?q=<script>alert(0)</script>

Percent-encoded XSS payload (blocked):

http://www.example.com/search.php?q=%3Cscript%3Ealert(0)%3C%2Fscript%3E

Double-percent-encoded payload (allowed): http://www.example.com/search.php?q=%253Cscript%253Ealert(0)%253C%252Fscript%253E

Basically, we can tabulate the encodings that we've just done:

Character

Percent encoded

Double encoded

<

%3C

%253C

>

%3E

%253E

/

%2F

%252F

Before I end this topic, I must say the double encoding technique to bypass countermeasures is very powerful provided that our requirements (such as recursive decoding). It can be applied to other attack techniques such as SQL injections.

Double encoding can be further extrapolated into triple encoding and so on. For triple encoding, all we need to is prefix %25 then append 25 then the hex code; the triple encoding for < will be %25253C.

 

Base64 encoding


Base64 is an encoding mechanism which was originally made for encoding binary data into textual format. First used in e-mail system that required binary attachments such as images and rich-text documents to be sent in ASCII format.

Base64 is commonly used in websites as well, not for encoding binary data but for obscuring things such as request parameter values, sessions, and so on. You might be aware that security through obscurity is not at all beneficial in any way. In this case, developers are not generally aware of the fact that even a slightly skilled person can decode the hidden value disguised as a Base64 string. Base64 encoding is used to encode media such as images, fonts, and so on through data URIs.

JS also provides built-in functions for encoding/decoding Base64-encoded strings such as:

  • atob(): Encode to Base64

  • bota(): Decode from Base64

Character set of Base64 encoding

Base64 encoding contains a character set of 64 printable ASCII characters. The following set of characters is used to encode binary to text:

  • A to Z characters

  • a to z characters

  • + (plus character)

  • / (forward-slash character)

  • = (equal character)

The following table is used for indexing the values to their respective Base64 encoding alternatives:

Value

Enc

Value

Enc

Value

Enc

Value

Enc

0

A

16

Q

32

g

48

w

1

B

17

R

33

h

49

x

2

C

18

S

34

i

50

y

3

D

19

T

35

j

51

z

4

E

20

U

36

k

52

0

5

F

21

V

37

l

53

1

6

G

22

W

38

m

54

2

7

H

23

X

39

n

55

3

8

I

24

Y

40

o

56

4

9

J

25

Z

41

p

57

5

10

K

26

a

42

q

58

6

11

L

27

b

43

r

59

7

12

M

28

c

44

s

60

8

13

N

29

d

45

t

61

9

14

O

30

e

46

u

62

+

15

P

31

f

47

v

63

/

The encoding process

The encoding process is as follows:

  1. Binary or non-binary data is read from left to right.

  2. Three separate 8-bit data from the input are joined to make a 24-bit-long group.

  3. The 24-bit long group is divided into 6-bit individual groups, that is, 4 groups.

  4. Now each 6-bit group is converted into the Base64-encoded format using the previous lookup table.

Example:

Let us take the word God. We'll make a table to demonstrate the process more easily:

Alphabet

G

o

d

 

8-bit groups

01000111

01101111

01100100

 

6-bit groups

010001

110110

111101

100100

6-bit in decimal (Radix)

17

54

61

36

Base64 lookup

R

2

9

k

Therefore, the Base64 equivalent for God becomes R29k.

However, a problem arises when the character groups are do not exactly form the 24-bit pattern. Let me illustrate this. Consider the word PACKT. We cannot divide this word into 24-bit groups equally. Hypothetically speaking, the first 24-bit group is PAC and second group KT?, where ? signifies a missing 8-bit character. This is the place where the padding mechanism of Base64 kicks in. I'll explain that in the next section.

Padding in Base64

Wherever there is a missing character (8-bit) in forming the 24-bit groups then for every missing character (8-bit), = is appended in place of that. So, for one missing character, = is used; for every two missing characters == is used:

Input

Output

Padding

Padding Length

Web Hacking

V2ViIEhhY2tpbmc=

=

1

Why God Why ?

V2h5IEdvZCBXaHkgPw==

==

2

Format

Rm9ybWF0

 

0

 

Summary


In this chapter, we've learnt about the same-origin policy, CORS and different types of encoding mechanism that are prevalent on the Web. The things discussed here will be required in later chapters as per the requirement. You can fiddle around with other encoding techniques such as Base32, ROT13, and so on for your own understanding.

You can read about ROT13 at: http://www.geocachingtoolbox.com/index.php?page=caesarCipher.

In the next chapter, we will learn different reconnaissance techniques, which will enable us to learn more about our target so that we can increase our attack surface.

About the Author

  • Prakhar Prasad

    Prakhar Prasad is a web application security researcher and penetration tester from India. He has been a successful participant in various bug bounty programs and has discovered security flaws on websites such as Google, Facebook, Twitter, PayPal, Slack, and many more. He secured the tenth position worldwide in the year 2014 at HackerOne's platform. He is OSCP and OSWP certified, which are some of the most widely respected certifications in the information security industry. He occasionally performs training and security assessment for various government, non-government, and educational organizations.

    Browse publications by this author

Latest Reviews

(3 reviews total)
enough said enough said enough said enough said enough said
🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠🤠
Excellent

Recommended For You

Burp Suite Cookbook

Get hands-on experience in using Burp Suite to execute attacks and perform web assessments

By Sunny Wear
Hands-On Application Penetration Testing with Burp Suite

Test, fuzz, and break web applications and services using Burp Suite’s powerful capabilities

By Carlos A. Lozano and 2 more
Bug Bounty Hunting Essentials

Get hands-on experience on concepts of Bug Bounty Hunting

By Carlos A. Lozano and 1 more
Mastering Malware Analysis

Master malware analysis to protect your systems from getting infected

By Alexey Kleymenov and 1 more