Penetration Testing with the Bash shell

Chapter 1. Getting to Know Bash

The Bourne Again SHell (bash) is arguably one of the most important pieces of software in existence. Without bash shell's many utilities and the problem-solving potential it gives its users by integrating and interfacing system utilities in a programmable way (called bash scripting), many of the very important security-related problems of the modern world would be very tedious to solve. Utilities such as grep, wget, vi, and awk enable their users to do very powerful string processing, data mining, and information management. System administrators, developers, security engineers, and penetration testers all across the world for many years have sworn by its sheer problem-solving potential and effectiveness in enabling them to tackle their day-to-day technical challenges.

Why are discussing the bash shell? Why is it so popular among system administrators, penetration testers, and developers? Well, there may be other reasons, but fundamentally the bash shell is the most standardized and is usually, with regard to most popular operating systems, implemented from a single code base—one source for the official source code. This means one can guarantee a certain base set of execution behaviors for a bash script or collection of commands regardless of the operating system hosting the bash implementation. Operating systems popularly have unique implementations of the Korn Shell (ksh) and other terminal emulator software.

The only disadvantage, if any, of the Linux or Unix environment that bash is native to is that for most people, especially those accustomed to the Graphical User Interface (GUI), the learning curve may be a little steep. This is mainly because the way information is represented. The general Linux/Unix culture and conventions can often be difficult to appreciate for newcomers and possibly due to the lack of tooltips, hints, and rich graphical interaction design and user experience engineering GUIs often benefit from. This book and especially this chapter will introduce some of the witty but brilliant Linux/Unix culture and conventions so that you can get comfortable enough with the bash shell and eventually find your own way around and follow the more advance topics later on in the book.

Throughout the book, the bash environment or the host operating system that will be discussed will be Kali Linux. Kali Linux is a distribution adapted from Debian, and it is packed with utilities focused purely on technical security problem solving and testing. Because knowing how to wield your terminal is strongly associated with knowing your operating system and its various nuances, this chapter and the following chapters will introduce some topics related to the Kali Linux operating system, its configuration setup, and default behavior to enable you to properly use your terminal utilities.

If you're already a seasoned "basher", feel free to skip this chapter and move on to the more security-focused topics in this book.

Getting help from the man pages

Bash shells typically come bundled with a very useful utility called man files, short for manual files. It's a utility that gives you a standardized format to document the purpose and usage of most of the utilities, libraries, and even system calls available to you in your Unix/Linux environment.

In the following sections, we will frequently make use of the conventions and descriptive style used in man files so that you can comfortably switch over to using the man pages to support what you've learnt in the following sections and chapters.

Using man files is pretty easy; all you need to do is fire off the following command from your terminal:

man [SECTION NUMBER] [MAN PAGE NAME]

In the previous command, [SECTION NUMBER] is the number of the man page section to be referenced and [MAN PAGE NAME] is, well, the name of the man page. Usually, it is the name of the command, system call, or library itself. For example, if you want to look up the man page for the man command itself, you would execute the following command from your terminal:

man 1 man

In the previous command, 1 tells man to use section 1 and the man argument suffixing the command is the name of the man page, which is also the name of the command to which the page is dedicated.

Man page sections are numbered according to a specification of their own. Here's how the numbers are appropriated:

General commands: You usually use this section to look up the information about commands used on the command line. In a previous example in this section, we used it to look up information about the man file.
System calls; This section documents the arguments and purpose of common system calls facilitated by the host operating system.
C library functions: This section is very useful for C developers and developers who use languages developed as C derivatives such as Python. It will give you information about the arguments, defining header files, behavior, and purpose of certain fundamental C library function calls.
Special files: This section documents special-purpose files, typically those in the /dev/ directory, for instance, character devices, pseudo terminals, and so on. Try picking a couple files in the /dev/ directory of your operating system and executing the following command:
```
man 4 [FILENAME]
```
For instance:
```
man 4 pts
man 4 tty 
man 4 urandom
```
File formats and conventions: This section documents common file formats used to structure information about the system, for instance, logfile formats, the password file formats, and so on. Usually, any file is used to document the information generated by common operating system utilities.
Games and Screensavers: This section contains information about games and screensavers.
Miscellanea: This section contains information about miscellaneous commands and other information. It is reserved for documentation of anything that does not fit into the other categories.
System administration commands and daemons: This section is dedicated to administration commands and information about system daemons.

For a synopsis and full description of these sections, try checking out the intro man files for each of them. You can reach these files by executing the following command for each section number:

man [SECTION NUMBER] intro

I've documented all the man page section numbers and their traditional purpose here. Of course, it is up to developers to uphold these conventions, but generally all you will be interested in is section 1, and if you're going to do some reverse engineering, section 2, 3, and 4 will also be of great help.

The man page layout is standardized to contain a certain collection of sections. Each section of the man page describes a given property of the command, system call, or library being discussed. The following list explains the purpose of the common sections in man file:

Name: This is the name of the command, function, system call, or file format.
Synopsis: This is a formal description of the command, system call, file format, or what have you describing the usage specification. The way the syntax or usage specifications for commands are specified takes a little understanding to appreciate properly. You may notice the braces in the specification, these are not to be interpreted as literal parts of the command invocation. In fact, they indicate that whatever appears inside the brackets is an optional argument. Also, the "|" character indicates that either the symbols preceding it or following it can be specified as part of the command invocation but not both; think of it as a logical OR.
Description: This is an informal description and discussion of the man page topic, detailing its purpose and more information about the options and possible arguments mentioned in the Synopsis section.
Examples: This is a collection of examples for the usage of the man page topic.
See also: This is a collection of references, web pages, and other resources containing further information about the topic being discussed.

For more about the Linux manual pages, please see the Further reading section at the end of this chapter.

Navigating and searching the filesystem

Navigating and searching the Linux filesystem is one of the most essential skills the developers, system administrators, and penetration testers will need to master in order to realize the full potential of their bash consoles and utilities. To properly master this skill, you will need a good understanding of the organization of your host operating system though it is a little out of context of this book to have a thorough discussion of the Kali Linux operating system's inner workings and organization.

Navigating a filesystem requires the use of a sample collection of the tools and utilities. Here's a breakdown of these tools:

Command name	Common name	Purpose
`cd`	Change Directory	This changes your current working directory
`ls`	List	This lists the contents of the current working directory
`pwd`	Print Working Directory	This displays the current working directory
`find`	Find	This locates or verifies the existence of a file based on a the values of certain attributes

Navigating directories

Navigating directories is popularly done by using the cd command, which is probably one of the simplest commands to use. All you need to do is supply the directory you wish to change to and cd will do the rest. It also has very useful shorthands to speed up the most common tasks users perform when navigating their filesystems.

The following is what the command usage specification looks like:

cd [ -L | -P ] [directory]

In the syntax specification, [directory] is the directory you wish to change your current working directory to and [-L|-P] may be any one of the following:

-L: When changing directory, symbolic links should not be respected. The current directory will be changed to include the name of the symbolic link and not its target. This is described in documentation as making the symbolic link logical, since it forces the name of the symbolic link to be treated as logical element in the path being set as the working directory.
Note
Symbolic links are constructs on a filesystem that allow one file or directory to act purely as a reference to another file. These links affect the way path resolution occurs, since in some situations when a symbolic link is followed, it will allow one path to direct the current directory to a file represented by another name, as opposed to a pathname resolving strictly as it is named.
-P: This is the opposite of the -L command. This specifies that should the file being set as the current directory be symbolic link, it should be resolved completely before being set as the current directory. This means if you visit a symbolic link, your current path will not reflect the name of the symbolic link you used to reach it, unless of course if the link has the same name as its target.

The following is a typical usage example of the cd command:

cd /

The preceding command will change your current directory to the root directory, which is named /; everything hosted on your filesystem is usually reachable from this directory.

The following are some more examples:

cd ~: This command is used to navigate to the current user's home directory
cd ../: This command is used to navigate to the directory directly above the current one

In the preceding command, one can have cd navigate an arbitrary number of directories above the current one, for instance, by supplying it a command as follows:

cd ../../../../../

The following are some other commands that can be used to navigate to different directories:

cd .: This command is used to navigate to the current directory
cd –: This command is used to navigate to the previous directory
cd --: This command is used to navigate to the second-last directory

To see whether you have indeed changed your current working directory to the one you've specified, you can invoke the pwd command that will print your working directory. The syntax for the pwd command is as follows:

pwd [-L|-P] [--help] [--version]
pwd [--logical | --physical ]

The –L or --logical and –P or --physical invocation options serve the same purpose as in the cd command.

Listing directory contents

It's not enough to just move between directories. You will eventually want to find out what's inside these directories. You can do this by using the ls command.

The following is the usage specification for the ls command—adapted from its man page:

ls [-aAlbBCdDfFghHiIklLmNopqQrRsStTuvwxXZ1] [FILE/DIRECTORY]

The previous command specification is another popular Linux/Unix convention. It's a shorthand to specify that any of the letters appearing in the brackets can be specified as part of the command invocation. Also, any number of them may be specified at the same time. For instance, consider the following commands:

ls –Ham
ls –and
ls –Rotti

According to the command specification, they are all acceptable ways to use the ls command. Whether or not any of these will actually do something useful depends on how each switch affects the ls command's behavior. You should keep in mind that some options may have opposing effects or certain combinations may have no effect, like a general note when reading usage specifications such as the one for ls.

The [FILE] or [DIRECTORY] argument would be any path or file at which you wish to fire ls. Without any arguments, ls will list the current working directory's entries.

Note

A switch is a popular jargon for the options, that is, anything directly following the hyphen, specified as part of the command invocation. For example, –l is a switch.

Here's what some of the switches do—we will only discuss some of the most important switches here for the sake of brevity. Keep in mind that the ls command lists directory contents, so all its options will be focused on organizing and presenting a given directory's contents in a specified way.

The following are some of the ls command's invocation options:

-a –-all: This displays all the directory entries and does not omit directories or file starting with "." in their names.
-d –directory: This lists the directory entries and not their contents. This will also force ls not to dereference symbolic links.
-h: This prints sizes in human-readable format, for instance, instead of the number of bytes only it will display file sizes in gigabytes, kilobytes, or megabytes where applicable.
-i: This prints the inode number of each file.
Note
Inodes or i-nodes are data structures assigned to files that represent detailed information about their access rights, access times, sizes, owners, and the location of the file on the actual block devices—the physical medium hosting the file—as well as other important housekeeping-orientated details.
-l: This lists the entries in long format.
-R –-recursive: This recursively lists directory contents. This tells ls to nest down all the levels of the specified path and enumerate all the reachable file paths, instead of stopping once the working directory is listed—as is the default.
-S: This lists the entries sorted by file size.
-x: This sorts entries alphabetically by extension, for example, all PDFs after MP3s.

The following are some examples of these options in action. For instance, if you'd like to say sort a bunch of files by their size, while displaying human-readable file sizes and all the access rights and creation times—which seems like a lot of work—you would run the following command:

ls –alSh

You're output could look something like the following screenshot:

Another very useful example would be checking the volume of logins to the system. This can be done by looking at the output of the following command:

ls –alSh /var/log/auth*

Generally, keeping track of the contents of the /var/log/ directory will always be a good way to grab a good synopsis of the activity on a system.

Searching the filesystem

Another important skill is being able to find resources on your filesystem in a compact yet powerful way. One of the ways you can do this is by using the aptly named find command. The following command is how find works:

find [-H] [-L] [-P] [-D debugopts] [-0level] [path…] [expression]

You can find out more about the find command by checking out the man file on it. This can be done by executing the following command:

man 1 find.

This was discussed in the Getting help from the man pages section earlier in this chapter.

Moving on, the first three switches, namely, -H, -L, and –P, all control the way symbolic links are treated. The following list tells what they do:

-H: This tells find not to follow symbolic links. Symbolic links will be treated as normal files and will not resolve them to their targets. Putting it simply, if a directory contains a symbolic link, the symbolic link will be treated as any other file. This does not affect symbolic links that form part of the selection criteria; these will be resolved.
-L: This forces find to follow symbolic links in the directories being processed.
-P: This forces find to treat symbolic links as normal files. If a symbolic link is encountered during execution, find will inspect the properties of the symbolic link itself and not its target.

The –D switch is used to allow find to print debug information if you need to know a little about what find is up to while it's searching for the files you want. -0level controls how find optimizes tests and it also allows you to reorder some tests. The level part can be specified as any number between 0 and 3 (inclusive).

The [path...] part of the argument is used to tell find where to look for files. You can also use the . and .. shorthands to specify the current and directory one level up respectively, as with the cd command.

The next argument, or rather group of arguments, is quite an important one: the [expression]. It consists of all the arguments that control the following:

Options: This tells what kind of files find should look for
Tests: This tells how to identify the files it is looking for
Actions: This tells what find should do with the files once they are found

The following is the structural breakdown of the find expression:

[expression] := [options][[test][OPERATOR][test][OPERATOR]...][actions]

[options] :=  [-d][-daystart][-depth][-follow][-help]...
[tests] := [-amin n][-atime file][-cmin n][-cnewer file]...
[OPERATOR] := [()][!][-not][-a][-and][-or]...
[actions] := [-delete][-exec command [;|{} +]][-execdir command]...

Note

The previous code only serves as information about the structure of the expression, to let you know which options go where. Many of the switches for each section have been omitted for brevity. The := characters mean that whatever is on the left-hand side is defined by whatever is defined on the right-hand side.

So now that you know where everything goes, let's look at what some of these arguments do. The find command has quite a number of very powerful options and operational modes, and one could quite literally write an entire book about find itself. So to make sure you don't get short changed—buying a book about "command line hacking" and instead learning only about find—we will only discuss some of the most common options and arguments penetration testers, system administrators, and developers use. The rest of the find command's power can be learned from the Linux manual files.

The following is a summary of some of the find command's possible arguments for options, tests, and actions.

Directory traversal options

The following are some of the options arguments you can use with find:

-maxdepth n: This specifies that tests must only be applied to entries in directories at most n levels below the current directory. This option is useful if you're searching through directories that have a similar structure. For instance, if each directory below the one you're searching has something like a lib directory that contains uninteresting files, you can skip all such directories by specifying this option.
-mindepth n: This specifies that tests should only be applied to files at depth of at least n directories lower than the specified path.
-daystart: This forces any –amin, -atime, -cmin, -ctime, or equivalent time-related tests to use the time starting from the beginning of the current day, rather than 24 hours ago—as is the default behavior.
-mount: This forbids find from traveling into other filesystems.
Note
The find command allows you to specify numeric arguments using convenient shorthands to indicate an "at least" or "at most" type comparison with the specified time:
+n: This indicates the specified argument is to be compared as greater than, or at least n
-n: This indicates the specified argument is to be compared as less than or at most n
n: This forces find to compare n as is, and the attribute must have the exact value of n

File testing options

Tests are applied to a file and either return true or false: either the file being tested has the desired attribute or it doesn't. More than one test can also be supplied, in which case a logical combination—which can also be specified—is applied. By default, if no Boolean is supplied to combined to tests, a logical AND is assumed. This means both tests must be true for the file to be found or reported. The following are some of the file testing options:

-amin n: This specifies that the last access time of the file should be n minutes ago. For example:
- -amin 20: This means the file must have been accessed exactly 20 minutes ago
- -amin +35: This means the file must have been accessed at most 35 minutes ago
-atime n: This specifies that the file should have been access n*24 hours ago, meaning n days. Any fractional part of this number is ignored.
-mmin n: This specifies that the file should have been modified n minutes ago.
-mtime n: This is the same as –atime, except it matches against the files modified time.
-executable | -readable | -writable: This matches any file that has access rights indicating that the file is executable, readable, or writable, respectively.
-perm: This mode specifies that the file group should be name. The –perm option offers a myriad of different ways to specify the access mode being tested, here's how it works.
Note
The access mode bits can be prefixed with anyone of the following:
- mode: This means no prefix and the mode must be matched exactly.
- -mode: This means the file's mode must have at least the specified bits set. This will match files with other bits set as long as the specified bits are set as well.
- /mode: This means that any of the specified bits must be set for the file.
The mode itself can also be specified in two different ways, symbolically using characters to indicate user types and access modes or the octal decimal mode specification.
-iname nAmE: This specifies that the name of the file should match nAmE if the case is ignored; in other words, case-insensitive name matching.
-regex pattern: This matches the specified pattern as a regular expression against the file's pathname. Your regular expression must describe the entire pathname.
Note
Regular expressions are merely ways to describe a set of strings with a specified number of properties in common. If you want to describe a string, you must be able to detail all the properties of the string from beginning to the end. If you don't describe a single character in some or other way, the regular expression won't match!
Regular expression are in themselves a language, for instance, you could write a regular expressions to describe regular expressions! This means you will need to know how to speak this language in order to use regular expressions properly. To find out how to do this, see the Further reading section at the end of this chapter.

The following are a few simple examples of the –regex option's usage:

Find all the files directly under the /etc/ directory that start with the letter p and end in anything using the following command:
```
find / -regex '^/etc/p[a-z]*$'
```
Find all the files on the filesystem that are called configuration, ignoring case, and accommodating abbreviations such as confg, cnfg, and cnfig using the following command:
```
find / -regex '^[/a-z_]*[cC]+[Oo]*[nN]+[fF]+[iI]*[gF]+$'
```
See the following screenshot for a practical example of the previous command:

The regular expression used here must describe the entire file's path! For instance, consider the difference in results between the following two regular expressions:

find / -regex '^[/a-z_]*/$' #matches only the / directory
find / -regex '^[/a-z_]*/*$' #matches everything reachable from the / directory!

Tip

Bash script comments

Any bash command or text fed to the bash interpreter and preceded by a hash character is considered a comment, and it will not interpreted.

File action options

The following are some of the action arguments you can use with find:

-delete: This action forces find to delete any file for which the specified test returns true. For instance, consider the following command:
```
find / -regex '^/[a-z_\-]*/[Vv][iI][rR][uS]*$' –delete
```
This command will find and delete anything reachable one level from the root that has a name such as 'virus'—case-insensitive.
-exec: This allows you to specify an arbitrary command to execute on all files that match.
The way this argument works is to build a command line—which is probably passed to some exec* type system call—using the results of the find operation for every result. The find command will use any argument after the –exec switch as a literal argument to the command being executed and any instance of the {} chars as a placeholder for the name of the file, until a ; character is encountered.
For instance, consider the following as the –exec argument:
```
find /etc/ -maxdepth 1 -name passwd -exec stat {} \;
```
The actual command line(s) that will be run will look something like the following command, since the only file that will match will be /etc/passwd:
```
stat /etc/passwd
```
See the following screenshot for a comparison of the stat and find –exec commands:
-execdir: This works the same way –exec does, except it will isolate execution of the specified command to the directory of the match file. This works great if you'd like to execute commands based on the contents of a directory that has certain files. For instance, you may want to edit all the .bashrc files for users that don't have .vimrc, which is a configuration script for the VIM text editor. We will discuss more about the .bashrc code later.
-print0: This prints the file's full name to standard output. This argument also has the added benefit of terminating filenames with a NULL character, or 0x0 character, so as to allow filenames to contain newlines. It also helps make sure that any program interpreting the output of find will be able to determine the separation between filenames, as they will be strictly separated by NULL characters.
Note
NULL characters are traditionally used to mark the end of a character string. The NULL character itself is represented at memory level as a 0 value so that compilers and operating systems can clearly recognize the delimitation between strings appearing in memory.
-ls: This lists the current file by executing ls –dils, and the output is printed to standard output. The –dils option makes sure that the directory entries are printed. If the matched file is a directory, then inode is printed, and the entry appears in the ls command's long listing format as well as the size of the file.

There are a couple more actions you can specify. For the rest of them, please see the manual file on the find command, which you can access using the man find command.

So as far as searching your filesystem for files, directories, or generally any other interesting things, that's pretty much it. The next fundamental skill you'll need to master is redirecting output from one command to another.

Using I/O redirection

I/O redirection is one of the easiest things to master when it comes to the bash scripting. It's as simple as knowing where you want your input to go and where it's coming from. It may seem like this is a very interesting topic and you might not see why you need to know this, but redirecting output—if you truly get to understand what it's all about—will be what you're doing on your command line almost 80 percent of the time! It's essentially the one thing that allows you to combine different utilities and have them work together quite effectively on the command line in a compact and simple way. For instance, you may want to search through the output from nmap or tcpdump or a key-logger by feeding its output to another file or program to analyze.

Redirecting output

To redirect the output of one program that is invoked from the command line into a file, all you need to do is add a > symbol at the end of the command line for the said program and proceed this with a filename.

For instance, using the most recent example, if you want to redirect the output of the find command to a file named something like writeable-files.txt, this is how it would be done:

find / -writeable > writeable-files.txt

There is one small detail about this kind of I/O redirection though, as with many of the common bash shorthands: there's usually quite a bit going on under the hood. If used as demonstrated previously, the only output that will actually appear in the chosen file (for the previous example it is writeable-files.txt) would be the output actually printed to the standard output file that is commonly referred to as file descriptor 0, which is the default destination for normal output.

Note

File descriptors are constructs in operating systems that represent access to an actual section of the physical storage mechanism or a file. File descriptors are nothing more than numbers that are associated to other data structures managed by the kernel that represent open files. Each process has its own "private" set of file descriptors.

Whenever you open a file using a text editor or generally perform any editing of a resource stored on a physical medium, a file descriptor representing the involved file is passed to the kernel through a system call. The kernel then uses this number to look up other details about the file in a data structure only the kernel should have access to.

The file descriptor's primary purpose is to help abstract and logically isolate details about the actual process involved with accessing the storage mechanism. After all, reading and writing to files is quite an essential operation to computer systems and it would be quite tedious—and error-prone—to do many things if writing to a file meant accommodating actions such as spinning/stopping the hard drive disk, interpreting different filesystems' organization, and handling read/write errors!

Output destined for or coming from any file descriptor can be redirected, provided that you have the correct access rights from your bash shell! Here's the code to do that:

[command line] a>&b > [output file]

In the previous command, a and b are both file descriptors. If a or b are not explicitly set, then they default to 1, which is standard output.

What about output destined for the standard error file? How do you redirect that? Well as it turns out this is pretty easy too, and here's the code to do it:

[command] 2> [output file]

As you can see in the previous example, we specified the redirection symbol as 2>, which simply means the following:

Redirect everything from file descriptor 1 to the file called writeable-files.txt.

You can also combine or bond the two standard output files, namely send the output of both input and output to a single file if there is anything interesting being printed to the standard error output. It is done using the following command:

[command line] 2>&1 > [output file]

There's also a simpler abbreviation for this and here's what it looks like:

[command line] &> [output file]

This means the following:

Redirect everything from file descriptor 1 to file descriptor 0 and then redirect everything from file descriptor 0 to [output file].

The previous redirection commands will all assume that the specified file does not exist; if it does, the output being directed will overwrite whatever is currently in the file. What will you do if you'd like to append text to a file? Well, the following command shows how that works:

[command line] [&][n] >> [&][m] [filename.txt]

As before, the &, n, and m notations are all optional parameters and work exactly the same as they did in previous examples.

Redirecting input

If you can redirect output, you should also be able to redirect input using the following command:

[command line] < [input file | command line]

Its pretty straightforward really: if > means redirect output, then < means redirect the 'output' of the right operand, which from the perspective of the left operand is input.

As with output redirection, you can also control which file descriptors you'd like to include in the redirection using the following command:

[command line] <[n] [input file | command line]

In the previous command, [n] is the file descriptor number, as with output redirection. The following are a few examples you can test out on your terminal console:

keylogs.txt < /dev/`tty`
The preceding command redirects all the input written to the terminal into the file called keylogs.txt. It achieves this by getting the current tty device associated to the terminal console using the tty command.
wc –l < /etc/passwd
The preceding command redirects input from the /etc/passwd file that contains all the usernames and other user account-orientated details to the wc command, which is used to count lines, file sizes, and other file attributes. Using the –l switch causes the wc command to count all the lines, or more specifically all the new line characters it encounters, until an end of file (EOF) sentinel is reached.

Getting to know grep

The Global Regular Expression Print (grep) utility is a staple for all command-line jockeys. The grep utility in its most basic functionality gives its users the ability to run regular expressions on a given input file or stream and prints the matching results. More advanced features of grep allow you to specify which attributes of the matching text you'd like to print, whether you'd like the output colorized, or even how many lines around the matching output you should print. It's packed with many very useful features, and once mastered they become an essential part of any penetration tester, developer, or system administrator's arsenal.

Tip

To properly make use of grep, you will need at least basic understanding and practice with regular expressions. Regular expressions will not be covered in their entirety here, though simple examples and basic elements of regular expression language will be covered. For more extensive reading on regular expressions and how they work, see the Further reading section at the end of the chapter.

Regular expression language – a crash course

Regular expressions are merely strings that describe a collection of strings using a special language—in formal language theory terms, any collection or set of strings is termed as language. Being able to wield this language to your disposal is an invaluable skill. It will help you do many things from static code source analysis, reverse engineering, malware fingerprinting and larger vulnerability assessment, and exploit development.

The regular expression language supported by grep is filled with useful shorthands to simplify the description of a set of common strings, for instance, describing a string consisting of any decimal number, any lowercase or uppercase alphabetic character or even any printable character. So given that any string or collection of strings must be composed of a collection of smaller strings, if you know how to match or describe any alphabetic character or any decimal number, you should be able to describe anything composed of characters from those character classes. A character class is simply a language composed of length 1 strings from a specific collection of characters.

First of all, we need to define some "control" characters. Given that you will be describing strings using other strings, there needs to be a way to designate special meaning to given characters or substrings in your regular expression. Otherwise, all you'd be able to do is compare one string to another, character by character. You can do that as follows:

^: The following regular expression must be matched at the beginning of a line, for example, ^this is the start of the line.
$: The preceding regular expression must be matched at the end of a line, for example, this is the end of the line$.
[]: The description of a character class, or a list of characters, is contained within the brackets, and strings that match contain characters in the specified list. Certain character classes can be described using shorthands. We will see some of them throughout the rest of the chapter.
(): This logically groups regular expressions together.
|: This is a logical OR of two regular expressions, for instance, ([expression]) | ([expression]).
?: This matches the preceding regular expression at least once. For example, keith? will match any string that either contains "keith" or doesn't at all.
+: This matches the preceding regular expression at least once.
{n}: This matches the preceding regular expression exactly n times.
{n,m}: This matches the preceding regular expression at least n times and at most m times. For example [0-9]{0,10} will match any decimal number containing between 0 and 10 digits.

The following is a small collection of some of the shorthands grep supports as an extended regular expression language:

[:alnum:]: This matches alphanumeric characters, any decimal digit, or alphabetical character
[:alpha:]: This matches strictly alphabetical characters a-z
[:digit:]: This strictly matches decimal numbers 0-9
[:punt:]: Any punctuation character will be matched

There are a number of other character class shorthands available; see the manual page for grep for more information.

Regular expressions are simply collections of these control characters and character classes. For example, you could combine them in any way you like as long as all the brackets, braces, and parenthesis are balanced.

Now that you have some basic background in regular expressions, let's look at the grep utility's usage specification using the following command:

grep [options] PATTERN [file list]
[options] := [matcher selection][matching control][output control][file selection][other]
PATTERN := a pattern used to match with content in the file list.
[matcher selection] := [-E|--extended-regexp][-F|--fixed-strings]...
[matching control] := [-e|--regexp][-f|--file][-i|--ignore-case]... 
[output control] := [-c][--count][-L|--files-without-match]...
[file selection] := [-a | --text][--binary-files=TYPE][--exclude]...
[file list] := [file name] [file name] ... [file name]

Please remember this is a mere summary of the structure of the command and does not mention all possible options. For more information about the grep utility's regular expression syntax, please see the Further reading section at the end of this chapter, as well as the man page for Perl regular expressions, which can be reached by executing the command man 3 pcresyntax. You can also learn more about regular expression by checking out the man page on POSIX.2 regular expressions, Kali Linux might not have the man page mentioned in the previous command. You can get the regex manual page using the command man 7 regex.

Building on this specification, let's look at some of the options in detail.

Regular expression matcher selection options

Part of the invocation of grep requires you to let grep know what method you would like to use to match your pattern with the contents of the file. This is because grep is capable of more than just running regular expressions.

The following are the options for matcher selection:

-E or –-extended-regexp: This interprets the PATTERN argument as an extended regular expression
Note
Extended regular expression language is pretty much what everyone uses today, but this wasn't always the case. Way back in Unix's heyday, regular expressions were represented using something called POSIX (Portable Operating System Interface) basic regular expression language. Some years later, Unix developers added some functionality to the regular expression language and a new standard for representing this new, more shorthand-laden language was created called the Extended Regular Expression (ERE) language standard.
-F or –-fixed-strings: This tells grep to interpret PATTERN as a list of fixed strings separated by newlines to look for in the given file list
For example, the following screenshot shows the output of this command:
-P or –-perl-regexp: This allows grep to interpret PATTERN as a Perl regular expression

Regular expression matching control options

The following options allow you to control a little about how the data being matched should be treated, whether you'd like to match whole words in your input or whole lines or funnel in a number of patterns from a given file.

The following are the options for matching control:

-e PATTERN or –-regexp=PATTERN: This forces the PATTERN argument supplied here to be used as PATTERN to match against the input files.
The following command is an example of the usage for the preceding option:
```
cat /etc/passwd | grep –e '^root' 
```
The preceding example matches the line that starts with the word root.
-f or –-file=FILE: This grabs a list of patterns to use from the supplied file.
For example, consider a file containing the following text:
```
^root
^www
^nobody
```
This file can be used with the –f option as follows:
```
grep –f patterns.txt < /etc/passwd
```
-v or –-invert-match: This inverts the matching, which means select or report only file contents that don't match.
-w or –-word-regexp: This report lines from the input files that have whole matching words.
For example, see the output of the following commands:
```
root@kali:~# grep r –w < /etc/passwd

root@kali:~# grep ro –w < /etc/passwd

root@kali:~# grep root –w < /etc/passwd
root:x:0:0:root:/root:/bin/bash
```
As you can see from the previous output, and maybe some of your own testing, the first two runs did not describe a complete word of the contents of the /etc/passwd file. However, the last run does; so it's the only one that actually produces output.
-x or –-line-regexp: This reports or prints lines from the input file that have whole lines matching.

Output control options

The grep utility also allows you to control how it reports information about successful matches. You can also specify which attributes of the matches to report on.

The following are the some of the output control options:

-c or –-count: This doesn't report on the matched data, instead prints the number of matches.
-L or –-files-without-match: This prints only the names of files that contain no matches.
-l or –-files-with-matches: This prints only the names of files that contain matches.
-m or –max-count=NUM: This stops processing input after NUM number of matches. If input comes from standard input or using an input redirection, the processing will stop after NUM lines are read.
-o or –only-matching: This prints the matching parts of the input data, each on a separate line.

File selection options

The following options allow you to specify where the input files should come from and also control some of the attributes of the input data as a whole.

The following are the options for the file selection:

-a or –-text: This forces binary files to be processed as text. This allows you to operate grep much like the strings utility, which returns all the printable strings from a given file with the added benefit of being able to match the strings using regular expressions.
For example:
```
grep 'printf' –m 1 –color –text `which echo`
```
Note
The which command
The which command prints the canonical file path of the supplied argument. Here, it appears in back-ticks so that the bash shell will substitute this command for the value it produces, which effectively means grep will be running through the binary for the echo command.
The output of the previous command is as shown in the following screenshot:
--binary-files=TYPE: This checks if a file supplied as input is a binary file. If yes, then it treats the file as the specified TYPE.
-D ACTION or –-devices=ACTION: This processes the input file as a device and uses the ACTION parameter to siphon input from it. By default, ACTION is read.
--exclude=GLOB: This skips any files whose name matches GLOB; wild cards are honored in the matching.
-R, –r, or –-recursive: This processes all the reachable file entries in nested directories from the current directory.

Well that's pretty much it as far as grep goes. Hopefully, you'll be able to make use of these options to find what you're looking for. It takes a little practice and getting used to but once mastered, grep is an invaluable utility.

Filter reviews by

All

Amazon verified reviews

SuJo Jun 15, 2014

Penetration Testing with the Bash shellI gave this book a 5 star rating because I felt it covered all of the basic concepts rather well, and it was easy to follow along. The crash course on using the console (Bash) system was very well paced and planned out. I administer various linux systems and found this book rather helpful for testing the different virtual machines, dedicated machines, and workstations around the home office. Hackers won't just target businesses, they will target home users as well, and small home businesses are major targets now.I liked the introduction to each tool that will be used, I didn't find which version the author used for Kali Linux which isn't a huge negative but it would have been nice to have a chapter on setting up the environment. I decided to download the official 64bit version and dual boot my main system to follow the examples in the book, I like to type out the code as well but the code download was very helpful through the publishers website.Overall this book is well worth the cost, if you want to take your security serious this book is a great stepping stone. You will learn tons of useful information from this book, most interesting was the Man in the middle attacks and the network discovery tools. I've used a few of the tools in the past and found this to be an excellent refresher, and the portion on Regular Expressions was really good.Publisher Link:[...]

Amazon Verified review

A. Zubarev Jun 23, 2014

I have recentlygrabbed a copy of Penetration Testing with the Bash Shell[...] (Kindle format).My observation is,there are security incidents all over, but there are very few booksavailable, yet it seems there are even fewer mature specialists tolearn from. Keith Makan surely is one of the top ones.My intent wasnarrow: to widen my knowledge on the penetration testing tactics, andI hoped to apply at least a few at work. This book helped.So more on the book,it is structured so in part one it teaches about the intricacies ofthe very common in Linux Bash shell commands and the shell itself.And I admit a learnt a few nifty tricks about the commands I alreadyknew!I must state soonerthat later that the book is based on the getting increasingly popularKali Linux [...];But I was able to work out my practice on a Ubuntu 14 without anyissues. One nuance, if a Bash shell in Kali works under the superuserby default you need to remember to sudo on Ubuntu.The book is not tooterribly long at 150 pages, but it did not feel short. The book isproviding with a lot of useful reference, too. I am going to get afew of the suggested books there actually.After part one isover the real fun begins. I was able to interrogate remote hosts andwrite a small pen-script for an application at work. And it worked,however not what I wanted to perform was covered in the book, what Iliked, I did not find any vulnerabilities in my app. So it is purefun, but a serious thing at the same time in the wake of the myriadof security breaches around the world.I am hesitant togive this book a full 5 out of 5 due to the reduced topics coverage,but I cannot give it 4.5 either. So I gave it a 4 stars rating.

Charles Profitt Jul 17, 2014

The book begins out with a section dedicated to general Linux commands. It will make it easy for non-Linux users to work with Linux. It was not as useful for experienced Linux users, but the section on customizing your bash prompt and bash history is useful for both beginners and experienced Linux users. The crash course on regular expressions was excellent as well and is a skill anyone using Linux can make great use of. I strongly encourage readers to pay attention to regular expressions as they are a very useful tool.I liked the 'further reading sections at the end of each chapter'. It gives the reader a small road map to building up further skills.In discussion on customizing the command history and keeping sensitive variables secure the author mentioned using TrueCrypt which should lkely be avoided given the announcement from the authors that truecrypt is not secure. I would recommend using GPG on a Linux system.Chapters include:- Getting to Know Bash- Customizing Your Shell- Network Reconnaissance- Exploitation and Reverse Engineering- Network Exploitation and MonitoringOverall I think the book is a good introduction to the world of penetration testing and builds some command line skills that can be essential to the pen tester.

L. Fesenden Jul 16, 2014

I’ll have to say that, for some reason, I thought this book was going to be some kind of guide to using only bash itself to do penetration testing. It’s not that at all. It’s really more like doing penetration testing FROM the bash shell, or command line of you like.Your first 2 chapters take you through a solid amount of background bash shell information. You cover topics like directory manipulation, grep, find, understanding some regular expressions, all the sorts of things you will appreciate knowing if you are going to be spending some time at the command line, or at least a good topical smattering. There is also some time spent on customization of your environment, like prompts and colorization and that sort of thing. I am not sure it’s really terribly relevant to the book topic, but still, as I mentioned before if you are going to be spending time at the command line, this is stuff that’s nice to know. I’ll admit that I got a little charge out of it because my foray into the command line was long ago on an amber phosphorous serial terminal. We’ve come a long way, Baby :)The remainder of the book deals with some command line utilities and how to use them in penetration testing. At this point I really need to mention that you should be using Kali Linux or BackTrack Linux because some of the utilities they reference are not immediately available as packages in other distributions. If you are into this topic, then you probably already know that, but I just happened to be reviewing this book while using a Mint system while away from my test machine and could not immediately find a package for dnsmap.The book gets topically heavier as you go through, which is a good thing IMHO, and by the time you are nearing the end you have covered standard bash arsenal commands like dig and nmap. You have spent some significant time with metasploit and you end up with the really technical subjects of disassembly (reverse engineering code) and debugging. Once you are through that you dive right into network monitoring, attacks and spoofs. I think the networking info should have come before the code hacking but I can also see their logic in this roadmap as well. Either way, the information is solid and sensical, it’s well written and the examples work. You are also given plenty of topical reference information should you care to continue your research, and this is something I think people will really appreciate.To sum it up, I like the book. Again, it wasn’t what I thought it was going to be, but it surely will prove to be a valuable reference, especially combined with some of Packt’s other fine books like those on BackTrack.

Dustin Marx Jun 25, 2014

"Penetration Testing with the Bash shell" provides a nice introduction to using bash and bash-based tools and scripts to perform penetration testing. Although the book's examples and descriptions are specifically aimed at Kali Linux, most of the book's examples and descriptions apply to tools and scripts available with other Linux implementations or available for download and install on other Linux implementations.The first two chapters of "Penetration Testing with the Bash shell" provide an overview of bash and some of its most important syntax and commands. The remaining three chapters focus on applying bash and associated tools and scripts to perform penetration testing."Penetration Testing with the Bash shell" is about 125 pages of content divided into five chapters. It provides a nice mix of details and introductory-level information and examples to get one started using bash tools to perform penetration testing. The chapters provide a detailed introduction to various penetration testing use cases and the tools that can be used to deal with those uses cases. I liked that each chapter has useful "Further Reading" sections to find resources with greater detail.

Penetration Testing with the Bash shell: Make the most of Bash shell and Kali Linux's command line based security assessment tools.

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Note

Note

Note

Note

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access