About this book

The most basic interface to a computer—the command line—remains the most flexible and powerful way of processing data and performing and automating various day-to-day tasks.

Command Line Fundamentals begins by exploring the basics, and then focuses on the most common tool, the Bash shell (which is standard on all Linux and iOS systems). As you make your way through the book, you'll explore the traditional Unix command-line programs as implemented by the GNU project. You'll also learn to use redirection and pipelines to assemble these programs to solve complex problems.

By the end of this book, you'll have explored the basics of shell scripting, allowing you to easily and quickly automate tasks.

Publication date:
December 2018
Publisher
Packt
Pages
314
ISBN
9781789807769

 

Introduction to the Command Line

Learning Objectives

By the end of this chapter, you will be able to:

  • Describe the basics of a filesystem
  • Navigate a filesystem with command-line tools
  • Perform file management tasks using the command line
  • Utilize shell history and tab completion to efficiently compose commands
  • Utilize shell-editing shortcuts to efficiently work with the command line
  • Write and use wildcard expressions to manage groups of files and folders

This chapter gives a brief history of the command line, explains filesystems, and describes how to get into and out of the command line.

 

Introduction

Today, with the widespread use of computing devices, graphical user interfaces (GUIs) are all-pervasive and easily learned by almost anyone. However, we should not ignore one of the most powerful tools from a bygone era, which is the command-line interface (CLI).

GUIs and CLIs approach user interaction from different angles. While GUIs emphasize user-friendliness, instant feedback, and visual aesthetics, CLIs target automation and repeatability of tasks, and composition of complicated task workflows that can be executed in one shot. These features result in the command line having widespread utility even today, nearly half a century since its invention. For instance, it is useful for web administrators to administer a web server via a shell command-line interface: instead of running a local CLI on your machine, you remotely control one that is running thousands of miles away, as if it were right in front of you. Similarly, it is useful for developers who create the backends of websites. This role requires them to learn how to use a command line, since they often need to replicate the web server environment on their local machine for development.

Even outside the purely tech-oriented professions, almost everyone works with computers, and automation is a very helpful tool that can save a lot of time and drudgery. The CLI is specifically built to help automate things. Consider the task of a graphic designer, who downloads a hundred images from a website and resizes all of them into a standard size and creates thumbnails; a personnel manager, who takes 20 spreadsheet files with personnel data and converts all names to upper case, checking for duplicates; or a web content creator, who quickly replaces a person's name with another across an entire website's content.

Using a GUI for these tasks would usually be tedious, considering that these tasks may need to be performed on a regular basis. Hence, rather than repeating these manually using specific applications, such as a download manager, photo editor, spreadsheet, and so on, or getting a custom application written, the professional in each case can use the command line to automate these jobs, consequently reducing drudgery, avoiding errors, and freeing the person to engage in the more important aspects of their job. Besides this, every new version of a GUI invalidates a lot of what you learned earlier. Menus change, toolbars look different, things move around, and features get removed or changed. It is often a re-learning exercise filled with frustration. On the other hand, much of what we learn about the command line is almost 100% compatible with the command line of 30 years ago, and will remain so for the foreseeable future. Rarely is a feature added that will invalidate what was valid before.

Everyone should use the command line because it can make life so much easier, but there is an aura of mystery surrounding the command line. Popular depictions of command-line users are stereotypical asocial geniuses. This skewed perception makes people feel it is very arcane, complex, and difficult to learn—as if it were magic and out of the reach of mere mortals. However, just like any other thing in the world, it can be learned incrementally step-by-step, and unlike learning GUI programs, which have no connection to one another, each concept or tool you learn in the command line adds up.

 

Command Line: History, Shells, and Terminology

It is necessary for us to explore a little bit of computing history to fully comprehend the rationale behind why CLIs came into being.

History of the Command Line

At the dawn of the computing age, computers were massive electro-mechanical calculators, with little or no interactivity. Stacks of data and program code in the form of punched cards would be loaded into a system, and after a lengthy execution, punched cards containing the results of the computation would be spit out by the machines.

This was called batch processing (this paradigm is still used in many fields of computing even today). The essence of batch processing is to prepare the complete input dataset and the program code by hand and feed it to the machine in a batch. The computation is queued up for execution, and as soon as it finishes, the output is delivered, following which the next computation in the queue is processed.

As the field progressed, the age of the teletypewriter (TTY) arrived. Computers would take input and produce human—readable output interactively through a typewriter-like device. This was the first time that people sat at a terminal and interacted continuously with the system, looking at results of their computations live.

Eventually, TTYs with paper and mechanical keyboards were replaced by TTYs with text display screens and electronic keyboards. This method of interaction with a computer via a keyboard and text display device is called a command-line interface (CLI), and works as follows:

  1. The system prompts the user to type a sentence (a command line).
  2. The system executes the command, if valid, and prints out the results.
  3. This sequence repeats indefinitely, and the user conducts their work step by step.

In a more generic sense, a CLI is also called a REPL, which stands for Read, Evaluate, Print, Loop, and is defined as follows:

  1. Read an input command from the user.
  2. Evaluate the command.
  3. Print the result.
  4. Loop back to the first step.

The concept of a REPL is seen in many places—even the flight control computer on NASA's 1998 Deep Space 1 mission spacecraft had a REPL controlled from Earth, which allowed scientists to troubleshoot a failure in real-time and prevent the mission from failing.

Command-Line Shells

CLIs that interface with the operating system are called shells. As shells evolved, they went from being able to execute just one command at a time, to multiple commands in sequence, repeat commands multiple times, re-invoke commands from the past, and so on. Most of this evolution happened in the UNIX world, and the UNIX CLI remains up to date the de facto standard.

There are many different CLIs in UNIX itself, which are analogous to different dialects of a language—in other words, the way they interpret commands from the user varies. These CLIs are called shells because they form a shell between the internals of the operating system and the user.

There are several shells that are widely used, such as the Bourne shell, Korn shell, and C shell, to name a few. Shells for other operating systems such as Windows exist too (PowerShell and DOS). In this book, we will learn a modern reincarnation of the Bourne shell, called Bash (Bourne Again Shell), which is the most widely used, and considered the most standard. The Bash shell is part of the GNU project from the Free Software Foundation that was founded by Richard Stallman, which provides free and open source software.

During this book, we will sometimes introduce common abbreviations for lengthy terms, which the students should get accustomed to.

Command-Line Terminology

Before we can delve into the chapters, we will learn some introductory command-line terms that will come handy throughout the book.

  • Commands: They refer to the names that are typed to execute some function. They can be built into the shell or be external programs. Any program that's available on the system is a command.
  • Arguments: The strings typed after a command are called its arguments. They tell the command how to operate or what to operate on. They are typically options or names of some data resource such as a file, URL, and so on.
  • Switches/Options/Flags: These are arguments that typically start with a single or double hyphen and request a certain optional behavior from a command. Usually, an option has a short form, which is a hyphen followed by a single character, and a longer version of the same option, as a double hyphen followed by an entire word. The long option is easier to remember and often makes the command easier to read. Note that options are always case-sensitive.

The following are some examples of switches and arguments in commands:

ls -l --color --classify
grep -n --ignore-case 'needle' haystack.txt 'my data.txt'

In the preceding snippet, ls and grep are commands, –l, --color, –classify, -n, and --ignore-case are flags, and 'needle', haystack.txt and 'my data.txt' are arguments.

 

Exploring the Filesystem

The space in which a command line operates is called a filesystem (FS). A lot of shell activity revolves around manipulating and organizing files; thus, learning the basics of filesystems is imperative to learning the command line. In this topic, we will learn about filesystems, and how to navigate, examine, and modify them via the shell. For regular users of computers, some of these ideas may seem familiar, but it is necessary to revisit them to have a clear and unambiguous understanding.

Filesystems

The UNIX design philosophy is to represent every object on a computer as a file; thus, the main objects that we manipulate with a command line are files. There are many different types of file-like objects under UNIX, but for our purposes, we will deal with simple data files, typically ASCII text files, that are human readable.

From this UNIX perspective, the system is accessible under what is termed a filesystem (FS). An FS is a representation of the system that's analogous to a series of nested boxes, each of which is called a directory or folder. Most of us are familiar with this folder structure, which we would have encountered when using a GUI file manager.

A directory that contains another directory is called the parent of the latter. The latter is called a sub-directory of the former. On UNIX-like systems, the outermost directory is called the root directory, and each directory can contain either files or other directories in turn. Some files are not data, but rather represent devices or other resources on the system. To be concise, we will refer to folders, regular files, and special files as FS objects.

Typically, every user of a system has their own distinct home directory, named after the user's name, where they store their own data. Various other directories used by the operating system, called system directories, exist on the filesystem, but we need not concern ourselves with them for the purposes of this book. For the sake of simplicity, we will assume that our entire filesystem resides on only a single disk or partition (although this is not true in general):

Figure 1.1: An illustration of an example structure of a typical filesystem
Figure 1.1: An illustration of an example structure of a typical filesystem

The notation used to refer to a location in a filesystem is called a path. A path consists of the list of directories that need to be navigated to reach some FS object. The list is separated by a forward slash, which is called a path separator. The complete location of an FS object, including its path from the root directory onward, is called a fully qualified pathname.

Paths can be absolute or relative. An absolute path starts at the root directory, whereas a relative path starts at what is called the current working directory (CWD). Every process that runs on a system is started with its CWD set to some location. This includes the command-line process itself. When an FS object is accessed within the CWD, the name of the object alone is enough to refer to it.

The root directory itself is represented by a single forward slash; thus, any absolute path starts with a single forward slash. The following is an example of an absolute path relative to the root directory:

/home/robin/Lesson1/data/cupressaceae/juniperus/indica

Special syntax is used to refer to the current, parent, and user's home directories:

  • ./ refers to the current directory explicitly. The CWD is implicit in many cases, but is useful when the current directory needs to be explicitly specified as an argument to some commands. For instance, the same directory that we've just seen can be expressed relative to the CWD (/home/robin, in this case) as follows: one pathname specifying ./ explicitly and one without:
    ./Lesson1/data/cupressaceae/juniperus/indica
    Lesson1/data/cupressaceae/juniperus/indica
  • ../ refers to the parent directory. This can be extended further, such as ../../../, and so on. For instance, the preceding directory can be expressed relative to the parent of the CWD, as follows:
    ../robin/Lesson1/data/cupressaceae/juniperus/indica

    The ../ takes us to one level up to the parent of all the user home directories, and then we go back down to robin and the rest of the path.

  • ~/ refers to the home directory of the current user.

    ~robin/ refers to the home directory of a user called "robin". This is a useful shorthand, because the home directory of a user could be configured to be anywhere in the filesystem. For example, macOS keeps the users' home directories in /Users, whereas Linux systems keep it in /home.

    Note

    The trailing slash symbol at the end of a directory pathname is optional. The shell does not mandate this. It is usually typed only to make it obvious that it is the name of a directory rather than a file.

Navigating Filesystems

We will now look briefly at the most common commands for moving around the filesystem and examining its contents:

  • The cd (change directory) command changes the CWD to the path specified as its argument—if the path is non-existent, it prints an error message. Specifying just a single hyphen as the argument to cd changes the CWD to the last directory that was navigated from.
  • The pwd (print working directory) command simply displays the absolute path of the CWD.
  • The pushd and popd (push directory and pop directory) commands are used to bookmark the CWD and return to it later, respectively. They work by pushing and popping entries on to an internal directory stack, hence the names pushd and popd. Since they use a stack, you can push multiple values and pop them later in reverse order.
  • The tree command displays the hierarchical structure of a directory as a text-based diagram.
  • The ls (list) command displays the contents of one or more specified directories (by default, the CWD) in various formats.
  • The cat (concatenate) command outputs the concatenation of the contents of the files specified to it. If only one file is specified, it simply displays the file. This is a quick way to look at a file's content, if the files are small. cat can also apply some transformations on its output, such as numbering the lines or suppressing multiple blank lines.
  • The less command can be used to interactively scroll through one or more files easily, search for a string, and so on. This command is called a pager (it lets text content be viewed page by page). On most systems, less is configured to be the default pager. Other commands that require a pager interface will request the default pager from the system for this purpose. Here are some of the most useful keyboard shortcuts for less:

    (a) The up or down and Page Up or Page Down keys scroll vertically.

    (b) The Enter and spacebar keys scroll down by one line and one screenful, respectively.

    (c) < and > or g and G characters will scroll to the beginning and end of the file, respectively.

    (d) / followed by a string and then Enter searches for the specified string. The occurrences are also highlighted.

    (e) n and N jump to the next or previous match, respectively.

    (f) Esc followed by u turns off the highlights.

    (g) h shows a help screen, with the list of shortcuts and commands that are supported.

    (h) q exits the application or exits the help screen if it is being shown.

    There are many more features for navigating, searching, and editing that less provides, which we will not cover in this basic introduction.

Commonly Used Options for the Commands

The following options are used with the ls command:

  • The -l option (which stands for long list) shows the contents with one entry per line—each column in the listing shows some specific information, namely permissions, link count, owner, group, size, and modification time, followed by the name, respectively. For the purposes of this book, we will only consider the size and the name. Information about the type of each FS object is indicated in the first character of the permissions field. For example, - for a file, and d for a directory.
  • The --reverse option sorts the entries in reverse order. This is an example of a long option, where the option is a complete word, which is easy to remember. Long options are usually aliases for short options—in this case, the corresponding short option is -r.
  • The --color option is used to make different kinds of FS objects display in different colors—there is no corresponding short option for this.

The following options are used with the tree command:

  • The -d option prints only directories and skips files
  • The -o option writes the output to a file rather than the display
  • The -H option generates a formatted HTML output, and typically would be used along with -o to generate an HTML listing to serve on a website

Before going ahead with the exercises, let's establish some conventions for the rest of this book. Each chapter of this book includes some test data to practice on. Throughout this book, we will assume that each chapter's data is in its own folder called Lesson1, Lesson2, and so on.

In all of the exercises that follow, it is assumed that the work is in the home directory of the logged-in user (here, the user is called robin).

Exercise 1: Exploring Filesystem Contents

In this exercise, we will navigate through a complex directory structure and view files using the commands learned so far. The sample data used here is a dataset of conifer trees, hierarchically structured as per botanic classification, which will be used in future activities and exercises too.

  1. Open the command-line shell.
  2. Navigate to the Lesson1 directory and examine the contents of the folder with the ls command:
    robin ~ $ cd Lesson1 
    robin ~/Lesson1 $ ls 
    data data1

    In the preceding code snippet, the part of the first line up to the $ symbol is called a prompt. The system is prompting for a command to be typed. The prompt shows the current user, in this case robin, followed by the CWD ~/Lesson1. The text shown after the command is what the command itself prints as output.

    Note

    Recall that ~ means the home directory of the current user.

  3. Use the cd command to navigate to the data directory and examine its contents with ls:
    robin ~/Lesson1 $ cd data 
    robin ~/Lesson1/data $ ls 
    cupressaceae  pinaceae  podocarpaceae  taxaceae

    Note

    Notice that the prompt shown afterward displays the new CWD. This is not always true. Depending on the configuration of the system, the prompt may vary, and may even be a simple $ symbol with no other information shown.

  4. The ls command can be provided with one or more arguments, which are the names of files and folders to list. By default, it lists only the CWD. The following snippet can be used to view the subdirectories within the taxaceae and podocarpaceae directories:
    robin ~/Lesson1/data $ ls taxaceae podocarpaceae 
    podocarpaceae/: 
    acmopyle     dacrydium      lagarostrobos  margbensonia  parasitaxus   podocarpus    saxegothaea 
    afrocarpus   falcatifolium  lepidothamnus  microcachrys  pherosphaera  prumnopitys   stachycarpus
    dacrycarpus  halocarpus     manoao         nageia        phyllocladus  retrophyllum  sundacarpus
    taxaceae/: 
    amentotaxus  austrotaxus  cephalotaxus  pseudotaxus  taxus  torreya 

    The dataset contains a directory for every member of the botanical families of coniferous trees. Here, we can see the top-level directories for each botanical family. Each of these has subdirectories for the genii, and those in turn for the species.

  5. You can also use ls to request a long output in color, as follows:
    robin ~/Lesson1/data $ ls -l --color 
    total 16 
    drwxr-xr-x 36 robin robin 4096 Aug 20 14:01 cupressaceae 
    drwxr-xr-x 15 robin robin 4096 Aug 20 14:01 pinaceae 
    drwxr-xr-x 23 robin robin 4096 Aug 20 14:01 podocarpaceae 
    drwxr-xr-x  8 robin robin 4096 Aug 20 14:01 taxaceae 
  6. Navigate into the taxaceae folder, and then use the tree command to visualize the directory structure at this point. For clarity, specify the -d option, which instructs it to display only directories and exclude files:
    robin ~/Lesson1/data $ cd taxaceae
    robin ~/Lesson1/data/taxaceae $ tree -d 

    You should get the following output on running the preceding command:

    Figure 1.2: The directory structure of the taxaceae folder (not shown entirely)
    Figure 1.2: The directory structure of the taxaceae folder (not shown entirely)
  7. cd can be given a single hyphen as an argument to jump back to the last directory that was navigated from:
    robin ~/Lesson1/data/taxaceae $ cd taxus
    robin ~/Lesson1/data/taxaceae/taxus $ cd - 
    /home/robin/Lesson1/data/taxaceae 

    Observe that it prints out the absolute path of the directory it is changing to.

    Note

    The home directory is stored in /home on UNIX-based systems. Other operating systems such as Mac OS may place them in other locations, so the output of some of the following commands may slightly differ from that shown here.

  8. We can move upwards in the hierarchy by using .. any number of times. Type the first command that follows to reach the home directory, which is three levels up. Then, use cd - to return to the previous location:
    robin ~/Lesson1/data/taxaceae $ cd ../../.. 
    robin ~ $ cd - 
    /home/robin/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/taxaceae $
  9. Use cd without any arguments to go to the home directory. Then, once again, use cd - to return to the previous location:
    robin ~/Lesson1/data/taxaceae $ cd 
    robin ~ $ cd - 
    /home/robin/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/taxaceae $
  10. Now, we will explore commands that help us navigate the folder structure, such as pwd, pushd, and popd. Use the pwd command to display the path of the CWD, as follows:
    robin ~/Lesson1/data/taxaceae $ pwd 
    /home/robin/Lesson1/data/taxaceae

    The pwd command may not seem very useful when the CWD is being displayed in the prompt, but it is useful in some situations, for example, to copy the path to the clipboard for use in another command, or to share it with someone.

  11. Use the pushd command to navigate into a folder, while remembering the CWD:
    robin ~/Lesson1/data/taxaceae $ pushd taxus/baccata/ 
    ~/Lesson1/data/taxaceae/taxus/baccata ~/Lesson1/data/taxaceae 

    Use it once again, saving this location to the stack too:

    robin ~/Lesson1/data/taxaceae/taxus/baccata $ pushd ../sumatrana/
    ~/Lesson1/data/taxaceae/taxus/sumatrana ~/Lesson1/data/taxaceae/taxus/baccata ~/Lesson1/data/taxaceae 

    Using it yet again, now we have three folders on the stack:

    robin ~/Lesson1/data/taxaceae/taxus/sumatrana $  pushd ../../../pinaceae/cedrus/deodara/ 
    ~/Lesson1/data/pinaceae/cedrus/deodara ~/Lesson1/data/taxaceae/taxus/sumatrana ~/Lesson1/data/taxaceae/taxus/baccata ~/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/pinaceae/cedrus/deodara $

    Notice that it prints out the list of directories that have been saved so far. Since it is a stack, the list is ordered according to recency, with the first entry being the one we just changed into.

  12. Use popd to walk back down the directory stack, successively visiting the folders we saved earlier. Notice the error message when the stack is empty:
    robin ~/Lesson1/data/pinaceae/cedrus/deodara $ popd 
    ~/Lesson1/data/taxaceae/taxus/sumatrana ~/Lesson1/data/taxaceae/taxus/baccata ~/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/taxaceae/taxus/sumatrana $ popd 
    ~/Lesson1/data/taxaceae/taxus/baccata ~/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/taxaceae/taxus/baccata $ popd 
    ~/Lesson1/data/taxaceae 
    robin ~/Lesson1/data/taxaceae $ popd 
    bash: popd: directory stack empty 

    The entries on the directory stack are added and removed from the top of the stack as pushd and popd are used, respectively.

  13. Each of the folders for a species has a text file called data.txt that contains data about that tree from Wikipedia, which we can view with cat. Use the cat command to view the file's content, after navigating into the taxus/baccata directory:
    robin ~/Lesson1/data/taxaceae $ cd taxus/baccata 
    robin ~/Lesson1/data/taxaceae/taxus/baccata $ cat data.txt

    The output will look as follows:

    Figure 1.3: A screenshot showing a partial output of the data.txt file
    Figure 1.3: A screenshot showing a partial output of the data.txt file

    Notice that the output from the last command scrolled outside the view rapidly. cat is not ideal for viewing large files. You can scroll through the window manually to see the contents, but this may not extend to the whole output. To view files in a more user-friendly, interactive fashion, we can use the less command.

  14. Use ls to see that there is a file called data.txt, and then use the less command to view it:
    robin ~/Lesson1/data/taxaceae/taxus/baccata $ ls -l 
    total 40 
    -rw-r--r-- 1 robin robin 38260 Aug 16 01:08 data.txt
    robin ~/Lesson1/data/taxaceae/taxus/baccata $ less data.txt

    The output is shown here:

Figure 1.4: A screenshot showing the output of the less command
Figure 1.4: A screenshot showing the output of the less command

In this exercise, we have practiced the basic commands used to view directories and files. We have not covered all of the options available with these commands in detail, but what we have learned so far will serve for most of our needs.

Given this basic knowledge, we should be able to find our way around the entire filesystem and examine any file that we wish.

Manipulating a Filesystem

So far, we have looked at commands that only examine directories and files. Now we will learn how to manipulate filesystem objects. We will not be manipulating the contents of files yet, but only their location in the filesystem.

Here are the most common commands that are used to modify a filesystem. The commonly used options for some of these commands are also mentioned:

  • mkdir (make directory) creates the directory specified as its argument. It can also create a hierarchy of directories in one shot.

    The -p or --parents flag can be used to tell mkdir to create all the parent directories for the path if they do not exist. This is useful when creating a nested path in one shot.

  • rmdir (remove directory) is used to remove a directory. It only works if the directory is empty.

    The -p or --parents flag works similarly to how it does in mkdir. All the directories along the path that's specified are deleted if they are empty.

  • touch is used to create an empty file or update an existing file's timestamp.
  • cp (copy) is used to copy files or folders between directories. When copying directories, it can recursively copy all subdirectories, too. The syntax for this command is as follows:
    cp <sources> <dest>

    Here, <sources> is the paths of one or more files and folders to be copied, and <dest> is the path of the folder where <sources> are copied. This can be a filename, if <sources> is a single filename. The following options can be used with this command:

    The -r or --recursive flag is necessary when copying folders. It recursively copies all of the folder's contents to the destination.

    The -v or --verbose flag makes cp print out the source and destination pathname of every file it copies.

  • mv (move) can be used to rename an object and/or move it to another directory.

    Note

    The mv command performs both renaming and moving. However, these are not two distinct functions. If you think about it, renaming a file and moving it to a different path on the same disk are the same thing. Inherently, a file's content is not related to its name. A change to its name is not going to affect its contents. In a sense, a pathname is also a part of a file's name.

  • rm (remove) deletes a file permanently, and can also be used to delete a directory, recursively deleting all the subdirectories. Unlike sending files to the Trashcan or Recycle Bin in a GUI interface, files deleted with rm cannot be recovered. This command has the following options:

    The -r or --recursive flag deletes folders recursively.

    The -v or --verbose flag makes rm print out the pathname of every file it deletes.

    The -i or --interactive=always options allows review and confirmation before each entry being deleted. Answering n rather than y to the prompts (Enter must be pressed after y or n) will either skip deleting some files or skip entire directories.

    -I or --interactive=once prompts only once before removing more than three files, or when removing recursively, whereas -i prompts for each and every file or directory.

Exercise 2: Manipulating the Filesystem

In this exercise, we will learn how to manipulate the FS and files within it. We will modify the directories in the Lesson1 folder by creating, copying, and deleting files/folders using the commands that we learned about previously:

  1. Open a command-line shell and navigate to the directory for this lesson:
    robin ~ $ cd Lesson1/
    robin ~/Lesson1 $
  2. Create some directories, using mkdir, that classify animals zoologically. Type the commands shown in the following snippet:
    robin ~/Lesson1 $ mkdir animals 
    robin ~/Lesson1 $ cd animals 
    robin ~/Lesson1/animals $ mkdir canis 
    robin ~/Lesson1/animals $ mkdir canis/familiaris 
    robin ~/Lesson1/animals $ mkdir canis/lupus 
    robin ~/Lesson1/animals $ mkdir canis/lupus/lupus 
    robin ~/Lesson1/animals $ mkdir leopardus/colocolo/pajeros 
    mkdir: cannot create directory 'leopardus/colocolo/pajeros': No such file or directory 
  3. Notice that mkdir normally creates subdirectories that are only in already-existing directories, so it raises an error when we try to make leopardus/colocolo/pajeros. Use the --parents or -p switch to overcome this error:
    robin ~/Lesson1/animals $ mkdir -p leopardus/colocolo/pajeros
    robin ~/Lesson1/animals $ mkdir --parents panthera/tigris
    robin ~/Lesson1/animals $ mkdir panthera/leo
  4. Now, use tree to view and verify the directory structure we created:
    robin ~/Lesson1/animals $ tree 

    The directory structure is shown here:

    Figure 1.5: The directory structure of the animals folder
    Figure 1.5: The directory structure of the animals folder
  5. Now use the rmdir command to delete the directories. Try the following code snippets:
    robin ~/Lesson1/animals $ rmdir canis/familiaris/ 
    robin ~/Lesson1/animals $ rmdir canis 
    rmdir: failed to remove 'canis': Directory not empty 
    robin ~/Lesson1/animals $ rmdir canis/lupus 
    rmdir: failed to remove 'canis/lupus': Directory not empty
  6. Notice that it raises an error when trying to remove a directory that is not empty. You need to empty the directory first, removing canis/lupus/lupus, and then use the -p option to remove both canis/lupus and its parent, canis:
    robin ~/Lesson1/animals $ rmdir canis/lupus/lupus 
    robin ~/Lesson1/animals $ rmdir -p canis/lupus 
  7. Now, use tree to view the modified directory structure, as follows:
    robin ~/Lesson1/animals $ tree 

    The directory structure is shown here:

    Figure 1.6: A screenshot of the output displaying the modified folder structure of the animals folder
    Figure 1.6: A screenshot of the output displaying the modified folder structure of the animals folder
  8. Create some directories with the following commands:
    robin ~/Lesson1/animals $ mkdir -p canis/lupus/lupus 
    robin ~/Lesson1/animals $ mkdir -p canis/lupus/familiaris 
    robin ~/Lesson1/animals $ ls 
    canis  leopardus  panthera
  9. Create some dummy files with the touch command, and then view the entire tree again:
    robin ~/Lesson1/animals $ touch canis/lupus/familiaris/dog.txt 
    robin ~/Lesson1/animals $ touch panthera/leo/lion.txt 
    robin ~/Lesson1/animals $ touch canis/lupus/lupus/wolf.txt 
    robin ~/Lesson1/animals $ touch panthera/tigris/tiger.txt 
    robin ~/Lesson1/animals $ touch leopardus/colocolo/pajeros/colocolo.txt
    robin ~/Lesson1/animals $ tree 

    The output will look as follows:

    Figure 1.7: A screenshot of the output displaying the revised folder structure 
of the animals folder
    Figure 1.7: A screenshot of the output displaying the revised folder structure of the animals folder
  10. Use cp to copy the dog.txt and wolf.txt files from the familiaris and lupus directories into a new directory called dogs, as follows:
    robin ~/Lesson1/animals $ mkdir dogs 
    robin ~/Lesson1/animals $ cp canis/lupus/familiaris/dog.txt dogs/
    robin ~/Lesson1/animals $ cp canis/lupus/lupus/wolf.txt dogs/
    robin ~/Lesson1/animals $ tree

    The output will look as follows:

    Figure 1.8: A screenshot of the output displaying the revised folder structure of the animals folder, along with the newly copied files
    Figure 1.8: A screenshot of the output displaying the revised folder structure of the animals folder, along with the newly copied files
  11. Now clone the entire panthera directory into a new directory called cats using cp:
    robin ~/Lesson1/animals $ mkdir cats 
    robin ~/Lesson1/animals $ cp -r panthera cats 
    robin ~/Lesson1/animals $ tree 

    The output will look as follows:

    Figure 1.9: A screenshot of the output displaying the revised folder structure 
of the animals folder
    Figure 1.9: A screenshot of the output displaying the revised folder structure of the animals folder
  12. Now use the --verbose option with cp to copy the files with verbose progress displayed and print the output using the tree command:
    robin ~/Lesson1/animals $ mkdir bigcats
    robin ~/Lesson1/animals $ cp -r --verbose leopardus/ panthera/ bigcats
    'leopardus/' -> 'bigcats/leopardus'
    'leopardus/colocolo' -> 'bigcats/leopardus/colocolo'
    'leopardus/colocolo/pajeros' -> 'bigcats/leopardus/colocolo/pajeros'
    'leopardus/colocolo/pajeros/colocolo.txt' -> 'bigcats/leopardus/colocolo/pajeros/colocolo.txt'
    'panthera/' -> 'bigcats/panthera'
    'panthera/tigris' -> 'bigcats/panthera/tigris'
    'panthera/tigris/tiger.txt' -> 'bigcats/panthera/tigris/tiger.txt'
    'panthera/leo' -> 'bigcats/panthera/leo' 
    'panthera/leo/lion.txt' -> 'bigcats/panthera/leo/lion.txt'
    robin ~/Lesson1/animals $ tree bigcats 

    The output of the tree command is shown here:

    Figure 1.10: A screenshot of the output displaying the folder structure of the animals folder after a recursive directory copy
    Figure 1.10: A screenshot of the output displaying the folder structure of the animals folder after a recursive directory copy
  13. Now use mv to rename the animals folder to beasts:
    robin ~/Lesson1/animals $ cd .. 
    robin ~/Lesson1 $ mv animals beasts 
    robin ~/Lesson1 $ cd beasts 
    robin ~/Lesson1/beasts $ ls 
    bigcats  canis  cats  dogs  leopardus  panthera
  14. Use mv to move an individual file to a different path. We move dogs/dog.txt to the CWD as fido.txt and move it back again:
    robin ~/Lesson1/beasts $ mv dogs/dog.txt fido.txt 
    robin ~/Lesson1/beasts $ ls 
    bigcats  canis  cats  dogs  fido.txt  leopardus  panthera 
    robin ~/Lesson1/beasts $ mv fido.txt dogs/
  15. Use mv to relocate an entire folder. Move the whole canis folder into dogs:
    robin ~/Lesson1/beasts $ mv canis dogs 
    robin ~/Lesson1/beasts $ tree dogs 

    The revised folder structure is shown here:

    Figure 1.11: A screenshot of the output displaying the folder structure of the animals folder after relocating a folder
    Figure 1.11: A screenshot of the output displaying the folder structure of the animals folder after relocating a folder
  16. Use the -v or --verbose option with mv to make it report each item being moved. In this case, there was only one file being moved, but this can be a long list:
    robin ~/Lesson1/beasts $ mkdir panthers 
    robin ~/Lesson1/beasts $ mv --verbose panthera panthers 
    renamed 'panthera' -> 'panthers/panthera' 
    robin ~/Lesson1/beasts $ tree panthers 

    The output is shown here:

    Figure 1.12: A screenshot of the output displaying the folder structure of the animals folder 
after moving a folder
    Figure 1.12: A screenshot of the output displaying the folder structure of the animals folder after moving a folder
  17. Use tree to view the dogs folder (before we use rm to delete it):
    robin ~/Lesson1/beasts $ tree dogs 

    The output is shown here:

    Figure 1.13: A screenshot of the output displaying the folder structure of the animals folder before the deletion of files
    Figure 1.13: A screenshot of the output displaying the folder structure of the animals folder before the deletion of files
  18. Delete the files one by one with rm:
    robin ~/Lesson1/beasts $ rm dogs/fido.txt
    robin ~/Lesson1/beasts $ rm dogs/wolf.txt 
    robin ~/Lesson1/beasts $ rm dogs/canis/lupus/familiaris/dog.txt 
    robin ~/Lesson1/beasts $ rm dogs/canis/lupus/lupus/wolf.txt 
    robin ~/Lesson1/beasts $ tree dogs 

    The output is shown here:

    Figure 1.14: The folder structure of the animals folder after the deletion of files
    Figure 1.14: The folder structure of the animals folder after the deletion of files
  19. Remove the complete directory structure with the -r or --recursive switch of rm:
    robin ~/Lesson1/beasts $ ls 
    bigcats  cats  dogs  leopardus  panthers 
    robin ~/Lesson1/beasts $ rm -r dogs 
    robin ~/Lesson1/beasts $ ls 
    bigcats  cats  leopardus  panthers

    As we can see, the entire dogs directory was silently removed without warning.

  20. Use the -i flag to remove items interactively. Each individual operation is prompted for confirmation:

    Note

    Depending on your system configuration, the prompts you see for the following command and the one in step 21 may be in a different order or worded differently. The system will prompt you for every deletion to be performed, regardless.

    robin ~/Lesson1/beasts $ rm -r -i panthers 
    rm: descend into directory 'panthers'? y 
    rm: descend into directory 'panthers/panthera'? y 
    rm: descend into directory 'panthers/panthera/leo'? y 
    rm: remove regular empty file 'panthers/panthera/leo/lion.txt'? n 
    rm: remove directory 'panthers/panthera/leo'? n 
    rm: descend into directory 'panthers/panthera/tigris'? n 
    robin ~/Lesson1/beasts $ ls 
    bigcats  cats  leopardus  panthers 

    Now use the -I flag to remove items interactively. Confirmation is asked only a few times, and not for each file:

    robin ~/Lesson1/beasts $ rm -r -I bigcats 
    rm: remove 1 argument recursively? y 
    robin ~/Lesson1/beasts $ ls 
    cats  leopardus  panthers 
  21. Use the -v or --verbose option to make rm report each file or directory that's removed:
    robin ~/Lesson1/beasts $ rm -r -v panthers/ 
    removed 'panthers/panthera/leo/lion.txt' 
    removed directory 'panthers/panthera/leo' 
    removed 'panthers/panthera/tigris/tiger.txt' 
    removed directory 'panthers/panthera/tigris' 
    removed directory 'panthers/panthera' 
    removed directory 'panthers/'
  22. Now clear the entire folder we used for this exercise so that we can move on to the next lesson with a blank slate:
    robin ~/Lesson1/beasts $ cd .. 
    robin ~/Lesson1 $ ls 
    beasts  data  data1 
    robin ~/Lesson1 $ rm -r beasts 
    robin ~/Lesson1 $ ls 
    data  data1 

In this exercise, we learned how to change or extend the structure of the filesystem tree. We have yet to learn how to create and manipulate the content within files, which will be covered in future chapters.

Activity 1: Navigating the Filesystem and Viewing Files

For this activity, use the conifer tree dataset that has been supplied as a hierarchy of folders representing each tree's Family, Genus, and Species. Every species has an associated text file called data.txt containing information about the species, which has been mined from a Wikipedia page. Your aim is to navigate this hierarchy via the command line and answer basic questions about certain species by looking it up the data in those text files. Navigate through the directories within the example dataset provided for this lesson and answer the following questions:

  1. Provide two common names for the species Cedrus Deodara, which belongs to the Pinaceae family.
  2. Look up information about Abies Pindrow in the Pinaceae family and fill in the following blank: "The name pindrow derives from the tree's name in _______".
  3. How many species of the Taxus genus in the family Taxaceae are documented in this dataset?
  4. How many species in total are documented in this dataset?

Follow these steps to complete this activity:

  1. Use the cd command to navigate to the appropriate folder and use less to read the relevant information.
  2. Use the cd command to navigate to the appropriate folder and view the file with less. Use the / command to search for the phrase "derives from" and read the rest of the sentence to get the answer.
  3. Navigate to the right folder and run the tree command, which reports the number of directories in it. Each directory is a species.
  4. Navigate to the top-level data folder and run the tree command, which reports the number of files. Each file is associated with one species.

The expected answers for the preceding questions are as follows:

  1. Any two of the following: deodar cedar, Himalayan cedar, deodar, devdar, devadar, devadaru
  2. Nepali
  3. 12
  4. 770

    Note

    The solution for this activity can be found on page 270.

Activity 2: Modifying the Filesystem

For this activity, you will be using the conifer tree sample dataset that is in the ~/Lesson1/data folder. You need to collect the data for all trees from the family taxaceae and the genus torreya into one folder. Each file should be named <species>.txt, where <species> is the name of the species/folder. Execute the following steps to complete this objective:

  1. Use the cd command to go into the Lesson1 folder and create a new folder called activity2.
  2. Navigate to the folder for the genus specified and view the subfolders which represent each species.
  3. Use the cp command to copy a data file from one sub-directory of the data/taxaceae/torreya folder into the output folder.
  4. Use the mv command to rename the file as per the species name.
  5. Repeat steps 3 and 4 for all the species that are requested.

The expected listing of the activity2 folder is as follows:

Figure 1.15: A screenshot of the expected listing of the activity2 folder
Figure 1.15: A screenshot of the expected listing of the activity2 folder

Note

The solution for this activity can be found on page 270.

So far, we have explored the space in which a shell command-line operates. In a GUI, we deal with an abstract space of windows, menus, applications, and so on. In contrast, a CLI is based on a lower layer of the operating system, which is the filesystem.

In this topic, we have learned what a filesystem is and how to navigate it, and examined its structure or looked at the contents of files in it using the command line. We also learned how to modify the FS structure and perform simple file management tasks.

We learned how the shell is a way to provide precise, unambiguous, and repeatable instructions to the computer. You may have noticed the fact that most command-line tools perform just one simple function. This stems from one of the UNIX design philosophies: Do only one thing but, do it well. These small commands can be combined like the parts of a machine into constructs that can automate tasks and process data in complex ways.

The focus of this topic was mainly to get familiar with the FS, the arena where most of the command-line work happens. In the next topic, we will learn how to reduce effort when composing commands, making use of several convenience features in Bash.

 

Shell History Recall, Editing, and Autocompletion

In the previous section, we have experienced the fact that we need to repeatedly type some commands, and often type out the pathnames of files and folders. Indeed, it can get quite tedious if we work with long or hard-to-spell pathnames (both of which are present in our tree dataset). To counter this, we can use a few convenient features of modern command-line shells to reduce typing effort. We will explore these useful keyboard shortcuts for the command line in this section.

The GNU Bash shell uses an interface library called readline. This same interface is used by several programs (for example, gdb, python, and Node.js); hence, what you learn now applies to the CLIs of all those.

The readline interface supports emacs and vi modes. The keyboard shortcuts in these modes are derived from the ones in the iconic editors of those names. Since the default is the emacs mode, we will study only that.

Note

When indicating shortcuts, the convention is to show a combination of the Ctrl key and another key using the caret symbol '^' with the key. For example, Ctrl + C is indicated by ^C.

Command History Recall

The Bash shell retains a history of the past commands that were typed. Depending on the system configuration, anywhere from a few hundred to a few thousand commands could be maintained in the history log. Any command from the history can be brought back and re-executed (after optionally modifying it).

Basic History Navigation Shortcuts

History is accessed by using the following shortcuts:

  • The up and down arrow keys move through the command history.
  • Esc + < and Esc + > or Page Up and Page Down or Alt + < and Alt + > move to the first and last command in the history. The other shortcuts listed may or may not work depending on the system's configuration.
  • Ctrl + S and Ctrl + R let you incrementally search for a command in the history forward and backward, respectively, by typing any substring of the command.

Navigating through the history of past commands with the up and down arrow keys or with Esc + < and Esc + > is quite straightforward. As you navigate, the command appears on the prompt, and can be executed by pressing Enter immediately, or after editing it.

Note

In the aforementioned shortcuts, remember that < and > implies that the Shift key is held down, since these are the secondary symbols on the keyboard.

To view the entire history, we can use the history command:

robin ~ $ history

An example output is shown here:

Figure 1.16: A screenshot of the output displaying the shell command history
Figure 1.16: A screenshot of the output displaying the shell command history

This command can perform other tasks related to history management as well, but we will not concern ourselves with that for this book.

Incremental Search

This feature lets you find a command in the history that matches a few characters that you type. To perform a forward incremental search, press Ctrl + S, upon which the shell prompt changes to something like this:

robin ~ $ ^S
(i-search)`':

When we press Ctrl + R instead, we see the following prompt:

robin ~ $ ^R
(reverse-i-search)`':

i-search stands for incremental search. When these prompts are displayed, the shell expects a few characters that appear within a command to be typed. As they are typed, the command which matches those characters as a substring is displayed. If there is more than one command that matches the input, the list of matches can be iterated with Ctrl + R and Ctrl + S backward and forward, respectively.

The incremental search happens from the point where you have currently navigated in the history (with arrow keys and so on). If there are no more matches in the given direction, the prompt changes to something similar to what is shown here:

(failed reverse-i-search)`john': man join

At this point, we can do the following:

  • Backspace the search string that was typed, to widen the range of matches and find one.
  • Change the search direction. We can press Ctrl + S if we were searching backward or press Ctrl + R if we were searching forward, to return to any previous match that was crossed over.
  • Press Esc to exit the search and accept whatever match was shown last.
  • Press Ctrl + G to exit the search and leave the command line empty.

    Note

    On some systems, Ctrl + S does not activate incremental search. Instead, it performs an unrelated function. To make sure it works as we require it to, type the following command once in the console before the exercises here: stty -ixon.

Remember that the search happens relative to the current location in history, so if you start a search without navigating upward in the history, then searching forward would have no effect, since there are no commands after the current history location (that is, the present). This means that searching backward with Ctrl + R is generally the more frequently used and useful feature. Most of the time, a history search comes in handy for retyping a long command from the recent past, or for retrieving a complex command typed long ago, whose details have been forgotten.

As you progress in your command-line knowledge and experience, you will find that although it is easy to compose complicated command lines when you have a certain problem to solve, it is not easy to recollect them after a long period of time has passed. Keeping this in mind, it makes sense to conserve your mental energy, and reuse old commands from history, rather than try to remember or recreate them from scratch. Indeed, it is possible to configure Bash to save your entire history infinitely so that you never lose any command that you ever typed on the shell.

Exercise 3: Exploring Shell History

In this exercise, we will use the history search feature to repeat some commands from an earlier exercise. Make sure that you are in the Lesson1 directory before starting:

  1. Create a temporary directory called data2 to work with:
    robin ~/Lesson1 $ mkdir data2 
    robin ~/Lesson1 $ cd data2 
    robin ~/Lesson1/data2 $
  2. Press Ctrl + R to start a reverse incremental search, and then type "animals". The most recent command with that string will be shown.
  3. Press Ctrl + R two times to search backward until we get the command we need, and then press Enter to execute it:
    (reverse-i-search)`animals': mkdir animals
    robin ~/Lesson1/data2 $ mkdir animals
    robin ~/Lesson1/data2 $ cd animals
  4. Find the command that created the directory for the species of the domestic dog canis/lupus/familiaris. The string familiaris is quite unique, so we can use that as a search pattern. Press Esc + < to reach the start of the history and Ctrl + S to start searching forward from that point. Type "fa" and press Ctrl + S two more times to get the command we are searching for. Finally, press Enter to execute it:
    (i-search)`fa': mkdir -p canis/lupus/familiaris
    robin ~/Lesson1/data2/animals $ mkdir -p canis/lupus/familiaris 
  5. Repeat the same command, except change the directory to create canis/lupus/lupus. Press the up arrow to get the same command again. Change the last word to lupus and press Enter to create the new directory:
    robin ~/Lesson1/data2/animals $ mkdir -p canis/lupus/lupus

In this brief exercise, we have seen how to retrieve commands that we typed previously. We can move through the history linearly or search for a command, saving ourselves a lot of retyping.

Command-Line Shortcuts

There are many keyboard shortcuts on Bash that let you modify an already typed command. Usually, it is more convenient to take an existing command from the history and edit it to form a new one, rather than retype everything.

Navigation Shortcuts

The following are some navigation shortcuts:

  • The left or right arrow keys, as well as Home or End work as per standard conventions. Ctrl + A and Ctrl + E are alternatives for Home and End.
  • Alt + F and Alt + B jump by one word forward and backward, a word being a contiguous string that consists of numbers and letters.

Clipboard Shortcuts

The following are some clipboard shortcuts:

  • Alt + Backspace cuts the word to the left of the cursor
  • Alt + D cuts the word to the right of the cursor, including the character under the cursor
  • Ctrl + W cuts everything to the left of the cursor until a whitespace character is encountered
  • Ctrl + K cuts everything from the cursor to the end of the line
  • Ctrl + U cuts everything from the cursor to the start of the line, excluding the character under the cursor
  • Ctrl + Y pastes what was just cut
  • Alt + Y cycles through the previously cut entries one by one (works only after pasting with Ctrl + Y)

Other Shortcuts

The following are some other shortcuts that may come in useful:

  • Alt + \ deletes all whitespace characters that are at the cursor, that is, it joins two words that are separated by whitespaces.
  • Ctrl + T swaps the current and previous character. This is useful to correct typos.
  • Alt + T swaps the current and previous word.
  • Ctrl + Shift + _ undoes the last keypress.
  • Alt + R reverts all changes to a line. This is useful to revert a command from history back to what it was originally.
  • Alt + U converts the characters from the cursor position until the next word boundary to uppercase.
  • Alt + L converts the characters from the cursor position until the next word boundary to lowercase.
  • Alt + C capitalizes the first letter of the word under the cursor and moves to the next word.

There are several other shortcuts, but these are the most useful. It is not necessary to memorize all of these, but the navigation and cut/paste shortcuts are certainly worth learning by heart.

Note

The clipboard that the readline interface in Bash uses is distinct from the clipboard provided in the GUI. The two are independent mechanisms and should not be confused with each other. When you use any other command-line interface that uses readline, for example, the Python shell, it gets its own independent clipboard.

Exercise 4: Using Shell Keyboard Shortcuts

In this exercise, we will try out some of the command-line shortcuts. For simplicity, we will introduce the echo command to help with this exercise. This command merely prints out its arguments without causing any side effects. The examples here are contrived to help illustrate the editing shortcuts:

  1. Run the following command:
    robin ~/Lesson1/data2/animals $ echo one two three four five/six/seven
    one two three four five/six/seven
  2. Press the up arrow key to get the same command again. Press Alt + B three times. The cursor ends up at five. Type "thousand" followed by a space, and press Enter to execute the edited command:
    robin ~/Lesson1/data2/animals $ echo one two three four thousand five/six/seven
    one two three four thousand five/six/seven
  3. Now use the cut and paste shortcuts as follows: press the up arrow key to get the previous command, press Alt + Backspace to cut the last word seven into the clipboard, press Alt + B twice (the cursor ends up at five), use Ctrl + Y to paste the word that we cut, type a forward slash, and finally press Enter:
    robin ~/Lesson1/data2/animals $ echo one two three four thousand seven/five/six/ 
    one two three four thousand seven/five/six/
  4. Press the up arrow key to get the previous command, press Alt + B four times (the cursor ends up at thousand), press Alt + D to cut that word (notice that an extra space was left behind), press End to go to the end of the line, use Ctrl + Y to paste the word that we cut, and press Enter to execute the command:
    robin ~/Lesson1/data2/animals $ echo one two three four  seven/five/six/thousand
    one two three four seven/five/six/thousand
  5. Press the up arrow key to get the previous command again, press Alt + B three times (the cursor ends up at five), press Ctrl + K to cut to the end of the line, press Alt + B to go back one word (the cursor ends up at seven), use Ctrl + Y to paste the word that we cut, type a forward slash, and press Enter to execute the command:
    robin ~/Lesson1/data2/animals $ echo one two three four  five/six/thousand/seven/ 
    one two three four five/six/thousand/seven/
  6. Press the up arrow key to get the previous command once more, press Alt + B three times (the cursor ends up at six), press Ctrl + U to cut to the start of the line, press Alt + F to move forward one word (the cursor ends up at /thousand), press Ctrl + Y to paste the content we cut earlier, press Home and type echo, and then press the spacebar and then Enter to execute the command:
    robin ~/Lesson1/data2/animals $ echo sixecho one two three four  five//thousand/seven/ 
    sixecho one two three four five//thousand/seven/

In this exercise, we have explored how to use the editing shortcuts to efficiently construct commands. With some practice, it becomes quite unnecessary to compose a command from scratch. Instead, we compose them from older ones.

Command-Line Autocompletion

We all use auto-suggest on our mobile devices, but surprisingly, this feature has existed on Bash for decades. Bash provides the following context-sensitive completion when you type commands:

  • File and pathname completion
  • Command completion, which suggests the names of programs and commands
  • Username completion
  • Options completion
  • Customized completion for any program (many programs such as Git add their own completion logic)

Completion is invoked on Bash by entering a few characters and pressing the Tab key. If there is only one possible completion, it is immediately inserted on to the command line; otherwise, the system beeps. Then, if Tab is pressed again, all the possible completions are shown. If the possible completions are too numerous, a confirmation prompt is shown before displaying them.

Note

Depending on the system's configuration, the number of possible command completions seen will vary, since different programs may be installed on different systems.

Exercise 5: Completing a Folder Path

In this exercise, we will explore hands-on how the shell autocompletes folder paths for us:

  1. Open a new command shell and return to the directory that we recreated from history in the earlier exercise:
    robin ~ $ cd Lesson1/data2/animals
    robin ~/Lesson1/data2/animals $ 
  2. Type cd canis/ and press Tab three times. It completes the command to cd canis/lupus/ and shows two possible completions:
    robin ~/Lesson1/data2/animals $ cd canis/lupus/
    familiaris/ lupus/
    robin ~/Lesson1/data2/animals $ cd canis/lupus/
  3. Type f and press Tab to choose the completion familiaris:
    robin ~/Lesson1/data2/animals $ cd canis/lupus/familiaris/

Exercise 6: Completing a Command

In this exercise, we will use command completion to suggest commands (after each sequence here, clear the command line with Ctrl + U or Alt + Backspace):

  1. Type "les" and press Tab to produce the completion:
    robin ~/Lesson1/data2/animals $ less
  2. Type "rmd" and press Tab to produce the completion:
    robin ~/Lesson1/data2/animals $ rmdir
  3. If we do not type enough characters, the number of possible completions may be a large one. For instance, type "g" and press Tab twice (it beeps the first time to indicate that there is no single completion). The shell shows a confirmation prompt before showing all possible commands that start with "g", since there are too many:
    robin ~/Lesson1/data2/animals $ g
    Display all 184 possibilities? (y or n)

    In such cases, it is more practical to say n, because poring over so many possibilities is time-consuming, and defeats the purpose of completion.

Exercise 7: Completing a Command using Options

In this exercise, we will use command completion using options to suggest the long options for commands (after each sequence here, clear the command line with Ctrl + U):

  1. Type "ls --col" and press Tab to produce the completion:
    robin ~/Lesson1/data2/animals $ ls --color
  2. Type "ls --re" and press Tab twice to produce the list of two possible completions:
    robin ~/Lesson1/data2/animals $ ls --re 
    --recursive  --reverse 
  3. Then, type "c" and press Tab to select --recursive as the completion:
    robin ~/Lesson1/data2/animals $ ls --recursive

After performing these exercises, we have learned how the shell autocompletes text for us based on the context. The autocompletion is extensible, and many programs such as docker and git install completions for their commands, too.

Activity 3: Command-Line Editing

You are provided with the following list of tree species' names:

  1. Pinaceae Cedrus Deodara
  2. Cupressaceae Thuja Aphylla
  3. Taxaceae Taxus Baccata
  4. Podocarpaceae Podocarpus Alba

Each line has the family, genus, and species written like this: Podocarpaceae Lepidothamnus Intermedius. You need to type out each of these entries and use command-line shortcuts to convert them into a command that prints out the path of the data.txt file associated with the species.

You need to work out the most efficient way to compose a command, reducing typing effort and errors. Use the conifer tree sample data for this chapter that is in the ~/Lesson1/data folder and follow these steps to complete this activity:

  1. Navigate to the data folder.
  2. Type out a line from the file manually, for example, Podocarpaceae Lepidothamnus Intermedius.
  3. Use as few keystrokes as possible to generate a command that prints out the name of the file associated with that species, in this case: echo podocarpaceae/lepidothamnus/intermedius/data.txt.
  4. Repeat steps 3 and 4 for all the entries.

You should obtain the following paths for the data.txt files for the given species:

pinaceae/cedrus/deodara/data.txt
cupressaceae/thuja/aphylla/data.txt
taxaceae/taxus/baccata/data.txt
podocarpaceae/podocarpus/alba/data.txt

Note

If you are typing any piece of text multiple times, you can save time by typing that only once and then using the cut and paste functionality. You might want to experiment with the behavior of the two "cut word" shortcuts for this particular case. The solution for this activity can be found on page 272.

In this topic, we have examined the more hands-on interactive facilities that command-line shells provide. Without the time-saving features of history, completion, and editing shortcuts, the command line would be very cumbersome. Indeed, some old primitive command shells from the 1980s such as MS-DOS lacked most, if not all, of these features, making it quite a challenge to use them effectively.

Going forward, we will delve deeper into file management operations by utilizing a powerful concept called wildcard expansion, also known as shell globbing.

 

Shell Wildcards and Globbing

In the preceding exercises and activities, notice that we often perform the same operation on multiple files or folders. The point of a computer is to never have to manually instruct it to do something more than once. If we perform any repeated action using a computer, there is usually some way that it can be automated to reduce the drudgery. Hence, in the context of the shell too, we need an abstraction that lets us handle a bunch of files together. This abstraction is called a wildcard.

The term wildcard originates from card games where a certain card can substitute for whatever card the player wishes. When any command is sent to the shell, before it is executed, the shell performs an operation called wildcard expansion or globbing on each of the strings that make up the command line. The process of globbing replaces a wildcard expression with all file or pathnames that match it.

Note

This wildcard expansion is not performed on any quoted strings that are quoted with single or double quotes. Quoted arguments will be discussed in detail in a future chapter.

Wildcard Syntax and Semantics

A wildcard is any string that contains any of the following special characters:

  • A ? matches one occurrence of any character. For example, ?at matches cat, bat, and rat, and every other three letter string that ends with "at".
  • A * matches zero or more occurrences of any character. For example, image.* matches image.png, image.jpg, image.bmp.zip, and so on.
  • A ! followed by a pair of parentheses containing another wildcard expands to strings that do not match the contained expression.

    Note

    The exclamation operator is an "extended glob" syntax and may not be enabled by default on your system. To enable it, the following command needs to be executed: shopt -s extglob.

There are a few more advanced shell glob expressions, but we will restrict ourselves to these most commonly used ones for now.

Wildcard Expansion or Globbing

When the shell encounters a wildcard expression on the command line, it is internally expanded to all the files or pathnames that match it. This process is called globbing. Even though it looks as though one wildcard argument is present, the shell has converted that into multiple ones before the command runs.

Note that a wildcard can match paths across the whole filesystem:

  • * matches all the directories and files in the current directory
  • /* matches everything in the root directory
  • /*/* matches everything exactly two levels deep from the root directory
  • /home/*/.bashrc matches a file named .bashrc that is in every user's home directory

At this point, a warning is due: this powerful matching mechanism of wildcards can end up matching files that the user never intended if the wildcard was not specified correctly. Hence, you must exercise great care when running commands that use wildcards and modify or delete files. For safety, run echo with the glob expression to view what files it gets expanded to. Once we are sure that the wildcard is correct, we can run the actual command that affects the files.

Note

Since the shell expands wildcards as individual arguments, we can run into a situation where the number of arguments exceeds the limit that the system supports. We should be aware of this limitation when using wildcards.

Let's dive into an exercise and see how we can use wildcards.

Exercise 8: Using Wildcards

In this exercise, we will practice the use of wildcards for file management by creating folders and moving files with specific file formats to those folders.

Note

Some of the commands used in this exercise produce many screenfuls of output, so we only show them partially or not at all.

  1. Open the command line shell and navigate to the ~/Lesson1/data1 folder:
    robin ~ $ cd Lesson1/data1 

    There are over 11,000 files in this folder, all of which are empty dummy files, but their names come from a set of real-world files.

  2. Use a wildcard to list all the GIF files: *.gif matches every file that ends with .gif:
    robin ~/Lesson1/data1 $ ls *.gif

    The output is shown here:

    Figure 1.17: A screenshot of the output displaying a list of all GIF files within the folder
    Figure 1.17: A screenshot of the output displaying a list of all GIF files within the folder
  3. Create a new folder named gif, and use the wildcard representing all GIF files to move all of them into that folder:
    robin ~/Lesson1/data1 $ mkdir gif 
    robin ~/Lesson1/data1 $ mv *.gif gif 
  4. Verify that there are no GIF files left in the CWD:
    robin ~/Lesson1/data1 $ ls *.gif 
    ls: cannot access '*.gif': No such file or directory 
  5. Verify that all of the GIFs are in the gif folder:
    robin ~/Lesson1/data1 $ ls gif/ 

    The output is shown here:

    Figure 1.18: A screenshot of a partial output of the gif files within the folder
    Figure 1.18: A screenshot of a partial output of the gif files within the folder
  6. Make a new folder called jpeg and use multiple wildcard arguments with mv to move all JPEG files into that folder:
    robin ~/Lesson1/data1 $ mkdir jpeg 
    robin ~/Lesson1/data1 $ mv *.jpeg *.jpg jpeg 
  7. Verify with ls that no JPEG files remain in the CWD:
    robin ~/Lesson1/data1 $ ls *.jpeg *.jpg 
    ls: cannot access '*.jpeg': No such file or directory 
    ls: cannot access '*.jpg': No such file or directory
  8. List the jpeg folder to verify that all the JPEGs are in it:
    robin ~/Lesson1/data1 $ ls jpeg 

    The output is shown here:

    Figure 1.19: A screenshot of a partial output of the .jpeg files within the folder
    Figure 1.19: A screenshot of a partial output of the .jpeg files within the folder
  9. List all .so (shared object library) files that have only a single digit as the trailing version number:
    robin ~/Lesson1/data1 $ ls *.so.? 

    The output is shown here:

    Figure 1.20: A screenshot of a partial output of the .jpeg .so files ending with a dot, followed by a one-character version number
    Figure 1.20: A screenshot of a partial output of the .so files ending with a dot, followed by a one-character version number
  10. List all files that start with "google" and have an extension;
    robin ~/Lesson1/data1 $ ls google*.* 
    google_analytics.png  google_cloud_dataflow.png  google_drive.png  google_fusion_tables.png google_maps.png  google.png
  11. List all files that start with "a", have the third character "c", and have an extension:
    robin ~/Lesson1/data1 $ ls a?c*.* 
    archer.png  archive_entry.h  archive.h  archlinux.png  avcart.png
  12. List all of the files that do not have the .jpg extension:
    robin ~/Lesson1/data1 $ ls !(*.jpg) 

    The output is shown here:

    Figure 1.21: A screenshot of a partial output of the non-.jpeg files in the folder
    Figure 1.21: A screenshot of a partial output of the non-.jpeg files in the folder
  13. Before we conclude this exercise, get the sample data back to how it was before in preparation for the next activity. First, move the files within the jpeg and gif folders back to the current directory:
    robin ~/Lesson1/data1 $ mv gif/* .
    robin ~/Lesson1/data1 $ mv jpeg/* .

    Then, delete the empty folders:

    robin ~/Lesson1/data1 $ rm -r gif jpeg

Now, having learned the basic syntax, we can write wildcards to match almost any group of files and paths, so we rarely ever need to specify filenames individually.

Even in a GUI, it takes more effort than this to select groups of files in a file manager (for example, all .gifs) and this can be error-prone or frustrating when hundreds or thousands of files are involved.

Activity 4: Using Simple Wildcards

The supplied sample data in the Lesson1/data1 folder has about 11,000 empty files of various types. Use wildcards to copy each file to a directory representing its category, namely images, binaries, and misc., and count how many of each category exist. Through this activity, you will get familiar with using simple wildcards for file management. Follow these steps to complete this activity:

  1. Create the three directories representing the categories specified.
  2. Move all of the files with the extensions .jpg, .jpeg, .gif, and .png to the images folder.
  3. Move all of the files with the extensions .a, .so, and .so, followed by a period and a version number, into the binaries folder.
  4. Move the remaining files with any extension into the misc folder.
  5. Count the files in each folder using a shell command.

You should get the following answers: 3,674 images, 5,368 binaries, and 1,665 misc.

Note

The solution for this activity can be found on page 273.

Activity 5: Using Directory Wildcards

The supplied sample data inside the Lesson1/data folder has a taxonomy of tree species. Use wildcards to get the count of the following:

  1. The species whose family starts with the character p, and the genus has a as the second character.
  2. The species whose family starts with the character p, the genus has i as the second character, and species has u as the second character.
  3. The species whose family as well as genus starts with the character t.

This activity will help you get familiar with using simple wildcards that match directories.

Follow these steps to complete this activity:

  1. Navigate to the data folder.
  2. Use the tree command with a wildcard for each of the three conditions to get the count of species.

You should get the following answers: 83 species, 26 species, and 19 species.

Note

The solution for this activity can be found on page 273.

 

Summary

We have introduced a lot of material in this first chapter, which is probably quite novel to anyone approaching the command line for the first time. Even in this brief exploration, we can start to see how seemingly complicated filesystem tasks can be completed with minimal effort.

In the coming chapter, we will add to our toolbox of useful shell programs that process text data. In later chapters, we will learn about the mechanisms to tie these commands together, such as piping and redirection, to perform complex data-processing tasks. We will also learn about regular expressions and shell expansion constructs that let us manipulate textual data in powerful ways.

About the Author

  • Vivek N

    Vivek N is a self-taught programmer who has been programming for almost 30 years now, since the age of 8, with experience in X86 Assembler, C, Delphi, Python, JavaScript, and C++. He has been working with the command line since the days of DOS 4.01 and is keen to introduce the new generation of computer users to the power it holds to make their life easier.

    Browse publications by this author

Latest Reviews

(2 reviews total)
This book was not available for online reader, and it was difficult to find how to download.
Fast delivery! Excellent resource.
Command Line Fundamentals
Unlock this book and the full library for FREE
Start free trial