Learning Ansible

5 (1 reviews total)
By Madhurranjan Mohaan , Ramesh Raithatha
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

Automation includes provisioning new servers, making sure the servers adhere to their role and maintain the desired state from a configuration perspective, and orchestrating various actions across environments and deploying code as expected to all these servers. This is where Ansible steps in. It is secure, highly reliable, and minimalistic in nature. It automates configuration management, application deployment, and many other IT needs.

Learning Ansible will equip you with the necessary skills to automate/improve your infrastructure from a configuration management perspective. You will also be able to use Ansible for one-click deployments, provisioning, and orchestrating your infrastructure.

Publication date:
November 2014


Chapter 1. Getting Started with Ansible


We keep moving forward, opening new doors, and doing new things, because we're curious and curiosity keeps leading us down new paths.

 -- Walt Disney

If exploring new paths is what you like, then in this chapter, we're going to lead you down an exciting path with Ansible. Almost always, when one starts to invest time trying to learn a new tool or language, the expectation is to install a new tool, get a "Hello, world!" example out of the way, tweet about it (these days), and continue learning more features in the general direction of solving the problem at hand. The aim of this chapter is to make sure all of this (and more) is achieved.

In this chapter, we will cover the following topics:

  • What is Ansible?

  • The Ansible architecture

  • Configuring Ansible

  • Configuration management

  • Working with playbooks

  • Variables and their types

  • Working with inventory files

  • Working with modules

At the end of this chapter, you will be able to create basic playbooks and understand how to use Ansible.


What is Ansible?

Ansible is an orchestration engine in IT, which can be used for several use cases. Compared to other automation tools, Ansible brings you an easy way to configure your orchestration engine without the overhead of a client or central server setup. That's right! No central server! It comes preloaded with a wide range of modules that make your life simpler.

In this chapter, you will learn the basics of Ansible and how to set up Ansible on your system. Ansible is an open source tool (with enterprise editions available) developed using Python and runs on Windows, Mac, and UNIX-like systems. You can use Ansible for configuration management, orchestration, provisioning, and deployments, which covers many of the problems that are solved under the broad umbrella of DevOps. We won't be talking about culture here as that's a book by itself!


You could refer to the book, Continuous Delivery and DevOps – A Quickstart Guide by Packt Publishing for more information.

Let's try to answer some questions that you may have right away.

  • Can I use Ansible if I am starting afresh, have no automation in my system, and would like to introduce that (and as a result, increase my bonus for the next year)?

    A short answer to this question is Ansible is probably perfect for you. The learning curve with Ansible is way shorter than most other tools currently present in the market. For the long answer, you need to read the rest of the chapters!

  • I have other tools in my environment. Can I still use Ansible?

    Yes, again! If you already have other tools in your environment, you can still augment those with Ansible as it solves many problems in an elegant way. A case in point is a puppet shop that uses Ansible for orchestration and provisioning of new systems but continues to use Puppet for configuration management.

  • I don't have Python in my environment and introducing Ansible would bring in Python. What do I do?

    You need to remember that, on most Linux systems, a version of Python is present at boot time and you don't have to explicitly install Python. You should still go ahead with Ansible if it solves particular problems for you. Always question what problems you are trying to solve and then check whether a tool such as Ansible would solve that use case.

  • I have no configuration management at present. Can I start today?

    The answer is yes!

In many of the conferences we presented, the preceding four questions popped up most frequently. Now that these questions are answered, let's dig deeper.

The architecture of Ansible is agentless. Yes, you heard that right; you don't have to install any client-side software. It works purely on SSH connections; so, if you have a well oiled-SSH setup, then you're ready to roll Ansible into your environment in no time. This also means that you can install it only on one system (either a Linux or Mac machine) and you can control your entire infrastructure from that machine.

Yes, we understand that you must be thinking about what happens if this machine goes down. You would probably have multiple such machines in production, but this was just an example to elucidate the simplicity of Ansible. You could even run some of these machines from where you kick off Ansible scripts in a Demilitarized Zone (DMZ) to deal with your production machines.

The following table shows a small comparison of agentless versus agent-based configuration management systems:

Agent-based systems

Agentless systems

These systems need an agent and its dependencies installed.

No specific agent or third-party dependencies are installed on these systems. However, you need an SSH daemon that's up and running, in most cases.

These systems need to invoke the agent to run the configuration management tool. They can run it as a service or cron job. No external invocation is necessary.

These systems invoke the run remotely.

Parallel agent runs might be slow if they all hit the same server and the server cannot process several concurrent connections effectively. However, if they run without a server, the run would be faster.

Parallel agent runs might be faster than when all agents are contacting the same machine, but they might be constrained by the number of SSH connections since the runs are being invoked remotely.

The agent's installation and permissions need to be taken care of along with the configuration of the agent itself, for example, the server that they should talk to.

Remote connections can log in as a specific user and with the right level of user support since it's SSH-based.

Ansible primarily runs in the push mode but you can also run Ansible using ansible-pull, where you can install Ansible on each agent, download the playbooks locally, and run them on individual machines. If there is a large number of machines (large is a relative term; in our view, greater than 500 and requiring parallel updates) and you plan to deploy updates to the machine in parallel, this might be the right way to go about it.

To speedup default SSH connections, you can always enable ControlPersist and the pipeline mode, which makes Ansible faster and secure. Ansible works like any other UNIX command that doesn't require any daemon process to be running all the time.

Tools such as Chef and Puppet are agent-based and they need to communicate with a central server to pull updates. These can also run without a centralized server to scale a large number of machines commonly called Serverless Chef and Masterless Puppet, respectively.

When you start with something new, the first aspect you need to pay attention to is the nomenclature. The faster you're able to pick up the terms associated with the tool, the faster you're comfortable with that tool. So, to deploy, let's say, a package on one or more machines in Ansible, you would need to write a playbook that has a single task, which in turn uses the package module that would then go ahead and install the package based on an inventory file that contains a list of these machines. If you feel overwhelmed by the nomenclature, don't worry, you'll soon get used to it. Similar to the package module, Ansible comes loaded with more than 200 modules, purely written in Python. We will talk about modules in detail in the later chapters.

It is now time to install Ansible to start trying out various fun examples.

Installing Ansible

Installing Ansible is rather quick and simple. You can directly use the source code by cloning it from the GitHub project (https://github.com/ansible/ansible), install it using your system's package manager, or use Python's package management tool (pip). You can use Ansible on any Windows, Mac, or UNIX-like system. Ansible doesn't require any database and doesn't need any daemons running. This makes it easier to maintain the Ansible versions and upgrade without any breaks.

We'd like to call the machine where we will install Ansible our command center. Many people also refer to it as the Ansible workstation.


Note that, as Ansible is developed using Python, you would need Python Version 2.4 or a higher version installed. Python is preinstalled, as specified earlier, on the majority of operating systems. If this is not the case for you, refer to https://wiki.python.org/moin/BeginnersGuide/Download to download/upgrade Python.

Installing Ansible from source

Installing from source is as easy as cloning a repository. You don't require any root permissions while installing from source. Let's clone a repository and activate virtualenv, which is an isolated environment in Python where you can install packages without interfering with the system's Python packages. The command and the resultant output for the repository is as follows:

$ git clone git://github.com/ansible/ansible.git
Initialized empty Git repository in /home/vagrant/ansible/.git/
remote: Counting objects: 67818, done.
remote: Compressing objects: 100% (84/84), done.
remote: Total 67818 (delta 49), reused 2 (delta 0)
Receiving objects: 100% (67818/67818), 21.06 MiB | 238 KiB/s, done.
Resolving deltas: 100% (42682/42682), done.

[node ~]$ cd ansible/
[node ansible]$ source ./hacking/env-setup

Setting up Ansible to run out of checkout...


Remember, you may wish to specify your host file with -i


Ansible needs a couple of Python packages, which you can install using pip. If you don't have pip installed on your system, install it using the following command. If you don't have easy_install installed, you can install it using Python's setuptools package on Red Hat systems or using Brew on the Mac:

$ sudo easy_install pip
<A long output follows> 

Once you have installed pip, install the paramiko, PyYAML, jinja2, and httplib2 packages using the following command lines:

$ sudo pip install paramiko PyYAML jinja2 httplib2
Requirement already satisfied (use --upgrade to upgrade): paramiko in /usr/lib/python2.6/site-packages
Requirement already satisfied (use --upgrade to upgrade): PyYAML in /usr/lib64/python2.6/site-packages
Requirement already satisfied (use --upgrade to upgrade): jinja2 in /usr/lib/python2.6/site-packages
Requirement already satisfied (use --upgrade to upgrade): httplib2 in /usr/lib/python2.6/site-packages
Downloading/unpacking markupsafe (from jinja2)
  Downloading MarkupSafe-0.23.tar.gz
  Running setup.py (path:/tmp/pip_build_root/markupsafe/setup.py) egg_info for package markupsafe
Installing collected packages: markupsafe
  Running setup.py install for markupsafe
    building 'markupsafe._speedups' extension
    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c markupsafe/_speedups.c -o build/temp.linux-x86_64-2.6/markupsafe/_speedups.o
    gcc -pthread -shared build/temp.linux-x86_64-2.6/markupsafe/_speedups.o -L/usr/lib64 -lpython2.6 -o build/lib.linux-x86_64-2.6/markupsafe/_speedups.so
Successfully installed markupsafe
Cleaning up...


By default, Ansible will be running against the development branch. You might want to check out the latest stable branch. Check what the latest stable version is using the following command line:

$ git branch -a

Copy the latest version you want to use. Version 1.7.1 was the latest version available at the time of writing. Check the latest version you would like to use using the following command lines:

[node ansible]$ git checkout release1.7.1
Branch release1.7.1 set up to track remote branch release1.7.1 from origin.
Switched to a new branch 'release1.7.1'

[node ansible]$ansible --version
ansible 1.7.1 (release1.7.1 268e72318f) last updated 2014/09/28 21:27:25 (GMT +000)

You now have a working setup of Ansible ready. One of the benefits of running Ansible through source is that you can enjoy the benefits of new features immediately without waiting for your package manager to make them available for you.

Installing Ansible using the system's package manager

Ansible also provides a way to install itself using the system's package manager. We will look into installing Ansible via Yum, Apt, Homebrew, and pip.

Installing via Yum

If you are running a Fedora system, you can install Ansible directly. For CentOS- or RHEL-based systems, you should add the EPEL repository first, as follows:

$ sudo yum install ansible


On Cent 6 or RHEL 6, you have to run the command rpm -Uvh. Refer to http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm for instructions on how to install EPEL.

You can also install Ansible from an RPM file. You need to use the make rpm command against the git checkout of Ansible, as follows:

$ git clone git://github.com/ansible/ansible.git
$ cd ./ansible
$ make rpm
$ sudo rpm -Uvh ~/rpmbuild/ansible-*.noarch.rpm


You should have rpm-build, make, and python2-devel installed on your system to build an rpm.

Installing via Apt

Ansible is available for Ubuntu in a Personal Package Archive (PPA). To configure the PPA and install Ansible on your Ubuntu system, use the following command lines:

$ sudo apt-get install apt-add-repository
$ sudo apt-add-repository ppa:rquillo/ansible
$ sudo apt-get update
$ sudo apt-get install ansible

You can also compile a deb file for Debian and Ubuntu systems, using the following command line:

$ make deb
Installing via Homebrew

You can install Ansible on Mac OSX using Homebrew, as follows:

$ brew update
$ brew install ansible
Installing via pip

You can install Ansible via Python's package manager pip. If you don't have pip installed on your system, install it. You can use pip to install Ansible on Windows too, using the following command line:

$ sudo easy_install pip

You can now install Ansible using pip, as follows:

$ sudo pip install ansible

Once you're done installing Ansible, run ansible --version to verify that it has been installed:

$ ansible –version

You will get the following as the output of the preceding command line:

ansible 1.7.1

Hello Ansible

Let's start by checking if two remote machines are reachable; in other words, let's start by pinging two machines following which we'll echo hello ansible on the two remote machines. The following are the steps that need to be performed:

  1. Create an Ansible inventory file. This can contain one or more groups. Each group is defined within square brackets. This example has one group called servers:

    $ cat inventory
  2. Now, we have to ping the two machines. In order to do that, first run ansible --help to view the available options, as shown below (only copying the subset that we need for this example):

    ansible --help
    Usage: ansible <host-pattern> [options]
                           module arguments
    -i INVENTORY, --inventory-file=INVENTORY
                           specify inventory host file
    -m MODULE_NAME, --module-name=MODULE_NAME
                           module name to execute

    We'll now ping the two servers using the Ansible command line, as shown in the following screenshot:

  3. Now that we can ping these two servers, let's echo hello ansible!, using the command line shown in the following screenshot:

Consider the following command:

$ansible servers -i inventory -a '/bin/echo hello ansible!' 

The preceding command line is the same as the following one:

$ansible servers -i inventory  -m command -a '/bin/echo hello ansible!'.

If you move the inventory file to /etc/ansible/hosts, the Ansible command will become even simpler, as follows:

$ ansible servers -a '/bin/echo hello ansible!'

There you go. The 'Hello Ansible' program works! Time to tweet!


You can also specify the inventory file by exporting it in a variable named ANSIBLE_HOSTS. The preceding command, without the –i option, will work even in that situation.

Now that we've seen the 'Hello, world!' example, let's dig a little deeper into the architecture. Once you've had a hand on the architecture, you will start realizing the power of Ansible.


The Ansible architecture

As you can see from the following diagram, the idea is to have one or more command centers from where you can blast out commands onto remote machines or run a sequenced instruction set via playbooks:

The host inventory file determines the target machines where these plays will be executed. The Ansible configuration file can be customized to reflect the settings in your environment. The remote servers should have Python installed along with a library named simplejson in case you are using Python Version 2.5 or an earlier version.

The playbooks consist of one or more tasks that are expressed either with core modules that come with Ansible or custom modules that you can write for specific situations. The plays are executed sequentially from top to bottom, so there is no explicit order that you have to define. However, you can perform conditional execution on tasks so that they can be skipped (an Ansible term) if the conditions are not met.

You can also use the Ansible API to run scripts. These are situations where you would have a wrapper script that would then utilize the API to run the playbooks as needed. The playbooks are declarative in nature and are written in YAML Ain't Markup Language (YAML). This takes the declarative simplicity of such systems to a different level.

Ansible can also be used to provision new machines in data centers and/or Cloud, based on your infrastructure and configure them based on the role of the new machine. For such situations, Ansible has the power to execute a certain number of tasks in the local mode, that is, on the command center, and certain tasks on the actual machine, post the machine-boot-up phase.

In this case, a local action can spawn a new machine using an API of some sort, wait for the machine to come up by checking whether standard ports are up, and then log in to the machine and execute commands. The other aspect to consider is that Ansible can run tasks either serially or N threads in parallel. This leads to different permutations and combinations when you're using Ansible for deployment.

Before we proceed with full-fledged examples and look at the power of Ansible, we'll briefly look at the Ansible configuration file. This will let you map out the configuration to your setup.


Configuring Ansible

An Ansible configuration file uses an INI format to store its configuration data. In Ansible, you can overwrite nearly all of the configuration settings either through Ansible playbook options or environment variables. While running an Ansible command, the command looks for its configuration file in a predefined order, as follows:

  1. ANSIBLE_CONFIG: Firstly, the Ansible command checks the environment variable, which points to the configuration file

  2. ./ansible.cfg: Secondly, it checks the file in the current directory

  3. ~/.ansible.cfg: Thirdly, it checks the file in the user's home directory

  4. /etc/ansible/ansible.cfg: Lastly, it checks the file that is automatically generated when installing Ansible via a package manager

If you have installed Ansible through your system's package manager or pip, then you should already have a copy of ansible.cfg under the /etc/ansible directory. If you installed Ansible through the GitHub repository, you can find ansible.cfg under the examples directory, where you cloned your Ansible repository.

Configuration using environment variables

You can use most of the configuration parameters directly via environment variables by appending ANSIBLE_ to the configuration parameter (the parameter name should be in uppercase). Consider the following command line for example:


The ANSIBLE_SUDO_USER variable can then be used as part of the playbooks.

Configuration using ansible.cfg

Ansible has many configuration parameters; you might not need to use all of them. We will consider some of the configuration parameters, as follows, and see how to use them:

  • hostfile: This parameter indicates the path to the inventory file. The inventory file consists of a list of hosts that Ansible can connect to. We will discuss inventory files in detail later in this chapter. Consider the following command line for example:

    hostfile = /etc/ansible/hosts
  • library: Whenever you ask Ansible to perform any action, whether it is a local action or a remote one, it uses a piece of code to perform the action; this piece of code is called a module. The library parameter points to the path of the directory where Ansible modules are stored. Consider the following command line for example:

    library = /usr/share/ansible
  • forks: This parameter is the default number of processes that you want Ansible to spawn. It defaults to five maximum processes in parallel. Consider the following command line for example:

    forks = 5
  • sudo_user: This parameter specifies the default user that should be used against the issued commands. You can override this parameter from the Ansible playbook as well (this is covered in a later chapter). Consider the following command line for example:

    sudo_user = root
  • remote_port: This parameter is used to specify the port used for SSH connections, which defaults to 22. You might never need to change this value unless you are using SSH on a different port. Consider the following command line for example:

    remote_port = 22
  • host_key_checking: This parameter is used to disable the SSH host key checking; this is set to True by default. Consider the following command line for example:

    host_key_checking = False
  • timeout: This is the default value for the timeout of SSH connection attempts:

    timeout = 60
  • log_path: By default, Ansible doesn't log anything; if you would like to send the Ansible output to a logfile, then set the value of log_path to the file you would like to store the Ansible logs in. Consider the following command line for example:

    log_path = /var/log/ansible.log

In the latter half of this chapter, we'll focus on Ansible features and, primarily, how it can be used for configuration management. We'd recommend you to try out the given examples.


Configuration management

There has been a huge shift across companies of all sizes throughout the world in the field of infrastructure automation. CFEngine was one of the first tools to demonstrate this capability way back in the 1990s; more recently, there have been Puppet, Chef, and Salt besides Ansible. We will try and compare Ansible with Puppet and Chef during the course of this book since we've had a good experience with all three tools. We will also point out specifically how Ansible would solve a problem compared to Chef or Puppet.

All of them are declarative in nature and expect to move a machine to the desired state that is specified in the configuration. For example, in each of these tools, in order to start a service at a point in time and start it automatically on restart, you would need to write a declarative block or module; every time the tool runs on the machine, it will aspire to obtain the state defined in your playbook (Ansible), cookbook (Chef), or manifest (Puppet).

The difference in the toolset is minimal at a simple level but as more situations arise and the complexity increases, you will start finding differences between the different toolsets. In Puppet, you need to take care of the order, and the puppet server will create the sequence of instructions to execute every time you run it on a different box. To exploit the power of Chef, you will need a good Ruby team. Your team needs to be good at the Ruby language to customize both Puppet and Chef, and you will need a bigger learning curve with both the tools.

With Ansible, the case is different. It uses the simplicity of Chef when it comes to the order of execution, the top-to-bottom approach, and allows you to define the end state in the YAML format, which makes the code extremely readable and easy for everyone, from Development teams to Operations teams, to pick up and make changes. In many cases, even without Ansible, operations teams are given playbook manuals to execute instructions from, whenever they face issues. Ansible mimics that behavior. Do not be surprised if you end up having your project manager change the code in Ansible and check it into git because of its simplicity!

Let's now start looking at playbooks, variables, inventory files, and modules.


Working with playbooks

Playbooks are one of the core features of Ansible and tell Ansible what to execute. They are like a to-do list for Ansible that contains a list of tasks; each task internally links to a piece of code called a module. Playbooks are simple human-readable YAML files, whereas modules are a piece of code that can be written in any language with the condition that its output should be in the JSON format. You can have multiple tasks listed in a playbook and these tasks would be executed serially by Ansible. You can think of playbooks as an equivalent of manifests in Puppet, states in Salt, or cookbooks in Chef; they allow you to enter a list of tasks or commands you want to execute on your remote system.

The anatomy of a playbook

Playbooks can have a list of remote hosts, user variables, tasks, handlers (covered later in this chapter), and so on. You can also override most of the configuration settings through a playbook. Let's start looking at the anatomy of a playbook. The purpose of a playbook is to ensure that the httpd package is installed and the service is started. Consider the following screenshot, where the setup_apache.yml file is shown:

The setup_apache.yml file is an example of a playbook. The file comprises of three main parts, as follows:

  • hosts: This lists the host or host group against which we want to run the task. The hosts field is mandatory and every playbook should have it (except roles). It tells Ansible where to run the listed tasks. When provided with a host group, Ansible will take the host group from the playbook and will try looking for it in an inventory file (covered later in the chapter). If there is no match, Ansible will skip all the tasks for that host group. The --list-hosts option along with the playbook (ansible-playbook <playbook> --list-host) will exactly tell you against which hosts the playbook will run.

  • remote_user: This is one of the configuration parameters of Ansible (consider, for example, tom' - remote_user) that tells Ansible to use a particular user (in this case, tom) while logging into the system.

  • tasks: Finally, we come to tasks. All playbooks should contain tasks. Tasks are a list of actions you want to perform. A tasks field contains the name of the task, that is, the help text for the user about the task, a module that should be executed, and arguments that are required for the module. Let's look at the single task that is listed in the playbook, as shown in the preceding screenshot:

       - name: Install httpd package
         yum: name=httpd state=latest
         sudo: yes
       - name: Starting httpd service
         service: name=httpd state=started
         sudo: yes


Most of the examples in the book would be executed on CentOS, but the same set of examples with a few changes would work on Ubuntu as well.

In the preceding case, there are two tasks. The name parameter represents what the task is doing and is present only to improve readability, as we'll see during the playbook run. The name parameter is optional. The modules, yum and service, have their own set of parameters. Almost all modules have the name parameter (there are exceptions such as the debug module), which indicates what component the actions are performed on. Let's look at the other parameters:

  • In the yum module's case, the state parameter has the latest value and it indicates that the httpd latest package should be installed. The command to execute more or less translates to yum install httpd.

  • In the service module's scenario, the state parameter with the started value indicates that httpd service should be started, and it roughly translates to /etc/init.d/httpd start.

  • The sudo: yes parameter represents the fact that the tasks should be executed with the sudo access. If the sudo user's file does not allow the user to run the particular command, then the playbook will fail when it is run.


You might have questions about why there is no package module that internally figures out the architecture and runs either the yum, apt, or other package options depending on the architecture of the system. Ansible populates the package manager value into a variable named ansible_pkg_manager.

In general, we need to remember that the number of packages that have a common name across different operating systems is a small subset of the number of packages that are actually present. For example, the httpd package is called httpd in Red Hat systems and apache2 in Debian-based systems. We also need to remember that every package manager has its own set of options that make it powerful; as a result, it makes more sense to use explicit package manager names so that the full set of options are available to the end user writing the playbook.

Let's look at the folder structure before we run the playbook. We have a folder named example1; within that, there are different files related to the Ansible playbook. With advanced examples, we'll see various folders and multiple files. Consider the following screenshot:

The hosts file is set up locally with one host named host1 that corresponds to what is specified in the setup_apache.yml playbook:

[[email protected] example1]# cat hosts

Now, it's time (yes, finally!) to run the playbook. Run the command line, as shown in the following screenshot:

Wow! The example worked. Let's now check whether the httpd package is installed and up-and-running on the machine. Perform the steps shown in the following screenshot:

The end state, according to the playbook, has been achieved. Let's briefly look at what exactly happens during the playbook run:

#ansible-playbook -i hosts playbooks/setup_apache.yml

The command, ansible-playbook, is what you would use in order to invoke the process that runs a playbook. The familiar -i option points to the inventory host file. We'll look at other options with ansible-playbook in a later section.

Next, we'll look into the Gathering Facts task that, when run, is displayed as follows:

GATHERING FACTS ***************************************************************
ok: [host1]

The first default task when any playbook is run is the Gathering Facts task. The aim of this task is to gather useful metadata about the machine in the form of variables; these variables can then be used as a part of tasks that follow in the playbook. Examples of facts include the IP Address of the machine, the architecture of the system, and hostname. The following command will show you the facts collected by Ansible about the host:

ansible -m setup host1 -i hosts

You can disable the gathering of facts by setting the gather_facts command just below the hosts command in the Ansible playbook. We'll look at the pros and cons of fact gathering in a later chapter.

-   hosts: host1
    gather_facts: False

TASK: [Install httpd package] *************************************************
changed: [host1]

TASK: [Starting httpd service] ************************************************
changed: [host1]

Followed by the execution of the preceding command lines, we have the actual tasks that are executed. Both the tasks give their outputs stating whether the state of the machine has changed by running the task or not. In this case, since neither the httpd package was present nor the service was started, the tasks' outputs changed for the user to see on the screen (as seen in the preceding screenshot). Let's rerun the task now and see the output after both the tasks have actually run.

As you would have expected, the two tasks in question give an output of ok, which would mean that the desired state was already met prior to running the task. It's important to remember that many tasks such as the Gathering facts task obtain information regarding a particular component of the system and do not necessarily change anything on the system; hence, these tasks didn't display the changed output earlier.

The PLAY RECAP section in the first and second run are shown as follows. You will see the following output during the first run:

You will see the following output during the second run:

As you can see, the difference is that the first task's output shows changed=2, which means that the system state changed twice due to two tasks. It's very useful to look at this output, since, if a system has achieved its desired state and then you run the playbook on it, the expected output should be changed=0.

If you're thinking of the word Idempotency at this stage, you're absolutely right and deserve a pat on the back! Idempotency is one of the key tenets of Configuration Management. Wikipedia defines Idempotency as an operation that, if applied twice to any value, gives the same result as if it were applied once. Earliest examples that you would have encountered in your childhood would be multiplicative operations on the number 1, where 1*1=1 every single time.

Most of the configuration management tools have taken this principle and applied it to the infrastructure as well. In a large infrastructure, it is highly recommended to monitor or track the number of changed tasks in your infrastructure and alert the concerned tasks if you find oddities; this applies to any configuration management tool in general. In an ideal state, the only time you should see changes is when you're introducing a new change in the form of any Create, Remove, Update, or Delete (CRUD) operation on various system components. If you're thinking how you can do it with Ansible, keep reading the book and you'll eventually find the answer!

Let's proceed. You could have also written the preceding tasks as follows but when the tasks are run, from an end user's perspective, they are quite readable:

   - yum: name=httpd state=latest
     sudo: yes
   - service: name=httpd state=started
     sudo: yes

Let's run the playbook again to spot any difference in the output, as shown in the following screenshot:

As you can see, the difference is in the readability. Wherever possible, it's recommended to keep the tasks as simple as possible (the KISS principle of Keep It Simple Stupid) to allow for maintainability of your scripts in the long run.

Now that we've seen how you can write a basic playbook and run it against a host, let's look at other options that would help you while running playbooks.

One of the first options anyone picks up is the debug option. To understand what is happening when you run the playbook, you can run it with the Verbose (-v) option. Every extra v will provide the end user with more debug output. Let's see an example of using the playbook debug for a single task using the following debug options.

  • The -v option provides the default output, as shown in the preceding screenshot.

  • The -vv option adds a little more information, as shown in the following screenshot:

  • The -vvv option adds a lot more information, as shown in the following screenshot. This shows the SSH command Ansible uses to create a temporary file on the remote host and run the script remotely.

From a playbook, it becomes important to view what the values of certain variables are. In order to help you, there is a helpful debug module that you can add to your setup_apache.yml playbook, as shown in the following screenshot:

Let's run the setup_apache.yml playbook and see how the debug module works in the following screenshot:

This is also the first usage of the metadata that we've gathered from the machine. Here, we're outputting the value that is assigned to the ansible_distribution variable. Every Ansible variable that has the metadata starts with ansible_. The debug option can be used generously to help you in your overall usage of tasks. Also, as expected, there is no change in the state of the machine; hence, changed=0.

The next useful option with ansible-playbook is to simply list all the tasks that will be executed when you run a playbook. When you have several tasks and multiple playbooks that run as part of a playbook, this option would help you analyze a playbook when it is run. Let's look at an example in the following screenshot:

You also have the start-at option. It will start executing the task you specify. Let's look at an example in the following screenshot:


Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

As you can see, there is no Install httpd package task in the preceding screenshot, since it was skipped. Another related option that we should look at is the step option. With this, you can prompt the user to execute a task (or not). Let's look at an example in the following screenshot:

In the preceding example, we didn't execute the Starting httpd service task. There are several more useful options that we will cover later in this chapter and in Chapter 2, Developing, Testing, and Releasing Playbooks, along with relevant examples. For now, let's jump into variables.


Variables and their types

Variables are used to store values that can be later used in your playbook. They can be set and overridden in multiple ways. Facts of machines can also be fetched as variables and used. Ansible allows you to set your variables in many different ways, that is, either by passing a variable file, declaring it in a playbook, passing it to the ansible-playbook command using -e / --extra-vars, or by declaring it in an inventory file (discussed later in this chapter).

Before we look at the preceding ways in a little more detail, let's look at some of the ways in which variables in Ansible can help you, as follows:

  • Use them to specify the package name when you have different operating systems running, as follows:

    - set_fact package_name=httpd
      when: ansible_os_family == "Redhat"
    - set_fact package_name=apache2
      when: ansible_os_family == "Debian"

    The preceding task will set a variable package_name either with httpd or apache2 depending on the OS family of your machine. We'll look at other facts that are fetched as variables shortly.

  • Use them to store user values using a prompt in your Ansible playbook:

    - name: Package to install
      pause: prompt="Provide the package name which you want to install "
      register: package_name

    The preceding task will prompt the user to provide the package name. The user input would then be stored in a variable named package_name.

  • Store a list of values and loop it.

  • Reduce redundancy if the same variables are called in multiple playbooks that refer to the same variable name. This is so that you can change the value in just one place and it gets reflected in every invocation of the variable.

The types of variables that Ansible supports are String, Numbers, Float, List, Dictionary, and Boolean.

Variable names

All variable names in Ansible should start with a letter. The name can have letters, numbers, and an underscore.

Valid variable names in Ansible

The following are a few examples of valid variable names in Ansible:

  • package_name

  • package_name2

  • user_input_package

  • Package

Invalid variable names in Ansible

The following are a few examples of invalid variable names in Ansible:

  • mysql version (multiple words)

  • mysql.port (a dot)

  • 5 (a number)

  • user-input (a hyphen)

You can define variables in Ansible at different hierarchy levels; let's see what those levels are and how you can override variables in that hierarchy.

Variables in an included task file

Variables in an included task file will override any other variables defined at different levels of hierarchy except the extra variables passed through the command line. We will see how command-line variables work later in this chapter. This override feature allows you to change the value of a variable during different tasks, making it more dynamic. This is one of the widely used variable features, where you want to assign a default value to the variable and override it during different tasks. Let's see an example of how this works in the following screenshot:

In the preceding example, we created two playbooks. One will set a fact for the package name and install Apache depending on the OS distribution; the second one will actually be executed by Ansible, which will first call install_apache.yml and make sure the Apache service is running on the remote host. To fetch the package name, we will directly use the package_name variable that was set by the install_apache.yml playbook.

Variables in a playbook

Variables in a playbook are set by the vars: key. This key takes a key-value pair, where the key is the variable name and the value is the actual value of the variable. This variable will overwrite other variables that are set by a global variable file or from an inventory file. We will now see how to set a variable in a playbook using the preceding example. This is demonstrated in the following screenshot:

The preceding example will first set the package_name variable with a default value of httpd; this value will be further overridden by a task, install_apache.yml, that we included. You can define multiple variables, each in a separate line.

Variables in a global file

Variables in Ansible can also be defined in a separate file; this allows you to separate your data (that is, variables) from your playbook. You can define as many variable files as you want; you just need to tell your playbook the files it needs to look at for variables. The format to define variables in a file is similar to the format of playbook variables. You provide the variable name and its value in a key-value pair and it follows a YAML format. Let's see how it works in the following screenshot:

The preceding example defines some variables where we directly pass a default value, whereas, for AWS_ACCESS_KEY and AWS_SECRET_KEY, we fetch the value from an environment variable using the lookup plugin of the Jinja templating language (more on this in later chapters). Anything that succeeds hash (#) is not interpreted by Ansible and is counted as a comment. You can also have comments after a variable is defined. For example, consider the following command line:

mount_point: "/dev/sdf"    # Default mount point

You tell Ansible which variable files need to be checked by using the vars_files key. You can specify multiple variable files in a playbook. Ansible will check for a variable in a bottom-to-top manner. Let's see how this works, in the following screenshot:

In the preceding example, Ansible will first check for a variable named package_name in the var2.yml file. It will stop further lookup if it finds the variable in var2.yml; if not, it will try searching for the variable in var3.yml, and other files if there are any more.

Facts as variables

You've already seen an example of how to use a fact, such as ansible_distribution, that is obtained as a fact from a machine. Let's look at a bigger list that you can access when you run gather_facts. The same set of facts can be seen by running the setup module without using the ansible-playbook command, and by using the ansible command as shown in the following command lines:

$ ansible -i inventory -m setup | success >> {
    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
        "ansible_all_ipv6_addresses": [
        "ansible_architecture": "x86_64",
        "ansible_distribution_major_version": "6",
        "ansible_distribution_release": "Final",
        "ansible_distribution_version": "6.4",
        "ansible_domain": "localdomain"
        "ansible_swapfree_mb": 2559,
        "ansible_swaptotal_mb": 2559,
        "ansible_system": "Linux",
        "ansible_system_vendor": "innotek GmbH",
        "ansible_user_id": "vagrant",
        "ansible_userspace_architecture": "x86_64",
        "ansible_userspace_bits": "64",

These facts are now exposed as variables and can be used as part of playbooks. You will find more examples regarding this later in this book.

Command-line variables

Command-line variables are a great way to overwrite file/playbook variables. This feature allows you to give your user the power to pass the value of a variable directly from an ansible-playbook command. You can use the -e or --extra-vars options of the ansible-playbook command by passing a string of key-value pairs, separated by a whitespace. Consider the following command line:

ansible-playbook -i hosts --private-key=~/.ssh/ansible_key playbooks/apache.yml 
--extra-vars "package_name=apache2"

The preceding ansible-playbook command will overwrite the package_name variable if it is mentioned in any of the variable files. Command-line variables will not override the variables that are set by the set_fact module. To prevent this type of overriding, we can use Ansible's conditional statements, which will override a variable only if it is not defined. We will discuss more about conditional statements in later chapters.

One of the commonly used command-line variables is hosts. Till now, we saw some Ansible playbook examples where we directly passed the hostname to the playbook. Instead of passing the hostname directly, you can use an Ansible variable and leave it to the end user to provide the hostname. Let's see how this works in the following screenshot:

In the preceding playbook, instead of directly using the hostname, we will now pass a variable named nodes. The ansible-playbook command for such a playbook will look as follows:

ansible-playbook -i hosts --private-key=~/.ssh/ansible_key playbooks/apache.yml 
--extra-vars "nodes=host1"

The preceding ansible-playbook command will set the value of nodes as host1. The Ansible playbook will now use the value of the nodes variable against hosts. You can also pass a group of hosts by passing the hostname to the nodes variable (more on grouping hosts will be seen in the next section).

The last thing we'd like to cover in this section is typing out all the options every single time. You can export all of the options as environment variables, as shown in the following command lines, so that you don't have to type them all.

$ env | grep ANSIBLE

Once you type the preceding command lines, the playbook command would resemble the following command line:

ansible-playbook playbooks/apache.yml --extra-vars "nodes=host1"

Variables in an inventory file

All of the preceding variables are applied globally to all hosts against which you are running your Ansible playbook. You might sometimes need to use a specific list of variables for a specific host. Ansible supports this by declaring your variables inside an inventory file. There are different ways to declare variables inside an inventory file; we will look at how an inventory file works and how to use Ansible with it in the next section of this chapter.

Apart from the user-defined variables, Ansible also provides some system-related variables called facts. These facts are available to all hosts and tasks, and are collected every time you run the ansible-playbook command, unless disabled manually. You can directly access these facts by using a Jinja template, for example, as follows:

- name: Show how debug works
  debug: msg={{ ansible_distribution }}

The ansible_distribution part in the preceding command line is a fact, which will be initialized by Ansible when you run the ansible-playbook command. To check all the facts Ansible collects, run the following command line:

ansible -m setup host1 -i host1,

The preceding example will run the setup module on host1 and list out all possible facts that Ansible can collect from the remote host. It will also collect the facts from facter (a discovery program) if you have it installed on your system. The variable -i in the preceding example specifies an inventory file; in this case, instead of passing a file, we will directly use the hostname.


When using a hostname directly instead of an inventory file, you need to add a trailing comma "," with the hostname. You can even specify multiple hostnames separated by a comma.


Working with inventory files

An inventory file is the source of truth for Ansible (there is also an advanced concept called dynamic inventory, which we will cover later). It follows the INI format and tells Ansible whether the remote host or hosts provided by the user are genuine or not.

Ansible can run its tasks against multiple hosts in parallel. To do this, you can directly pass the list of hosts to Ansible using an inventory file. For such parallel execution, Ansible allows you to group your hosts in the inventory file; the file passes the group name to Ansible. Ansible will search for that group in the inventory file and run its tasks against all the hosts listed in that group.

You can pass the inventory file to Ansible using the -i or --inventory-file option followed by the path to the file. If you do not explicitly specify any inventory file to Ansible, it will take the default path from the host_file parameter of ansible.cfg, which defaults to /etc/ansible/hosts.

The basic inventory file

Before diving into the concept, first let's look at a basic inventory file in the following screenshot:

Ansible can take either a hostname or an IP address within the inventory file. In the preceding example, we specified four servers; Ansible will take these servers and search for the hostname that you provided, to run its tasks. If you want to run your Ansible tasks against all of these hosts, then you can pass all to the hosts parameter while running the ansible-playbook or to the ansible command; this will make Ansible run its tasks against all the hosts listed in an inventory file.

The command that you would run is shown in the following screenshot:

In the preceding screenshot, the Ansible command took all the hosts from an inventory file and ran the ping module against each of them. Similarly, you can use all with the ansible-playbook by passing all to the hosts. Let's see an example for an Ansible playbook in the following screenshot:

Now, when you run the preceding Ansible playbook, it will execute its tasks against all hosts listed in an inventory file.

This command will spawn off four parallel processes, one for each machine. The default number of parallel threads is five. For a larger number of hosts, you can increase the number of parallel processes with the -f or --forks=< value > option.

Coming back to the features of the file, one of the drawbacks with this type of simple inventory file is that you cannot run your Ansible tasks against a subset of the hosts, that is, if you want to run Ansible against two of the hosts, then you can't do that with this inventory file. To deal with such a situation, Ansible provides a way to group your hosts and run Ansible against that group.

Groups in an inventory file

In the following example, we grouped the inventory file into three groups, that is, application, db, and jump:

Now, instead of running Ansible against all the hosts, you can run it against a set of hosts by passing the group name to the ansible-playbook command. When Ansible runs its tasks against a group, it will take all the hosts that fall under that group. To run Ansible against the application group, you need to run the command line shown in the following screenshot:

This time we directly passed the group name instead of running Ansible against all hosts; you can have multiple groups in the inventory file and you can even club similar groups together in one group (we will see how clubbing groups works in the next section). You can use groups using Ansible's playbook command as well by passing the group name to hosts.

In the preceding screenshot, Ansible will run its tasks against the hosts example.com and web001.

You can still run Ansible against a single host by directly passing the hostname or against all the hosts by passing all to them.

Groups of groups

Grouping is a good way to run Ansible on multiple hosts together. Ansible provides a way to further group multiple groups together. For example, let's say, you have multiple application and database servers running in the east coast and these are grouped as application and db. You can then create a master group called eastcoast. Using this command, you can run Ansible on your entire eastcoast data center instead of running it on all groups one by one.

Let's take a look at an example shown in the following screenshot:

You can use a group of groups in the same way you use Ansible with groups in the preceding section. This is demonstrated in the following screenshot:

You can directly refer to an inventory group in the ansible-playbook as follows:

Regular expressions with an inventory file

An inventory file would be very helpful if you have many servers. Let's say you have a large number of web servers that follow the same naming convention, for example, web001, web002, …, web00N, and so on. Listing all these servers separately will result in a dirty inventory file, which would be difficult to manage with hundreds to thousands of lines. To deal with such situations, Ansible allows you to use regex inside its inventory file. The following screenshot shows an example of multiple servers:

From the preceding screenshot, we can deduce the following:

  • web[001:200] will match web001, web002, web003, web004, …, web199, web200 for the application group

  • db[001:020] will match db001, db002, db003, …, db019, db020 for the db group

  • 192.168.2.[1:3] will match,, for the jump group

External variables

Ansible allows you to define external variables in many ways, from an external variable file within a playbook, by passing it from the Ansible command using the -e / --extra-vars option, or by passing it to an inventory file. You can define external variables in an inventory file either on a per-host basis, for an entire group, or by creating a variable file in the directory where your inventory file exists.

Host variables

Using the following inventory file, you can access the variable db_name for the db001 host, and db_name and db_port for

Group variables

Let's move to variables that can be applied to a group. Consider the following example:

The preceding inventory file will provide two variables and their respective values for the application group, app_type=search and app_port=9898.


Host variables will override group variables.

Variable files

Apart from host and group variables, you can also have variable files. Variable files can be either for hosts or groups that reside in the folder of your inventory file. All of the host variable files will go under the host_vars directory, whereas all group variable files will go under the group_vars directory.

The following is an example of a host variable file (assuming your inventory file is under the /etc/ansible directory):

cat /etc/ansible/host_vars/web001

As our inventory file resides under the /etc/ansible directory, we will create a host_vars directory inside /etc/ansible and place our variable file inside that. The name of the variable file should be the hostname, mentioned in your inventory file.

The following is an example of a group variable file:

cat /etc/ansible/group_vars/db

The variable file for groups is the same as the host file. The only difference here is that the variables will be accessible to all of the hosts of that group. The name of the variable file should be the group name, mentioned in your inventory file.


Inventory variables follow a hierarchy; at the top of this is the common variable file (we discussed this in the previous section, Working with inventory files) that will override any of the host variables, group variables, and inventory variable files. After this, comes the host variables, which will override group variables; lastly, group variables will override inventory variable files.

Overriding configuration parameters with an inventory file

You can override some of Ansible's configuration parameters directly through the inventory file. These configuration parameters will override all the other parameters that are set either through ansible.cfg, environment variables, or passed to the ansible-playbook/ansible command. The following is the list of parameters you can override from an inventory file:

  • ansible_ssh_user: This parameter is used to override the user that will be used for communicating with the remote host.

  • ansible_ssh_port: This parameter will override the default SSH port with the user-specified port. It's a general, recommended sysadmin practice to not run SSH on the standard port 22.

  • ansible_ssh_host: This parameter is used to override the host for an alias.

  • ansible_connection: This specifies the connection type that should be used to connect to the remote host. The values are SSH, paramiko, or local.

  • ansible_ssh_private_key_file: This parameter will override the private key used for SSH; this will be useful if you want to use some specific keys for a specific host. A common use case is if you have hosts spread across multiple data centers, multiple AWS regions, or different kinds of applications. Private keys can potentially be different in such scenarios.

  • ansible_shell_type: By default, Ansible uses the sh shell; you can override this using the ansible_shell_type parameter. Changing this to csh, ksh, and so on will make Ansible use the commands of that shell.

  • ansible_python_interpreter: Ansible, by default, tries to look for a Python interpreter within /usr/bin/python; you can override the default Python interpreter using this parameter.

Let's take a look at the following example:

web001 ansible_ssh_user=myuser ansible_ssh_private_key_file=myuser.rsa

The preceding example will override the SSH user and the SSH private keys for the web001 host. You can set similar variables for groups and variable files.


Working with modules

Now that we've seen how playbooks, variables, and inventories come together, let's look at modules in greater detail. Ansible provides more than 200 modules under top-level modules, such as System, Network, Database, and Files, that you can readily use and deal with in your infrastructure. The module index page has more details regarding the categories of the module. We'll explore modules that are commonly used and look at more advanced modules in later chapters.

Command modules

We start with four modules that are pretty similar to each other. They are used to execute tasks on remote machines. We need to take care of idempotency for most tasks that involve any of the above modules using conditional clauses that we will see in the coming chapters. Using parameters such as creates and removes can also introduce idempotency. Let's see when and where we can use each of these.

The command module

This takes the command name along with the arguments. However, shell variables or operations such as <, >, |, and & will not work as they will not be processed by the shell. This feature is similar to the fork function in C programming. Running the command module is secure and predictable. Also, the command module gives you the following parameters:

  • chdir: This is used to change to a specific directory and execute commands

  • creates: You can specify what file will be created with this option

  • removes: This is used to remove a file

Let's write a task to reboot a machine, as follows:

   - name: Reboot machine
     command: /sbin/shutdown -r now
     sudo: yes

On running the preceding command, we can see the following output:

As expected, without the conditional clause, this task will execute every single time as part of running the playbook.

The raw module

This module is used only when the other command modules do not fit the bill. This can be applied to a machine and will run a command in SSH. Use cases include running remote tasks on machines that don't have Python installed. Networking devices, such as routers and switches, are classic cases. Let's look at a quick example to install the vim package, as follows:

   - name: Install vim
     raw: yum -y install vim-common
     sudo: yes

On running the preceding command, we see that the package is installed. Even after the package is installed, the task does not show that it is a changed task. It's best to not use the raw package.

The script module

This module is used to copy a script remotely to a machine and execute it. It supports the creates and removes parameters. Let's look at an example where we list down directories within a particular directory. Remember, you don't have to copy the script remotely in this case. The module does it for you as follows:

   - name: List directories in /etc
     script: list_number_of_directories.sh /etc
     sudo: yes

On running the preceding command, we get the following output:

Here, 83 is the number of directories in the /etc directory, which can be verified by running the following command:

$ls -l /etc | egrep '^d' | wc -l

The shell module

Finally we come to the shell module. The major difference between the command and shell modules is that the shell module uses a shell (/bin/sh, by default) to run the commands. You can use shell environment variables and other shell features. Let's look at an example where we redirect the list of all files in /tmp to a directory and, in a subsequent task, concatenate (using cat) the file. The tasks are shown as follows:

   - name: List files in /tmp and redirect to a file
     shell: /bin/ls -l /tmp > /tmp/list
     sudo: yes

   - name: Cat /tmp/list
     shell: /bin/cat /tmp/list

The output is shown as follows:


We've turned off color for screenshots that involve debugging just to make them more readable. The preceding output might not look that great but it can be formatted. Using callbacks and register, you can format an output in a better manner. We'll demonstrate that in later chapters.

File modules

We'll now switch to some very useful file modules, namely, file, template, and copy. There are others as well but we intend to cover the most important ones and those that are used often. Let's start with the file module.

The file module

The file module allows you to change the attributes of a file. It can touch a file, create or delete recursive directories, and create or delete symlinks.

The following example makes sure that httpd.conf has the right permissions and owner:

   - name: Ensure httpd conf has right permissions and owner/group
file: path=/etc/httpd/conf/httpd.conf owner=root group=root mode=0644
     sudo: yes

On running the preceding command, you should see the following output:

If we check the output on the machine, we will see the following:

As shown in the preceding screenshot, there is no change as the file has the expected permissions and ownership. However, it's important to make sure important configuration files are under the control of Ansible (or any configuration management tool in general) so that, if there are changes, the next time playbook is run, those changes are reverted. This is one of the reasons for having your infrastructure as code, making sure you control all that matters from the code that is checked in. If there are genuine changes, then those have to be checked into the main Ansible repository that you maintain, and change has to flow from there.

The next example will create a symlink to the httpd conf file, as follows:

   - name: Create a symlink in /tmp for httpd.conf
     file: src=/etc/httpd/conf/httpd.conf dest=/tmp/httpd.conf owner=root group=root state=link
     sudo: yes

The output of the preceding task is as follows:

If we check on the machine, we will see the following output:

The output is as expected. You might notice that we're running the ls command to verify the output. This is not always necessary, but it's highly recommended that you test everything that you automate right from the start. In the next chapter, we'll show you the possible methods in which you can automate these tests. For now, they are manual.

Debugging in Ansible

Now, let's create a hierarchy of directories with 777 permissions. For this particular example, we're going to use Ansible 1.5.5 for the purpose of showcasing how to debug with Ansible:

   - name: Create recursive directories
     file: path=/tmp/dir1/dir2/dir3 owner=root group=root mode=0777
     sudo: yes

Do you see something not right with the preceding example? (In hindsight, if we've asked you the question, it means something is wrong!)

Let's run it and check. On running it, we see the following output:

We would expect this task's output to be changed. However, it shows ok. Let's verify it on the system.

There you go! The recursive directories are not present. Why did this happen without an error?

To find the answer, run the -vv option you learned about earlier. The following output is received:

This was a bug in Version 1.5.5 but was fixed later in Version 1.6 and without specifying state=directory, it errors out. However, there is a possibility that you might find other such issues. Make sure you check the documentation; it might be a bug that you might want to raise or possibly fix. To fix the preceding bug in Version 1.5.5, we change the state value to directory, as shown in the following command lines.

   - name: Create recursive directories
     file: path=/tmp/dir1/dir2/dir3 owner=root group=root mode=0777 state=directory
     sudo: yes

On running the preceding command line with the debug mode this time, we will see the following output:

Looking at the tree output on the machine, we see that the directory has been created. This is depicted in the following screenshot:

The moral of the story is, Learn debugging techniques with a tool so that you can resolve issues at the earliest!

Let's move to another very useful module, template. This is as useful as the template resource in Chef/Puppet.

The template module

Ansible uses the Jinja2 templating language for creating templates, modeled after Django's templates (Django is a popular Python web framework). This is similar to Erubis, which Puppet and Chef use.

Templating is also a way to create a file on a machine. Let's now create a simple template using the following command lines:

$cat test
This is a test file on {{ ansible_hostname }}

The test file is in the same directory as example1.yml.

To reference the template, we'll add the following to the playbook:

   - name: Create a test template
     template: src=test dest=/tmp/testfile mode=644

On running the preceding playbook, we get the following output:

As you can see in the following screenshot, Ansible created testfile inside /tmp and applied the template to the file.

The user running the playbook is vagrant in this case and the file created will also be owned by the same user. The ansible_hostname variable is populated during the gather facts phase. Let's take a minor detour and disable gather facts by adding the following to the playbook:

- hosts: host1
  gather_facts: False

Now, on running the playbook, the debug task fails as follows:

On commenting out the task and running it again, we get an error with the template, as shown in the following screenshot:

Now, there are several cases where you might not want to gather facts. In such cases, to refer to the host, Ansible provides another useful variable inventory_hostname, which you can use. To modify the template, use the following command line:

cat playbooks/test
This is a test file on {{ inventory_hostname }}

On deleting the test file and rerunning the Ansible playbook, we find the same result as before:

As expected, Ansible created testfile and did not fail because we used the inventory_hostname variable this time:

The Ansible template also has a validate parameter that allows you to run a command to validate the file before copying it. This is like a hook that Ansible provides to make sure files that might break the service are not written. A classic example is that of the Apache httpd configuration. You can verify that the Apache httpd configuration files have the right syntax using the httpd or apachectl command. Since the validate command takes an argument, we'll use the httpd option. It works as follows:

$httpd -t -f /etc/httpd/conf/httpd.conf
Syntax OK

If we introduce an error by uncommenting the Virtual hosts line, we get the following output when we rerun the $httpd -t -f /etc/httpd/conf/httpd.conf command:

httpd: Syntax error on line 1003 of /etc/httpd/conf/httpd.conf: /etc/httpd/conf/httpd.conf:1003: <VirtualHost> was not closed.

We'll demonstrate the same technique for a new virtual host file in Apache. We'd like to verify that the new virtual host file that we're adding has the right syntax. Let's look at the virtual host file. There is an error, as shown in the following screenshot:

The playbook will have the following template task. The validate option takes a %s parameter, which is the source file that we're planning to write to the machine:

   - name: Create a virtual host
     template: src=test.conf dest=/etc/httpd/conf.d/test.conf mode=644 validate='httpd -t -f %s'
     sudo: yes

Now, on running the preceding command line, we get the following output:

Ansible points out the error and the file is not written. This is a great feature that Ansible templates offer; if we have scripts for different packages/services that can verify the validity of files, then we'll have a lot fewer breakages due to incorrect configuration files. It is especially useful to have the validate feature when you write your own modules. We will cover how to write custom modules in a later chapter.

Let's move to the next module, copy.

The copy module

Using copy, you can copy a file from your local location to remote machines. This is another way to create a remote file with predetermined content (the template being the first). Let's look at an example as follows:

 - name: Copy file remotely
   copy: src=test2.conf dest=/etc/test2.conf owner=root group=root mode=0644

On running the preceding command, we see the following output:

Any idea why this failed? If you figured out that it's because sudo: true is not part of the module invocation, share a high five with the nearest person. Once we add it, the run goes through without errors, as shown in the following screenshot:

The copy module also supports the validate parameter just like the template module.


With simple modules, we've tried to highlight possible errors that you might come across during your experience with Ansible.

The source control module – git

We'll now look at a very important module, the source control module git. Ansible has modules to support subversion, bzr, and hg, apart from github_hooks. We'll look at possibly the most popular source control system today, git.

Let's check out a git repository from GitHub using an Ansible task. We'll first install git with a yum task, shown as follows:

   - yum: name=git state=installed
     sudo: yes

Now, let's check out the popular gitignore repository. The Ansible task is shown as follows:

   - name: Checkout gitignore repository
     git: repo=https://github.com/github/gitignore.git
     sudo: yes

On running the playbook (using –vv here), we will see the following output:

If we check the machine, we will see that the repository has been checked out in the expected directory.

On rerunning the task, we see the following output:

This basically means that the task is idempotent as it checks the before and after SHA (Secure Hash Algorithm) values.

There are several other parameters with the git module, such as depth and version (which version to checkout). Quite often, we need to check out the git repository via the SSH protocol. The public keys have to be added to the repository and post that the checkout can happen. The git module has two other parameters, accept_hostkey and key_file, to help you in the git checkouts. The following is an example of a sample repository in one of our GitHub accounts, which we'll checkout on the remote machine. This example assumes that the private key pair is already present in ~/.ssh.

Consider the following task for example:

   - name: Checkout git repo via ssh
     git: [email protected]:madhurranjan/node_nagios.git

The output for the preceding command line is as follows:



We end this chapter by summarizing what we've learned. We looked at an introduction to Ansible, wrote our very first playbook, learned ways to use the Ansible command line, learned how to debug playbooks, how Ansible does configuration management at a basic level, how to use variables, how inventory files work, and finally looked at modules. We hope you had a good time learning so far, which is why we recommend a coffee break before you start the next chapter! You can think of the following questions during your break to see whether you've understood the chapter:

  • What does Ansible offer?

  • How can variables be used in Ansible?

  • What files in your environment can be templatized?

  • How can you create inventory files for your current inventory?

We will look at more advanced modules and examples in later chapters. In the next chapter, we will look at how to develop Ansible playbooks and the best practices you should follow. We will focus on how to develop Ansible playbooks, test them, and the release cycle for the Ansible playbooks themselves. Make sure you do this in the right manner and incorporate best practices right from the beginning; this will save you a lot of time once your code matures.

About the Authors

  • Madhurranjan Mohaan

    Madhurranjan Mohaan is a passionate engineer who loves solving problems. He has more than 8 years of experience in the software industry. He worked as a network consultant with Cisco before starting his DevOps journey in 2011 at ThoughtWorks, where he learned the nuances of DevOps and worked on all aspects from Continuous Integration to Continuous Delivery. He is currently working at Apigee, where he enjoys dealing with systems at scale.

    Madhurranjan has also worked with various tools in configuration management, CI, deployment, and monitoring and logging space, apart from cloud platforms such as AWS. He has been a presenter at events such as DevOpsDays and Rootconf, and loves teaching and singing. He is a first-time author and wishes to share his learning with readers from across the world.

    Browse publications by this author
  • Ramesh Raithatha

    Ramesh Raithatha is a DevOps engineer by profession and a cyclist at heart. He is currently working at Apigee and has worked on various facets of the IT industry, such as configuration management, continuous deployment, automation, and monitoring. He likes exploring stunning natural surroundings on two wheels in his leisure time.

    Browse publications by this author

Latest Reviews

(1 reviews total)
Book Title
Access this book, plus 7,500 other titles for FREE
Access now