Learning Ansible 2.7 - Third Edition

2 (1 reviews total)
By Fabio Alessandro Locati
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Getting Started with Ansible

About this book

Ansible is an open source automation platform that assists organizations with tasks such as application deployment, orchestration, and task automation. With the release of Ansible 2.7, even complex tasks can be handled much more easily than before.

Learning Ansible 2.7 will help you take your first steps toward understanding the fundamentals and practical aspects of Ansible by introducing you to topics such as playbooks, modules, and the installation of Linux, Berkeley Software Distribution (BSD), and Windows support. In addition to this, you will focus on various testing strategies, deployment, and orchestration to build on your knowledge. The book will then help you get accustomed to features including cleaner architecture, task blocks, and playbook parsing, which can help you to streamline automation processes. Next, you will learn how to integrate Ansible with cloud platforms such as Amazon Web Services (AWS) before gaining insights into the enterprise versions of Ansible, Ansible Tower and Ansible Galaxy. This will help you to use Ansible to interact with different operating systems and improve your working efficiency.

By the end of this book, you will be equipped with the Ansible skills you need to automate complex tasks for your organization.

Publication date:
April 2019
Publisher
Packt
Pages
266
ISBN
9781789954333

 

Chapter 1. Getting Started with Ansible

Information and communications technology (ICT) is often described as a fast-growing industry. I think that the best quality of the ICT industry is not related to its ability to grow at a super-high speed, but is related to its ability to revolutionize itself, and the rest of the world, at an astonishing pace.

Every 10 to 15 years there are major shifts in how this industry works, and every shift solves problems that were very hard to manage up to that point, creating new challenges. Also, at every major shift, many of the best practices of the previous iteration are classified as anti-patterns, and new best practices are created. Although it might appear that those changes are impossible to predict, this is not always true. Obviously, it is not possible to know exactly what changes will occur and when they will take place, but looking at companies with a large number of servers and many lines of code usually reveals what the next steps will be.

The current shift has already happened in big companies such as Amazon Web Services (AWS), Facebook, and Google. It is the implementation of IT automation systems to create and manage servers.

In this chapter we will cover the following topics:

  • IT automation
  • What is Ansible?
  • The secure shell
  • Installing Ansible
  • Creating a test environment with Vagrant
  • Version control systems
  • Using Ansible with Git
 

Technical requirements


To support the learning of Ansible, I suggest having a machine where you can install Vagrant. Using Vagrant will allow you to try many operations, even destructive ones, without fear.

Additionally, AWS and Azure accounts are suggested, since some examples will be on those platforms.

All examples in this book are available in the GitHub repository at https://github.com/PacktPublishing/-Learning-Ansible-2.X-Third-Edition/.

 

IT automation


IT automation is – in its broader sense – the processes and software that help with the management of the IT infrastructure (servers, networking, and storage). In the current shift, we are supporting for a huge implementation of such processes and software.

At the beginning of IT history, there were very few servers, and a lot of people were needed to make them work properly, usually more than one person for each machine. Over the years, servers became more reliable and easier to manage, so it was possible to have multiple servers managed by a single system administrator. In that period, the administrators manually installed the software, upgraded the software manually, and changed the configuration files manually. This was obviously a very labor-intensive and error-prone process, so many administrators started to implement scripts and other means to make their lives easier. Those scripts were (usually) pretty complex, and they did not scale very well.

In the early years of this century, data centers started to grow a lot due to companies' needs. Virtualization helped in keeping prices low, and the fact that many of these services were web services meant that many servers were very similar to each other. At this point, new tools were needed to substitute the scripts that were used before: the configuration management tools.

CFEngine was one of the first tools to demonstrate configuration management capabilities way back in the 1990s; more recently, there has been Puppet, Chef, and Salt, besides Ansible.

Advantages of IT automation

People often wonder if IT automation really brings enough advantages, considering that implementing it has some direct and indirect costs. The main benefits of IT automation are the following:

  • The ability to provision machines quickly
  • The ability to recreate a machine from scratch in minutes
  • The ability to track any change performed on the infrastructure

For these reasons, it's possible to reduce the cost of managing the IT infrastructure by reducing the repetitive operations often performed by system administrators.

Disadvantages of IT automation

As with any other technology, IT automation does come with some disadvantages. From my point of view, these are the biggest disadvantages:

  • Automating all of the small tasks that were once used to train new system administrators.
  • If an error is performed, it will be propagated everywhere.

The consequence of the first is that new ways to train junior system administrators will need to be implemented.

The second one is trickier. There are a lot of ways to limit this kind of damage, but none of those will prevent it completely. The following mitigation options are available:

  • Always have backups: Backups will not prevent you from nuking your machine – they will only make the restore process possible.
  • Always test your infrastructure code (playbooks/roles) in a non-production environment: Companies have developed different pipelines to deploy code, and those usually include environments such as dev, test, staging, and production. Use the same pipeline to test your infrastructure code. If a buggy application reaches the production environment it could be a problem. If a buggy playbook reaches the production environment, it can be catastrophic.
  • Always peer-review your infrastructure code: Some companies have already introduced peer-reviews for the application code, but very few have introduced it for the infrastructure code. As I was saying in the previous point, I think that infrastructure code is way more critical than application code, so you should always peer-review your infrastructure code, whether you do it for your application code or not.
  • Enable SELinux: SELinux is a security kernel module that is available on all Linux distributions (it is installed by default on Fedora, Red Hat Enterprise Linux, CentOS, Scientific Linux, and Unbreakable Linux). It allows you to limit users and process powers in a very granular way. I suggest using SELinux instead of other similar modules (such as AppArmor) because it is able to handle more situations and permissions. SELinux will prevent a huge amount of damage because, if correctly configured, it will prevent many dangerous commands from being executed.
  • Run the playbooks from a limited account: Even though user and privilege escalation schemes have been in Unix code for more than 40 years, it seems as if not many companies use them. Using a limited user for all your playbooks, and escalating privileges only for commands that need higher privileges, will help prevent you from nuking a machine while trying to clean an application temporary folder.
  • Use horizontal privilege escalation: The sudo command is a well known, but is often used in its more dangerous form. The sudo command supports the -u parameter that will allow you to specify a user that you want to impersonate. If you have to change a file that is owned by another user, please do not escalate to root to do so, just escalate to that user. In Ansible, you can use the become_user parameter to achieve this.
  • When possible, don't run a playbook on all your machines at the same time: Staged deployments can help you detect a problem before it's too late. There are many problems that are not detectable in dev, test, staging, and QA environments. The majority of them are related to a load that is hard to emulate properly in those non-production environments. A new configuration you have just added to your Apache HTTPd or MySQL servers could be perfectly OK from a syntax point of view, but disastrous for your specific application under your production load. A staged deployment will allow you to test your new configuration on your actual load without risking downtime in case something was wrong.
  • Avoid guessing commands and modifiers: A lot of system administrators will try to remember the right parameter, and try to guess if they don't remember it exactly. I've done it too, a lot of times, but this is very risky. Checking the man page or the online documentation will usually take you less than two minutes, and often, by reading the manual, you'll find interesting notes you did not know. Guessing modifiers is dangerous because you could be fooled by a non-standard modifier (that is, -v is not a verbose mode for grep, and -h is not a help command for the MySQL CLI).
  • Avoid error-prone commands: Not all commands have been created equally. Some commands are (way) more dangerous than others. If you can assume a cat-based command safe, you have to assume that a dd-based command is dangerous, since it performs copies and conversion of files and volumes. I've seen people using dd in scripts to transform DOS files to Unix (instead of dos2unix) and many other, very dangerous, examples. Please, avoid such commands, because they could result in a huge disaster if something goes wrong.
  • Avoid unnecessary modifiers: If you need to delete a simple file, use rm ${file}, not rm -rf ${file}. The latter is often performed by users that have learned to be sure, always userm -rf, because at some time in their past, they have had to delete a folder. This will prevent you from deleting an entire folder if the ${file} variable is set wrongly.
  • Always check what could happen if a variable is not set: If you want to delete the contents of a folder and you use the rm -rf ${folder}/* command, you are looking for trouble. In case the ${folder} variable is not set for some reason, the shell will read a rm -rf /* command, which is deadly (considering the fact that the rm -rf / command will not work on the majority of current OSes because it requires a --no-preserve-root option, while the rm -rf /* will work as expected). I'm using this specific command as an example because I have seen such situations: the variable was pulled from a database which, due to some maintenance work, was down, and an empty string was assigned to that variable. What happened next is probably easy to guess. In case you cannot prevent using variables in dangerous places, at least check them to see whether they are empty or not before using them. This will not save you from every problem, but may catch some of the most common ones.
  • Double-check your redirections: Redirections (along with pipes) are the most powerful elements of Unix shells. They could also be very dangerous: a cat /dev/rand > /dev/sda can destroy a disk even if a cat-based command is usually overlooked because it's not usually dangerous. Always double-check all commands that include a redirection.
  • Use specific modules wherever possible: In this list, I've used shell commands because many people will try to use Ansible as if it's just a way to distribute them: it's not. Ansible provides a lot of modules and we'll see them in this book. They will help you create more readable, portable, and safe playbooks.

Types of IT automation

There are a lot of ways to classify IT automation systems, but by far the most important is related to how the configurations are propagated. Based on this, we can distinguish between agent-based systems and agent-less systems.

Agent-based systems

Agent-based systems have two different components: a server, and a client called agent.

There is only one server, and it contains all of the configuration for your whole environment, while the agents are as many as the machines in the environment.

Note

In some cases, more than one server could be present to ensure high availability, but treat it as if it's a single server, since they will all be configured in the same way.

Periodically, the client will contact the server to see if a new configuration for its machine is present. If a new configuration is present, the client will download it and apply it.

Agent-less systems

In agent-less systems, no specific agent is present. Agent-less systems do not always respect the server/client paradigm, since it's possible to have multiple servers and even the same number of servers and clients. Communications are initialized by the server, which will contact the client(s) using standard protocols (usually via SSH and PowerShell).

Agent-based versus agent-less systems

Aside from the differences previously outlined, there are other contrasting factors that arise because of those differences.

From a security standpoint, an agent-based system can be less secure. Since all machines have to be able to initiate a connection to the server machine, this machine could be attacked more easily than in an agent-less case, where the machine is usually behind a firewall that will not accept any incoming connections.

From a performance point of view, agent-based systems run the risk of having the server saturated, therefore the roll-out could be slower. It also needs to be considered that, in a pure agent-based system, it is not possible to force-push an update immediately to a set of machines. It will have to wait until those machines check-in. For this reason, multiple agent-based systems have implemented out-of-bands ways to implement such features. Tools such as Chef and Puppet are agent-based, but can also run without a centralized server to scale a large number of machines, commonly called Serverless Chef and Masterless Puppet respectively.

An agent-less system is easier to integrate in an infrastructure that is already present (brownfield) since it will be seen by the clients as a normal SSH connection, therefore no additional configuration is needed.

 

What is Ansible?


Ansible is an agent-less IT automation tool developed in 2012 by Michael DeHaan, a former Red Hat associate. The Ansible design goals are for it to be minimal, consistent, secure, highly reliable, and easy to learn. The Ansible company was bought by Red Hat in October 2015, and now operates as part of Red Hat, Inc.

Ansible primarily runs in push mode using SSH, but you can also run Ansible using ansible-pull, where you can install Ansible on each agent, download the playbooks locally, and run them on individual machines. If there are a large number of machines (large is a relative term; but in this case, consider it to mean greater than 500), and you plan to deploy updates to the machines in parallel, this might be the right way to go about it. As we discussed before, either agent-full and agent-less systems have their pros and cons.

In the next section, we will discuss Secure Shell (SSH), which is a core part of Ansible and the Ansible philosophy.

Secure Shell

Secure Shell (also known as SSH) is a network service that allows you to log in and access a shell remotely over a fully encrypted connection. The SSH daemon is today the standard for UNIX system administration, after having replaced the unencrypted telnet. The most frequently used implementation of the SSH protocol is OpenSSH.

In the last few years, Microsoft has implemented OpenSSH for Windows. I think that this proves the de facto standard situation that SSH lives into.

Since Ansible performs SSH connections and commands in the same way any other SSH client would do, no specific configuration has been applied to the OpenSSH server.

To speed up default SSH connections, you can always enable ControlPersist and the pipeline mode, which makes Ansible faster and more secure.

 

Why Ansible?


We will try and compare Ansible with Puppet and Chef during the course of this book, since many people have good experience those tools. We will also point out specifically how Ansible would solve a problem compared to Chef or Puppet.

Ansible, as well as Puppet and Chef, are declarative in nature, and are expected to move a machine to the desired state specified in the configuration. For example, in each of these tools, in order to start a service at a point in time and start it automatically on restart, you would need to write a declarative block or module; every time the tool runs on the machine, it will aspire to obtain the state defined in your playbook (Ansible), cookbook (Chef), or manifest (Puppet).

The difference in the toolset is minimal at a simple level, but as more situations arise and the complexity increases, you will start finding differences between the different toolsets. In Puppet, you do not set the order in which the tasks will be executed, and the Puppet server will decide the sequence and the parallelizations at runtime, making it easier to end up with difficult-to-debug errors. To exploit the power of Chef, you will need a good Ruby team. Your team needs to be good at the Ruby language to customize both Puppet and Chef, and there will be a bigger learning curve with both of the tools.

With Ansible, the case is different. It uses the simplicity of Chef when it comes to the order of execution – the top-to-bottom approach – and allows you to define the end state in YAML format, which makes the code extremely readable and easy for everyone, from development teams to operations teams, to pick up and make changes. In many cases, even without Ansible, operations teams are given playbook manuals to execute instructions from whenever they face issues. Ansible mimics that behavior. Do not be surprised if you end up having your project manager change the code in Ansible and check it into Git because of its simplicity!

 

Installing Ansible


Installing Ansible is rather quick and simple. You can use the source code directly, by cloning it from the GitHub project (https://github.com/ansible/ansible); install it using your system's package manager; or use Python's package management tool (pip). You can use Ansible on any Windows or Unix-like system, such as macOS and Linux. Ansible doesn't require any databases, and doesn't need to have any daemons running. This makes it easier to maintain Ansible versions and upgrade without any breaks.

We'd like to call the machine where we will install Ansible our Ansible workstation. Some people also refer to it as the command center.

Installing Ansible using the system's package manager

It is possible to install Ansible using the system's package manager, and, in my opinion, this is the preferred option if your system's package manager ships at least Ansible 2.0. We will look into installing Ansible via Yum, Apt, Homebrew, and pip.

Installing via Yum

If you are running a Fedora system, you can install Ansible directly, since from Fedora 22, Ansible 2.0+ is available in the official repositories. You can install it as follows:

$ sudo dnf install ansible

For RHEL and RHEL-based (CentOS, Scientific Linux, and Unbreakable Linux) systems, versions 6 and 7 have Ansible 2.0+ available in the EPEL repository, so you should ensure that you have the EPEL repository enabled before installing Ansible as follows:

$ sudo yum install ansible

Note

On RHEL 6, you have to run the $ sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm command to install EPEL, while on RHEL 7, $ sudo yum install epel-release is enough.

Installing via Apt

Ansible is available for Ubuntu and Debian. To install Ansible on those operating systems, use the following command:

$ sudo apt-get install ansible

Installing via Homebrew

You can install Ansible on Mac OS X using Homebrew, as follows:

$ brew update$ brew install ansible

Installing via pip

You can install Ansible via pip. If you don't have pip installed on your system, install it. You can use pip to install Ansible on Windows too, using the following command line:

$ sudo easy_install pip

You can now install Ansible using pip, as follows:

$ sudo pip install ansible

Once you're done installing Ansible, run ansible --version to verify that it has been installed:

$ ansible --version

You will get many lines as output from the preceding command line, as follows:

ansible 2.7.1
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/fale/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /bin/ansible
  python version = 2.7.15 (default, Oct 15 2018, 15:24:06) [GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]

Installing Ansible from source

In case the previous methods do not fit your use case, you can install Ansible directly from source. Installing from source does not require any root permissions. Let's clone a repository and activate virtualenv, which is an isolated environment in Python where you can install packages without interfering with the system's Python packages. The command and the resulting output for the repository are as follows:

$ git clone git://github.com/ansible/ansible.gitCloning into 'ansible'...remote: Counting objects: 116403, done.remote: Compressing objects: 100% (18/18), done.remote: Total 116403 (delta 3), reused 0 (delta 0), pack-reused 116384Receiving objects: 100% (116403/116403), 40.80 MiB | 844.00 KiB/s, done.Resolving deltas: 100% (69450/69450), done.Checking connectivity... done.$ cd ansible/$ source ./hacking/env-setupSetting up Ansible to run out of checkout...PATH=/home/vagrant/ansible/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/vagrant/binPYTHONPATH=/home/vagrant/ansible/lib:MANPATH=/home/vagrant/ansible/docs/man:Remember, you may wish to specify your host file with -iDone!

Ansible needs a couple of Python packages, which you can install using pip. If you don't have pip installed on your system, install it using the following command. If you don't have easy_install installed, you can install it using Python's setuptools package on Red Hat systems, or by using Brew on the macOS:

$ sudo easy_install pip<A long output follows>

Once you have installed pip, install the paramiko, PyYAML, jinja2, and httplib2 packages using the following command lines:

$ sudo pip install paramiko PyYAML jinja2 httplib2Requirement already satisfied (use --upgrade to upgrade): paramiko in /usr/lib/python2.6/site-packagesRequirement already satisfied (use --upgrade to upgrade): PyYAML in /usr/lib64/python2.6/site-packagesRequirement already satisfied (use --upgrade to upgrade): jinja2 in /usr/lib/python2.6/site-packagesRequirement already satisfied (use --upgrade to upgrade): httplib2 in /usr/lib/python2.6/site-packagesDownloading/unpacking markupsafe (from jinja2)  Downloading MarkupSafe-0.23.tar.gz  Running setup.py (path:/tmp/pip_build_root/markupsafe/setup.py) egg_info for package markupsafeInstalling collected packages: markupsafe  Running setup.py install for markupsafe    building 'markupsafe._speedups' extension    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c markupsafe/_speedups.c -o build/temp.linux-x86_64-2.6/markupsafe/_speedups.o    gcc -pthread -shared build/temp.linux-x86_64-2.6/markupsafe/_speedups.o -L/usr/lib64 -lpython2.6 -o build/lib.linux-x86_64-2.6/markupsafe/_speedups.soSuccessfully installed markupsafeCleaning up...

Note

By default, Ansible will be running against the development branch. You might want to check out the latest stable branch. Check what the latest stable version is using the following $ git branch -a command.

Copy the latest version you want to use.

Version 2.0.2 was the latest version available at the time of writing. Check the latest version using the following command lines:

[node ansible]$ git checkout v2.7.1Note: checking out 'v2.0.2'.[node ansible]$ ansible --versionansible 2.7.1 (v2.7.1 c963ef1dfb) last updated 2018/10/25 20:12:52 (GMT +000)

You now have a working setup of Ansible ready. One of the benefits of running Ansible from source is that you can enjoy the new features immediately, without waiting for your package manager to make them available for you.

 

Creating a test environment with Vagrant


To be able to learn Ansible, we will need to make quite a few playbooks and run them.

Note

Doing this directly on your computer will be very risky. For this reason, I would suggest using virtual machines.

It's possible to create a test environment with cloud providers in a few seconds, but it is often more useful to have those machines locally. To do so, we will use Vagrant, which is a piece of software by Hashicorp that allows users to quickly set up virtual environments independently from the virtualization backend used on the local system. It does support many virtualization backends (in the Vagrant ecosystem these are known as Providers) such as Hyper-V, VirtualBox, Docker, VMWare, and libvirt. This allows you to use the same syntax no matter what operating system or environment you are in.

First we will install vagrant. On Fedora, it will be enough to run the following code:

$ sudo dnf install -y vagrant

On Red Hat/CentOS/Scientific Linux/Unbreakable Linux, we will need to install libvirt first, enable it, and then install vagrant from the Hashicorp website:

$ sudo yum install -y qemu-kvm libvirt virt-install bridge-utils libvirt-devel libxslt-devel libxml2-devel libvirt-devel libguestfs-tools-c
$ sudo systemctl enable libvirtd
$ sudo systemctl start libvirtd
$ sudo rpm -Uvh https://releases.hashicorp.com/vagrant/2.2.1/vagrant_2.2.1_x86_64.rpm
$ vagrant plugin install vagrant-libvirt

If you use Ubuntu or Debian, you can install it using the following code:

$ sudo apt install virtualbox vagrant

For the following examples, I'll be virtualizing CentOS 7 machines. This is for multiple reasons; the main ones are as follows:

  • CentOS is free and 100% compatible with Red Hat, Scientific Linux, and Unbreakable Linux.
  • Many companies use Red Hat/CentOS/Scientific Linux/Unbreakable Linux for their servers.
  • These distributions are the only ones with SELinux support built in, and, as we have seen earlier, SELinux can help you make your environment much more secure.

To test that everything went well, we can run the following commands:

$ sudo vagrant init centos/7 && sudo vagrant up

If everything went well, you should expect an output ending with something like this:

==> default: Configuring and enabling network interfaces...
default: SSH address: 192.168.121.60:22
default: SSH username: vagrant
default: SSH auth method: private key
==> default: Rsyncing folder: /tmp/ch01/ => /vagrant

So, you can now execute vagrant ssh, and you will find yourself in the machine we just created.

Note

There will be a Vagrant file in the current folder. In this file, you can create the directives with vagrant init to create the virtual environment.

 

Version control systems


In this chapter, we have already encountered the expression infrastructure code to describe the Ansible code that will create and maintain your infrastructure. We use the expression infrastructure code to distinguish it from the application code, which is the code that composes your applications, websites, and so on. This distinction is needed for clarity, but in the end, both types are a bunch of text files that the software will be able to read and interpret.

For this reason, a version control system will help you a lot. Its main advantages are as follows:

  • The ability to have multiple people working simultaneously on the same project.
  • The ability to perform code-reviews in a simple way.
  • The ability to have multiple branches for multiple environments (that is, dev, test, QA, staging, and production).
  • The ability to track a change so that we know when it was introduced, and who introduced it. This makes it easier to understand why that piece of code is there, months or years later.

These advantages are provided to you by the majority of version control systems out there.

Version control systems can be divided into three major groups, based on the three different models that they can implement:

  • Local data model
  • Client-server model
  • Distributed model

The first category, the local data model, is the oldest (circa 1972) approach and is used for very specific use cases. This model requires all users to share the same filesystem. Famous examples of it are the Revision Control System (RCS) and Source Code Control System (SCCS).

The second category, the client-server model, arrived later (circa 1990) and tried to solve the limitations of the local data model, creating a server that respected the local data model and a set of clients that dealt with the server instead of with the repository itself. This additional layer allowed multiple developers to use local files and synchronize them with a centralized server. Famous examples of this approach are Apache Subversion (SVN), and Concurrent Versions System (CVS).

The third category, the distributed model, arrived at the beginning of the twenty-first century, and tried to solve the limitations of the client-server model. In fact, in the client-server model, you could work on the code offline, but you needed to be online to commit the changes. The distributed model allows you to handle everything on your local repository (like the local data model), and to merge different repositories on different machines in an easy way. In this new model, it's possible to perform all actions as in the client-server model, with the added benefits of being able to work completely offline as well as the ability to merge changes between peers without passing by the centralized server. Examples of this model are BitKeeper (proprietary software), Git, GNU Bazaar, and Mercurial.

There are some additional advantages that will be provided by only the distributed model, such as the following:

  • The possibility of making commits, browsing history, and performing any other action even if the server is not available.
  • Easier management of multiple branches for different environments.

When it comes to infrastructure code, we have to consider that the infrastructure that retains and manages your infrastructure code is frequently kept in the infrastructure code itself. This is a recursive situation that can create problems. A distributed version control system will prevent this problem.

As for the simplicity of managing multiple branches, even if this is not a hard rule, often distributed version control systems have much better merge handling than the other kinds of version control systems.

 

Using Ansible with Git


For the reasons that we have just seen, and because of its huge popularity, I suggest always using Git for your Ansible repositories.

There are a few suggestions that I always provide to the people I talk to, so that Ansible gets the best out of Git:

  • Create environment branches: Creating environment branches, such as dev, prod, test, and stg, will allow you to easily keep track of the different environments and their respective update statuses. I often suggest keeping the master branch for the development environment, since I find that many people are used to pushing new changes directly to the master. If you use a master for a production environment, people can inadvertently push changes in the production environment when they wanted to push them in a development environment.
  • Always keep environment branches stable: One of the big advantages of having environment branches is the possibility of destroying and recreating any environment from scratch at any given moment. This is only possible if your environment branches are in a stable (not broken) state.
  • Use feature branches: Using different branches for specific long-development features (such as a refactor or some other big changes) will allow you to keep your day-to-day operations while your new feature is in the Git repository (so you'll not lose track of who did what and when they did it).
  • Push often: I always suggest that people push commits as often as possible. This will make Git work as both a version control system and a backup system. I have seen laptops broken, lost, or stolen with days or weeks of un-pushed work on them far too often. Don't waste your time – push often. Also, by pushing often, you'll detect merge conflicts sooner, and conflicts are always easier to handle when they are detected early, instead of waiting for multiple changes.
  • Always deploy after you have made a change: I have seen times when a developer has created a change in the infrastructure code, tested in the dev and test environments, pushed to the production branch, and then went to have lunch before deploying the changes in production. His lunch did not end well. One of his colleagues deployed the code to production inadvertently (he was trying to deploy a small change he had made in the meantime) and was not prepared to handle the other developer's deployment. The production infrastructure broke and they lost a lot of time figuring out how it was possible that such a small change (the one the person who made the deployment was aware of) created such a big mess.
  • Choose multiple small changes rather than a few huge changes: Making small changes, whenever possible, will make debugging easier. Debugging an infrastructure is not very easy. There is no compiler that will allow you to see "obvious problems" (even though Ansible performs a syntax check of your code, no other test is performed), and the tools for finding something that is broken are not always as good as you would imagine. The infrastructure as a code paradigm is new, and tools are not yet as good as the ones for the application code.
  • Avoid binary files as much as possible: I always suggest keeping your binaries outside your Git repository, whether it is an application code repository or an infrastructure code repository. In the application code example, I think it is important to keep your repository light (Git, as well as the majority of the version control systems, do not perform very well with binary blobs), while, for the infrastructure code example, it is vital because you'll be tempted to put a huge number of binary blobs in it, since very often it is easier to put a binary blob in the repository than to find a cleaner (and better) solution.
 

Summary


In this chapter, we have seen what IT automation is, its advantages and disadvantages, what kind of tools you can find, and how Ansible fits into this big picture. We have also seen how to install Ansible and how to create a Vagrant virtual machine. In the end, we analyzed the version control systems and spoke about the advantages Git brings to Ansible, if used properly.

In the next chapter, we will start looking at the infrastructure code that we mentioned in this chapter, without explaining exactly what it is and how to write it. We'll also see how to automate simple operations that you probably perform every single day, such as managing users, managing files, and file content.

About the Author

  • Fabio Alessandro Locati

    Fabio Alessandro Locati, commonly known as Fale, is a director at Otelia, a public speaker, an author, and an open source contributor. His main areas of expertise are Linux, automation, security, and cloud technologies. Fale has more than 12 years of working experience in IT, with many of them spent consulting for many companies, including dozens of Fortune 500 companies. This has allowed him to consider technologies from different points of view, and to think critically about them.

    Browse publications by this author

Latest Reviews

(1 reviews total)
Purchase was baited by 2.7 suffix, reading the first chapters just to start is a hurdle when considering the author's assumptions.

Recommended For You

Hands-On Docker for Microservices with Python

A step-by-step guide to building microservices using Python and Docker, along with managing and orchestrating them with Kubernetes

By Jaime Buelta
Continuous Delivery with Docker and Jenkins - Second Edition

Create a complete Continuous Delivery process using modern DevOps tools such as Docker, Kubernetes, Jenkins, Docker Hub, Ansible, GitHub and many more.

By Rafał Leszko
Learning Python Networking - Second Edition

Take your networking skills to the next level by learning network programming concepts and algorithms using Python

By José Manuel Ortega and 2 more
Ansible 2 for Configuration Management [Video]

Perform powerful automation to get consistent and correct configurations every time with Ansible 2

By Alan Hohn