What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Getting Started with Terraform

Chapter 1. Infrastructure Automation

Before starting to learn Terraform, you first need to learn certain concepts in the modern infrastructure. To be able to use the new tool, one needs to understand what problem it solves. In order to do it, this chapter will cover the following topics:

Learning what Infrastructure as Code is and why it is needed
Understanding the benefits of declarative approach for configuration management
Explaining the missing points of configuration management tools
Laying out requirements for high-level infrastructure automation
Taking a quick look at main tools in order to provision infrastructure
The short overview and history of Terraform
What you will learn in this book

What is Infrastructure as Code and why is it needed?

The amount of servers used by almost any project is growing rapidly mostly due to increasing adoption of cloud technologies. As a result, traditional ways of managing IT infrastructure become less and less relevant.

The manual approach fits well for the farm of a dozen, perhaps even a couple of dozens of servers. But when we're talking about hundreds of them, doing anything by hand is definitely not going to play out well.

It's not only about servers, of course. Every cloud provider gives extra services on top, be it a virtual networking service, an object storage, or a monitoring solution, which you don't need to maintain yourself. These services function like a Software as a Service (SaaS). And actually, we should treat various SaaS products as part of our infrastructure as well. If you use NewRelic for monitoring purposes, then it is your infrastructure too, with the difference that you don't need to manage servers for it yourself. But how you use it and whether you use it correctly is up to you.

No surprises, companies of any size, from small start-ups to huge enterprises, are adopting new techniques and tools to manage and automate their infrastructures. These techniques got a new name eventually: Infrastructure as Code (IaC).

Dated something around 2009, the Infrastructure as Code term is all about approaching your IT-infrastructure tasks the same way you develop software. This includes the things similar to the following:

Heavy use of source control to store all infrastructure-related code
Collaboration on this code in the same fashion as applications are developed
Using Unit and Integration testing and even applying Test-driven development to infrastructure code
Introducing Continuous Integration and Continuous Delivery to test and release infrastructure code

Infrastructure as Code is a foundation for DevOps culture because once both operations and developers approach their work in the same way and by following principles laid out preceding, they already have some common ground.

Not saying that if your infrastructure is treated like code, then the border between development and operations becomes so blurry that the whole existence of this separation can become eventually quite questionable.

Of course, the introduction of Infrastructure as Code requires a new kind of tools.

Declarative vs Procedural tools for Infrastructure as Code

What is infrastructure code specifically? It highly depends on your particular infrastructure setup.

In the simplest case, it might be just a bunch of shell scripts and component-specific configuration files (Nginx configuration, cron jobs, and so on) stored in source control. Inside these shell scripts, you specify exact steps computer needs to take to achieve the state you need:

Copy this file to that folder.
Replace all occurrences of ADDRESS with mysite.com.
Restart the Nginx service.
Send an e-mail about successful deployment.

This is what we call procedural programming. It's not bad. For example, build steps of Continuous Integration tools such as Jenkins that are a perfect fit for a procedural approach—after all the sequence of command is exactly what you need in this case.

However, you can only go that far with shell scripts when it comes to configuring servers and higher level pieces. The more common and mature approach these days is to use tools that provide a declarative, rather than a procedural way to define your infrastructure. With declarative definitions, you don't need to think how to do something; you only write what should be there.

Perhaps the main benefit of it is that rerunning a declarative definition will never do the same job twice, whereas executing the same shell script will most likely break something on the second run. Proper configuration management tool will ensure that the server will be in the exactly same state as defined in your code. This property of modern configuration and provisioning tools is named idempotency.

Let's look at an example. Let's say that you have a box in your network that hosts packages repository. For some reason, instead of using DNS server, you want to hardcode the IP address of this box to the /etc/hosts file with a domain name repository.internal.

Note

In Unix-like systems, the /etc/hosts file contains a local text database of DNS records. The system tries to resolve DNS name by looking at this file first, and only asking DNS-server only after.

Not a complex task to do, given that you only need to add a new line to the /etc/hosts file. To achieve this, you could have a script like the following:

echo 192.168.0.5 repository.internal >> /etc/hosts/hosts

Running it once will do the job: required entry will be added to the end of the /etc/hosts file. But what will happen if you execute it again? You guessed it right: exactly the same line will be appended again. And even worse, what if the IP address of repository box will change? Then, if you execute your script, you will end up with two different host entries for the same domain name.

You can ensure idempotency yourself inside the script, with the high usage of conditional checks. But why reinvent the wheel when there is already a tool to do exactly this job? It would be so much better to just define the end result, without composing sequence of commands to achieve this.

And that is exactly what configuration management tools such as Puppet and Chef do by providing you a special Domain Specific Language (DSL) for defining the desired state of the machine. The certain downside is the necessity to learn a new DSL: a special small language focused on solving one particular task. It's not a complete programming language, neither does it to be; in this case, its only job is to describe the state of your server.

Let's look at how the same task could be done with the help of a Puppet manifest:

host { 'repository.internal': 
  ip => '192.168.0.5', 
}

Applying this manifest multiple times will never add extra entries, and changing the IP address in the manifest will be reflected correctly in host files changing the existing entry, and not creating a new one.

Note

There is an additional benefit I should mention: on top of idempotency, you often get platform agnosticism. What it means is that the same definition could be used for completely different operating systems without any change. For example, by using package resource in Puppet, you don't care whether the underlying system uses rpm or deb.

Now you should better understand that when it comes to configuration management tools that provide the declarative way of doing things are preferred.

Modern configuration management tools such as Chef or Puppet completely solved the problem of setting up a single machine. There is an increasing number of high-quality libraries (be it cookbooks or modules) for configuring all kinds of software in an (almost) OS-agnostic way. But configuring what goes inside single server is only part of the picture. The other part that is located a layer above also requires a new tooling.

Requirements for infrastructure provisioner

Before proceeding to the existing solutions, let's lay out the list of most important requirements for a tool such as this, so we are able to choose it wisely.

Supports a wide variety of services

AWS alone already has dozens of entities to take care of. Other players (DigitalOcean, Google Cloud, Microsoft Azure, and so on) increase this number significantly. And if you want to add smaller SaaS providers to the game, you get hundreds of resources to manage.

Idempotency

Same as with a single server configuration, reapplying infrastructure template should not do the job twice. If you have a template defining 50 different resources, from EC2 instances to S3 buckets, then you do not want to duplicate or recreate all of them every time you apply the template. You want only missing parts to be created, existing ones to be in the desired state and the ones, which have become obsolete to be destroyed.

Dependency resolution

It is important to be able not just to define 2 app servers, 1 DB server and 2 security groups,but to also point them to each other using some lookup mechanism. Especially when creating a complete environment from scratch, you want to ensure the correct order of creation to achieve a flawless bootstrap of each component.

Note

Here and further in the book, the term environment will mean a complete set of resources that infrastructure consists of. It includes a network setup, all servers, and all related resources.

Robust integration with existing tools

Even though it is pretty awesome to have all infrastructure in one beautiful template, you still need to take care of what is happening on each particular server: applications need to be deployed, databases need to be configured, and so on. This is not the job for infrastructure provisioning tool. But certainly a tool like this should easily integrate with other tools such as Chef, which solves this problem already.

Platform agnosticism

Ideally, templates should be platform agnostic. This means that if I define a template for 2 app servers, 1 db server, all talk to each other, I should be able to easily switch from AWS to local Vagrant without rewriting the template. Platform agnosticism is difficult to obtain, while at the same time, might not really be needed that often. Completely changing the underlying platform is a rather rare event that happens perhaps once or twice in a product's lifetime.

Smart update management

This is a tricky one, and at the moment of writing, no tool can do it flawlessly in every case (and, honestly, unlikely it will ever do). What happens when I change a type of three EC2 instances from m3.medium to c4.xlarge? Will my m3.medium instances shut down and be replaced one by one with new ones? Will they be instantly destroyed leading to a few minutes of downtime? Or will the tool just ignore the updated instance type? Or will it not and then just override old nodes and I will end up with three new nodes and three old EC2 instances that I have to remove manually? Solutions to this problem differ from platform to platform, which makes it more complicated for the tool to be platform agnostic.

Ease of extension

The last requirement is of particular importance: there must be an easy way to extend this tool to support other resources. For example, if a tool lacks support for AWS Kinesis or particular feature or property of already supported service, and there is no plan to support it officially, then there has to be a way to implement it yourself quickly.

Which tools exist for infrastructure provisioning?

Now, when we have a problem to solve and a list of requirements to the tool that should solve the problem, we can go into specifics of different existing tools.

Scripting

Almost every cloud provider has an API, and if there is an API, you can script it. You could also go beyond single script and develop a small-focused tool just for your company to create environments. Disadvantages are: more software to develop and support in-house.

Configuration management

Most of configuration management tools already have a way to create cloud resources. Chef has Chef provisioning, which allows you to write recipes that define not entities on a single server, but multiple servers and components such as security groups of AWS and networking parts. There are also Puppet modules, which wrap cloud APIs into Puppet resources. Ansible also has modules to support providers such as AWS, Openstack, and others.

While the idea to use the single tool for both levels: high complete infrastructure definition and inside-a-server configuration is tempting, but it has some drawbacks . One of them is lack of support for many required services and immaturity of these solutions in general.

Also, the ways to use these tools for this purpose are kind of ambiguous. There are no well-defined workflows. Let's take AWS as an example. The recommended way to setup a firewall in AWS environment is to use Security Groups (SGs). SGs are a separate entity, which are available via web interface or via API.

What should you do if you want to create an AWS security group that allows connections from an app server to a database server? Should you put this code to a database package or an application package? AWS Security Group clearly doesn't belong to either of them.

The only meaningful solution is to create a separate package, which is dedicated to creating the security groups and performs searches against the nodes API to define inbound and outbound rules for these groups.

It's also unclear from where to execute this kind of code. From a workstation? From a separate AWS-resources node that has permissions to do this sort of thing? How do you secure it? How do you distribute keys? And, more importantly, how do you make this process reproducible and ready to be used in CI/CD pipelines? There is no clear answer to these questions from the configuration management tools' point of view.

The other downside is that you might not even have and want to have complete configuration management in your organization. Implementing them gives huge benefits, but steep learning curve and lack of in-house expertise can be significant blockers in their adaption.

CloudFormation/Heat

Both AWS and OpenStack have a built-in way to define all of their resources in one template. Often it works nicely in environments that are only AWS or only OpenStack. But as soon as you want to add another provider to the mix, you need another tool.

Terraform

Finally, there is Terraform, the tool this book is about, and the one we will use to codify complete infrastructure or, at least, the top layer of it.

Journey ahead and how to read this book

This is a book about Terraform, and you will learn everything that there is to learn about this tool. There are two main parts of this book, split into six chapters of pure learning.

In next three chapters, we will learn the basics. In Chapter 2, Deploying First Server the next one, you will learn the basics of Terraform, main entities it uses and how to deploy our first server with it. We will also get a short AWS EC2 introduction.

In Chapter 3, Resource Dependencies and Modules, we will discover how exactly Terraform operates with its resources and how to refactor our code. In Chapter 4, Storing and Supplying Configuration you will learn all the possible ways you can configure your templates with various APIs Terraform provides.

If you are already familiar with Terraform basics, Chapter 2, Deploying First Server to Chapter 4, Storing and Supplying Configuration, might be a bit too boring for you. They are about how to use this tool as a first-time user, and they don't cover many advanced topics that you will get to once you run Terraform in production. Feel free to skip the next three chapters if you already used Terraform. For advanced topics, head over to Chapter 5, Connecting with Other Tools, Chapter 6, Scaling and Updating Infrastructure, and Chapter 7, Collaborative Infrastructure.

In Chapter 5, Connecting with Other Tools you will learn how to connect Terraform with many different tools, from configuration management to infrastructure testing tools. We will find out how to provision and reprovision machines and how to use Terraform in pair with literally any other tool.

In Chapter 6, Scaling and Updating Infrastructure, we will cover infrastructure updates with Terraform, from the very simple cases (such as changing the one property of some non-essential resource) to complex upgrade scenarios of whole clusters of machines.

Finally, in Chapter 7, Collaborative Infrastructure, you will learn how to collaborate on infrastructure work with Terraform. We will also master integration testing for Terraform environments.

Be prepared: this book is not only about Terraform. It's about Infrastructure as Code and various topics surrounding it, such as Immutable Infrastructure. Terraform will be the main tool we will study, but definitely not the only one. Configuration management tools, testing tools, half a dozen of small helper utilities, and the same amount of AWS services; get ready to learn the whole toolset required to embrace Infrastructure as Code because, as you will soon notice, Terraform is a tool that must be supported by other software.

In the final chapter, Chapter 8, Future of Terraform, we will run through multiple topics related to Terraform that did not make it to the other chapters. Chapter 8, Future of Terraform, also has a non-conventional piece on the future of Terraform, that you might or might not want to read before proceeding to learning it.

So, without further delay, let's proceed to creating our first server with Terraform.

Key benefits

• An up-to-date and comprehensive resource on Terraform that lets you quickly and efficiently launch your infrastructure

• Learn how to implement your infrastructure as code and make secure, effective changes to your infrastructure

• Learn to build multi-cloud fault-tolerant systems and simplify the management and orchestration of even the largest scale and most complex cloud infrastructures

Description

Terraform is a tool used to efficiently build, configure, and improve production infrastructure. It can manage existing infrastructure as well as create custom in-house solutions. This book shows you when and how to implement infrastructure as a code practices with Terraform. It covers everything necessary to set up complete management of infrastructure with Terraform, starting with the basics of using providers and resources. This book is a comprehensive guide that begins with very small infrastructure templates and takes you all the way to managing complex systems, all using concrete examples that evolve over the course of the book. It finishes with the complete workflow of managing a production infrastructure as code – this is achieved with the help of version control and continuous integration. At the end of this book, you will be familiar with advanced techniques such as multi-provider support and multiple remote modules.

Who is this book for?

This book is for developers and operators who already have some exposure to working with infrastructure but want to improve their workflow and introduce infrastructure as a code practice. Knowledge of essential Amazon Web Services components (EC2, VPC, IAM) would help contextualize the examples provided. Basic understanding of Jenkins and Shell scripts will be helpful for the chapters on the production usage of Terraform.

What you will learn

• Understand what Infrastructure as Code (IaC) means and why it matters

• Install, configure, and deploy Terraform

• Take full control of your infrastructure in the form of code

• Manage complete complete infrastructure, starting with a single server and scaling beyond any limits

• Discover a great set of production-ready practices to manage infrastructure

• Set up CI/CD pipelines to test and deliver Terraform stacks

• Construct templates to simplify more complex provisioning tasks

What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Frequently bought together

$43.99

$54.99

$48.99

Total $ 147.97

Kindle Customer Apr 20, 2017

One of those technical books which although possibly technically competent, is so jarring in it's use of English, that it is difficult to read. I suspect that English is not the author's first language, which would account for the eccentric use of articles. This may sound picky, but I find it hard to read "Create EC2 instance" rather than "Creating an EC2 instance". The writing isn't even consistent; there'll be an article one time and not another. Very frustrating as it distracts from the sense of the book.

Amazon Verified review

Getting Started with Terraform: Infrastructure automation made easy

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access