What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Getting Started with Terraform

Infrastructure Automation

Before starting to learn Terraform, you first need to learn certain concepts in the modern infrastructure. To be able to use the new tool, one needs to understand what problem it solves. In order to do it, this chapter will cover the following topics:

Learning what Infrastructure as Code is and why it is needed
Understanding the benefits of a declarative approach to configuration management
Explaining the missing points of configuration management tools
Laying out requirements for high-level infrastructure automation
Taking a quick look at the main tools in order to provision infrastructure
The short overview and history of Terraform
What you will learn in this book

What is Infrastructure as Code and why is it needed?

The amount of servers used by almost any project is growing rapidly mostly due to increasing adoption of cloud technologies. As a result, traditional ways of managing IT infrastructure become less and less relevant.

The manual approach fits well for a farm of a dozen, perhaps even a couple of dozen of servers. But when we're talking about hundreds of them, doing anything by hand is definitely not going to play out well.

It's not only about servers, of course. Every cloud provider gives extra services on top, be it a virtual networking service, object storage, or a monitoring solution, which you don't need to maintain yourself. These services function that a Software as a Service (SaaS). And actually, we should treat various SaaS products as part of our infrastructure as well. If you use New Relic for monitoring purposes, then it is your infrastructure too, with the difference that you don't need to manage servers for it yourself. But how you use it and whether you use it correctly is up to you.

No surprises, companies of any size, from small start-ups to huge enterprises, are adopting new techniques and tools to manage and automate their infrastructures. These techniques eventually got a new name: Infrastructure as Code (IaC).

Dated something 2009, the Infrastructure as Code term is all about approaching your IT-infrastructure tasks the same way you develop software. This includes the things similar to the following:

Heavy use of source control to store all infrastructure-related code
Collaboration on this code in the same fashion as applications are developed
Using unit and integration testing and even applying Test-driven development to infrastructure code
Introducing continuous integration and continuous delivery to test and release infrastructure code

Infrastructure as Code is a foundation for DevOps culture because both operations and developers approach their work in the same way, and by following the principles laid out before, they already have some common ground.

This is not to say that if your infrastructure is treated as code, then the border between development and operations becomes so blurry that the whole existence of this separation can become eventually quite questionable.

Of course, the introduction of Infrastructure as Code requires new kinds of tools.

Declarative versus procedural tools for Infrastructure as Code

What is infrastructure code specifically? It depends highly on your particular infrastructure setup.

In the simplest case, it might be just a bunch of shell scripts and component-specific configuration files (Nginx configuration, cron jobs, and so on) stored in source control. Inside these shell scripts, you specify exact steps computer needs to take to achieve the state you need:

Copy this file to that folder.
Replace all occurrences of ADDRESS with mysite.com.
Restart the Nginx service.
Send an e-mail about successful deployment.

This is what we call procedural programming. It's not bad. For example, build steps of Continuous Integration tools such as Jenkins that are a perfect fit for a procedural approach—after all, the sequence of command is exactly what you need in this case.

However, you can only go far with shell scripts when it comes to configuring servers and higher-level pieces. The more common and mature approach these days is to use tools that provide a declarative, rather than a procedural, way to define your infrastructure. With declarative definitions, you don't need to think how to do something; you only write what should be there.

Perhaps the main benefit of it is that rerunning a declarative definition will never do the same job twice, whereas executing the same shell script will most likely break something on the second run. The proper configuration management tool will ensure that the server is in the exactly same state as defined in your code. This property of modern configuration and provisioning tools is named idempotency.

Let's look at an example. Let's say that you have a box in your network that hosts a packages repository. For some reason, instead of using DNS server, you want to hardcode the IP address of this box to the /etc/hosts file with the domain name repository.internal.

In Unix-like systems, the /etc/hosts file contains a local text database of DNS records. The system tries to resolve the DNS name by looking at this file first, and asking DNS-server only after.

Not a complex task to do, given that you only need to add a new line to the /etc/hosts file. To achieve this, you could have a script like the following:

echo 192.168.0.5 repository.internal >> /etc/hosts/hosts

Running it once will do the job: required entry will be added to the end of the /etc/hosts file. But what will happen if you execute it again? You guessed right: exactly the same line will be appended again. And, even worse, what if the IP address of the repository box will changes? Then, if you execute your script, you will end up with two different host entries for the same domain name.

You can ensure idempotency yourself inside the script with the high usage of conditional checks. But why reinvent the wheel when there is already a tool to do exactly this job? It would be so much better to just define the end result without composing a sequence of commands to achieve this.

And that is exactly what configuration management tools such as Puppet and Chef do by providing you with a special Domain Specific Language (DSL) to define the desired state of the machine. The certain downside is the necessity to learn a new DSL: a special small language focused on solving one particular task. It's not a complete programming language, neither does it need to be; in this case, its only job is to describe the state of your server.

Let's look at how the same task could be done with the help of a Puppet manifest:

host { 'repository.internal': 
  ip => '192.168.0.5', 
}

Applying this manifest multiple times will never add extra entries, and changing the IP address in the manifest will be reflected correctly in host files, changing the existing entry and not creating a new one.

There is an additional benefit I should mention: on top of idempotency, you often get platform agnosticism. What this means is that the same definition could be used for completely different operating systems without any change. For example, by using the package resource in Puppet, you don't care whether the underlying system uses rpm or deb.

Now you should better understand that, when it comes to configuration management, tools that provide the declarative way of doing things are preferred.

Modern configuration management tools such as Chef or Puppet completely solve the problem of setting up a single machine. There is an increasing number of high-quality libraries (be it cookbooks or modules) for configuring all kinds of software in an (almost) OS-agnostic way. But configuring what goes inside a single server is only part of the picture. The other part, which is located a layer above, also requires new tooling.

Infrastructure as Code in the Cloud

Quite often, servers are only one part of infrastructure. With cloud platforms such as Amazon Web Services (AWS), Google Cloud Platform, and OpenStack advancing more and more, there is an increased need for automating and streamlining the way people work with the services these platforms provide. If you rely heavily on at least one cloud provider for major parts of your project, you will start meeting challenges in applying consistent patterns of their usage.

The approach of modern configuration management tools, while having been around for quite some time and having been adopted by many companies, has some inconveniences when it comes to managing anything but servers.

There is a strong likelihood you would want these patterns to be written once and then applied automatically. Even more, you need to be able to reproduce every action and test the result of it, following the aforementioned Infrastructure as Code principles. Otherwise, working with cloud providers will either end up in so-called ClickOps, where you work with infrastructure primarily by clicking buttons in the web interface of a cloud provider, or you will script all the processes by using APIs of this provider directly. And, even if scripting APIs sounds like a big step towards true Infrastructure as Code, you can achieve much more using existing tools for this exact task.

There is a certain need for a configuration tool that operates one level higher than a setup of a single server; a tool that would allow writing a blueprint that would define all of the high-level pieces at once: servers, cloud services, and even external SaaS products. A tool like this is called given a different name: infrastructure orchestrator, infrastructure provisioner, infrastructure templating, and so on. No matter what you call it, at some point in time, your infrastructure will really need it.

Requirements for infrastructure provisioner

Before proceeding to the existing solutions, let's lay out a list of the most important requirements for a tool such as this, so we are able to choose one wisely.

Supports a wide variety of services

AWS alone already has dozens of entities to take care of. Other players (DigitalOcean, Google Cloud, Microsoft Azure, and so on) increase this number significantly. And if you want to add smaller SaaS providers to the game, you get hundreds of resources to manage.

Idempotency

The same as with a single-server configuration, reapplying an infrastructure template should not do the job twice. If you have a template defining 50 different resources, from EC2 instances to S3 buckets, then you do not want to duplicate or recreate all of them every time you apply the template. You want only missing parts to be created, existing ones to be in the desired state, and the ones which have become obsolete to be destroyed.

Dependency resolution

It is important to be able not just to define 2 app servers, 1 DB server, and 2 security groups, but to also point them to each other using lookup mechanism. Especially when creating a complete environment from scratch, you want to ensure the correct order of creation to achieve the flawless bootstrap of each component.

Here, and further in the book, the term environment will mean a complete set of resources that an infrastructure consists of. It includes a network setup, all servers, and all related resources.

Robust integration with existing tools

Even though it is pretty awesome to have all infrastructures in one beautiful template, you still need to take care of what is happening on each particular server: applications need to be deployed, databases need to be configured, and so on. This is not the job for an infrastructure provisioning tool. But, certainly, a tool like this should easily integrate with other tools such as Chef, which solves this problem already.

Platform agnosticism

Ideally, templates should be platform agnostic. This means that if I define a template for 2 app servers, 1 db server, all talk to each other, I should be able to easily switch from AWS to local Vagrant without rewriting the template. Platform agnosticism is difficult to obtain, while at the same time, might not really be needed that often. Completely changing the underlying platform is a rather rare event that happens perhaps once or twice in a product's lifetime.

Smart update management

This is a tricky one, and at the moment of writing, no tool can do it flawlessly in every case (and, honestly, it is unlikely one will ever). What happens when I change a type of three EC2 instances from m3.medium to c4.xlarge? Will my m3.medium instances shut down and be replaced one by one by new ones? Will they be instantly destroyed leading to a few minutes of downtime? Or will the tool, just ignore the updated instance type? Or will it not and then just override old nodes and I will end up with three new nodes and three old EC2 instances that I have to remove manually? Solutions to this problem differ from platform to platform, which makes it more complicated for the tool to be platform agnostic.

Ease of extension

The last requirement is of particular importance: there must be an easy way to extend this tool to support other resources. For example, if a tool lacks support for AWS Kinesis or a particular feature or property of already supported service, and there is no plan to support it officially, then there has to be a way to implement it yourself quickly.

Which tools exist for infrastructure provisioning?

Now that we have a problem to solve and a list of requirements the tool that should solve the problem, we can go into the specifics of the different existing tools.

Scripting

Almost every cloud provider has an API, and if there is an API, you can script it. You could also go beyond a single script and develop a small-focused tool just for your company to create environments. The disadvantages are: more software to develop and support in-house.

Configuration management

Most configuration management tools already have a way to create cloud resources. Chef has Chef provisioning, which allows you to write recipes that define, not entities on a single server, but multiple servers and components, such as security groups of AWS and networking parts. There are also Puppet modules which wrap cloud APIs into Puppet resources. Ansible also has modules to support providers, such as AWS, OpenStack, and others.

While the idea of using a single tool for both levels: high complete infrastructure definition and inside-a-server configuration, is tempting, it has some drawbacks. One of them is lack of support for many required services and the immaturity of these solutions in general.

Also, the ways to use these tools for this purpose are kind of ambiguous. There are no well-defined workflows. Let's take AWS as an example. The recommended way to set up a firewall in AWS environment is to use security groups (SGs). SGs are a separate entity, which are available via web interface or API.

What should you do if you want to create an AWS security group that allows connections from an app server to a database server? Should you put this code a database package or an application package? An AWS security group clearly doesn't belong to either of them.

The only meaningful solution is to create a separate package which is dedicated to creating the security groups and performs searches against the nodes API to define inbound and outbound rules for these groups.

It's also unclear from where to execute this kind of code. From a workstation? From a separate AWS-resources node that has permissions to do this sort of thing? How do you secure it? How do you distribute keys? And, more importantly, how do you make this process reproducible and ready to be used in CI/CD pipelines? There is no clear answer to these questions from the configuration management tools' point of view.

The other downside is that you might not even have, or want to have, a complete configuration management in your organization. Implementing them gives huge benefits, but a steep learning curve and lack of in-house expertise can be significant blockers in their adaption.

CloudFormation/Heat

Both AWS and OpenStack have a built-in way to define all of their resources in one template. Often, it works nicely in environments that are only AWS or only OpenStack. But, as soon as you want to add another provider to the mix, you need another tool.

Terraform

Finally, there is Terraform, the tool this book is about, and the one we will use to codify a complete infrastructure, or at least the top layer of it.

A short overview of Terraform

Terraform is an open source utility, created by the HashiCorp company, the same company that created Vagrant, Packer, Consul, and other popular infrastructure tools. It was initially released in July 2014, and since then, has come a long way to become one of the most important tools for infrastructure provisioning and management.

This is how Terraform is described by HashiCorp:

... a tool for safely and efficiently building, combining, and launching infrastructure. From physical servers to containers to SaaS products, Terraform is able to create and compose all the components necessary to run any service or application. (https://www.hashicorp.com/blog/terraform.html).

Terraform easily fits most of the requirements listed here:

At the time of writing, it supports over 30 different providers, from a huge ones such as AWS to a smaller ones such as multiple SaaS DNS providers.
Terraform provides special configuration language to declare your infrastructure in simple text templates.
Terraform also implements a complex graph logic, which allows you to resolve dependencies, intelligibility and reliability.
When it comes to servers, Terraform has multiple ways of configuring and wiring them up with existing configuration management tools.
Terraform is not platform agnostic in the sense described earlier, but it allows you to use multiple providers in a single template, and there are ways to make it somewhat platform agnostic. We will talk about these ways towards the end of the book.
Terraform keeps track of the current state of the infrastructure it created and applies delta changes when something needs to be updated, added, or deleted. It also provides a way to import existing resources and target only specific resources.
Terraform is easily extendable with plugins, which should be written in the Go programming language.

Over the next seven chapters, we will learn how to use Terraform and all of its features.

Journey ahead and how to read this book

This is a book about Terraform, and you will learn everything that there is to learn about this tool. There are two main parts to this book, split into six chapters of pure learning.

In the next three chapters, we will learn the basics. In Chapter 2, Deploying First Server, the next one, you will learn the basics of Terraform, the main entities it uses, and how to deploy our first server with it. We will also get a short introduction to AWS EC2.

In Chapter 3, Resource Dependencies and Modules, we will discover how exactly Terraform operates with its resources and how to refactor our code. In Chapter 4, Storing and Supplying Configuration, you will learn all the possible ways you can configure your templates with the various APIs Terraform provides.

If you are already familiar with the Terraform basics, Chapter 2, Deploying First Server, to Chapter 4, Storing and Supplying Configuration, might be a bit boring for you. They are about how to use this tool as a first-time user, and they don't cover many advanced topics that you will get to once you run Terraform in production. Feel free to skip the next three chapters if you, already used Terraform. For advanced topics, head over to Chapter 5, Connecting with Other Tools, Chapter 6, Scaling and Updating Infrastructure, and Chapter 7, Collaborative Infrastructure.

In Chapter 5, Connecting with Other Tools, you will learn how to connect Terraform with many different tools, from configuration management to infrastructure testing tools. We will find out how to provision and reprovision machines and how to use Terraform alongside literally any other tool.

In Chapter 6, Scaling and Updating Infrastructure, we will cover infrastructure updates with Terraform, from the very simple cases (such as changing one property of a non-essential resource) to complex upgrade scenarios of whole clusters of machines.

Finally, in Chapter 7, Collaborative Infrastructure, you will learn how to collaborate on infrastructure work with Terraform. We will also master integration testing for Terraform environments.

Be prepared: this book is not only about Terraform. It's about Infrastructure as Code and various topics surrounding it, such as Immutable Infrastructure. Terraform will be the main tool we study, but definitely not the only one. Configuration management tools, testing tools, half a dozen small helper utilities, and the same amount of AWS services; get ready to learn the whole toolset required to embrace Infrastructure as Code because, as you will soon notice, Terraform is a tool that must be supported by other software.

In the final chapter, Chapter 8, Future of Terraform, we will run through multiple topics related to Terraform which did not make it into the other chapters. That chapter, also includes a non-conventional piece on the future of Terraform, which you may or may not want to read before proceeding to learn it.

So, without further delay, let's proceed to creating our first server with Terraform.

Key benefits

*An up-to-date and comprehensive resource on Terraform that lets you quickly and efficiently launch your infrastructure

*Learn how to implement your infrastructure as code and make secure, effective changes to your infrastructure

*Learn to build multi-cloud fault-tolerant systems and simplify the management and orchestration of even the largest scale and most complex cloud infrastructures

Description

Terraform is a tool used to efficiently build, configure, and improve the production infrastructure. It can manage the existing infrastructure as well as create custom in-house solutions. This book shows you when and how to implement infrastructure as a code practices with Terraform. It covers everything necessary to set up the complete management of infrastructure with Terraform, starting with the basics of using providers and resources. It is a comprehensive guide that begins with very small infrastructure templates and takes you all the way to managing complex systems, all using concrete examples that evolve over the course of the book. The book ends with the complete workflow of managing a production infrastructure as code—this is achieved with the help of version control and continuous integration. The readers will also learn how to combine multiple providers in a single template and manage different code bases with many complex modules. It focuses on how to set up continuous integration for the infrastructure code. The readers will be able to use Terraform to build, change, and combine infrastructure safely and efficiently.

Who is this book for?

This book is for developers and operators who already have some exposure to working with infrastructure but want to improve their workflow and introduce infrastructure as a code practice. Knowledge of essential Amazon Web Services components (EC2, VPC, IAM) would help contextualize the examples provided. Basic understanding of Jenkins and Shell scripts will be helpful for the chapters on the production usage of Terraform.

What you will learn

*Understand what Infrastructure as Code (IaC) means and why it matters

*Install, configure, and deploy Terraform

*Take full control of your infrastructure in the form of code

*Manage complete infrastructure, starting with a single server and scaling beyond any limits

*Discover a great set of production-ready practices to manage infrastructure

*Set up CI/CD pipelines to test and deliver Terraform stacks

*Construct templates to simplify more complex provisioning tasks

What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

$59.99

$54.99

$32.99

Total $ 147.97

speedyboy Sep 29, 2018

Excellent book for getting started with Terraform, especially if you're interested in using it with AWS.

Amazon Verified review

M C Sep 29, 2018

Written in broken English and already outdated

Graydon Apr 05, 2024

Does anyone ever proof-read anymore? Preface: "By the end of this book not only will you now how to use Terraform" NOT "now", you wanted to say "know".

Subscriber review

Getting Started with Terraform: Manage production infrastructure as a code , Second Edition

What do you get with Print?

Getting Started with Terraform

Infrastructure Automation

What is Infrastructure as Code and why is it needed?

Declarative versus procedural tools for Infrastructure as Code

Infrastructure as Code in the Cloud

Requirements for infrastructure provisioner

Supports a wide variety of services

Idempotency

Dependency resolution

Robust integration with existing tools

Platform agnosticism

Smart update management

Ease of extension

Which tools exist for infrastructure provisioning?

Scripting

Configuration management

CloudFormation/Heat

Terraform

A short overview of Terraform

Journey ahead and how to read this book

Summary

Page 1 of 9

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the author

FAQs

Getting Started with Terraform: Manage production infrastructure as a code , Second Edition

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access