About this book

Chef is a configuration management tool used to handle the hardest part of infrastructure, easing the deployment of servers and applications to any infrastructure. Chef-Solo is an open source version of the chef-client that allows you to use cookbooks with nodes, without requiring access to a Chef server. In any infrastructure, managing servers is one of the most critical tasks for any server administration. Chef-Solo makes the process of booting and provisioning many machines at the same time much easier.

Configuration Management with Chef-Solo will take you through the workflow of managing one or more servers. It includes many sample recipes to start with, and gradually you will take a look at the different interaction points and will also learn how Chef-Solo helps minimize your efforts to build and manage different machines. You will learn how to run servers while executing Ruby code. This  hands-on guide will help you to understand the importance of this amazing configuration management tool.

Publication date:
June 2014
Publisher
Packt
Pages
116
ISBN
9781783982462

 

Chapter 1. Introduction to Chef and Chef-Solo

Chef is a configuration management system to automate the process of deploying servers to any physical, virtual, or cloud location. Each setup involves the basic structure with one Chef server and different nodes managed by the chef-client. Chef infrastructure is managed by Ruby code and it allows you to test, build, and replicate your infrastructure.

This chapter will guide you through the basics of Chef and how it can help you in building an infrastructure. We will discuss Chef, Chef-Solo, and address some common problems in building an infrastructure and how Chef can help us to solve these problems.

We will cover the following topics in this chapter:

  • Chef explanations and concepts

  • Chef-Solo

  • Terminology for Chef

  • Different use cases

  • Concepts

 

Getting started with Chef


Chef is a complete framework to automate infrastructure operations to build servers or applications from scratch or add new configurations to existing systems. Servers are managed by code, written in Ruby and it provides the facility to test and reproduce machines.

Chef basic infrastructure contains at least one server and one node. Each node is maintained and set up by chef-client and is responsible for executing recipes and configuring environments to run applications. It contains the abstract-level configuration of a server or an application.

Tiny code blocks in recipes contain a set of commands that execute on a system sequentially, and gradually configure the whole environment. The complete process is fully automated and without human administration; Chef can set up several nodes.

For instance, if you want 100 servers with Python/Django running Nginx with uWSGI and you want to have the same installations on each node, Chef can make this happen in minutes; it also provides you with the switch to turn your nodes on and off. It can check for revision control system and is responsible for pulling recent updates from the repository. You can easily revert the system to the previous state if something does not happen according to your needs. With Chef, system administrators can spend less time on maintenance and more time on innovation.

Traditional infrastructure is slow and tedious; it involves many steps to build servers and running applications. All your configurations are in one place and you will not worry about the several configurations of different servers. While scaling your application, it is highly recommended to use Chef, as you can easily split your app on to different servers by using roles and nodes. You do not have to install the same application 10 times on one machine or any other, just create a new node in Chef server and in a few minutes, the server will be ready to handle the application. Also, there is no need to maintain the documentation of servers, as the recipes' code is self-explanatory and easy to grasp for a new user.

Chef is developed by Chef Software, Inc. and recently they released Version 11.0. Chef code is completely rewritten in Version 11.0, swapping out Apache CouchDB for PostgreSQL and Ruby for Erlang. The result is massive and now a single Chef server can handle more than 1000 nodes (clients).

Chef is provided in the following three versions:

  • Private Chef: This is an enterprise version that supports multi-tenancy to provide a highly scalable server to handle several nodes. It should be located in the client's premises and managed behind a firewall.

  • Hosted Chef: This is an SAAS service managed by Chef Software, Inc. It is a cloud-based service and highly available (24/7 x 365), with roles and resource-based access controls. It does not require a firewall.

  • Open source Chef: This is a community-driven version with almost identical features, and it should be managed locally and behind the firewall. The latest features initially were released for the commercial version and then gradually released in the open source version. The system administrator will be responsible for applying updates, defining roles, data migrations, and ensuring that the infrastructure scales appropriately.

Chef has been primarily divided into the following three parts:

  • Chef server: Chef server is responsible for handling nodes and providing cookbooks to clients.

  • chef-client: The chef-client actually executes the recipes and configures the system. It also resolves each application dependency. The Chef architecture is based on the Thin server, Thick client model. There is no need for continuous communication with the server, as the client retrieves the cookbooks from the server and processes recipes on the client end. The server distributes data to each node including cookbooks, templates, files, and other items. The server contains the copy of all items. This approach ensures that each node has persistent data and files.

  • Knife: Knife is a tool that provides an interface between local-repo and the server. It is used to retrieve cookbooks, policies, roles, environments, and other items.

 

Understanding Chef-Solo


Chef-Solo is an open source version of chef-client. It executes recipes from the local cookbooks. It is a limited version of chef-client and has much fewer features than it. It does not have the following features:

  • Node data storage: Node data storage is used to keep values consistent across each node in a large infrastructure.

  • Search indexes: Search index is a full list of objects that are stored by the server, including roles, nodes, environments, and data bags. They are a fully text-based search and its queries can be made using wildcard, range, and fuzzy logic. While using Knife, a search can be made by using a subcommand from Knife.

    The following command is an example. To search by a platform ID, use the following command:

    knife search node 'rackspace:*' –i
    

    The result for the preceding command would be as follows:

    4 items found
    ip-1B45DE89.rackspace.internal
    ip-1B45DE89.rackspace.internal
    ip-1B45DE89.rackspace.internal
    ip-1B45DE89.rackspace.internal
    

    Tip

    Downloading the sample code

    You can download the sample code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

    Similarly, you can search by instance type, node, environment, nested attributes, and multiple arguments.

  • Centralized distribution of cookbooks: As Chef-Solo works individually, it does not have the ability for distribution of cookbooks. Even if you have deployed Chef server, Chef-Solo will not be able to retrieve recipes from a centralized source.

  • Centralized API for integration with other infrastructure components: There is no centralized API for Chef-Solo to retrieve other configurations from a different machine. For instance, if your application needs database connectivity, you will not be able to get the IP of the database source. There are multiple solutions to address this problem, which we will discuss in the upcoming chapters.

  • Authentication: Chef-Solo has no authentication module; anyone can execute the recipes:

    # chef-solo privileges
    test ALL=(ALL) NOPASSWD: /usr/bin/chef-solo
    #test is name of non-root user.
    
  • Persistent attributes: There is no centralized cookbook system for Chef-Solo; it just executes the recipes from a local cookbook.

Although Chef-Solo has fewer features, it provides the core use of developing cookbooks.

Moreover, Chef-Solo provides a simple way to start. You can build the system by using cookbooks and it's extremely useful for booting new machines.

Like chef-client, Chef-Solo can be used for servers, applications, or any physical machine.

 

Terminologies


We will now discuss some terminologies about Chef. As we have already discussed, Chef has two different types, namely Chef server and Chef-Solo. Chef-Solo provides a simple way to start. The following terminologies mentioned are used for Chef server as well as Chef-Solo.

List of terminologies

A generalized list of Chef terminologies are mentioned in the following section.

Node

Any physical, virtual, or cloud machine where chef-client will run is termed as a node. There are the following four types of nodes that can be managed by chef-client:

  • Cloud node: This is a server hosted on any cloud-based service such as Rackspace, Amazon, Virtual Private Cloud, OpenStack, Google Compute Engine, or Windows Azure. Different plugins are available for supporting different cloud types. It can be used to create instances using cloud-based services.

  • Physical node: Physical node is a server or a virtual machine that has the capability of sending, receiving, and forwarding data through a network channel. In simple words, a network machine that runs the chef-client.

  • Virtual node: A virtual node runs as a software implementation but behaves like a proper machine, for example, VirtualBox, Docker, and so on.

  • Network node: This is a device attached to the network and capable of sending, receiving, and forwarding data, and managed by chef-client. Routers, switches, and firewalls are a perfect example of network nodes.

Workstation

A workstation is a machine, where Knife configures and sends instructions to a node. Knife is used to manage nodes, cookbooks and recipes, roles, and environments. A commercial Knife version can be used to search index data on the server.

For a production environment, workstation authentication is managed by RSA or a DSA key pair. Authentication ensures that a workstation is properly registered with the server.

Moreover, Chef-repo is maintained on the workstation and it is distributed in chef-clients from the workstation. Once the distribution is done, chef-client executes the recipes and installs everything on the system.

Cookbooks

Cookbooks are a collection of recipes. Each cookbook defines the policy and scenario to install and configure any particular machine. For instance, installing PostgreSQL needs libpq-dev and other packages. It contains all the components that need to be installed on the system.

Additional configurations can be set up using cookbooks:

  • Attributes to set on nodes

  • Definitions of resources

  • Dependency control

  • File distributions

  • Libraries to help Chef-Solo to extend Ruby code, for example, Berkshelf, Librarian-Chef

  • Templates

  • Custom resources and providers

  • Roles

  • Metadata of recipes

  • Versions

Cookbooks are written in Ruby code. It's good to have knowledge about Ruby, but it's not mandatory. The Ruby code used in cookbooks is very simple and self-explanatory. While using cookbooks, you do not need to maintain the documentation of the server setup.

The sole purpose of a cookbook is to give a reasonable set of resources to a chef-client for the infrastructure automation.

Recipes

Recipes are the fundamental configuration elements in cookbooks. A recipe contains a set of commands that needs to be executed step by step. A recipe can include additional recipes within a recipe.

Each code block contains a set of instructions. For example, take a look at the following code:

# To update
execute "update apt" do
  command "apt-get update --fix-missing"
end

# For installing some packages
%w{
    curl
    screen
    make
    python2.7-dev
    vim
    python-setuptools
    libmysqlclient-dev
    mysql-client
}.each do |pkg|
  package pkg do
    action :install
  end
end

Recipes are written in Ruby code, and it contains the set of resources; each code block is wrapped in a resource. In the previous example, execute and package is a resource to handle code inside a block.

There are certain rules for writing recipes:

  • It should maintain a proper order. For instance, if you want to use MySQL, you must specify the libmysqlclient-dev package first and then install MySQL.

  • Recipes must be placed in the cookbook folder.

  • It must define everything that needs to be installed in a particular environment.

  • Recipes must be declared in run_list to execute in any recipe.

  • Any additional recipe that you specify should be contained in the same cookbook folder or you should have some dependency resolved to include the recipe (Berkshelf allows you to include the recipe from github.com).

Resources

A resource is an integral part of any recipe. It defines the actions to be taken in a recipe. It could be a service, a package, a group of a user, and so on. For example, it will instruct chef-client to check whether a particular package needs to be installed or not, or when a service needs to be restarted or not, and which directory or file needs to be created by which user. Each resource is written in a code block and it executes in the same order as mentioned in the recipe. Chef-Solo ensures that each action has to be taken as specified in the recipe. After the successful execution of resources, it returns the success code to chef-client. In case there is an error, it returns with an error code and chef-client exits with an error.

The following is an example of a directory resource:

directory "/var/log/project" do
  owner "root"
  group "root"
  recursive true
  action :create
end

The chef-client will look up the directory resource and call load_current_resource to create a new directory resource. The client will look up the directory; if it's not created, it will create the directory in the logs folder, and if the directory already exists, nothing will happen and the resource will be marked as completed.

The following is an example of a Git resource:

git "/home/user/webapps/project" do
  repository "[email protected]:opscode-cookbooks/chef_handler.git"
  reference "master"
  action :sync
  enable_submodules true
  user "root"
  group "root"
end

The mentioned Git resource will pull the code from the repository with all sub-modules.

It will switch the branch to master and async will ensure that recent changes have been pulled from the remote repository to the local repository.

The resource is mainly divided into the following four parts:

  • Type

  • Name

  • Attributes

  • Actions

The coding convention of the resource is shown in the following code:

resourcetype "name" do
   attribute "value"
   action :action
end

In the preceding code, resourcetype is the name of a resource, for example, directory, file, apache_site, and so on.

As we have discussed earlier that each resource has separate actions, the action command is used to execute these actions.

Each resource has its own type of actions and attributes. The directory resource has a create and a delete action. Each resource has its own default actions and attributes. Similarly, the default action directory resource is create. And it has group, inherits, mode, owner, path, provider, recursive, and right attributes.

Roles

In simple words, the reusable configuration of several nodes, for example, database, Web, and so on, are called as roles. They define certain patterns and processes that need to be installed in different nodes. When a role runs against any recipe, attributes have been overwritten with role attributes.

Each node can have zero or more roles assigned to it and then run_list of roles will be executed on the node.

An attribute can be defined both in a node and in a role. Role should have at least the following attributes:

  • name: This attribute gives the name of the role

  • description: This contains the description of the role

  • run_list: Recipes need to be executed with this role

Roles can be declared in two ways: we can define it in Ruby or in JSON. In case of JSON, there are some additional attributes, such as chef_type, json_class that need to be defined. Detailed information about roles is available in the next chapter.

Attributes

In simple terms, attributes are variables that are defined in a cookbook to be used by recipes and templates to create configuration files. When chef-client executes the recipes, it loads all the attributes from cookbooks, recipes, and roles.

Attributes are declared in the attributes folder under cookbooks. When any recipe is executed, it checks within the context of the current node and applies on that node.

For example, Nginx cookbook attributes are given as follows:

default["nginx"]["dir"]          = "/etc/nginx"
default["nginx"]["listen_ports"] = [ "80","443" ]

Similarly, Git attributes are given as follows:

default["project"][:project_path] = "/home/chef/webapps/project"
default["project"][:repository] = "[email protected]:opscode-cookbooks/chef_handler.git"

We have already discussed about the attributes' precedence in the Roles section. We will discuss attributes in more detail in the upcoming chapters.

Templates

A template is a simple configuration file that has a placeholder for attributes. Templates files are written in Embedded Ruby (.erb) format. For example, to deploy Nginx, you need the nginx.conf.erb file.

A sample of a template file is mentioned as follows:

server {
    listen 80;
    server_name <%= node["project"]["domain"] %>;
    access_log <%= node["project"]["logs_dir"] %>/<%= node["project"]["project"_name] %>_access.log;
    error_log <%= node["project"]["logs_dir"] %>/<%= node["project"]["project"_name] %>_error.log;
    location / {
        client_max_body_size 20M;
        client_body_temp_path /tmp/
        expires -1;
        proxy_pass        http://localhost:8000;
        proxy_set_header  X-Real-IP  $remote_addr;
        send_timeout 5m;
        keepalive_timeout 5m;
        gzip on;
    }
}

In the preceding example, the following attributes will be replaced and the configuration file will be copied to a specific directory with a real value:

node["project"]["domain"] = "http://mydomain.com"
node["project"]["logs_dir"] = "/var/logs/project"

Data bags

A data bag is a global variable defined in JSON and accessible from a server.

The following is an example:

{
    "project": {
        "dbdomain": "dbdomain.com",
        "dbuser": "database_user",
        "dbpassword": "database_password"
    },
    "run_list": [
        "recipe[git::default]",
        "recipe[nginx::default]",
        "recipe[project::default]"
    ]
}

In the preceding example, Dbdomain, dbuser, and dbpassword are the data bags.

 

Different use cases


The best way to learn Chef is to see real-world examples. We will not install and create nodes in the current chapter, but we will explain the dependencies of different environments. More detailed information about installation of these environments will be explained in Chapter 2, Setting Up an Environment for Chef-Solo.

PHP WordPress

This section assumes that you are already familiar with WordPress. To set up a WordPress site, we need the following packages installed on any machine:

  • Apache Web server

  • MySQL Database server

  • PHP

Chef-Solo will be responsible for downloading WordPress and configuring Apache and PHP along with its dependencies. Chef Software, Inc. already has a WordPress recipe. We can use the same recipe to install a new blog.

Python/Django application

In this example, we will configure a node for Python application, assuming our Python application is built in Python using a Django framework with a MySQL database.

To run the web server, we will use Nginx. The complete requirements will look like the following:

  • Python

  • Django framework

  • Nginx

As you can see, quite a long list needs to be installed on a system to work properly. After defining the recipes, all the instructions are written in code files and after execution of these recipes, the server will be up and running.

Chef provides us with the facility to automate to achieve efficiency, scalability, reusability, and documentation.

 

An overview of Chef


The idea of Chef is to automate the whole process; it is rare that any single individual knows everything in such a large infrastructure. Also, Chef has a large community that participates to solve large problems.

The basic principle of Chef is that the user knows each bit of the environment; for example which are the packages that need to be installed, requirements of your applications, and so on. The best person for any application development is the team who developed the system. While creating a system, they can write down the requirements and easily convert them into Ruby code.

Moreover, any large organization uses a number of different packages to handle web applications, for instance, you have a Python application, running on Nginx, using uWSGI, caching with memcahe and redis, and using Elasticsearch for quick search.

All the mentioned packages have a list of dependencies. Once you have the recipes, you can run them once or several times on the same system and the results will be identical. The chef-client keeps track of the changes in resources. If any resource item is updated, then it runs the resource again and marks it as successful.

Chef is designed on a Thin Server, Thick Client approach. The client does not need continuous communication to the server. Once the client receives the set of instructions (cookbooks), it will execute and report to the server. The server is responsible for the distribution of cookbooks, templates, environments, and so on, via the workstation. Also, the server will keep a copy of all configurations.

Moreover, while creating the recipes, an order should be maintained. As we have already discussed, recipes are based on several resources. Each resource contains a set of actions and are executed one by one. The resources execute sequentially and the order needs to be maintained. For example, if you try to install PostgreSQL before libpq-dev, you will get the error of missing dependency and the recipe script will exit with an error.

 

Summary


In this chapter, we have discussed Chef, chef-client, and Chef-Solo. We also discussed some core concepts of Chef and had a look at different use cases. Also, we elaborated upon certain Chef terminologies such as nodes, workstations, cookbooks, recipes, resources, roles, attributes, templates, and data bags.

In the next chapter, we will discuss the cookbooks in more detail and set up a Linux machine with Chef, and will execute some open source recipes.

About the Author

  • Naveed ur Rahman

    Naveed ur Rahman is a self-taught programmer and an avid traveler. When he is not experimenting with the latest in programming and deployment, he is out camping and watching cricket.

    His adventures in programming began at a very young age when he got introduced to GW-BASIC. Now, he has experience working for one of the biggest tech names in the Middle East.

    Having worked at the largest technology company in the Middle East, Naveed has helped teams create and deploy applications written in various languages using configuration management tools.

    Browse publications by this author
Configuration Management with Chef-Solo
Unlock this book and the full library for FREE
Start free trial