Reader small image

You're reading from  Practical Ansible - Second Edition

Product typeBook
Published inSep 2023
PublisherPackt
ISBN-139781805129974
Edition2nd Edition
Right arrow
Authors (3):
James Freeman
James Freeman
author image
James Freeman

James Freeman is an accomplished IT professional with over 25 years' experience in the technology industry. He has more than a decade of first-hand experience in solving real-world enterprise problems in production environments using Ansible, open source, and AWS. As part of this work, he frequently introduces Ansible as a new technology to businesses and CTOs for the first time. In addition, he has co-authored five books and one video training course on Ansible, facilitated bespoke Ansible workshops and training sessions, and presented at both international conferences and meetups on Ansible.
Read more about James Freeman

Fabio Alessandro Locati
Fabio Alessandro Locati
author image
Fabio Alessandro Locati

Fabio Alessandro Locati – commonly known as Fale – is an EMEA associate principal solutions architect at Red Hat, a public speaker, an author, and an open source contributor. His primary areas of expertise are Linux, automation, security, and cloud technologies. Fale has more than 15 years of working experience in IT, with many of them spent consulting for various organizations, including dozens of Fortune 500 companies. Fale has written Learning Ansible 2.7, Learning Ansible 2, and OpenStack Cloud Security, and has been part of the review process of multiple books.
Read more about Fabio Alessandro Locati

Daniel Oh
Daniel Oh
author image
Daniel Oh

Daniel Oh is a principal technical marketing manager at Red Hat. He provides runtimes, frameworks, fast data access, and high-performance messaging in flexible, easy-to-use, cost-effective, open, and collaborative ways. He's also a CNCF ambassador and DevOps Institute ambassador who evangelizes how to design and develop cloud-native serverless microservices and deploy them to multi/hybrid cloud-native platforms based on CNCF projects. Daniel loves to share his developer experiences with DevOps folks in terms of how to evolve traditional microservices to cloud-native, event-driven, and serverless applications via technical workshops, brown bag sessions, hackathons, and hands-on labs across regions at many international conferences.
Read more about Daniel Oh

View More author details
Right arrow

Troubleshooting and Testing Strategies

In a similar way to any other kind of code, Ansible code can contain issues and bugs. Ansible tries to make it as safe as possible by checking the task syntax before the task is executed. This check, however, only saves you from a small number of possible types of errors, such as incorrect task parameters, but it will not protect you from others.

It’s also important to remember that, due to its nature, we describe the desired state in Ansible code rather than stating a sequence of steps to obtain the desired state. This difference means that the system is less prone to logical errors.

Nevertheless, a bug in a playbook could mean a potential misconfiguration on all your machines. This should be taken very seriously. It is even more critical when critical parts of the system are changed, such as SSH daemon or sudo configuration, since the risk is you locking yourself out of the system.

There are many ways to prevent or mitigate a...

Technical requirements

This chapter assumes that you have set up your control host with Ansible, as detailed in Chapter 1, Getting Started with Ansible, and are using the most recent version available – the examples in this chapter were tested with Ansible 2.9. Although we will give specific examples of hostnames in this chapter, you are free to substitute them with your hostnames and/or IP addresses. Details of how to do this will be provided at the appropriate places.

The examples in this chapter can be found in this book’s GitHub repository at https://github.com/PacktPublishing/Practical-Ansible-Second-Edition/tree/main/Chapter%2012.

Digging into playbook execution problems

There are cases where an Ansible execution will interrupt. Many things can cause these situations.

The network is the most frequent cause of problems I’ve found while executing Ansible playbooks. Since the machine issuing the commands and the one performing them are usually linked through the network, a problem in the network will immediately show itself as an Ansible execution problem.

You can tell Ansible to repeat the execution of a task by registering a variable and using the until keyword.

Sometimes, and this is particularly true for some modules, such as ansible.builtin.shell or ansible.builtin.command, the return code is non-zero, even though the execution was successful. In those cases, you can ignore the error by using the following line in your module:

ignore_errors: yes

For instance, if you run the /bin/false command, it will always return 1. To execute this in a playbook so that you can avoid it blocking there...

Using host facts to diagnose failures

Some execution failures derive from the state of the target machine. The most common problem of this kind is the case where Ansible expects a file or variable to be present, but it’s not there.

Sometimes, it can be enough to print the machine facts to find the problem.

To do so, we need to create a simple playbook called print_facts.yaml, which contains the following content:

---
- hosts: all
  tasks:
  - name: Display all variables/facts known for a host
    ansible.builtin.debug:
      var: hostvars[inventory_hostname]

This technique will give you a lot of information about the state of the target machine during Ansible execution.

Testing with a playbook

One of the most complex things in the IT field is not creating software and systems but debugging them when they have problems. Ansible is no exception. No matter how good you are at creating Ansible playbooks, sooner or later, you’ll find yourself debugging a playbook that is not behaving as you thought it would.

The simplest way of performing basic tests is to print out the values of variables during execution. Let’s learn how to do this with Ansible:

  1. First of all, we need a playbook called debug.yaml with the following content:
    ---
    - hosts: localhost
      tasks:
      - ansible.builtin.shell: /usr/bin/uptime
        register: result
      - ansible.builtin.debug:
          var: result
  2. Run it with the following command:
    $ ansible-playbook debug.yaml

You will receive an output similar to the following:

PLAY [localhost] *******************************************...

Using check mode

Although you might be confident in the code you have written, it still pays to test it before running it for real in a production environment. In such cases, it is a good idea to be able to run your code, but with a safety net in place. This is what check mode is for. Follow these steps:

  1. First of all, we need to create an easy playbook to test this feature. Let’s create a playbook called check-mode.yaml that contains the following content:
    ---
    - hosts: localhost
      tasks:
      - name: Touch a file
        ansible.builtin.file:
          path: /tmp/myfile
          state: touch
  2. Now, we can run the playbook in check mode by specifying the --check option in the invocation:
    $ ansible-playbook check-mode.yaml --check

This will output everything as if it were really performing the operation, as follows:

PLAY [localhost] *******************************************...

Solving host connection issues

Ansible is often used to manage remote hosts or systems. To do this, Ansible will need to be able to connect to the remote host, and only after that will it be able to issue commands. Sometimes, the problem is that Ansible is unable to connect to the remote host. A typical example of this is when you try to manage a machine that hasn’t booted yet. Being able to quickly recognize these kinds of problems and fix them promptly will help you save a lot of time.

Follow these steps to get started:

  1. Let’s create a playbook called remote.yaml with the following content:
    ---
    - hosts: all
      tasks:
      - name: Touch a file
        ansible.builtin.file:
          path: /tmp/myfile
          state: touch
  2. We can try to run the remote.yaml playbook against a non-existent FQDN, as follows:
    $ ansible-playbook -i host.example.com, remote.yaml

In this case...

Passing working variables via the CLI

One thing that can help during debugging, and definitely helps for code reusability, is passing variables to playbooks via the command line. Every time your application – either an Ansible playbook or any kind of application – receives input from a third party (a human, in this case), it should ensure that the value is reasonable. An example of this would be to check that the variable has been set and therefore is not an empty string. This is a security golden rule, but it should also be applied when the user is trusted since the user might mistype the variable’s name. The application should identify this and protect the whole system by protecting itself. Follow these steps:

  1. The first thing we want to have is a simple playbook that prints the content of a variable. Let’s create a playbook called printvar.yaml that contains the following content:
    ---
    - hosts: localhost
      tasks:
      - ansible.builtin...

Limiting the host’s execution

While testing a playbook, it might make sense to test on a restricted number of machines; for instance, just one. Let’s get started:

  1. To use the limitation of target hosts on Ansible, we will need a playbook. Create a playbook called helloworld.yaml that contains the following content:
    ---
    - hosts: all
      tasks:
      - ansible.builtin.debug:
          msg: "Hello, World!"
  2. We also need to create an inventory with at least two hosts. In my case, I created a file called inventory that contains the following content:
    [hosts]
    host1.example.com
    host2.example.com
    host3.example.com

Let’s run the playbook in the usual way with the following command:

$ ansible-playbook -i inventory helloworld.yaml

By doing this, we will receive the following output:

PLAY [all] ****************************************************************************************
TASK [Gathering Facts...

Flushing the code cache

Everywhere in IT, caches are used to speed up operations, and Ansible is no exception.

Usually, caches are good, and for this reason, they are heavily used ubiquitously. However, they might create some problems if they cache a value that should not have been cached or if they are not flushed, even if the value has changed.

Flushing caches in Ansible is very straightforward, and it’s enough to run ansible-playbook, which we are already running, with the addition of the --flush-cache option, as follows:

ansible-playbook -i inventory helloworld.yaml --flush-cache

Ansible can use multiple cache plugins to save host variables, as well as execution variables. Sometimes, those variables might be left behind and influence the following executions. When Ansible finds a variable that should be set in the step it just started, Ansible might assume that the step has already been completed, and therefore pick up that old variable as if it has just been...

Checking for bad syntax

Defining whether a file has the right syntax or not is fairly easy for a machine, but this might be more complex for humans. This does not mean that machines can fix the code for you, but they can quickly identify whether a problem is present or not. To use Ansible’s built-in syntax checker, we need a playbook with a syntax error. Let’s get started:

  1. Let’s create a syntaxcheck.yaml file with the following content:
    ---
    - hosts: all
      tasks:
      - ansible.builtin.debug:
        msg: "Hello, World!"
  2. Now, we can use the --syntax-check command:
    $ ansible-playbook syntaxcheck.yaml --syntax-check

By doing this, we will receive the following output:

ERROR! 'msg' is not a valid attribute for a Task
The error appears to be in '/home/fale/ansible/Ansible2Cookbook/Ch11/syntaxcheck.yaml': line 4, column 7, but may
be elsewhere in the file depending on the exact syntax problem...

Summary

In this chapter, you learned about the various options that Ansible provides so that you can look for problems in your Ansible code. More specifically, you learned how to use host facts to diagnose failures, how to include testing within a playbook, how to use check mode, how to solve host connection issues, how to pass variables from the CLI, how to limit the execution to a subset of hosts, how to flush the code cache, and how to check for bad syntax.

In the next chapter, you will learn how to get started with Ansible Automation Controller.

Questions

Answer the following questions to test your knowledge of this chapter:

  1. True or False: The ansible.builtin.debug module allows you to print the value of a variable or a fixed string during Ansible’s execution.
    1. True
    2. False
  2. Which keyword allows Ansible to force limit the host’s execution?
    1. --limit
    2. --max
    3. --restrict
    4. --force
    5. --except

Further reading

Ansible’s official documentation about error handling can be found at https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_error_handling.html.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Ansible - Second Edition
Published in: Sep 2023Publisher: PacktISBN-13: 9781805129974
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (3)

author image
James Freeman

James Freeman is an accomplished IT professional with over 25 years' experience in the technology industry. He has more than a decade of first-hand experience in solving real-world enterprise problems in production environments using Ansible, open source, and AWS. As part of this work, he frequently introduces Ansible as a new technology to businesses and CTOs for the first time. In addition, he has co-authored five books and one video training course on Ansible, facilitated bespoke Ansible workshops and training sessions, and presented at both international conferences and meetups on Ansible.
Read more about James Freeman

author image
Fabio Alessandro Locati

Fabio Alessandro Locati – commonly known as Fale – is an EMEA associate principal solutions architect at Red Hat, a public speaker, an author, and an open source contributor. His primary areas of expertise are Linux, automation, security, and cloud technologies. Fale has more than 15 years of working experience in IT, with many of them spent consulting for various organizations, including dozens of Fortune 500 companies. Fale has written Learning Ansible 2.7, Learning Ansible 2, and OpenStack Cloud Security, and has been part of the review process of multiple books.
Read more about Fabio Alessandro Locati

author image
Daniel Oh

Daniel Oh is a principal technical marketing manager at Red Hat. He provides runtimes, frameworks, fast data access, and high-performance messaging in flexible, easy-to-use, cost-effective, open, and collaborative ways. He's also a CNCF ambassador and DevOps Institute ambassador who evangelizes how to design and develop cloud-native serverless microservices and deploy them to multi/hybrid cloud-native platforms based on CNCF projects. Daniel loves to share his developer experiences with DevOps folks in terms of how to evolve traditional microservices to cloud-native, event-driven, and serverless applications via technical workshops, brown bag sessions, hackathons, and hands-on labs across regions at many international conferences.
Read more about Daniel Oh