Home

Argo CD in Practice

By Liviu Costea , Spiros Economakis

Book

eBook $37.99 $25.99

Print $46.99

Subscription $15.99 $10 p/m for three months

BUY NOW

$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

eBook $37.99 $25.99

Print $46.99

Subscription $15.99 $10 p/m for three months

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

About this book

GitOps follows the practices of infrastructure as code (IaC), allowing developers to use their day-to-day tools and practices such as source control and pull requests to manage apps. With this book, you’ll understand how to apply GitOps bootstrap clusters in a repeatable manner, build CD pipelines for cloud-native apps running on Kubernetes, and minimize the failure of deployments. You’ll start by installing Argo CD in a cluster, setting up user access using single sign-on, performing declarative configuration changes, and enabling observability and disaster recovery. Once you have a production-ready setup of Argo CD, you’ll explore how CD pipelines can be built using the pull method, how that increases security, and how the reconciliation process occurs when multi-cluster scenarios are involved. Next, you’ll go through the common troubleshooting scenarios, from installation to day-to-day operations, and learn how performance can be improved. Later, you’ll explore the tools that can be used to parse the YAML you write for deploying apps. You can then check if it is valid for new versions of Kubernetes, verify if it has any security or compliance misconfigurations, and that it follows the best practices for cloud-native apps running on Kubernetes. By the end of this book, you’ll be able to build a real-world CD pipeline using Argo CD.

Publication date:: November 2022
Publisher: Packt
Pages: 236
ISBN: 9781803233321
Download code from GitHub

GitOps and Kubernetes

In this chapter, we’re going to see what GitOps is and how the idea makes a lot of sense in a Kubernetes cluster. We will get introduced to specific components, such as the application programming interface (API) server and controller manager that make the cluster react to state changes. We will start with imperative APIs and get through the declarative ones and will see how applying a file and a folder up to applying a Git repository was just one step—and, when it was taken, GitOps appeared.

We will cover the following main topics in this chapter:

What is GitOps?
Kubernetes and GitOps
Imperative and declarative APIs
Building a simple GitOps operator
Infrastructure as code (IaC) and GitOps

Technical requirements

For this chapter, you will need access to a Kubernetes cluster, and a local one such as minikube (https://minikube.sigs.k8s.io/docs/) or kind (https://kind.sigs.k8s.io) will do. We are going to interact with the cluster and send commands to it, so you also need to have kubectl installed (https://kubernetes.io/docs/tasks/tools/#kubectl).

We are going to write some code, so a code editor will be needed. I am using Visual Studio Code (VS Code) (https://code.visualstudio.com), and we are going to use the Go language, which needs installation too: https://golang.org (the current version of Go is 1.16.7; the code should work with it). The code can be found at https://github.com/PacktPublishing/ArgoCD-in-Practice in the ch01 folder.

What is GitOps?

The term GitOps was coined back in 2017 by people from Weaveworks, who are also the authors of a GitOps tool called Flux. Since then, I have seen how GitOps turned into a buzzword, up to being named the next important thing after development-operations (DevOps). If you search for definitions and explanations, you will find a lot of them: it has been defined as operations via pull requests (PRs) (https://www.weave.works/blog/gitops-operations-by-pull-request) or taking development practices (version control, collaboration, compliance, continuous integration/continuous deployment (CI/CD)) and applying them to infrastructure automation (https://about.gitlab.com/topics/gitops/).

Still, I think there is one definition that stands out. I am referring to the one created by the GitOps Working Group (https://github.com/gitops-working-group/gitops-working-group), which is part of the Application Delivery Technical Advisory Group (Application Delivery TAG) from the Cloud Native Computing Foundation (CNCF). The Application Delivery TAG is specialized in building, deploying, managing, and operating cloud-native applications (https://github.com/cncf/tag-app-delivery). The workgroup is made up of people from different companies with the purpose of building a vendor-neutral, principle-led definition for GitOps, so I think these are good reasons to take a closer look at their work.

The definition is focused on the principles of GitOps, and five are identified so far (this is still a draft), as follows:

Declarative configuration
Version-controlled immutable storage
Automated delivery
Software agents
Closed loop

It starts with declarative configuration, which means we want to express our intent, an end state, and not specific actions to execute. It is not an imperative style where you say, “Let’s start three more containers,” but instead, you declare that you want to have three containers for this application, and an agent will take care of reaching that number, which might mean it needs to stop two running containers if there are five up right now.

Git is being referred to here as version-controlled and immutable storage, which is fair because while it is the most used source control system right now, it is not the only one, and we could implement GitOps with other source control systems.

Automated delivery means that we shouldn’t have any manual actions once the changes reach the version control system (VCS). After the configuration is updated, it comes to software agents to make sure that the necessary actions to reach the new declared configuration are being taken. Because we are expressing the desired state, the actions to reach it need to be calculated. They result from the difference between the actual state of the system and the desired state from the version control—and this is what the closed loop part is trying to say.

While GitOps originated in the Kubernetes world, this definition is trying to take that out of the picture and bring the preceding principles to the whole software world. In our case, it is still interesting to see what made GitOps possible and dive a little bit deeper into what those software agents are in Kubernetes or how the closed loop is working here.

Kubernetes and GitOps

It is hard not to hear about Kubernetes these days—it is probably one of the most well-known open source projects at the moment. It originated somewhere around 2014 when a group of engineers from Google started building a container orchestrator based on the experience they accumulated working with Google’s own internal orchestrator named Borg. The project was open sourced in 2014 and reached its 1.0.0 version in 2015, a milestone that encouraged many companies to take a closer look at it.

Another reason that led to its fast and enthusiastic adoption by the community is the governance of CNCF (https://www.cncf.io). After making the project open source, Google started discussing with the Linux Foundation (https://www.linuxfoundation.org) creating a new nonprofit organization that would lead the adoption of open source cloud-native technologies. That’s how CNCF came to be created while Kubernetes became its seed project and KubeCon its major developer conference. When I said CNCF governance, I am referring mostly to the fact that every project or organization inside CNCF has a well-established structure of maintainers and details how they are nominated, how decisions are taken in these groups, and that no company can have a simple majority. This ensures that no decision will be taken without community involvement and that the overall community has an important role to play in a project life cycle.

Architecture

Kubernetes has become so big and extensible that it is really hard to define it without using abstractions such as a platform for building platforms. This is because it is just a starting point—you get many pieces, but you have to put them together in a way that works for you (and GitOps is one of those pieces). If we say that it is a container orchestration platform, this is not entirely true because you can also run virtual machines (VMs) with it, not just containers (for more details, please check https://ubuntu.com/blog/what-is-kata-containers); still, the orchestration part remains true.

Its components are split into two main parts—first is the control plane, which is made of a REpresentational State Transfer (REST) API server with a database for storage (usually etcd), a controller manager used to run multiple control loops, a scheduler that has the job of assigning a node for our Pods (a Pod is a logical grouping of containers that helps to run them on the same node—find out more at https://kubernetes.io/docs/concepts/workloads/pods/), and a cloud controller manager to handle any cloud-specific work. The second piece is the data plane, and while the control plane is about managing the cluster, this one is about what happens on the nodes running the user workloads. A node that is part of a Kubernetes cluster will have a container runtime (which can be Docker, CRI-O, or containerd, and there are a few others), kubelet, which takes care of the connection between the REST API server and the container runtime of the node, and kube-proxy, responsible for abstracting the network at the node level. See the next diagram for details of how all the components work together and the central role played by the API server.

We are not going to enter into the details of all these components; instead, for us, the REST API server that makes the declarative part possible and the controller manager that makes the system converge to the desired state are important, so we want to dissect them a little bit.

The following diagram shows an overview of a typical Kubernetes architecture:

Figure 1.1 – Kubernetes architecture

Note

When looking at an architecture diagram, you need to know that it is only able to catch a part of the whole picture. For example, here, it seems that the cloud provider with its API is an external system, but actually, all the nodes and the control plane are created in that cloud provider.

HTTP REST API server

Viewing Kubernetes from the perspective of the HyperText Transfer Protocol (HTTP) REST API server makes it like any classic application with REST endpoints and a database for storing state—in our case, usually etcd—and with multiple replicas of the web server for high availability (HA). What is important to emphasize is that anything we want to do with Kubernetes we need to do via the API; we can’t connect directly to any other component, and this is true also for the internal ones: they can’t talk directly between them—they need to go through the API.

From our client machines, we don’t query the API directly (such as by using curl), but instead, we use this kubectl client application that hides some of the complexity, such as authentication headers, preparing request content, parsing the response body, and so on.

Whenever we do a command such as kubectl get pods, there is an HTTP Secure (HTTPS) call to the API server. Then, the server goes to the database to fetch details about the Pods, and a response is created and pushed back to the client. The kubectl client application receives it, parses it, and is able to display a nice output suited to a human reader. In order to see what exactly happens, we can use the verbose global flag of kubectl (--v), for which the higher value we set, the more details we get.

For an exercise, do try kubectl get pods --v=6, when it just shows that a GET request is performed, and keep increasing --v to 7, 8, 9, and more so that you will see the HTTP request headers, the response headers, part or all of the JavaScript Object Notation (JSON) response, and many other details.

The API server itself is not responsible for actually changing the state of the cluster—it updates the database with the new values and, based on such updates, other things are happening. The actual state changes are done by controllers and components such as scheduler or kubelet. We are going to drill down into controllers as they are important for our GitOps understanding.

Controller manager

When reading about Kubernetes (or maybe listening to a podcast), you will hear the word controller quite often. The idea behind it comes from industrial automation or robots, and it is about the converging control loop.

Let’s say we have a robotic arm and we give it a simple command to move at a 90-degree position. The first thing that it will do is to analyze its current state; maybe it is already at 90 degrees and there is nothing to do. If it isn’t in the right position, the next thing is to calculate the actions to take in order to get to that position, and then, it will try to apply those actions to reach its relative place.

We start with the observe phase, where we compare the desired state with the current state, then we have the diff phase, where we calculate the actions to apply, and in the action phase, we perform those actions. And again, after we perform the actions, it starts the observe phase to see if it is in the right position; if not (maybe something blocked it from getting there), actions are calculated, and we get into applying the actions, and so on until it reaches the position or maybe runs out of battery or something. This control loop continues on and on until in the observe phase, the current state matches the desired state, so there will be no actions to calculate and apply. You can see a representation of the process in the following diagram:

Figure 1.2 – Control loop

In Kubernetes, there are many controllers. We have the following:

ReplicaSet: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
HorizontalPodAutoscaler (HPA): https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
And a few others can be found here, but this isn’t a complete list: https://kubernetes.io/docs/concepts/workloads/controllers/

The ReplicaSet controller is responsible for running a fixed number of Pods. You create it via kubectl and ask to run three instances, which is the desired state. So, it starts by checking the current state: how many Pods we have running right now; it calculates the actions to take: how many more Pods to start or terminate in order to have three instances; it then performs those actions. There is also the HPA controller, which, based on some metrics, is able to increase or decrease the number of Pods for a Deployment (a Deployment is a construct built on top of Pods and ReplicaSets that allows us to define ways to update Pods (https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)), and a Deployment relies on a ReplicaSet controller it builds internally in order to update the number of Pods. After the number is modified, it is still the ReplicaSet controller that runs the control loop to reach the number of desired Pods.

The controller’s job is to make sure that the actual state matches the desired state, and they never stop trying to reach that final state. And, more than that, they are specialized in types of resources—each takes care of a small piece of the cluster.

In the preceding examples, we talked about internal Kubernetes controllers, but we can also write our own, and that’s what Argo CD really is—a controller, its control loop taking care that the state declared in a Git repository matches the state from the cluster. Well, actually, to be correct, it is not a controller but an operator, the difference being that controllers work with internal Kubernetes objects while operators deal with two domains: Kubernetes and something else. In our case, the Git repository is the outside part handled by the operator, and it does that using something called custom resources, a way to extend Kubernetes functionality (https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).

So far, we have looked at the Kubernetes architecture with the API server connecting all the components and how the controllers are always working within control loops to get the cluster to the desired state. Next, we will get into details on how we can define the desired state: we will start with the imperative way, continue with the more important declarative way, and show how all these get us one step closer to GitOps.

Imperative and declarative APIs

We discussed a little bit about the differences between an imperative style where you clearly specify actions to take—such as start three more Pods—and a declarative one where you specify your intent—such as there should be three Pods running for the deployment—and actions need to be calculated (you might increase or decrease the Pods or do nothing if three are already running). Both imperative and declarative ways are implemented in the kubectl client.

Imperative – direct commands

Whenever we create, update, or delete a Kubernetes object, we can do it in an imperative style.

To create a namespace, run the following command:

kubectl create namespace test-imperative

Then, in order to see the created namespace, use the following command:

kubectl get namespace test-imperative

Create a deployment inside that namespace, like so:

kubectl create deployment nginx-imperative --image=nginx -n test-imperative

Then, you can use the following command to see the created deployment:

kubectl get deployment -n test-imperative nginx-imperative

To update any of the resources we created, we can use specific commands, such as kubectl label to modify the resource labels, kubectl scale to modify the number of Pods in a Deployment, ReplicaSet, StatefulSet, or kubectl set for changes such as environment variables (kubectl set env), container images (kubectl set image), resources for a container (kubectl set resources), and a few more.

If you want to add a label to the namespace, you can run the following command:

kubectl label namespace test-imperative namespace=imperative-apps

In the end, you can remove objects created previously with the following commands:

kubectl delete deployment -n test-imperative nginx-imperative
kubectl delete namespace test-imperative

Imperative commands are clear on what they do, and it makes sense when you use them for small objects such as namespaces. But for more complex ones, such as Deployments, we can end up passing a lot of flags to it, such as specifying a container image, image tag, pull policy, if a secret is linked to a pull (for private image registries), and the same for init containers and many other options. Next, let’s see if there are better ways to handle such a multitude of possible flags.

Imperative – with config files

Imperative commands can also make use of configuration files, which make things easier because they significantly reduce the number of flags we would need to pass to an imperative command. We can use a file to say what we want to create.

This is what a namespace configuration file looks like—the simplest version possible (without any labels or annotations). The following files can also be found at https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/imperative-confi

Copy the following content into a file called namespace.yaml:

apiVersion: v1
kind: Namespace
metadata:
  name: imperative-config-test

Then, run the following command:

kubectl create -f namespace.yaml

Copy the following content and save it in a file called deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: imperative-config-test
spec:
  selector:
    matchLabels:
      app: nginx
 template:
   metadata:
     labels:
       app: nginx
   spec:
     containers:
     - name: nginx
       image: nginx

Then, run the following command:

kubectl create -f deployment.yaml

By running the preceding commands, we create one namespace and one Deployment, similar to what we have done with imperative direct commands. You can see this is easier than passing all the flags to kubectl create deployment. What’s more, not all the fields are available as flags, so using a configuration file can become mandatory in many cases.

We can also modify objects via the config file. Here is an example of how to add labels to a namespace. Update the namespace we used before with the following content (notice the extra two rows starting with labels). The updated namespace can be seen in the official https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/imperative-config repository in the namespace-with-labels.yaml file:

apiVersion: v1
kind: Namespace
metadata:
  name: imperative-config-test
  labels:
    name: imperative-config-test

And then, we can run the following command:

kubectl replace -f namespace.yaml

And then, to see if a label was added, run the following command:

kubectl get namespace imperative-config-test -o yaml

This is a good improvement compared to passing all the flags to the commands, and it makes it possible to store those files in version control for future reference. Still, you need to specify your intention if the resource is new, so you use kubectl create, while if it exists, you use kubectl replace. There are also some limitations: the kubectl replace command performs a full object update, so if someone modified something else in between (such as adding an annotation to the namespace), those changes will be lost.

Declarative – with config files

We just saw how easy it is to use a config file to create something, so it would be great if we could modify the file and just call some update/sync command on it. We could modify the labels inside the file instead of using kubectl label and could do the same for other changes, such as scaling the Pods of a Deployment, setting container resources, container images, and so on. And there is such a command that you can pass any file to it, new or modified, and it will be able to make the right adjustments to the API server: kubectl apply.

Please create a new folder called declarative-files and place the namespace.yaml file in it, with the following content (the files can also be found at https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/declarative-files):

apiVersion: v1
kind: Namespace
metadata:
  name: declarative-files

Then, run the following command:

kubectl apply -f declarative-files/namespace.yaml

The console output should then look like this:

namespace/declarative-files created

Next, we can modify the namespace.yaml file and add a label to it directly in the file, like so:

apiVersion: v1
kind: Namespace
metadata:
  name: declarative-files
  labels:
    namespace: declarative-files

Then, run the following command again:

kubectl apply -f declarative-files/namespace.yaml

The console output should then look like this:

namespace/declarative-files configured

What happened in both of the preceding cases? Before running any command, our client (or our server—there is a note further on in this chapter explaining when client-side or server-side apply is used) compared the existing state from the cluster with the desired one from the file, and it was able to calculate the actions that needed to be applied in order to reach the desired state. In the first apply example, it realized that the namespace didn’t exist and it needed to create it, while in the second one, it found that the namespace exists but it didn’t have a label, so it added one.

Next, let’s add the Deployment in its own file called deployment.yaml in the same declarative-files folder, as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: declarative-files
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
   spec:
     containers:
     - name: nginx
       image: nginx

And we will run the following command that will create a Deployment in the namespace:

kubectl apply -f declarative-files/deployment.yaml

If you want, you can make the changes to the deployment.yaml file (labels, container resources, images, environment variables, and so on) and then run the kubectl apply command (the complete one is the preceding one), and the changes you made will be applied to the cluster.

Declarative – with config folder

In this section, we will create a new folder called declarative-folder and two files inside of it.

Here is the content of the namespace.yaml file (the code can also be found here: https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/declarative-folder):

apiVersion: v1
kind: Namespace
metadata:
  name: declarative-folder

Here is the content of the deployment.yaml file:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: nginx
 namespace: declarative-folder
spec:
 selector:
   matchLabels:
     app: nginx
 template:
   metadata:
     labels:
       app: nginx
   spec:
     containers:
     - name: nginx
       image: nginx

And then, we will run the following command:

kubectl apply -f declarative-folder

Most likely, you will see the following error, which is expected, so don’t worry:

namespace/declarative-folder created
Error from server (NotFound): error when creating "declarative-folder/deployment.yaml": namespaces "declarative-folder" not found

That is because those two resources are created at the same time, but deployment depends on the namespace, so when a deployment needs to be created, it needs to have the namespace ready. We see the message says that a namespace was created but the API calls were done at the same time and on the server, so the namespace was not available when the deployment started its creation flow. We can fix this by running the following command again:

kubectl apply -f declarative-folder

And in the console, we should see the following output:

deployment.apps/nginx created
namespace/declarative-folder unchanged

Because the namespace already existed, it was able to create a deployment inside it while no change was made to the namespace.

The kubectl apply command took the whole content of the declarative-folder folder, made the calculations for each resource found in those files, and then called the API server with the changes. We can apply entire folders, not just files, though it can get trickier if the resources depend on each other, and we can modify those files and call the apply command for the folder, and the changes will get applied. Now, if this is how we build applications in our clusters, then we had better save all those files in source control for future reference so that it will get easier to apply changes after some time.

But what if we could apply a Git repository directly, not just folders and files? After all, a local Git repository is a folder, and in the end, that’s what a GitOps operator is: a kubectl apply command that knows how to work with Git repositories.

Note

The apply command was initially implemented completely on the client side. This means the logic for finding changes was running on the client, and then specific imperative APIs were called on the server. But more recently, the apply logic moved to the server side; all objects have an apply method (from a REST API perspective, it is a PATCH method with an application/apply-patch+yaml content-type header), and it is enabled by default starting with version 1.16 (more on the subject here: https://kubernetes.io/docs/reference/using-api/server-side-apply/).

Building a simple GitOps operator

Now that we have seen how the control loop works, have experimented with declarative commands, and know how to work with basic Git commands, we have enough information to build a basic GitOps operator. We now need three things created, as follows:

We will initially clone a Git repository and then pull from it to keep it in sync with remote.
We’ll take what we found in the Git repository and try to apply it.
We’ll do this in a loop so that we can make changes to the Git repository and they will be applied.

The code is in Go; this is a newer language from Google, and many operations (ops) tools are built with it, such as Docker, Terraform, Kubernetes, and Argo CD.

Note

For real-life controllers and operators, certain frameworks should be used, such as the Operator Framework (https://operatorframework.io), Kubebuilder (https://book.kubebuilder.io), or sample-controller (https://github.com/kubernetes/sample-controller).

All the code for our implementation can be found at https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/basic-gitops-operator, while the YAML Ain’t Markup Language (YAML) manifests we will be applying are at https://github.com/PacktPublishing/ArgoCD-in-Practice/tree/main/ch01/basic-gitops-operator-config.

The syncRepo function receives the repository Uniform Resource Locator (URL) to clone and keep in sync, as well as the local path where to do it. It then tries to clone the repository using a function from the go-git library (https://github.com/go-git/go-git), git.PlainClone. If it fails with a git.ErrRepositoryAlreadyExists error, this means we have already cloned the repository and we need to pull it from the remote to get the latest updates. And that’s what we do next: we open the Git repository locally, load the worktree, and then call the Pull method. This method can give an error if everything is up to date and there is nothing to download from the remote, so for us, this case is normal (this is the condition: if err != nil && err == git.NoErrAlreadyUpToDate). The code is illustrated in the following snippet:

func syncRepo(repoUrl, localPath string) error {
   _, err := git.PlainClone(localPath, false, &git.CloneOptions{
       URL:      repoUrl,
       Progress: os.Stdout,
   })
   if err == git.ErrRepositoryAlreadyExists {
       repo, err := git.PlainOpen(localPath)
       if err != nil {
           return err
       }
       w, err := repo.Worktree()
       if err != nil {
           return err
       }
       err = w.Pull(&git.PullOptions{
           RemoteName: "origin",
           Progress:   os.Stdout,
       })
       if err == git.NoErrAlreadyUpToDate {
           return nil
       }
       return err
   }
   return err
}

Next, inside the applyManifestsClient method, we have the part where we apply the content of a folder from the repository we downloaded. Here, we create a simple wrapper over the kubectl apply command, passing as a parameter the folder where the YAML manifests are from the repository we cloned. Instead of using the kubectl apply command, we can use the Kubernetes APIs with the PATCH method (with the application/apply-patch+yaml content-type header), which means calling apply on the server side directly. But it complicates the code, as each file from the folder needs to be read and transformed into its corresponding Kubernetes object in order to be able to pass it as a parameter to the API call. The kubectl apply command does this already, so this was the simplest implementation possible. The code is illustrated in the following snippet:

func applyManifestsClient(localPath string) error {
   dir, err := os.Getwd()
   if err != nil {
       return err
   }
   cmd := exec.Command("kubectl", "apply", "-f", path.Join(dir, localPath))
   cmd.Stdout = os.Stdout
   cmd.Stderr = os.Stderr
   err = cmd.Run()
   return err
}

Finally, the main function is from where we call these functionalities, sync the Git repository, apply manifests to the cluster, and do it in a loop at a 5-second interval (I went with a short interval for demonstration purposes; in live scenarios, Argo CD—for example—does this synchronization every 3 minutes). We define the variables we need, including the Git repository we want to clone, so if you will fork it, please update the gitopsRepo value. Next, we call the syncRepo method, check for any errors, and if all is good, we continue by calling applyManifestsClient. The last rows are how a timer is implemented in Go, using a channel.

Note: Complete code file

For a better overview, we also add the package and import declaration; this is the complete implementation that you can copy into the main.go file.

Here is the code for the main function where everything is put together:

package main
import (
   "fmt"
   "os"
   "os/exec"
   "path"
   "time"
   "github.com/go-git/go-git/v5"
)
func main() {
   timerSec := 5 * time.Second
   gitopsRepo := "https://github.com/PacktPublishing/ArgoCD-in-Practice.git"   localPath := "tmp/"
   pathToApply := "ch01/basic-gitops-operator-config"
   for {
       fmt.Println("start repo sync")
       err := syncRepo(gitopsRepo, localPath)
       if err != nil {
           fmt.Printf("repo sync error: %s", err)
           return
       }
       fmt.Println("start manifests apply")
       err = applyManifestsClient(path.Join(localPath, pathToApply))
       if err != nil {
           fmt.Printf("manifests apply error: %s", err)
       }
       syncTimer := time.NewTimer(timerSec)
       fmt.Printf("\n next sync in %s \n", timerSec)
       <-syncTimer.C
   }
}

To make the preceding code work, go to a folder and run the following command (just replace <your-username>):

go mod init github.com/<your-username>/basic-gitops-operator

This creates a go.mod file where we will store the Go modules we need. Then, create a file called main.go and copy the preceding pieces of code in it, and the three functions syncRepo, applyManifestsClient, and main (also add the package and import declarations that come with the main function). Then, run the following command:

go get .

This will download all the modules (don’t miss the last dot).

And the last step is to actually execute everything we put together with the following command:

go run main.go

Once the application starts running, you will notice a tmp folder created, and inside it, you will find the manifests to be applied to the cluster. The console output should look something like this:

start repo sync
Enumerating objects: 36, done.
Counting objects: 100% (36/36), done.
Compressing objects: 100% (24/24), done.
Total 36 (delta 8), reused 34 (delta 6), pack-reused 0
start manifests apply
namespace/nginx created
Error from server (NotFound): error when creating "<>/argocd-in-practice/ch01/basic-gitops-operator/tmp/ch01/basic-gitops-operator-config/deployment.yaml": namespaces "nginx" not found
manifests apply error: exit status 1
next sync in 30s 
start repo sync
start manifests apply
deployment.apps/nginx created
namespace/nginx unchanged

You can see the same error since, as we tried applying an entire folder, this is happening now too, but on the operator’s second run, the deployment is created successfully. If you look in your cluster, you should find a namespace called nginx and, inside it, a deployment also called nginx. Feel free to fork the repository and make changes to the operator and to the config it is applying.

Note: Apply namespace first

The problem with namespace creation was solved in Argo CD by identifying them and applying namespaces first.

We created a simple GitOps operator, showing the steps of cloning and keeping the Git repository in sync with the remote and taking the contents of the repository and applying them. If there was no change to the manifests, then the kubectl apply command had nothing to modify in the cluster, and we did all this in a loop that imitates pretty closely the control loop we introduced earlier in the chapter. As a principle, this is alsowhat happens in the Argo CD implementation, but at a much higher scale and performance and with a lot of features added.

IaC and GitOps

You can find a lot of articles and blog posts trying to make comparisons between IaC and GitOps to cover the differences and, usually, how GitOps builds upon IaC principles. I would say that they have a lot of things in common—they are very similar practices that use source control for storing the state. When you say IaC these days, you are referring to practices where infrastructure is created through automation and not manually, and the infrastructure is saved as code in source control just like application code.

With IaC, you expect the changes to be applied using pipelines, a big advantage over going and starting to provision things manually. This allows us to create the same environments every time we need them, reducing the number of inconsistencies between staging and production, for example, which will translate into developers spending less time debugging special situations and problems caused by configuration drift.

The way of applying changes can be both imperative and declarative; most tools support both ways, while some are only declarative in nature (such as Terraform or CloudFormation). Initially, some started as imperative but adopted declarative configuration as it gained more traction recently (see https://next.redhat.com/2017/07/24/ansible-declares-declarative-intent/).

Having your infrastructure in source control adds the benefit of using PRs that will be peer-reviewed, a process that generates discussions, ideas, and improvements until changes are approved and merged. It also makes our infrastructure changes clear to everyone and auditable.

We went through all these principles when we discussed the GitOps definition created by the Application Delivery TAG at the beginning of this chapter. But more importantly, there were some more in the GitOps definition that are not part of the IaC one, such as software agents or closed loops. IaC is usually applied with a CI/CD system, resulting in a push mode whereby your pipeline connects to your system (cloud, database cluster, VM, and so on) and performs the changes. GitOps, on the other hand, is about agents that are working to reconcile the state of the system with the one declared in the source control. There is a loop where the differences are calculated and applied until the state matches. And we saw how this reconciliation happens again and again until there are no more differences discovered, this being the actual loop.

This does not happen in an IaC setup; there are no operators/controllers when talking about applying infrastructure changes. The updates are done with a push mode, which means the GitOps pull way is better in terms of security, as it is not the pipeline that has the production credentials, but your agent stores them, and it can run in the same account as your production—or at least in a separate, but trusted one.

Having agents applying your changes also means GitOps can only support the declarative way. We need to be able to specify what is the state we want to reach; we will not have too much control over how to do it as we offload that burden onto our controllers and operators.

Can a tool that was previously defined as IaC be applied in a GitOps manner? Yes, and I think we have a good example with Terraform and Atlantis (https://www.runatlantis.io). This is a way of running an agent (that would be Atlantis) in a remote setup, so all commands will not be executed from the pipeline, but by the agent. This means it does fit the GitOps definition, though if we go into details, we might find some mismatches regarding the closed loop.

In my opinion, Atlantis applies infrastructure changes in a GitOps way, while if you apply Terraform from your pipeline, that is IaC.

So, we don’t have too many differences between these practices—they are more closely related than different. Both have the state stored in source control and open the path for making changes with PRs. In terms of differences, GitOps comes with the idea of agents and the control loop, which improves security and can only be declarative.

Summary

In this chapter, we discovered what GitOps means and which parts of Kubernetes make it possible. We checked how the API server connects everything and how controllers work, introduced a few of them, and explained how they react to state changes in an endless control loop. We took a closer look at Kubernetes’ declarative nature, starting from imperative commands, then opening the path of not just applying a folder but a Git repository. In the end, we implemented a very simple controller so that you could have an idea of what Argo CD does.

In the next chapter, we are going to start exploring Argo CD, how it works, its concepts and architecture, and details around synchronization principles.

Argo CD in Practice

GitOps and Kubernetes

Technical requirements

What is GitOps?

Kubernetes and GitOps

Architecture

HTTP REST API server

Controller manager

Imperative and declarative APIs

Imperative – direct commands

Imperative – with config files

Declarative – with config files

Declarative – with config folder

Building a simple GitOps operator

IaC and GitOps

Summary

Further reading