In the previous chapter, you learned about the core components of Kubernetes and the basics of its infrastructure, and why putting Kubernetes in production is a challenging journey. We introduced the production-readiness characteristics for the Kubernetes clusters, along with our recommended checklist for the services and configurations that ensure the production-readiness of your clusters.
We also introduced a group of infrastructure design principles that we learned through building production-grade cloud environments. We use them as our guideline through this book whenever we make architectural and design decisions, and we highly recommend that cloud infrastructure teams consider these when it comes to architecting new infrastructure for Kubernetes and cloud platforms in general.
In this chapter, you will learn about the important architectural decisions that you will need to tackle while designing your Kubernetes infrastructure. We will explore the alternatives and the choices that you have for each of these decisions, along with the possible benefits and drawbacks. In addition to that, you will learn about the cloud architecture considerations, such as scaling, availability, security, and cost. We do not intend to make final decisions but provide the guidance because every organization has different needs and use cases. Our role is to explore them, and guide you through the decision-making process. When possible, we will state our preferred choices, which we will follow through this book for the practical exercises.
In this chapter, we will cover the following topics:
When it comes to Kubernetes infrastructure design, there are a few, albeit important, considerations to take into account. Almost every cloud infrastructure architecture shares the same set of considerations; however, we will discuss these considerations from a Kubernetes perspective, and shed some light on them.
Public cloud infrastructure, such as AWS, Azure, and GCP, introduced scaling and elasticity capabilities at unprecedented levels. Kubernetes and containerization technologies arrived to build upon these capabilities and extend them further.
When you design a Kubernetes cluster infrastructure, you should ensure that your architecture covers the following two areas:
To achieve the first requirement, there are parts that depend on the underlying infrastructure, either public cloud or on-premises, and other parts that depend on the Kubernetes cluster itself.
The first part is usually solved when you choose to use a managed Kubernetes service such as EKS, AKS, or GKE, as the cluster's control plane and worker nodes will be scalable and supported by other layers of scalable infrastructure.
However, in some use cases, you may need to deploy a self-managed Kubernetes cluster, either on-premises or in the cloud, and in this case, you need to consider how to support scaling and elasticity to enable your Kubernetes clusters to operate at their full capacity.
In all public cloud infrastructure, there is the concept of compute auto scaling groups, and Kubernetes clusters are built on them. However, because of the nature of the workloads running on Kubernetes, scaling needs should be synchronized with the cluster scheduling actions. This is where Kubernetes cluster autoscaler comes to our aid.
Cluster autoscaler (CAS) is a Kubernetes cluster add-on that you optionally deploy to your cluster, and it automatically scales up and down the size of worker nodes based on the set of conditions and configurations that you specify in the CAS. Basically, it triggers cluster upscaling when there is a pod that cannot schedule due to insufficient compute resources, or it triggers cluster downscaling when there are underutilized nodes, and their pods can be rescheduled and placed in other nodes. You should take into consideration the time a cloud provider takes to execute the launch of a new node, as this could be a problem for time-sensitive apps, and in this case, you may consider CAS configuration that enables node over provisioning.
For more information about CAS, refer to the following link: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler.
To achieve the second scaling requirement, Kubernetes provides two solutions to achieve autoscaling of the pods:
We highly recommend using HPA and VPA for your production deployments (it is not essential for non-production environments). We will give examples on how to use both of them in deploying production-grade apps and services in Chapter 8, Deploying Seamless and Reliable Applications.
Uptime means reliability and is usually the top metric that the infrastructure teams measure and target for enhancement. Uptime drives the service-level objectives (SLOs) for services, and the service level agreements (SLAs) with customers, and it also indicates how stable and reliable your systems and Software as a Service (SaaS) products are. High availability is the key for increasing uptime, and when it comes to Kubernetes clusters' infrastructure, the same rules still apply. This is why designing a highly available cluster and workload is an essential requirement for a production-grade Kubernetes cluster.
You can architect a highly available Kubernetes infrastructure on different levels of availability as follows:
As you may notice, Kubernetes has different architectural flavors when it comes to high availability setup, and I can say that having different choices makes Kubernetes more powerful for different use cases. Although the second choice is the most common one as of now, as it strikes a balance between the ease of implementation and operation, and the high availability level. We are optimistically searching for a time when we can reach the fourth level, where we can easily deploy Kubernetes clusters across cloud providers and gain all the high availability benefits without the burden of tough operations and increased costs.
As for the cluster availability itself, I believe it goes without saying that Kubernetes components should run in a highly available mode, that is, having three or more nodes for a control plane, or preferably letting the cloud manage the control plane for you, as in EKS, AKE, or GKE. As for workers, you have to run one or more autoscaling groups or node groups/pools, and this ensures high availability.
The other area where you need to consider achieving high availability is for the pods and workloads that you will deploy to your cluster. Although this is beyond the scope of this book, it is still worthwhile mentioning that developing new applications and services, or modernizing your existing ones so that they can run in a high availability mode, is the only way to make use of the raft of capabilities provided by the powerful Kubernetes infrastructure underneath it. Otherwise, you will end up with a very powerful cluster but with monolithic apps that can only run as a single instance!
Kubernetes infrastructure security is rooted at all levels of your cluster, starting from the network layer, going through the OS level, up to cluster services and workloads. Luckily, Kubernetes has strong support for security, encryption, authentication, and authorization. We will learn about security in Chapter 6, Securing Kubernetes Effectively, of this book. However, during the design of the cluster infrastructure, you should give attention to important decisions relating to security, such as securing the Kubernetes API server endpoint, as well as the cluster network design, security groups, firewalls, network policies between the control plane components, workers nodes, and the public internet.
You will also need to plan ahead in terms of the infrastructure components or integrations between your cluster and identity management providers. This usually depends on your organization's security policies, which you need to align with your IT and security teams.
Another aspect to consider is the auditing and compliance of your cluster. Most organizations have cloud governance policies and compliance requirements, which you need to be aware of before you proceed with deploying your production on Kubernetes.
If you decide to use a multi-tenant cluster, the security requirements could be more challenging, and setting clear boundaries among the cluster tenants, as well as cluster users from different internal teams, may result in decisions such as deploying a service mesh, hardening cluster network policies, and implementing a tougher Role-Based Access Control (RBAC) mechanism. All of this will impact your decisions while architecting the infrastructure of your first production cluster.
The Kubernetes community is keen on compliance and quality, and for that there are multiple tools and tests to ensure that your cluster achieves an acceptable level of security and compliance. We will learn about these tools and tests in Chapter 6, Securing Kubernetes Effectively.
Cloud cost management is an important factor for all organizations adopting cloud technology, both for those just starting and those who are already in the cloud. Adding Kubernetes to your cloud infrastructure is expected to bring cost savings, as containerization enables you to highly utilize your computer resources on a scale that was not possible with VMs ever before. Some organizations achieved cost savings up to 90% after moving to containers and Kubernetes.
However, without proper cost control, costs can rise again, and you end up with a lot of wasted infrastructure cost with uncontrolled Kubernetes clusters. There are many tools and best practices to consider in relation to cost management, but we mainly want to focus on the actions and the technical decisions that you need to consider during infrastructure design.
We believe that there are two important aspects that require decisions, and these decisions will definitely affect your cluster infrastructure architecture:
There are no definitive correct decisions, but we will try to explore the choices in the next section, and how we can reach a decision.
These are other considerations to be made regarding cost optimization where an early decision can be made:
We highly recommend using spot instances for worker nodes, and you can run them in their node group/pool and assign to them the types of workloads where you are not concerned with them being disrupted.
In Chapter 9, Monitoring, Logging, and Observability, and Chapter 10, Operating and Maintaining Efficient Kubernetes Clusters, we will learn about cost observability and cluster operations.
Usually, when an organization starts building a Kubernetes infrastructure, they invest most of their time, effort, and focus in urgent and critical demands for infrastructure design and deployment, which we usually call Day 0 and Day 1. It is unlikely that an organization will devote its attention to operational and manageability concerns that we will face in the future (Day 2).
This is justified by the lack of experience in Kubernetes, and the types of operational challenges, or by being driven by gaining the benefits of Kubernetes that mainly relate to development, such as increasing a developer's productivity and agility, and automating releases and deployment.
All of this leads to organizations and teams being less prepared for Day 2. In this book, we try to maintain a balance between design, implementation, and operations, and shed some light on the important aspects of the operation and learn how to plan for it from Day 0, especially in relation to reliability, availability, security, and observability.
These are the common operational and manageability challenges that most teams face after deploying Kubernetes in production. This is where you need to rethink and consider solutions beforehand in order to handle these challenges properly:
etcd
, kube-proxy
, Docker images, and configuration for the cluster add-ons, become challenging to manage during the cluster life cycle. This requires the correct tools to be in place from the outset. Automation and IaC tools, such as Terraform, Ansible, and Helm, are commonly used to help in this regard.There are other operational challenges. However, we found that most of these can be mitigated if we stick to the following infrastructure best practices and standards:
Defining your set of standards covers processes for operation runbooks and playbooks, as well as technology standardization – using Docker, Kubernetes, and standard tools across teams. These tools should have specific characteristics: open source but battle-tested in production, the ability to support the other principles, such as IaC code, immutability, being cloud-agnostic, and being simple to use and deploy with a minimum of infrastructure.
Managing Kubernetes infrastructure is about management complexity. Hence, having a solid infrastructure design, applying best practices and standards, increasing the team's Kubernetes-specific skills, and expertise will all result in a smooth operational and manageability journey.
Kubernetes and its ecosystem come with vast choices for everything you can do related to deploying, orchestrating, and operating your workloads. This flexibility is a huge advantage, and enables Kubernetes to suit different use cases, from regular applications on-premises and in the cloud to IoT and edge computing. However, choices come with responsibility, and in this chapter, we learn about the technical decisions that you need to evaluate and take regarding your cluster deployment architecture..
One of the important questions to ask and a decision to make is where to deploy your clusters, and how many of them you may need in order to run your containerized workloads? The answer is usually driven by both business and technical factors; elements such as the existing infrastructure, cloud transformation plan, cloud budget, the team size, and business growth target. All of these aspects could affect this, and this is why the owner of the Kubernetes initiative has to collaborate with organization teams and executives to reach a common understanding of the decision drivers, and agree on the right direction for their business.
We are going to explore some of the common Kubernetes deployment architecture alternatives, with their use cases, benefits, and drawbacks:
We have discussed the various clusters deployments architectures and models available and their use cases. In the next section, we will learn work on designing the Kubernetes infrastructure that we will use in this book, and the technical decisions around it..
In this chapter, we have discussed and explored various aspects of Kubernetes clusters design, and the different architectural considerations that you need to take into account. Now, we need to put things together for the design that we will follow during this book. The decisions that we will make here do not mean that they are the only right ones, but this is the preferred design that we will follow in terms of having minimally acceptable production clusters for this book's practical exercise. You can definitely use the same design, but with modifications, such as cluster sizing.
In the following sections, we will explore our choices regarding the cloud provider, provisioning and configuration tools, and the overall infrastructure architecture, and in the chapters to follow, we will build upon these choices and use them to provision production-like clusters as well as deploy the configuration and services above the cluster.
As we learned in the previous sections, there are different ways in which to deploy Kubernetes. You can deploy it locally, on-premises, or in a public cloud, private cloud, hybrid, multi-cloud, or an edge location. Each of these infrastructure type has use cases, benefits, and drawbacks. However, the most common one is the public cloud, followed by the hybrid model. The remaining choices are still limited to specific use cases.
In a single book like ours, we cannot discuss each of these infrastructure platforms, so we decided to go with the common choice for deploying Kubernetes, by using one of the public clouds (AWS, Azure, or GCP). You still can use another cloud provider, a private cloud, or even an on-premises setup, and most of the concepts and best practices discussed in this book are still applicable.
When it comes to choosing one of the public clouds, we do not advocate one over the others, and we definitely recommend using the cloud provider that you already use for your existing infrastructure, but if you are just embarking on your cloud journey, we advise you to perform a deeper benchmarking analysis between the public clouds to see which one is better for your business.
In the practical exercises in this book, we will use AWS and the Elastic Kubernetes Service (EKS). We explained in the previous chapter regarding the infrastructure design principle that we always prefer a managed service over its self-managed counterpart, and this applies here when it comes to choosing between EKS and building our self-managed clusters over AWS.
When you plan for your cluster, you need to decide both the cluster and node sizes. This decision should be based on the estimated utilization of your workloads, which you may know beforehand based on your old infrastructure, or it can be calculated approximately and then adjusted after going live in production. In either case, you will need to decide on the initial cluster and node sizes, and then keep adjusting them until you reach the correct utilization level to achieve a balance between cost and reliability. You can target a utilization level of between 70 and 80% unless you have a solid justification for using a different level.
These are the common cluster and node size choices that you can consider either individually or in a combination:
etcd
, kube-proxy
, and so on) is higher than managing the same compute power for a larger node, in addition to which small nodes have a lower limit for pods per node.In a decentralized approach, the teams or individuals within an organization are allowed to create and manage their own Kubernetes clusters. This approach provides flexibility for the teams to get the best out of their clusters, and customize them to fit their use cases; on the other hand, this increases the operational overhead, cloud cost, and makes it difficult to enforce standardization, security, best practices, and tools across the clusters. This approach is more appropriate for organizations that are highly decentralized, or when they are going through cloud transformation, product life cycle transitional periods, or exploring and innovating new technologies and solutions.
In a centralized approach, the teams or individuals share a single cluster or small group of identical clusters that use a similar set of standards, configurations, and services. This approach overcomes and decreases the drawbacks in the decentralized model; however, it can be inflexible, slow down the cloud transformations, and decreases teams' agility. This approach is more suitable for organizations working towards maturity, platform stability, increasing cloud cost reduction, enforcing and promoting standards and best practices, and focusing on products rather than the underlaying platform.
Some organizations can run a hybrid models from the aforementioned alternatives, such as having large, medium, and small nodes to get the best of each type according to their apps needs. However, we recommend that you run experiments to decide which model suits your workload's performance, and meets your cloud cost reduction goal.
In the early days of Kubernetes, we used to deploy it from scratch, which was commonly called Kubernetes the Hard Way. Fast forward and the Kubernetes community got bigger and a lot of tools emerged to automate the deployment. These tools range from simple automation to complete one-click deployment.
In the context of this book, we are not going to explain each of these tools in the market (there are a lot), nor to compare and benchmark them. However, we will propose our choices with a brief reasoning behind the choices.
When you deploy Kubernetes for the first time, most likely you will use a command-line tool with a single command to provision the cluster, or you may use a cloud provider web console to do that. In both ways, this approach is suitable for experimental and learning purposes, but when it comes to real implementation across production and development environments a provisioning tool becomes a must.
The majority of organizations that consider deploying Kubernetes already have an existing cloud infrastructure or they are going through a cloud migration process. This makes Kubernetes not the only piece of the cloud infrastructure that they will use. This is why we prefer a provisioning tool that achieves the following:
We can find these characteristics in Terraform, and this is why we chose to use it in the production clusters that we managed, as well as in this practical exercise in this book. We highly recommend Terraform for you as well, but if you prefer another portioning tool, you can skip this chapter and then continue reading this book and apply the same concepts and best practices.
Kubernetes configuration is declarative by nature, so, after deploying a cluster, we need to manage its configuration. The add-ons deployed provide services for various areas of functionality, including networking, security, monitoring, and logging. This is why a solid and versatile configuration management tool is required in your toolset.
The following are solid choices:
Our preferred order of suitable tools is as follows:
We can debate this order, and we believe that any of these tools can fulfill the configuration management needs for Kubernetes clusters. However, we prefer to use Ansible for its versatility and flexibility as it can be used for Kubernetes and also for other configuration management needs for your environment, which makes it preferable over Helm. On the other hand, Ansible is preferred over Terraform because it is a provisioning tool at heart, and while it can handle configuration management, it is not the best tool for that.
In the hands-on exercises in this book, we decided to use Ansible with Kubernetes module and Jinja2 templates.
Each organization has its own way of managing cloud accounts. However, we recommend having at least two AWS accounts, one for production and another for non-production. The production Kubernetes cluster resides in the production account, and the non-production Kubernetes cluster resides in the non-production account. This structure is preferred for security, reliability, and operational efficiency.
Based on the technical decisions and choices that we made in the previous sections, we propose the following AWS architecture for the Kubernetes clusters that we will use in this book, which you can also use to deploy your own production and non-production clusters:
In the previous architecture diagram, we decided to do the following:
We will discuss the details of these design specs in the next chapters, in addition to the remainder of the technical aspects of the cluster's architecture.
Provisioning a Kubernetes cluster can be a task that takes 5 minutes with modern tools and managed cloud services; however, thus this is far from a production-grade Kubernetes infrastructure and it is only sufficient for education and trials. Building a production-grade Kubernetes cluster requires hard work in designing and architecting the underlying infrastructure, the cluster, and the core services running above it.
By now, you have learned about the different aspects and challenges you have to consider while designing, building, and operating your Kubernetes clusters. We explored the different architecture alternatives to deploy Kubernetes clusters, and the important technical decisions associated with this process. Then, we discussed the proposed cluster design, which we will use during the book for the practical exercises, and we highlighted our selection of infrastructure platform, tools, and architecture.
In the next chapter, we will see how to put everything together and use the design concepts we discussed in this chapter to write IaC and follow industry best practices with Terraform to provision our first Kubernetes cluster.
For more information on the topics covered in this chapter, please refer to the following links:
Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.
If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.
Please Note: Packt eBooks are non-returnable and non-refundable.
Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:
If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:
Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.
You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.
Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.
When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.
For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.