The adoption of cloud technology has changed the way enterprises run their IT services. By leveraging new approaches on how resources are being used, several cloud solutions came into play with different categories: private, public, hybrid, and community. Whatever cloud category is used, this trend was felt by many organizations, which needs to introduce an orchestration engine to their infrastructure to embrace elasticity, scalability, and achieve a unique user experience to a certain extent. Nowadays, a remarkable orchestration solution, which falls into the private cloud category, has brought thousands of enterprises to the next era of data center generation:Â OpenStack. At the time of writing, OpenStack has been deployed in several large to medium enterprise infrastructures, running different types of production workload. The maturity of this cloud platform has been boosted due to the joint effort of several large organizations and its vast developer community around the globe. Within every new release, OpenStack brings more great features, which makes it a glorious solution for organizations seeking to invest in it, with returns in operational workloads and flexible infrastructure.
In this edition, we will keep explaining the novelties of OpenStack within the latest releases and discuss the great opportunities, which OpenStack can offer for an amazing cloud experience.
Deploying OpenStack is still a challenging step, which needs a good understanding of its beneficial returns to a given organization in terms of automation, orchestration, and flexibility. If expectations are set properly, this challenge will turn into a valuable opportunity, which deserves an investment.
After collecting infrastructure requirements, starting an OpenStack journey will need a good design and consistent deployment plan with different architectural assets.
The Japanese military leader, Miyamoto Musashi, wrote the following, very impressive thought on perception and sight, in The Book of Five Rings, Start Publishing LLC:
"In strategy, it is important to see distant things as if they were close and to take a distanced view of close things."
Our OpenStack journey will start by going through the following points:
- Getting acquainted with the logical architecture of the OpenStack ecosystem by revisiting its components
- Learning how to design an OpenStack environment by choosing the right core services for the right environment
- Enlarging the OpenStack ecosystem by joining new projects within the latest stable releases
- Designing the first OpenStack architecture for a large-scale environment
- Planning for growth by going through first-deployment best practices and capacity planning
Cloud computing is about providing various types of infrastructural services, such as Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). The challenge, which has been set by the public cloud is about agility, speed, and self-service. Most companies have expensive IT systems, which they have developed and deployed over the years, but they are siloed and need human intervention. In many cases, IT systems are struggling to respond to the agility and speed of the public cloud services. The traditional data center model and siloed infrastructure might become unsustainable in today's agile service delivery environment. In fact, today's enterprise data center must focus on speed, flexibility, and automation for delivering services to get to the level of next-generation data center efficiency.
The big move to a software infrastructure has allowed administrators and operators to deliver a fully automated infrastructure within a minute. The next-generation data center reduces the infrastructure to a single, big, agile, scalable, and automated unit. The end result is a programmable, scalable, and multi-tenant-aware infrastructure. This is where OpenStack comes into the picture: it promises the features of a next-generation data center operating system. The ubiquitous influence of OpenStack was felt by many big global cloud enterprises such as VMware, Cisco, Juniper, IBM, Red Hat, Rackspace, PayPal, and eBay, to name but a few. Today, many of them are running a very large scalable private cloud based on OpenStack in their production environment. If you intend to be a part of a winning, innovative cloud enterprise, you should jump to the next-generation data center and gain valuable experience by adopting OpenStack in your IT infrastructure.
To read more about the success stories of many companies, visit https://www.openstack.org/user-stories.
Before delving into the OpenStackÂ architecture , we need to refresh or fill gaps andÂ learn more about the basic concepts and usage of each core component.
In order to get a better understanding on how it works, it will be beneficial to first briefly parse the things, which make it work. In the following sections, we will look at various OpenStack services, which work together to provide the cloud experience to the end user. Despite the different services catering to different needs, they follow a common theme in their design that can be summarized as follows:
- Most OpenStack services are developed in Python, which aidsÂ rapid development.
- All OpenStack services provide REST APIs. These APIs are the main external communication interfaces for services and are used by the other services or end users.
- The OpenStack service itself may be implemented as different components. The components of a service communicate with each other over the message queue. The message queue provides various advantages such as queuing of requests, loose coupling, and load distribution among the worker daemons.
With this common theme in mind, let's now put the essential core components under the microscope and go a bit further by asking the question: What is the purpose of such Â component?
From an architectural perspective, Keystone presents the simplest service in the OpenStack composition. It is the core component andÂ provides an identity service comprising authentication and authorization of tenants in OpenStack. Communications between different OpenStack services are authorized by Keystone to ensure that the right user or service is able to utilize the requested OpenStack service. Keystone integrates with numerous authentication mechanisms such as username/password and token/authentication-based systems. Additionally, it is possible to integrate it with an existing backend such as the Lightweight Directory Access Protocol (LDAP) and the Pluggable Authentication Module (PAM).
With the evolution of Keystone, many features have been implemented within recent OpenStack releases leveraging a centralized and federated identity solution. This will allow users to use their credentials in an existing, centralized, sign-on backend and decouples the authentication mechanism from Keystone.
The federation identity solution becomes more stable within the OpenStack Juno release, which engages Keystone as a Service Provider (SP), and uses and consumes from a trusted Provider of Identity (IdP), user identity information in SAML assertions, or OpenID Connect claims. An IdP can be backed by LDAP, Active Directory, or SQL.
Swift is one of the storage services available to OpenStack users. It provides an object-based storage service and is accessible through REST APIs. Compared to traditional storage solutions, file shares, or block-based access, an Object-Storage takes the approach of dealing with stored data as objects thatÂ can be stored and retrieved from the Object-Store. A very high-level overview of Object Storage goes like this. To store the data, the Object-Store splits it into smaller chunks and stores it in separate containers. These containers are maintained in redundant copies spread across a cluster of storage nodes to provide high availability, auto-recovery, and horizontal scalability.
We will leave the detailsÂ ofÂ the Swift architecture for later. Briefly, it has a number of benefits:
- It has no central brain, and indicates no Single Point Of Failure (SPOF)
- It is curative, and indicates auto-recovery in the case of failure
- It is highly scalable for large petabytes of storage access by scaling horizontally
- It has a better performance, which is achieved by spreading the load over the storage nodes
- It has inexpensive hardware that can be used for redundant storage clusters
You may wonder whether there is another way to provide storage to OpenStack users. Indeed, the management of the persistent block storage is available in OpenStack by using the Cinder service. Its main capability is to provide block-level storage to the virtual machine. Cinder provides raw volumes that can be used as hard disks in virtual machines.
Some of the features that Cinder offers are as follows:
- Volume management: This allows the creation or deletion of a volume
- Snapshot management: This allows the creation or deletion of a snapshot of volumes
- Attaching or detaching volumes from instances
- Cloning volumes
- Creating volumes from snapshotsÂ
- Copy of images to volumes and vice versa
It is very important to keep in mind that like Keystone services, Cinder features can be delivered by orchestrating various backend volume providers through configurable drivers for the vendor's storage products such as from IBM, NetApp, Nexenta, and VMware.
Cinder is proven as an ideal solution or a replacement of the old nova-volume service that existed before the Folsom release on an architectural level. It is important to know that Cinder has organized and created a catalog of block-based storage devices with several differing characteristics. However, we must obviously consider the limitation of commodity storage such as redundancy and auto-scaling.
When Cinder was introduced in the OpenStack Grizzly release, a joint feature was implemented to allow creating backups for Cinder volumes. A common use case has seen Swift evolves as a storage backup solution. Within the next few releases, Cinder was enriched with more backup target stores such as NFS, Ceph, GlusterFS, POSIX file systems, and the property IBM solution, Tivoli Storage Manager. This great backup extensible feature is defined by the means of Cinder backup drivers that have become richer in every new release. Within the OpenStack Mitaka release, Cinder has shown its vast number of backup options by marrying two different cloud computing environments, bringing an additional backup driver targeting Google Cloud Platform. This exciting opportunity allows OpenStack operators to leverage an hybrid cloud backup solution that empowers , aÂ disaster recovery strategy for persistent data. What about security? This latent issue has been resolved since the Kilo release so Cinder volumes can be encrypted before starting any backup operations.
Apart from the block and object we discussed in the previous section, since the Juno release, OpenStack has also had a file-share-based storage service called Manila. It provides storage as a remote file system. In operation, it resembles the Network File System (NFS) or SAMBA storage service that we are usedÂ on Linux while, in contrast to Cinder, it resembles the Storage Area Network (SAN) service. In fact, NFS and SAMBA or the Common Internet File System (CIFS) are supported as backend drivers to the Manila service. The Manila service provides the orchestration of shares on the share servers.
More details on storage services will be covered in Chapter 5, OpenStack Storage - Block, Object, and File Share.
Each storage solution in OpenStack has been designed for a specific set of purposes and implemented for different targets. Before taking any architectural design decisions, it is crucial to understand the difference between existing storage options in OpenStack today, as outlined in the following table:
Objects through REST API
As block devices.
No, can only be used by one client
Within single VM
Within multiple VMs
The Glance service provides a registry of images and metadata that the OpenStack user can launch as a virtual machine. Various image formats are supported and can be used based on the choice of hypervisor. Glance supports images for KVM/Qemu, XEN, VMware, Docker, and so on.
As a new user of OpenStack, one might often wonder, What is the difference between Glance and Swift? Both handle storage. What is the difference between them? Why do I need to integrate such a solution?
Swift is a storage system, whereas Glance is an image registry. The difference between the two is that Glance is a service that keeps track of virtual machine images and metadata associated with the images. Metadata can be information such as a kernel, disk images, disk format, and so on. Glance makes this information available to OpenStack users over REST APIs. Glance can use a variety of backends for storing images. The default is to use directories, but in a massive production environment it can use other approaches such as NFS and even Swift.
Swift, on the other hand, is a storage system. It is designed forÂ object-storage where you can keep data such as virtual disks, images, backup archiving, and so on.
The mission of Glance is to be an image registry. From an architectural point of view, the goal of Glance is to focus on advanced ways to store and query image information via the Image Service API. A typical use case for Glance is to allow a client (which can be a user or an external service) to register a new virtual disk image, while a storage system focuses on providing a highly scalable and redundant data store. At this level, as a technical operator, your challenge is to provide the right storage solution to meet cost and performance requirements. This will be discussed at the end of the book.
As you may already know, Nova is the original core component of OpenStack. From an architectural level, it is considered one of the most complicated components of OpenStack. Nova provides the compute service in OpenStack and manages virtual machines in response to service requests made by OpenStack users.
What makes Nova complex is its interaction with a large number of other OpenStack services and internal components, which it must collaborate with to respond to user requests for running a VM.
Let's break down the Nova service itself and look at its architecture as a distributed application that needs orchestration between different components to carry out tasks.
The nova-api component accepts and responds to the end user and computes API calls. The end users or other components communicate with the OpenStack nova-api interface to create instances via the OpenStack API or EC2 API.
The nova-compute component is primarily a worker daemon that creates and terminates VM instances via the hypervisor's APIs (XenAPI for XenServer, Libvirt KVM, and the VMware API for VMware).
The nova-network component accepts networking tasks from the queue and then performs these tasks to manipulate the network (such as setting up bridging interfaces or changing IP table rules).
The nova-scheduler component takes a VM instance's request from the queue and determines where it should run (specifically which compute host it should run on). At an application architecture level, the term scheduling or scheduler invokes a systematic search for the best outfit for a given infrastructure to improve its performance.
The nova-conductor service provides database access to compute nodes. The idea behind this service is to prevent direct database access from the compute nodes, thus enhancing database security in case one of the compute nodes gets compromised.
By zooming out of the general components of OpenStack, we find that Nova interacts with several services such as Keystone for authentication, Glance for images, and Horizon for the web interface. For example, the Glance interaction is central; the API process can upload any query to Glance, while nova-compute will download images to launch instances.
Nova also provides console services that allow end users to access the console of the virtual instance through a proxy such as nova-console, nova-novncproxy, and nova-consoleauth.
Neutron provides a real Network as a Service (NaaS) capability between interface devices that are managed by OpenStack services such as Nova. There are various characteristics that should be considered for Neutron:
- It allows users to create their own networks and then attaches server interfaces to them
- Its pluggable backend architecture lets users take advantage of commodity gear or vendor-supported equipment
- It provides extensions to allow additional network services to be integrated
Neutron has many core network features that are constantly growing and maturing. Some of these features are useful for routers, virtual switches, and SDN networking controllers.
Neutron introduces the following core resources:
- Ports: Ports in Neutron refer to the virtual switch connections. These connections are where instances and network services are attached to networks. When attached to subnets, the defined MAC and IP addresses of the interfaces are plugged into them.
- Networks: Neutron defines networks as isolated Layer 2 network segments. Operators will see networks as logical switches that are implemented by the Linux bridging tools, Open vSwitch, or some other virtual switch software. Unlike physical networks, either the operators or users in OpenStack can define this.
- Subnet: Subnets in Neutron represent a block of IP addresses associated with a network. IP addresses from this block are allocated to the ports.
Neutron provides additional resources as extensions. The following are some of the commonly used extensions:
- Routers: Routers provide gateways between various networks.
- Private IPs: Neutron defines two types of networks. They are as follows:
- Tenant networks: Tenant networks use private IP addresses. Private IP addresses are visible within the instance and this allows the tenant's instances to communicate while maintaining isolation from the other tenant's traffic. Private IP addresses are not visible to the Internet.
- External networks: External networks are visible and routable from the Internet. They must use routable subnet blocks.
- Floating IPs: A floating IP is an IP address allocated on an external network that Neutron maps to the private IP of an instance. Floating IP addresses are assigned to an instance so that they can connect to external networks and access the Internet. Neutron achieves the mapping of floating IPs to the private IP of the instance by using Network Address Translation (NAT).
Neutron also provides advanced services to rule additional network OpenStack capabilities as follows:
- Load Balancing as a Service (LBaaS) to distribute the traffic among multiple compute node instances.
- Firewall as a Service (FWaaS) to secure layer 3 and 4 network perimeter access.
- Virtual Private Network as a Service (VPNaaS) to build secured tunnels between instances or hosts.
You can refer to the latest updated Mitaka release documentation for more information on networking in OpenStack at http://docs.openstack.org/mitaka/networking-guide/.
The three main components of the Neutron architecture are:
- Neutron server: It accepts API requests and routes them to the appropriate Neutron plugin for action.
- Neutron plugins: They perform the actual workÂ forÂ the orchestration of backend devices such as the plugging in or unplugging ports, creating networks and subnets, or IP addressing.
Agents and plugins differ depending on the vendor technology of a particular cloud for the virtual and physical Cisco switches, NEC, OpenFlow, OpenSwitch, Linux bridging, and so on.
- Neutron agents: Neutron agents run on the compute and network nodes. The agents receive commands from the plugins on the Neutron server and bring the changes into effect on the individual compute or network nodes. Different types of Neutron agents implement different functionality. For example, the Open vSwitch agent implements L2 connectivity by plugging and unplugging ports onto Open vSwitch (OVS) bridges and they run on both compute and network nodes, whereas L3 agents run only on network nodes and provide routing and NAT services.
Neutron is a service that manages network connectivity between the OpenStack instances. It ensures that the network will not be turned into a bottleneck or limiting factor in a cloud deployment and gives users real self-service, even over their network configurations.
Another advantage of Neutron is its ability to provide a way to integrate vendor networking solutions and a flexible way to extend network services. It is designed to provide a plugin and extension mechanism that presents an option for network operators to enable different technologies via the Neutron API. More details about this will be covered in Chapter 6, OpenStack Networking - Choice of Connectivity Types and Networking Services and Chapter 7, Advances Networking - A Look Â SDN and NFV.
Keep in mind that Neutron allows users to manage and create networks or connect servers and nodes to various networks.
The scalability advantage will be discussed in a later topic in the context of the Software Defined Network (SDN) and Network Function Virtualization (NFV) technology, which is attractive to many networks and administrators who seek a high-level network multi-tenancy.
Ceilometer provides a metering service in OpenStack. In a shared, multi-tenant environment such as OpenStack, metering resource utilization is of prime importance.
Ceilometer collects data associated with resources. Resources can be any entity in the OpenStack cloud such as VMs, disks, networks, routers, and so on. Resources are associated with meters. The utilization data is stored in the form of samples in units defined by the associated meter. Ceilometer has an inbuilt summarization capability.
Ceilometer allows data collection from various sources, such as the message bus, polling resources, centralized agents, and so on.
As an additional design change in the Telemetry service in OpenStack since the Liberty release, the Alarming service has been decoupled from the Ceilometer project to make use of a new incubated project code-named Aodh. The Telemetry Alarming service will be dedicated to managing alarms and triggering them based on collected metering and scheduled events
More Telemetry service enhancements have been proposed to adopt a Time Series Database as a Service project code-named Gnoochi. This architectural change will tackle the challenge of metrics and event storage at scale in the OpenStack Telemetry service and improve its performance.
Telemetry and system monitoring areÂ covered in more detail in Chapter 10, Monitoring and Troubleshooting - Running a Healthy OpenStack Cluster.
Debuting in the Havana release is the OpenStack Orchestration project Heat. Initial development for Heat was limited to a few OpenStack resources including compute, image, block storage, and network services. Heat has boosted the emergence of resource management in OpenStack by orchestrating different cloud resources resulting in the creation of stacks to run applications with a few pushes of a button. From simple template engine text files referred to as HOT templates (Heat Orchestration Template), users are able to provision the desired resources and run applications in no time. Heat is becoming an attractive OpenStack project due to its maturity and extended support resources catalog within the latest OpenStack releases. Other incubated OpenStack projects such as Sahara (Big Data as a Service) have been implemented to use the Heat engine to orchestrate the creation of the underlying resources stack. It is becoming a mature component in OpenStack and can be integrated with some system configuration management tools such as Chef for full stack automation and configuration setup.
Heat uses templates files in YAML or JSON format; indentation is important!
The Orchestration project in OpenStack is covered in more detail in Chapter 8, Operating the OpenStack Infrastructure- The User Perspective.
Horizon is the web dashboard that pulls all the different pieces together from the OpenStack ecosystem.
Horizon provides a web frontend for OpenStack services. Currently, it includes all the OpenStack services as well as some incubated projects. It was designed as a stateless and data-less web application. It does nothing more than initiate actions in the OpenStack services via API calls and display information that OpenStack returns to Horizon. It does not keep any data except the session information in its own data store. It is designed to be a reference implementation that can be customized and extended by operators for a particular cloud. It forms the basis ofÂ several public clouds, most notably the HP Public Cloud, and at its heart is its extensible modular approach to construction.
Horizon is based on a series of modules called panels that define the interaction of each service. Its modules can be enabled or disabled, depending on the service availability of the particular cloud. In addition to this functional flexibility, Horizon is easy to style with Cascading Style Sheets (CSS).
Message Queue provides a central hub to pass messages between different components of a service. This is where information is shared between different daemons by facilitating the communication between discrete processes in an asynchronous way.
One major advantage of the queuing system is that it can buffer requests and provide unicast and group-based communication services to subscribers.
Its database stores most of the build-time and run-time states for the cloud infrastructure, including instance types that are available for use, instances in use, available networks, and projects. It provides a persistent storage for preserving the state of the cloud infrastructure. It is the second essential piece of sharing information in all OpenStack components.
Let's try to see how OpenStack works by chaining all the service cores covered in the previous sections in a series of steps:
- Authentication is the first action performed. This is where Keystone comes into the picture. Keystone authenticates the user based on credentials such as the username and password.
- The service catalog is then provided by Keystone. This contains information about the OpenStack services and the API endpoints.
- You can use the Openstack CLI to get the catalog:
$ openstack catalog list
The service catalog is a JSON structure that exposes the resources available on a token request.
- Typically, once authenticated, you can talk to an API node. There are different APIs in the OpenStack ecosystem (the OpenStack API and EC2 API):
The following figure shows a high-level view of how OpenStack works:
- Another element in the architecture is the instance scheduler. Schedulers are implemented by OpenStack services that are architected around worker daemons. The worker daemons manage the launching of instances on individual nodes and keep track of resources available to the physical nodes on which they run. The scheduler in an OpenStack service looks at the state of the resources on a physical node (provided by the worker daemons) and decides the best candidate node to launch a virtual instance on. An example of this architecture is nova-scheduler. This selects the compute node to run a virtual machine or Neutron L3 scheduler, which decides which L3 network node will host a virtual router.
The scheduling process in OpenStack Nova can perform different algorithms such as simple, chance, and zone. An advanced way to do this is by deploying weights and filters by ranking servers as its available resources.
It is important to understand how different services in OpenStack work together, leading to a running virtual machine. We have already seen how a request is processed in OpenStack via APIs.
Let's figure out how things workÂ by referring to the following simple architecture diagram:
The process of launching a virtual machine involves the interaction of the main OpenStack services that form the building blocks of an instance including compute, network, storage, and the base image. As shown in the previous diagram, OpenStack services interact with each other via a message bus to submit and retrieve RPC calls. The information of each step of the provisioning process is verified and passed by different OpenStack services via the message bus. From an architecture perspective, sub system calls are defined and treated in OpenStack API endpoints involving: Nova, Glance, Cinder, and Neutron.
On the other hand, the inter-communication of APIs within OpenStack requires an authentication mechanism to be trusted, which involves Keystone.
Starting with the identity service, the following steps summarize briefly the provisioning workflow based on API calls in OpenStack:
- Calling the identity service for authentication
- Generating a token to be used for subsequent calls
- Contacting the image service to list and retrieve a base image
- Processing the request to the compute service API
- Processing compute service calls to determine security groups and keys
- Calling the network service API to determine available networks
- Choosing the hypervisor node by the compute scheduler service
- Calling the block storage service API to allocate volume to the instance
- Spinning up the instance in the hypervisor via the compute service API call
- Calling the network service API to allocate network resources to the instance
It is important to keep in mind that handling tokens in OpenStack on every API call and service request is a time limited operation. One of the major causes of a failed provisioning operation in OpenStack is the expiration of the token during subsequent API calls. Additionally, the management of tokens has faced a few changes within different OpenStack releases. This includes two different approaches used in OpenStack prior to the Liberty release including:
- Universally Unique Identifier (UUID): Within Keystone version 2, an UUID token will be generated and passed along every API call between client services and back to Keystone for validation. This version has proven performance degradation of the identity service.
- Public Key Infrastructure (PKI): Within Keystone version 3, tokens are no longer validated at each API call by Keystone. API endpoints can verify the token by checking the Keystone signature added when initially generating the token.
Starting from the Kilo release, handling tokens in Keystone has progressed by introducing more sophisticated cryptographic authentication token methods, such as Fernet. The new implementation will help to tackle the token performance issue noticed in UUID and PKI tokens. Fernet is fully supported in the Mitaka release and the community is pushing to adopt it as the default. On the other hand, PKI tokens are deprecated in favor of Fernet tokens in further releases of Kilo OpenStack.
More advanced topics regarding additions introduced in Keystone are covered briefly in Chapter 3,Â OpenStack Cluster â The Cloud Controller and Common Services.
Let us first go through the architecture that can be deployed.
Deployment of OpenStack depends on the componentsÂ were covered previously. It confirms your understanding of how to start designing a complete OpenStack environment. Of course, assuming the versatility and flexibility of such a cloud management platform, OpenStack offers several possibilities that canÂ be considered an advantage. However, owing to such flexibility, it's a challenge to come withÂ the right design decision that suits your needs.
At the end of the day, it all comes down to the use cases that your cloud is designed to service.
Many enterprises have successfully designed their OpenStack environments by going through three phases of design: designing a conceptual model, designing a logical model, and finally, realizing the physical design. It's obvious that complexity increases from the conceptual to the logical design and from the logical to the physical design.
As the Â first conceptual phase, we will have our high-level reflection on what we will need from certain generic classes from the OpenStack architecture:
Stores virtual machine images
Provides a user interface
Stores disk files
Provides a user interface
Provides a user interface
Provides a user interface
Provides network connectivity
Provides a user interface
Provides measurements, metrics, and alerts
Provides a user interface
Provides a scale-out file share system for OpenStack
Provides a user interface
Provides a graphical user interface
Provides orchestration engine for stack creation
Provides a user interface
Let's map the generic basic classes in the following simplified diagram:
Keep in mind that the illustrated diagram will be refined over and over again since we will aim to integrate more services within our first basic design. In other words, we are following an incremental design approach, within which we should exploit the flexibility of the OpenStack architecture.
At this level, we can have a vision and direction of the main goal without worrying about the details.
Based on the conceptual reflectionÂ design, most probably you will have a good idea about different OpenStack core components, which will lay the formulation of the logical design.
We will start by outlining the relationships and dependencies between the service core of OpenStack. In this section we will look at the deployment architecture of OpenStack. We will start by identifying nodes to run an OpenStack service: the cloud controller, network nodes, and the compute node. You may wonder why such a consideration goes through a physical design classification. However, seeing the cloud controller and compute nodes as simple packages that encapsulate a bunch of OpenStack services will help you refine your design at an early stage. Furthermore, this approach will help plan in advance further high availability and scalability requirements, and will allow you to introduce them later in more detail.
Chapter 3, OpenStack Cluster â The Cloud Controller and Common ServicesÂ describes in depth how to distribute OpenStack services between cloud controllers and compute nodes.
Thus, the physical model design will be elaborated based on the previous theoretical phases by assigning parameters and values to our design. Let's start with our first logical iteration:
Obviously, in a highly available setup, we should achieve a degree of redundancy in each service within OpenStack. You may wonder about the critical OpenStack services claimed in the first part of this chapter: the database and message queue. Why can't they be separately clustered or packaged on their own? This is a pertinent question. Remember that we are still in the second logical phase where we try to dive slowly into the infrastructure without getting into the details. Besides, we keep on going from a generic and simple design to targeting specific use-cases. Decoupling infrastructure components such as RabbitMQ or MySQL from now on may lead to skipping the requirements of a simple design.
What about high availability?The previous figure includes several essential solutions for a highly-scalable and redundant OpenStack environment such as virtual IP (VIP), HAProxy, and Pacemaker. The aforementioned technologies will be discussed in more detail in Chapter 9, Openstack HA and Failover.
Compute nodes are relatively simple as they are intended just to run the virtual machine's workload. In order to manage the VMs, the nova-compute service can be assigned for each compute node. Besides, we should not forget that the compute nodes will not be isolated; a Neutron agent and an optional Ceilometer compute agent may run these nodes.
Network nodes will run Neutron agents for DHCP, and L3 connectivity.
You should now have a deeper understanding of the storage types within Swift, Cinder, and Manila.
However, we have not covered third-party software-defined storage, Swift and Cinder.
More details will be covered in Chapter 5, OpenStack Storage , and File Share. For now, we will design from a basis where we have to decide how Cinder, Manila, and/or Swift will be a part of our logical design.
You will have to ask yourself questions such as: How much data do I need to store? Will my future use cases result in a wide range of applications that run heavy-analysis data? What are my storage requirements for incrementally backing up a virtual machine's snapshots? Do I really need control over the filesystem on the storage or is just a file share enough? Do I need a shared storage between VMs?
Many will ask the following question: If one can be satisfied by ephemeral storage, why offer block/share storage? To answer this question, you canÂ think about ephemeral storage as the place where the end user will not be able to access the virtual disk associated with its VM when it is terminated. Ephemeral storage should mainly be used in production when the VM state is non-critical, where users or application don't store data on the VM. If you need your data to be persistent, you must plan for a storage service such as Cinder or Manila.
Remember that the current design applies for medium to large infrastructures. Ephemeral storage can also be a choice for certain users; for example, when they consider building a test environment. Considering the same case for Swift, we claimed previously that object storage might be used to store machine images, but when do we use such a solution? Simply put, when you have a sufficient volume of critical data in your cloud environment and start to feel the need for replication and redundancy.
OpenStack allows a wide ranging of configurations that variation, and tunneled networks such as GRE, VXLAN, and so on, with Neutron are not intuitively obvious from their appearance to be able to be implemented without fetching their use case in our design. Thus, this important step implies that you may differ between different network topologies because of the reasons behind why every choice was made and why it may work for a given use case.
OpenStack has moved from simplistic network features to more complicated ones, but of course the reason is that it offers more flexibility! This is why OpenStack is here. It brings as much flexibility as it can! Without taking any random network-related decisions, let's see which network modes are available. We will keep on filtering until we hit the first correct target topology:
Flat network design without tenant traffic isolation
nova-network Flat DHCP
Isolated tenants traffic and predefined fixed private IP space size
Limited number of tenant networks (4K VLANs limit)
Isolated tenants traffic
Limited number of tenant networks (4K VLANs limit)
Increased number of tenant networks
Increased packet size
Neutron tunneled networking (GRE, VXLAN, and so on)
The preceding table shows a simple differentiation between two different logical network designs for OpenStack. Every mode shows its own requirements: this is very important and should be taken into consideration before the deployment.
Arguing about our example choice, since we aim to deploy a very flexible, large-scale environment we will toggle the Neutron choice for networking management instead of nova-network.
Note that it is also possible to keep on going with nova-network, but you have to worry about any Single Point Of Failure (SPOF) in the infrastructure. The choice was made for Neutron, since we started from a basic network deployment. We will cover more advanced features in the subsequent chapters of this book.
We would like to exploit a major advantage of Neutron compared to nova-network, which is the virtualization of Layers 2 and 3 of the OSI network model.
Let's see how we can expose our logical network design. For performance reasons; it is highly recommended to implement a topology that can handle different types of traffic by using separated logical networks.
In this way, as your network grows, it will still be manageable in case a sudden bottleneck or an unexpected failure affects a segment.
Let us look at the different rate the OpenStack environment the OpenStack environment
We will start by looking at the physical networking requirements of the cloud.
The main feature of a data network that it provides the physical path for the virtual networks created by the OpenStack tenants. It separates the tenant data traffic from the infrastructure communication path required for the communication between the OpenStack component itself.
In a smaller deployment, the traffic for management and communication between the OpenStack components can be on the same physical link. This physical network provides a path for communication between the various OpenStack components such as REST API access and DB traffic, as well as for managing the OpenStack nodes. For a production environment, the network can be further subdivided to provide better isolation of traffic and contain the load on the individual networks.
The Storage network
The storage network provides physical connectivity and isolation for storage-related traffic between the VMs and the storage servers. As the traffic load for the storage network is quite high, it is a good idea to isolate the storage network load from the management and tenant traffic.
The features of an external or a public network are as follows:
- It provides global connectivity and uses routable IP addressing
- It is used by the virtual router to perform SNAT from the VM instances and provide external access to traffic originating from the VM and going to the Internet
SNAT refers to Source Network Address Translation. It allows traffic from a private network to go out to the Internet. OpenStack supports SNAT through its Neutron APIs for routers. More information can be found at http://en.wikipedia.org/wiki/Network_address_translation.
- It is used to provide a DNAT service for traffic from the Internet to reach a service running on the VM instance
The features of the tenant network are as follows:
- It provides a private network between virtual machines
- It uses private IP space
- It provides isolation of tenant traffic and allows multi-tenancy requirements for networking services
The next step is to validate our network design in a simple diagram:
Finally, we will bring our logical design to life in the form of a physical design.
We can start with a limited number of servers just to setup the first deployment of our environment effectively.
You have to consider the fact that hardware commodity selection will accomplish the mission of our massive scalable architecture.
Since the architecture is being designed to scale horizontally, we can add more servers to the setup. We will start by using commodity class, cost-effective hardware.
In order to expect our infrastructure economy, it would be great to make some basic hardware calculations for the first estimation of our exact requirements.
Considering the possibility of experiencing contentions for resources such as CPU, RAM, network, and disk, you cannot wait for a particular physical component to fail before you take corrective action, which might be more complicated.
Let's inspect a real-life example of the impact of underestimating capacity planning. A cloud-hosting company set up two medium servers, one for an e-mail server and the other to host the official website. The company, which is one of our several clients, grew in a few months and eventually ran out of disk space. The expected time to resolve such an issue is a few hours, but it took days. The problem was that all the parties did not make proper use of the cloud, due to the on demand nature of the service. This led to Mean Time To Repair (MTTR) increasing exponentially. The cloud provider did not expect this!
Incidents like this highlight the importance of proper capacity planning for your cloud infrastructure. Capacity management is considered a day-to-day responsibility where you have to stay updated with regard to software or hardware upgrades.
Through a continuous monitoring process of service consumption, you will be able to reduce the IT risk and provide a quick response to the customer's needs.
From your first hardware deployment, keep running your capacity management processes by looping through tuning, monitoring, and analysis.
The next stop will take into account your tuned parameters and introduce, within your hardware/software, the right change, which involves a synergy of the change management process.
Let's make our first calculation based on certain requirements. For example, let's say we aim to run 200 VMs in our OpenStack environment.
The following are the calculation-related assumptions:
- 200 virtual machines
- No CPU oversubscribing
Processor over subscription is defined as the total number of CPUs that are assigned to all the powered-on virtual machines multiplied by the hardware CPU core. If this number is greater than the GHz purchased, the environment is oversubscribed.
- GHz per physical core = 2.6 GHz
- Physical core hyper-threading support = use factor 2
- GHz per VM (AVG compute units) = 2 GHz
- GHz per VM (MAX compute units) = 16 GHz
- Intel Xeon E5-2648L v2 core CPU = 10
- CPU sockets per server = 2
The formula for calculating the total number of CPU cores is as follows:
(number of VMs x number of GHz per VM) / number of GHz per core
(200 * 2) / 2.6 = 153.846
We have 153 CPU cores for 200 VMs.
The formula for calculating the number of core CPU sockets is as follows:
Total number of sockets / number of sockets per server
153 / 10 = 15.3
We will need 15 sockets
The formula for calculating the number of socket servers is as follows:
Total number of sockets / Number of sockets per server
15 / 2 = 7.5
You will need around seven to eight dual socket servers.
The number of virtual machines per server with eight dual socket servers is calculated as follows:
We can deploy 25 virtual machines per server
200 / 8 = 25
Number of virtual machines / number of servers
Based on the previous example, 25 VMs can be deployed per compute node. Memory sizing is also important to avoid making unreasonable resource allocations.
Let's make an assumption list (keep in mind that it always depends on your budget and needs):
- 2 GB RAM per VM
- 8 GB RAM maximum dynamic allocations per VM
- Compute nodes supporting slots of: 2, 4, 8, and 16 GB sticks
- RAM available per compute node:
8 * 25 = 200 GB
Considering the number of sticks supported by your server, you will need around 256 GB installed. Therefore, the total number of RAM sticks installed can be calculated in the following way:
Total available RAM / MAX Available RAM-Stick size
256 / 16 = 16
To fulfill the plans that were drawn for reference, let's have a look at our assumptions:
- 200 Mbits/second is needed per VM
- Minimum network latency
To do this, it might be possible to serve our VMs by using a 10 GB link for each server, which will give:
10,000 Mbits/second / 25VMs = 400 Mbits/second
This is a very satisfying value. We need to consider another factor: highly available network architecture. Thus, an alternative is using two data switches with a minimum of 24 ports for data.
Thinking about growth from now, two 48-port switches will be in place.
What about the growth of the rack size? In this case, you should think about the example of switch aggregation that uses the Multi-Chassis Link Aggregation (MCLAG/MLAG) technology between the switches in the aggregation. This feature allows each server rack to divide itsÂ links between the pair of switches to achieve a powerful active-active forwarding while using the full bandwidth capability with no requirement for a spanning tree.
MCLAG is a Layer 2 link aggregation protocol between the servers that are connected to the switches, offering a redundant, load-balancing connection to the core network and replacing the spanning-tree protocol.
The network configuration also depends heavily on the chosen network topology. As shown in the previous example network diagram, you should be aware that all nodes in the OpenStack environment must communicate with each other. Based on this requirement, administrators will need to standardize the units will be planned to use and count the needed number of public and floating IP addresses. This calculation depends on which network type the OpenStack environment will run including the usage of Neutron or former nova-network service. It is crucial to separate which OpenStack units will need an attribution of Public and floating IPs. Our first basic example assumes the usage of the Public IPs for the following units:
- Cloud Controller Nodes: 3
- Compute Nodes: 15
- Storage Nodes: 5
In this case, we will initially need at least 18 public IP addresses. Moreover, when implementing a high available setup using virtual IPs fronted by load balancers, these will be considered as additional public IP addresses.
The use of Neutron for our OpenStack network design will involve a preparation for the number of virtual devices and interfaces interacting with the network node and the rest of the private cloud environment including:
- Virtual routers for 20 tenants: 20
- Virtual machines in 15 Compute Nodes: 375
In this case, we will initially need at least 395 floating IP addresses given that every virtual router is capable of connecting to the public network.
Additionally, increasing the available bandwidth should be taken into consideration in advance. For this purpose, we will need to consider the use of NIC bonding, therefore multiplying the number of NICs by 2. Bonding will empower cloud network high availability and achieve boosted bandwidth performance.
Considering the previous example, you need to plan for an initial storage capacity per server that will serve 25 VMs each.
A simple calculation, assuming 100 GB ephemeral storage per VM, will require a space of 25*100 = 2.5 TB of local storage on each compute node.
You can assign 250 GB of persistent storage per VM to have 25*250 = 5 TB of persistent storage per compute node.
Most probably, you have an idea about the replication of object storage in OpenStack, which implies the usage of three times the required space for replication.
In other words, if you are planning for X TB for object storage, your storage requirement will be 3X.
Other considerations, such as the best storage performance using SSD, can be useful for a better throughput where you can invest more boxes to get an increased IOPS.
For example, working with SSD with 20K IOPS installed in a server with eight slot drives will bring you:
(20K * 8) / 25 = 6.4 K Read IOPS and 3.2K Write IOPS
That is not bad for a production starter!
Well, let's bring some best practices under the microscope by exposing the OpenStack design flavor.
In a typical OpenStack production environment, the minimum requirement for disk space per compute node is 300 GB with a minimum RAM of 128 GB and a dual 8-core CPU.
Let's imagine a scenario where, due to budget limitations, you start your first compute node with costly hardware that has 600 GB disk space, 16-core CPUs, and 256 GB of RAM.
Assuming that your OpenStack environment continues to grow, you may decide to purchase more hardware: large, and at an incredible price! A second compute instance is placed to scale up.
Shortly after this, you may find out that demand is increasing. You may start splitting requests into different compute nodes but keep on continuing scaling up with the hardware. At some point, you will be alerted about reaching your budget limit!
There are certainly times when the best practices aren't in fact the best for your design. The previous example illustrated a commonly overlooked requirement for the OpenStack deployment.
If the minimal hardware requirement is strictly followed, it may result in an exponential cost with regards to hardware expenses, especially for new project starters.
Thus, you shouldÂ choose exactly what works for you and consider the constraints that exist in your environment.
Keep in mind that best practices are a guideline; apply them when you find what you need to be deployed and how it should be set up.
On the other hand, do not stick to values, but stick to the spirit of the rules. Let's bring the previous example under the microscope again: scaling up shows more risk andÂ may lead to failure than scaling out or horizontally. The reason behind such a design is to allow for a fast scale of transactions at the cost of duplicated compute functionality and smaller systems at a lower cost. That is how OpenStack was designed: degraded units can be discarded and failed workloads can be replaced.
Transactions and requests in the compute node may grow tremendously in a short time to a point whereÂ a single big compute node with 16 core CPUs starts failing performance-wise, while a few small compute nodes with 4 core CPUs can proceed to complete the job successfully.
As we have shown in the previous section, planning for capacity is a quite intensive exercise but very crucial to setting up an initial, successful OpenStack cloud strategy.
Planning for growth should be driven by the natural design of OpenStack and how it is implemented. We should consider that growth is based on demand where workloads in OpenStack take an elastic form and not a linear one. Although the previous resource's computation example can be helpful to estimate a few initial requirements for our designed OpenStack layout, reaching acceptable capacity planning still needs more action. This includes a detailed analysis of cloud performance in terms of growth of workload. In addition, by using more sophisticated monitoring tools, operators should be consistent in tracking the usage of each unit running in the OpenStack environment, which includes, for example, its overall resource consumption over time and cases of unit overutilization resulting in performance degradation. As we have conducted a rough estimation of our future hardware capabilities, this calculation model can be hardened by sizing the instance flavor for each compute host after first deployment and can be adjusted on demand if resources are carefully monitored.
This chapter has revisited the basic components of OpenStack and exposed new features such as Telemetry, Orchestration, and File Share projects.
We continued refining our logical design for future deployment by completing a first design layout. As an introductory chapter, we have rekindled the flames on each OpenStack component by discussing briefly each use case and role in its ecosystem. We have also covered a few tactical tips to plan and mitigate the future growth of the OpenStack setup in a production environment.
As a main reference for the rest of the book, we will be breaking down each component and new functionality in OpenStack by extending the basic layout covered in this chapter.
We will continue the OpenStack journey to deploy what was planned in a robust and effective way: the DevOps style.