At the Vancouver OpenStack Conference in May 2015, US retail giant Walmart announced that they had deployed an OpenStack cloud with 140,000 cores of compute, supporting 1.5 billion page views on Cyber Monday. CERN, a long-time OpenStack user, announced that their OpenStack private cloud had grown to 100,000 cores, running computational workloads on two petabytes of disk in production. In the years since then, telecommunications giants across the globe, including AT&T, Verizon, and NTT, have all begun the process of moving the backbone of the internet from purpose-built hardware onto virtualized network functions running on OpenStack.
The scale of the OpenStack project and its deployment is staggering—a given semi-annual release of the OpenStack software contains tens of thousands of commits from hundreds of developers from dozens of companies.
In this chapter, we'll look at what OpenStack is and why it has been so influential. We'll also take the first steps in architecting a cloud.
OpenStack is best defined by its use cases, as users and contributors approach the software with many different goals in mind. For hosting providers such as Rackspace, OpenStack provides the infrastructure for a multitenant shared services platform. For others, it might provide a mechanism for provisioning data and compute for a distributed business intelligence application. There are a few answers to this question that are relevant, regardless of your organization's use case.
One of the initial goals of OpenStack was to provide Application Program Interface (API) compatibility with the Amazon Web Service. As the popularity of the platform has increased, the OpenStack API has become a de facto standard on its own. In the November 2017 User Survey, the standard APIs were listed as the number one business driver for the adoption of OpenStack as a private cloud platform. As such, many of the enterprise organizations that we've worked with to create OpenStack clouds are using them as an underlying Infrastructure as a Service (IaaS) layer for one or more Platform as a Service (PaaS) or Hybrid Cloud deployments.
Every feature or function of OpenStack is exposed in one of its REST APIs. There are command-line interfaces for OpenStack (legacy
nova and the newer
openstack common client) as well as a standard web interface (Horizon). However, most interactions between the components and end users happen over the API. This is advantageous for the following reasons:
- Everything in the system can be automated
- Integration with other systems is well-defined
- Use cases can be clearly defined and automatically tested
The APIs are well-defined and versioned REST APIs, and there are native clients and SDKs for more than a dozen programming languages. For a full list of current SDKs, refer to: http://api.openstack.org.
OpenStack is an open source software project which has a huge number of contributors from a wide range of organizations. OpenStack was originally created by NASA and Rackspace. Rackspace is still a significant contributor to OpenStack, but these days, contributions to the project comes from a wide array of companies, including the traditional open source contributors (Red Hat, IBM, and HP) as well as companies that are dedicated entirely to OpenStack (Mirantis and CloudBase). Contributions come in the form of drivers for particular pieces of infrastructure (that is, Cinder block storage drivers or Neutron SDN drivers), bug fixes, or new features in the core projects.
OpenStack is governed by a foundation. Membership in the foundation is free and open to anyone who wishes to join. There are currently thousands of members in the foundation. Leadership on technical issues is provided by a 13-member technical committee, which is generally elected by the individual members. Strategic and financial issues are decided by a board of directors, which include members appointed by corporate sponsors and elected by the individual members.
For more information on joining or contributing to the OpenStack Foundation, refer to: http://www.openstack.org/foundation.
OpenStack is written in the Python programming language and is usually deployed on the Linux operating system. The source code is readily available on the internet and commits are welcome from the community at large. Before code is committed to the project, it has to pass through a series of gates, which includes unit testing and code review.
Finally, OpenStack provides the software modules necessary to build an automated private cloud platform. Although OpenStack has traditionally been focused on providing IaaS capabilities in the style of Amazon Web Services, new projects have been introduced lately, which begin to provide capabilities that might be associated more with Platform as a Service. This book will focus on implementing the core set of OpenStack components described as follows.
The most important aspect of OpenStack pertaining to its usage as a private cloud platform is the tenant model. The authentication and authorization services that provide this model are implemented in the identity service, Keystone. Every virtual or physical object governed by the OpenStack system exists within a private space referred to as a tenant or project. The latest version of the Keystone API has differentiated itself further to include a higher level construct named a domain. Regardless of the terminology, the innate ability to securely segregate compute, network, and storage resources is the most fundamental capability of the platform. This is what differentiates it from traditional data center virtualization and makes it a private cloud platform.
OpenStack is a modular system. Although some OpenStack Architects choose to implement a reference architecture of all of the core components shipped by an OpenStack distributor, many will only implement the services required to meet their business cases.
Reference implementations are typically used for development use cases where the final production state of the service might not be well-defined. Production deployments will likely gate the availability of some services to reduce the amount of configuration and testing required for implementation. Reference deployments will typically not vary from the distributor's implementation so that the distributor's deployment and testing tools can be reused without modification.
In this book, we'll be focusing on the following core components of OpenStack.
OpenStack Compute (Nova) is one of the original components of OpenStack. It provides the ability to provision a virtual machine, an application container, or a physical system, depending on configuration. All provisioning is image-based, and the OpenStack Image Service (Glance) is a prerequisite for the Compute service. Some kind of networking is also required to launch a compute instance.
Networking was originally provided by the Compute service in OpenStack, but the use of Nova Networking was deprecated in the Newton release of OpenStack, and it is no longer supported. Networking is provided by the Neutron service, which offers a wide range of functionality.
In OpenStack, we refer to provisioned compute nodes as instances and not virtual machines. Although this might seem like a matter of semantics, it's a useful device for a few reasons. The first reason is that it describes the deployment mechanism; all compute in OpenStack is the instantiation of a Glance image with a specified hardware template, the flavor.
The flavor describes the characteristics of the instantiated image, and it normally represents a number of cores of compute with a given amount of memory and storage. Storage may be provided by the Compute service or the block storage service, Cinder. Although quotas are defined to limit the amount of cores, memory, and storage available to a given user (the tenant), charge-back is traditionally established by the flavor (that is, instantiating a particular image on an m1-small flavor may cost a tenant a certain number of cents an hour).
The second reason that the term instance is useful is that virtual machines in OpenStack do not typically have the same life cycle as they do in traditional virtualization. Although we might expect virtual machines to have a multiyear life cycle like physical machines, we would expect instances to have a life cycle which is measured in days or weeks. Virtual machines are backed up and recovered, whereas instances are rescued or evacuated. Legacy virtualization platforms assume resizing and modifying behaviors are in place; cloud platforms such as OpenStack expect redeployment of virtual machines or adding additional capacity through additional instances, not adding additional resources to existing virtual machines.
The third reason that we find it useful to use the term instance is that the Compute service has evolved over the years to launch a number of different types of compute. Some OpenStack deployments may only launch physical machines, whereas others may launch a combination of physical, virtual, and container-based instances. The same construct applies, regardless of the compute provider.
Some of the lines between virtual machines and instances are becoming more blurred as more enterprise features are added to the OpenStack Compute service. Later on, we'll discuss some of the ways in which we can launch instances which act more like virtual machines for more traditional compute workloads.
Ephemeral backing storage for compute instances is provided by the Nova service. This storage is referred to as ephemeral because its life cycle coterminates with the life cycle of the compute instance. That is, when an instance is terminated, the ephemeral storage associated with the instance is deleted from the compute host on which it resided. The first kind of persistent storage provided in the OpenStack system was object storage, based on the S3 service available in the Amazon Web Service environment.
Object Storage is provided by the Swift service in OpenStack. Just as Nova provides an EC2- compatible compute API, Swift provides an S3-compatible object storage API. Applications which are written to run on the Amazon EC2 service and read and write their persistent data to the S3 Object Storage service do not need to be rewritten to run on an OpenStack system.
A number of third-party applications provide an S3 or Swift-compatible API, and they may be substituted for Swift in a typical OpenStack deployment. These include open source object stores, such as Gluster or Ceph, or proprietary ones, such as Scality or Cloudian. The Swift service is broken down into a few components, and third-party applications may use the Proxy component of Swift for API services and implement only a backend, or may replace the Swift service entirely. All OpenStack-compatible object stores will consume the tenant model of OpenStack and accept Keystone tokens for authentication.
Traditional persistent storage is provided to OpenStack workloads through the Cinder block storage component. The life cycle of Cinder volumes is maintained independent of compute instances, and volumes may be attached or detached to one or more compute instances to provide a backing store for filesystem-based storage.
OpenStack ships with a reference implementation of Cinder, which leverages local storage on the host and uses LVM as well as the ability to use iSCSI to share a block device attached to a Cinder storage node that can use its storage for instances. This implementation lacks high availability, and it is typically only used in test environments. Production deployments tend to leverage a software-based or hardware-based block storage solution, such as Ceph or NetApp, chosen based on performance and availability requirements.
The last of the foundational services in OpenStack is Neutron, the Network service. Neutron provides an API for creating ports, subnets, networks, and routers. Additional network services, such as firewalls and load balancers, are provided in some OpenStack deployments.
As with Cinder, the reference implementation, based on Open vSwitch, is typically used in test environments or smaller deployments. Large-scale production deployments will leverage one of the many available software-based or hardware-based SDN solutions, which have Neutron drivers. These solutions range from open source implementations such as Juniper's OpenContrail to proprietary solutions such as VMware's NSX platform.
In spite of immense interest, huge investment, and public success, we've seen a number of cases where well-intentioned OpenStack projects fail or are at least perceived as a failure by the people who have funded them. When OpenStack projects fail, the technology itself is rarely the root cause. Thomas Bittman at Gartner noticed this trend and wrote an influential blog post entitled Why are Private Clouds Failing? in September 2014.
Bittman's findings echo many of our experiences from the field. In short, the reason that most private cloud projects fail is that improper expectations were set from the beginning, and the business goals for the cloud weren't realized by the end result.
First and foremost, OpenStack deployments should be seen as an investment with returns and not a project to reduce operational costs. Although we've certainly seen dramatic reductions in operational workloads through the automation that OpenStack provides, it is difficult to accurately quantify those reductions in order to justify the operational investment required to run an efficient cloud platform. Organizations that are entirely focused on cutting costs through automation should first look at automating existing virtual environments instead of deploying new environments.
We've also seen a lot of projects that had poorly quantified goals. OpenStack is an enabler of use cases and not an IT panacea. If the use cases are not agreed upon before investment in the platform begins, it will prove very difficult to justify the investment in the end. This is why the role of the Architect is so critical in OpenStack deployments—it is their job to ensure that concrete requirements are written upfront so that all of the stakeholders can quantify the success of the platform once deployed.
With this in mind, let's take a look at some typical use cases for OpenStack deployments.
As we mentioned before, OpenStack was originally created with code contributions from NASA and Rackspace. NASA's interest in OpenStack sprang from their desire to create a private elastic compute cloud, whereas the primary goal for Rackspace was to create an open source platform that could replace their public shared hosting infrastructure. As of April 2015, the Rackspace Public Cloud offering had been ported to OpenStack and had passed the OpenStack Powered Platform certification.
The Rackspace implementation offers both Compute and Object Storage services, but some implementations may choose to offer only Compute or Object Storage and receive certifications for those services. DreamHost, another public OpenStack-based cloud provider, for example, has chosen to break their managed services down into DreamCompute and DreamObjects, which implement the services separately. The DreamObjects service was implemented and offered first as a compliment to DreamHost's existing shared web hosting, and the DreamCompute service was introduced later.
Most public hosting providers focus primarily on the Compute service, and many do not yet offer software-defined networking through the Neutron network service (DreamCompute being a notable exception). Architects of hosting platforms will focus first on tenancy issues, second on chargeback issues, and finally on scale.
The first production deployment of OpenStack outside NASA and Rackspace was at a Canadian not-for-profit organization named Cybera. Cybera deployed OpenStack as a technology platform in 2011 for its DAIR program, which provides free compute and storage to Canadian researchers, entrepreneurs, and small businesses.
Architects at Cybera, NASA, and CERN have all commented on how their services have much of the same concerns as in the public hosting space. They provide compute and storage resources to researchers and don't have much insight into how those resources will actually be used. Thus, concerns about secure multitenancy will apply to these environments just as much as they do in the hosting space.
HPC clouds will have an added focus on performance, though. Although hosting providers will look to economize on commodity hardware, research clouds will look to maximize performance by configuring their compute, storage, and network hardware to support high volume and throughput operations. Where most clouds will work best by growing low-to-mid range hardware horizontally with commodity hardware, high-performance clouds tend to be very specific about the performance profiles of their hardware selection. Cybera has published performance benchmarks, comparing its DAIR platform to EC2. Architects of research clouds may also look to use hardware pass-through capabilities or other low-level hypervisor features to enable specific workloads.
Over the past couple of years, a third significant use case has emerged for OpenStack—enterprise application development environments. While public hosting and high-performance compute implementations may have huge regions with hundreds of compute nodes and thousands of cores, enterprise implementations tend to have regions of 20 to 50 compute nodes. Enterprise adopters have a strong interest in software-defined networking.
The primary driver for enterprise adoption of OpenStack has been the increasing use of continuous integration and continuous delivery in the application development workflow. A typical Continuous Integration and Continuous Delivery (CI/CD) workflow will deploy a complete application on every developer commit, which passes basic unit tests in order to perform automated integration testing. These application deployments live as long as it takes to run the unit tests, and then an automated process tears down the deployment once the tests pass or fail. This workflow is easily facilitated with a combination of OpenStack compute and network services.
While Architects of hosting or High-Performance Computing (HPC) clouds spend a lot of time focusing on tenancy and scaling issues, Architects of enterprise deployments will spend a lot of time focusing on how to integrate OpenStack compute into their existing infrastructure. Enterprise deployments will frequently leverage existing service catalog implementations and identity management solutions. Many enterprise deployments will also need to integrate with existing IPAM and asset tracking systems.
One of the largest areas for development and deployment of the OpenStack platform has been in the telecommunications industry. Network Functions Virtualization (NFV) provides a common IaaS platform for that industry, which is in the process of replacing the purpose-built hardware devices that provide network services with virtualized appliances that run on commodity hardware. Some of these services are routing, proxies, content filtering, as well as packet core services and high-volume switching. Most of these appliances have intense compute requirements, and they are largely stateless. These workloads are well-suited for the OpenStack compute model.
NFV use cases typically leverage hardware features, which can directly attach compute instances to physical network interfaces on compute nodes. Instances are also typically very sensitive to CPU, and memory topology (NUMA) and virtual cores tend to be mapped directly to physical cores. Orchestration either through Heat or TOSCA has also been a large focus for these deployments.
Architects of NFV solutions will focus primarily on virtual instance placement and performance issues and less on tenancy and integration issues.
OpenStack is designed to be used at scale. Many IT projects might comprise a few physical assets deployed within an existing network, storage, and compute landscape, but OpenStack deployments are, by definition, new network, storage, and compute landscapes. Any project of this size and scope requires significant coordination between different teams within an IT organization. This kind of coordination requires careful planning and, in our experience, a lot of documentation.
This book is written to provide best practices for a relatively new role within many organizations—the Cloud Architect. The Cloud Architect's primary function is to take business requirements for Infrastructure or Platform as a Service and design an Infrastructure or Platform as a Service solution, which meets those requirements. This requires an in-depth knowledge of the capabilities of the infrastructure software paired with competency in network and storage architecture.
The typical Cloud Architect will have a background in compute and will lean heavily on the Network and Storage Architects within an organization to round out their technical knowledge. Since OpenStack is based on the Linux operating system, most OpenStack Architects will have a deep knowledge of that platform. But as we mentioned earlier, OpenStack is typically delivered as an API, and OpenStack Architects will need to have fluency in application development as well.
OpenStack Architects are responsible first and foremost for authoring and maintaining a set of design and deployment documentation. It's difficult to describe an ocean if you've never seen one, so this book will walk you through implementation of the documentation that you will create as you create it.
The first document that we will create is the design document. This may be named something different in your organization, but the goal of the design document is to explain the reasoning behind all of the choices that were made in the implementation of the platform. The format may vary from team to team, but we want to capture the following points:
- Background: This is the history behind the decision to start the project. If the document will only be consumed internally, this can be pretty short. If it's going to be consumed externally, this is an opportunity to provide organizational context for your vendors and partners.
- Summary: This is really just a detailed summary of the entire document. Typically, this part of the deliverable will be used by managers, technology, and business leaders to understand the business impact of the overall recommendation. Requirements and the resulting architecture should be summarized.
- Requirements: This is the meat of the document. Requirements can be in whatever format is acceptable for your project management team. We prefer the user story format and will use that in the examples in this book.
- Physical architecture: This is an explanation of roles and physical machines that take those roles. This should include a network diagram.
- Service architecture: This is a summary of available services and their relationships. This section should include a service diagram.
- Tenant architecture: A section should be included that describes the expected landscape inside the cloud. This includes things such as available compute flavors, images, identity management architecture, and IPAM or DDI.
- Roadmap: This section is optional and often lives in another document. It's an opportunity to identify areas for improvement in future releases of the platform.
The design document often goes through a number of revisions as the project is developed. An important step at the end of each iteration of the platform is to reconcile any changes made to the platform with the design document.
Beware of scope creep in the design document. This artifact has a tendency to turn into documentation on how OpenStack works. Remember to focus on explaining the decisions you made instead of what all the available options at the time were.
Every implementation of OpenStack should start with a deployment plan. The design document describes what's being deployed and why, whereas the deployment plan describes how. Like the design document, the content of a deployment plan varies from organization to organization. It should at least include the following things:
- Hardware: This is a list of the compute, storage, and network hardware available for the deployment.
- Network addressing: This is a table of IP and MAC addresses for the network assets in the deployment. For deployments of hundreds of compute nodes, this should probably be limited to a set of VLANs and subnets available for the deployment.
- Deployment-specific configuration: We'll assume that the configuration of the OpenStack deployment is automated. These are any settings that an engineer would need to adjust before launching the automated deployment of the environment.
- Requirements: These are things that need to be in place before the deployment can proceed. Normally, this is hardware configuration, switch configuration, LUN masking, and so on.
A good deployment plan will document everything that an engineering team needs to know to take the design document and instantiate it in the physical world. One thing that we like to leave out of the deployment plan is step-by-step instructions on how to deploy OpenStack. That information typically lives in an Installation Guide, which may be provided by a vendor or written by the operations team.
- An individual, usually a Linux or Cloud Architect, installs OpenStack on a single machine to verify that the software can be deployed without too much effort.
- The Architect enlists the help of other team members, typically Network and Storage Architects or Engineers to deploy a multiple-node installation. This will leverage some kind of shared ephemeral or block storage.
- A team of Architects or Engineers craft the first deployment of OpenStack, which is customized for the organization's use cases or environmental concerns. Professional services from a company, such as Red Hat, Mirantis, HP, IBM, Canonical, or Rackspace, are often engaged at this point in the process.
From here on out, it's off to the races. We'll follow a similar pattern in this book. In this first chapter, we'll start with the first step—the all-in-one deployment.
Taking the time to document the very first deployment might seem a bit obsessive, but it provides us with the opportunity to begin iterating on the documentation that is the key to successful OpenStack deployments. We'll start with the following template.
This deployment provides a compute capacity for 60 m1.medium instances or 30 m1.large instances.
Change the specifications in the table to meet your deployment. It's important to specify the expected capacity in the deployment document. For a basic rule of thumb, just divide the amount of available system memory by the instance memory. We'll talk more about accurately forecasting capacity in a later chapter.
Change the network addresses in this section to meet your deployment. We'll only use a single network interface for the all-in-one installation.
This deployment will use the RDO all-in-one reference architecture. This reference architecture uses a minimum amount of hardware as the basis for a monolithic installation of OpenStack, typically only used for testing or experimentation. For more information on the all-in-one deployment, refer to: https://www.rdoproject.org/.
For the first deployment, we'll just use the RDO distribution of the box. In later chapters, we'll begin to customize our deployment and add notes to this section to describe where we've diverged from the reference architecture.
- Red Hat Enterprise Linux 7 (or CentOS 7)
- Network Manager must be disabled
- Network interfaces must be configured as per the Network Addressing section in
- The RDO OpenStack repository must be enabled (from: https://rdoproject.org/)
To enable the RDO repository, run the following command as the root user on your system:
# yum install -y https://rdoproject.org/repos/rdo-release.rpm
Assuming that we've correctly configured our host machine as per our deployment plan, the actual deployment of OpenStack is relatively straightforward. The installation instructions can either be captured in an additional section of the deployment plan, or they can be captured in a separate document—the Installation Guide. Either way, the installation instructions should be immediately followed by a set of tests that can be run to verify that the deployment went correctly.
# yum install -y openstack-packstack
# rpm -q rdo-release
If the RDO repository has not been enabled, enable it using the following command:
# yum install -y https://rdoproject.org/repos/rdo-release.rpm
Next, run the
packstack utility to install OpenStack:
# packstack --allinone
packstack utility configures and applies a set of puppet manifests to your system to install and configure the OpenStack distribution. The
--allinone option instructs
packstack to configure the set of services defined in the reference architecture for RDO.
Once the installation has completed successfully, use the following steps to verify the installation.
First, verify the Keystone identity service by attempting to get an authorization token. The OpenStack command-line client uses a set of environment variables to authenticate your session. Two configuration files that set those variables will be created by the
packstack installation utility.
keystonerc_admin file can be used to authenticate an administrative user, and the
keystonerc_demo file can be used to authenticate a nonprivileged user. An example
keystonerc is shown as follows:
export OS_USERNAME=demo export OS_TENANT_NAME=demo export OS_PASSWORD=<random string> export OS_AUTH_URL=http://192.168.0.10:5000/v2.0/ export PS1='[\u@\h \W(keystone_demo)]\$ '
This file will be used to populate your command-line session with the necessary environment variables and credentials that will allow you to communicate with the OpenStack APIs that use the Keystone service for authentication.
In order to use the
keystonerc file to load your credentials, source the contents into your shell session from the directory you ran the
packstack command from. It will provide no output except for a shell prompt change:
# . ./keystonerc_demo
Your command prompt will change to remind you that you're using the sourced OpenStack credentials.
In order to load these credentials, the preceding source command must be run every time a user logs in. These credentials are not persistent. If you do not source your credentials before running OpenStack commands, you will most likely get the following error:
You must provide a username through either --os-username or env[OS_USERNAME].
To verify the Keystone service, run the following command to get a Keystone token:
# openstack token issue
+-----------+----------------------------------+ | Property | Value | +-----------+----------------------------------+ | expires | 2015-07-14T05:01:41Z | | id | a20264cd091847ac965cde8cbba7b0b9 | | tenant_id | 202bd2fa2a3a40639bb0bccc9a57e37d | | user_id | 68d90544e0064c4c838d47d80811b895 | +-----------+----------------------------------+
Next, verify the Glance image service:
# openstack image list
This should output a table listing a single image, the CirrOS image that is installed with the
packstack command. We'll use the ID of that glance image to verify the Nova Compute service. Before we do it, we'll verify the Neutron Network service:
# openstack network list
This should output a table listing a network available to use for testing. We'll use the ID of that network to verify the Nova Compute service with the following commands:
First, add the root's SSH key to OpenStack as
# openstack keypair create --public-key ~/.ssh/id_rsa.pub demo
Now, create an instance named
# openstack server create --flavor m1.tiny \ --image <image_id> \ --key-name demo \ --nic net-id=<networkid> \ instance01
This command will create the instance and output a table of information about the instance that you've just created. To check the status of the instance as it is provisioned, use the following command:
# openstack server show instance01
When the status becomes ACTIVE, the instance has successfully launched. The key created with the nova
keypair-add command (
demo.key) can be used to log in to the instance once it's running.
At this point, you should have a working OpenStack installation on a single machine. To familiarize yourself with the OpenStack Horizon user interface, see the documentation on the OpenStack website at https://docs.openstack.org/nova/queens/user/launch-instances.html.
This chapter provided background information on OpenStack and the component services that make up an OpenStack deployment. We looked at some typical use cases for OpenStack and discussed the role of the Cloud Architect in an organization that is embarking on an OpenStack private cloud deployment.
We also began the documentation for our OpenStack deployments. The documents such as the deployment plan and installation guide were created.
Finally, we completed an all-in-one OpenStack installation on a single server and verified the core set of services. This installation can be used to familiarize yourself with the OpenStack system. In the next chapter, we'll break down the different areas of design for OpenStack clouds and expand our documentation and deployment.
Please refer to the following links for further reading: