This chapter will look at ways that networking has changed in the private data centers and evolved in the past few years. It will focus on the emergence of Amazon Web Services (AWS) for public cloud and OpenStack for private cloud and ways in which this has changed the way developers want to consume networking. It will look at some of the networking services that AWS and OpenStack provide out of the box and look at some of the features they provide. It will show examples of how these cloud platforms have made networking a commodity much like infrastructure.
In this chapter, the following topics will be covered:
An overview of cloud approaches
The difference between Spanning Tree networks and Leaf-Spine networking
Changes that have occurred in networking with the introduction of public cloud
The Amazon Web Services approach to networking
The OpenStack approach to networking
The cloud provider market is currently saturated with a multitude of different private, public, and hybrid cloud solutions, so choice is not a problem for companies looking to implement public, private, or hybrid cloud solutions.
Consequently, choosing a cloud solution can sometimes be quite a daunting task, given the array of different options that are available.
The battle between public and private cloud is still in its infancy, with only around 25 percent of the industry using public cloud, despite its perceived popularity, with solutions such as Amazon Web Services, Microsoft Azure, and Google Cloud taking a large majority of that market share. However, this still means that 75 percent of the cloud market share is available to be captured, so the cloud computing market will likely go through many iterations in the coming years.
Public clouds are essentially a set of data centers and infrastructure that are made publicly available over the Internet to consumers. Despite its name, it is not magical or fluffy in any way. Amazon Web Services launched their public cloud based on the idea that they could rent out their servers to other companies when they were not using them during busy periods of the year.
Public cloud resources can be accessed via a Graphical User Interface (GUI) or, programmatically, via a set of API endpoints. This allows end users of the public cloud to create infrastructure and networking to host their applications.
Public clouds are used by businesses for various reasons, such as the speed it takes to configure and using public cloud resources is relatively low. Once credit card details have been provided on a public cloud portal, end users have the freedom to create their own infrastructure and networking, which they can run their applications on.
This infrastructure can be elastically scaled up and down as required, all at a cost of course to the credit card.
Public cloud has become very popular as it removes a set of historical impediments associated with shadow IT. Developers are no longer hampered by the restrictions enforced upon them by bureaucratic and slow internal IT processes. Therefore, many businesses are seeing public cloud as a way to skip over these impediments and work in a more agile fashion allowing them to deliver new products to market at a greater frequency.
When a business moves its operations to a public cloud, they are taking the bold step to stop hosting their own data centers and instead use a publicly available public cloud provider, such as Amazon Web Services, Microsoft Azure, IBM BlueMix, Rackspace, or Google Cloud.
Businesses that have moved to public cloud may find they no longer have a need for a large internal infrastructure team or network team, instead all infrastructure and networking is provided by the third-party public cloud, so it can in some quarters be viewed as giving up on internal IT.
Public cloud has proved a very successful model for many start-ups, given the agility it provides, where start-ups can put out products quickly using software-defined constructs without having to set up their own data center and remain product focused.
However, the Total Cost of Ownership (TCO) to run all of a business's infrastructure in a public cloud is a hotly debated topic, which can be an expensive model if it isn't managed and maintained correctly. The debate over public versus private cloud TCO rages on as some argue that public cloud is a great short-term fix but growing costs over a long period of time mean that it may not be a viable long-term solution compared with private cloud.
Private cloud is really just an extension of the initial benefits introduced by virtualization solutions, such as VMware, Hyper-V, and Citrix Xen, which were the cornerstone of the virtualization market. The private cloud world has moved on from just providing virtual machines, to providing software-defined networking and storage.
With the launch of public clouds, such as Amazon Web Services, private cloud solutions have sought to provide like-for-like capability by putting a software-defined layer on top of their current infrastructure. This infrastructure can be controlled in the same way as the public cloud via a GUI or programmatically using APIs.
Private cloud solutions such as Apache CloudStack and open source solutions such as OpenStack have been created to bridge the gap between the private cloud and the public cloud.
This has allowed vendors the agility of private cloud operations in their own data center by overlaying software-defined constructs on top of their existing hardware and networks.
However, the major benefit of private cloud is that this can be done within the security of a company's own data centers. Not all businesses can use public cloud for compliance, regularity, or performance reasons, so private cloud is still required for some businesses for particular workloads.
Hybrid cloud can often be seen as an amalgamation of multiple clouds. This allows a business to seamlessly run workloads across multiple clouds linked together by a network fabric. The business could select the placement of workloads based on cost or performance metrics.
A hybrid cloud can often be made up of private and public clouds. So, as an example, a business may have a set of web applications that it wishes to scale up for particular busy periods and are better suited to run on public cloud so they are placed there. However, the business also needs a highly regulated, PCI-compliant database, which would be better-suited to being deployed in a private on-premises cloud. So a true hybrid cloud gives a business these kinds of options and flexibility.
Hybrid cloud really works on the premise of using different clouds for different use cases, where each horse (application workload) needs to run a particular course (cloud). So, sometimes, a vendor-provided Platform as a Service (PaaS) layer can be used to place workloads across multiple clouds or alternately different configuration management tools, or container orchestration technologies can be used to orchestrate application workload placement across clouds.
The choice between public, private, or hybrid cloud really depends on the business, so there is no real right or wrong answer. Companies will likely use hybrid cloud models as their culture and processes evolve over the next few years.
If a business is using a public, private, or hybrid cloud, the common theme with all implementations is that they are moving towards a software-defined operational model.
So what does the term software-defined really mean? In simple terms, software-defined means running a software abstraction layer over hardware. This software abstraction layer allows graphical or programmatic control of the hardware. So, constructs, such as infrastructure, storage, and networking, can be software defined to help simplify operations, manageability as infrastructure and networks scale out.
When running private clouds, modifications need to be made to incumbent data centers to make them private cloud ready; sometimes, this is important, so the private data center needs to evolve to meet those needs.
When considering the private cloud, traditionally, company's private datacenters have implemented 3-tier layer 2 networks based on the Spanning Tree Protocol (STP), which doesn't lend itself well to modern software-defined networks. So, we will look at what a STP is in more depth as well as modern Leaf-Spine network architectures.
The implementation of STP provides a number of options for network architects in terms of implementation, but it also adds a layer of complexity to the network. Implementation of the STP gives network architects the certainty that it will prevent layer 2 loops from occurring in the network.
A typical representation of a 3-tier layer 2 STP-based network can be shown as follows:
The Core layer provides routing services to other parts of the data center and contains the core switches
The Aggregation layer provides connectivity to adjacent Access layer switches and the top of the Spanning Tree core
The bottom of the tree is the Access layer; this is where bare metal (physical) or virtual machines connect to the network and are segmented using different VLANs.
The use of layer 2 networking and STP mean that at the access layer of the network will use VLANs spread throughout the network. The VLANs sit at the access layer, which is where virtual machines or bare metal servers are connected. Typically, these VLANs are grouped by type of application, and firewalls are used to further isolate and secure them.
Traditional networks are normally segregated into some combination of the following:
Frontend: It typically has web servers that require external access
Business Logic: This often contains stateful services
Backend: This typically contains database servers
Applications communicate with each other by tunneling between these firewalls, with specific Access Control List (ACL) rules that are serviced by network teams and governed by security teams.
When using STP in a layer 2 network, all switches go through an election process to determine the root switch, which is granted to the switch with the lowest bridge id, with a bridge id encompassing the bridge priority and MAC address of the switch.
Once elected, the root switch becomes the base of the spanning tree; all other switches in the Spanning Tree are deemed non-root will calculate their shortest path to the root and then block any redundant links, so there is one clear path. The calculation process to work out the shortest path is referred to as network convergence. (For more information refer to the following link: http://etutorials.org/Networking/Lan+switching+fundamentals/Chapter+10.+Implementing+and+Tuning+Spanning+Tree/Spanning-Tree+Convergence/)
Network architects designing the layer 2 Spanning Tree network need to be careful about the placement of the root switch, as all network traffic will need to flow through it, so it should be selected with care and given an appropriate bridge priority as part of the network reference architecture design. If at any point, switches have been given the same bridge priority then the bridge with the lowest MAC address wins.
Network architects should also design the network for redundancy so that if a root switch fails, there is a nominated backup root switch with a priority of one value less than the nominated root switch, which will take over when a root switch fails. In the scenario, the root switch fails the election process will begin again and the network will converge, which can take some time.
The use of STP is not without its risks, if it does fail due to user configuration error, data center equipment failure or software failure on a switch or bad design, then the consequences to a network can be huge. The result can be that loops might form within the bridged network, which can result in a flood of broadcast, multicast or unknown-unicast storms that can potentially take down the entire network leading to long network outages. The complexity associated with network architects or engineers troubleshooting STP issues is important, so it is paramount that the network design is sound.
In recent years with the emergence of cloud computing, we have seen data centers move away from a STP in favor of a Leaf-Spine networking architecture. The Leaf-Spine architecture is shown in the following diagram:
In a Leaf-Spine architecture:
Spine switches are connected into a set of core switches
Spine switches are then connected with Leaf switches with each Leaf switch deployed at the top of rack, which means that any Leaf switch can connect to any Spine switch in one hop
Leaf-Spine architectures are promoted by companies such as Arista, Juniper, and Cisco. A Leaf-Spine architecture is built on layer 3 routing principle to optimize throughput and reduce latency.
Both Leaf and Spine switches communicate with each other via external Border Gate Protocol (eBGP) as the routing protocol for the IP fabric. eBGP establishes a Transmission Control Protocol (TCP) connection to each of its BGP peers before BGP updates can be exchanged between the switches. Leaf switches in the implementation will sit at top of rack and can be configured in Multichassis Link Aggregation (MLAG) mode using Network Interface Controller (NIC) bonding.
MLAG was originally used with STP so that two or more switches are bonded to emulate like a single switch and used for redundancy so they appeared as one switch to STP. In the event of a failure this provided multiple uplinks for redundancy in the event of a failure as the switches are peered, and it worked around the need to disable redundant paths. Leaf switches can often have internal Border Gate Protocol (iBGP) configured between the pairs of switches for resiliency.
In a Leaf-Spine architecture, Spine switches do not connect to other Spine switches, and Leaf switches do not connect directly to other Leaf switches unless bonded top of rack using MLAG NIC bonding. All links in a Leaf-Spine architecture are set up to forward with no looping. Leaf-Spine architectures are typically configured to implement Equal Cost Multipathing (ECMP), which allows all routes to be configured on the switches so that they can access any Spine switch in the layer 3 routing fabric.
ECMP means that Leaf switches routing table has the next-hop configured to forward to each Spine switch. In an ECMP setup, each leaf node has multiple paths of equal distance to each Spine switch, so if a Spine or Leaf switch fails, there is no impact as long as there are other active paths to another adjacent Spine switches. ECMP is used to load balance flows and supports the routing of traffic across multiple paths. This is in contrast to the STP, which switches off all but one path to the root when the network converges.
Normally, Leaf-Spine architectures designed for high performance use 10G access ports at Leaf switches mapping to 40G Spine ports. When device port capacity becomes an issue, new Leaf switches can be added by connecting it to every Spine on the network while pushing the new configuration to every switch. This means that network teams can easily scale out the network horizontally without managing or disrupting the switching protocols or impacting the network performance.
An illustration of the protocols used in a Leaf-Spine architecture are shown later, with Spine switches connected to Leaf switches using BGP and ECMP and Leaf switches sitting top of rack and configured for redundancy using MLAG and iBGP:
Consistent latency and throughput in the network
Consistent performance for all racks
Network once configured becomes less complex
Simple scaling of new racks by adding new Leaf switches at top of rack
Consistent performance, subscription, and latency between all racks
East-west traffic performance is optimized (virtual machine to virtual machine communication) to support microservice applications
Removes VLAN scaling issues, controls broadcast and fault domains
Modern switches have now moved towards open source standards, so they can use the same pluggable framework. The open standard for virtual switches is Open vSwitch, which was born out of the necessity to come up with an open standard that allowed a virtual switch to forward traffic to different virtual machines on the same physical host and physical network. Open vSwitch uses Open vSwitch database (OVSDB) that has a standard extensible schema.
Hyper-V has recently moved to support Open vSwitch using the implementation created by Cloudbase (https://cloudbase.it/), which is doing some fantastic work in the open source space and is testament to how Microsoft's business model has evolved and embraced open source technologies and standards in recent years. Who would have thought it? Microsoft technologies now run natively on Linux.
The Open vSwitch exchanges OpenFlow between virtual switch and physical switches in order to communicate and can be programmatically extended to fit the needs of vendors. In the following diagram, you can see the Open vSwitch architecture. Open vSwitch can run on a server using the KVM, Xen, or Hyper-V virtualization layer:
The ovsdb-server contains the OVSDB schema that holds all switching information for the virtual switch. The ovs-vswitchd daemon talks OpenFlow to any Control & Management Cluster, which could be any SDN controller that can communicate using the OpenFlow protocol.
Controllers use OpenFlow to install flow state on the virtual switch, and OpenFlow dictates what actions to take when packets are received by the virtual switch.
When Open vSwitch receives a packet it has never seen before and has no matching flow entries, it sends this packet to the controller. The controller then makes a decision on how to handle this packet based on the flow rules to either block or forward. The ability to configure Quality of Service (QoS) and other statistics is possible on Open vSwitch.
A Leaf-Spine architecture allows overlay networks to be easily built, meaning that cloud and tenant environments are easily connected to the layer 3 routing fabric. Hardware Vxlan Tunnel Endpoints (VTEPs) IPs are associated with each Leaf switch or a pair of Leaf switches in MLAG mode and are connected to each physical compute host via Virtual Extensible LAN (VXLAN) to each Open vSwitch that is installed on a hypervisor.
This allows an SDN controller, which is provided by vendors, such as Cisco, Nokia, and Juniper to build an overlay network that creates VXLAN tunnels to the physical hypervisors using Open vSwitch. New VXLAN tunnels are created automatically if a new compute is scaled out, then SDN controllers can create new VXLAN tunnels on the Leaf switch as they are peered with the Leaf switch's hardware VXLAN Tunnel End Point (VTEP).
Modern switch vendors, such as Arista, Cisco, Cumulus, and many others, use OVSDB, and this allows SDN controllers to integrate at the Control & Management Cluster level. As long as an SDN controller uses OVSDB and OpenFlow protocol, they can seamlessly integrate with the switches and are not tied into specific vendors. This gives end users a greater depth of choice when choosing switch vendors and SDN controllers, which can be matched up as they communicate using the same open standard protocol.
It is unquestionable that the emergence of the AWS, which was launched in 2006, changed and shaped the networking landscape forever. AWS has allowed companies to rapidly develop their products on the AWS platform. AWS has created an innovative set of services for end users, so they can manage infrastructure, load balancing, and even databases. These services have led the way in making the DevOps ideology a reality, by allowing users to elastically scale up and down infrastructure. They need to develop products on demand, so infrastructure wait times are no longer an inhibitor to development teams. AWS rich feature set of technology allows users to create infrastructure by clicking on a portal or more advanced users that want to programmatically create infrastructure using configuration management tooling, such as Ansible, Chef, Puppet, Salt or Platform as a Service (PaaS) solutions.
In 2016, the AWS Virtual Private Cloud (VPC) secures a set of Amazon EC2 instances (virtual machines) that can be connected to any existing network using a VPN connection. This simple construct has changed the way that developers want and expect to consume networking.
In 2016, we live in a consumer-based society with mobile phones allowing us instant access to the Internet, films, games, or an array of different applications to meet our every need, instant gratification if you will, so it is easy to see the appeal of AWS has to end users.
AWS allows developers to provision instances (virtual machines) in their own personal network, to their desired specification by selecting different flavors (CPU, RAM, and disk) using a few button clicks on the AWS portal's graphical user interface, alternately using a simple call to an API or scripting against the AWS-provided SDKs.
So now a valid question, why should developers be expected to wait long periods of time for either infrastructure or networking tickets to be serviced in on-premises data centers when AWS is available? It really shouldn't be a hard question to answer. The solution surely has to either be moved to AWS or create a private cloud solution that enables the same agility. However, the answer isn't always that straightforward, there are following arguments against using AWS and public cloud:
Not knowing where the data is actually stored and in which data center
Not being able to hold sensitive data offsite
Not being able to assure the necessary performance
High running costs
All of these points are genuine blockers for some businesses that may be highly regulated or need to be PCI compliant or are required to meet specific regularity standards. These points may inhibit some businesses from using public cloud so as with most solutions it isn't the case of one size fits all.
In private data centers, there is a cultural issue that teams have been set up to work in silos and are not set up to succeed in an agile business model, so a lot of the time using AWS, Microsoft Azure, or Google Cloud is a quick fix for broken operational models.
Ticketing systems, a staple of broken internal operational models, are not a concept that aligns itself to speed. An IT ticket raised to an adjacent team can take days or weeks to complete, so requests are queued before virtual or physical servers can be provided to developers. Also, this is prominent for network changes too, with changes such as a simple modification to ACL rules taking an age to be implemented due to ticketing backlogs.
Developers need to have the ability to scale up servers or prototype new features at will, so long wait times for IT tickets to be processed hinder delivery of new products to market or bug fixes to existing products. It has become common in internal IT that some Information Technology Infrastructure Library (ITIL) practitioners put a sense of value on how many tickets that processed over a week as the main metric for success. This shows complete disregard for customer experience of their developers. There are some operations that need to shift to the developers, which have traditionally lived with internal or shadow IT, but there needs to be a change in operational processes at a business level to invoke these changes.
Put simply, AWS has changed the expectations of developers and the expectations placed on infrastructure and networking teams. Developers should be able to service their needs as quickly as making an alteration to an application on their mobile phone, free from slow internal IT operational models associated with companies.
But for start-ups and businesses that can use AWS, which aren't constrained by regulatory requirements, it skips the need to hire teams to rack servers, configure network devices, and pay for the running costs of data centers. It means they can start viable businesses and run them on AWS by putting in credit card details the same way as you would purchase a new book on Amazon or eBay.
The reaction to AWS was met with trepidation from competitors, as it disrupted the cloud computing industry and has led to PaaS solutions such as Cloud Foundry and Pivotal coming to fruition to provide an abstraction layer on top of hybrid clouds.
When a market is disrupted, it promotes a reaction, from it spawned the idea for a new private cloud. In 2010, a joint venture by Rackspace and NASA, launched an open source cloud-software initiative known as OpenStack, which came about as NASA couldn't put their data in a public cloud.
The OpenStack project intended to help organizations offer cloud computing services running on standard hardware and directly set out to mimic the model provided by AWS. The main difference with OpenStack is that it is an open source project that can be used by leading vendors to bring AWS-like ability and agility to the private cloud.
Since its inception in 2010, OpenStack has grown to have over 500 member companies as part of the OpenStack Foundation, with platinum members and gold members that comprise the biggest IT vendors in the world that are actively driving the community.
The platinum members of the OpenStack foundation are:
OpenStack is an open source project, which means its source code is publicly available and its underlying architecture is available for analysis, unlike AWS, which acts like a magic box of tricks but it is not really known for how it works underneath its shiny exterior.
OpenStack is primarily used to provide an Infrastructure as a Service (IaaS) function within the private cloud, where it makes commodity x86 compute, centralized storage, and networking features available to end users to self-service their needs, be it via the horizon dashboard or through a set of common API's.
Many companies are now implementing OpenStack to build their own data centers. Rather than doing it on their own, some companies are using different vendor hardened distributions of the community upstream project. It has been proven that using a vendor hardened distributions of OpenStack, when starting out, mean that OpenStack implementation is far likelier to be successful. Initially, for some companies, implementing OpenStack can be seen as complex as it is a completely new set of technology that a company may not be familiar with yet. OpenStack implementations are less likely to fail when using professional service support from known vendors, and it can create a viable alternative to enterprise solutions, such as AWS or Microsoft Azure.
Vendors, such as Red Hat, HP, Suse, Canonical, Mirantis, and many more, provide different distributions of OpenStack to customers, complete with different methods of installing the platform. Although the source code and features are the same, the business model for these OpenStack vendors is that they harden OpenStack for enterprise use and their differentiator to customers is their professional services.
Oracle OpenStack for Oracle Linux, or O3L
Oracle OpenStack for Oracle Solaris
VMware Integrated OpenStack (VIO)
OpenStack vendors will support build out, on-going maintenance, upgrades, or any customizations a client needs, all of which are fed back to the community. The beauty of OpenStack being an open source project is that if vendors customize OpenStack for clients and create a real differentiator or competitive advantage, they cannot fork OpenStack or uniquely sell this feature. Instead, they have to contribute the source code back to the upstream open source OpenStack project.
This means that all competing vendors contribute to its success of OpenStack and benefit from each other's innovative work. The OpenStack project is not just for vendors though, and everyone can contribute code and features to push the project forward.
OpenStack maintains a release cycle where an upstream release is created every six months and is governed by the OpenStack Foundation. It is important to note that many public clouds, such as at&t, RackSpace, and GoDaddy, are based on OpenStack too, so it is not exclusive to private clouds, but it has undeniably become increasingly popular as a private cloud alternative to AWS public cloud and now widely used for Network Function Virtualization (NFV).
So how does AWS and OpenStack work in terms of networking? Both AWS and OpenStack are made up of some mandatory and optional projects that are all integrated to make up its reference architecture. Mandatory projects include compute and networking, which are the staple of any cloud solution, whereas others are optional bolt-ons to enhance or extend capability. This means that end users can cherry-pick the projects they are interested in to make up their own personal portfolio.
Having discussed both AWS and OpenStack, first, we will explore the AWS approach to networking, before looking at an alternative method using OpenStack and compare the two approaches. When first setting up networking in AWS, a tenant network in AWS is instantiated using VPC, which post 2013 deprecated AWS classic mode; but what is VPC?
A VPC is the new default setting for new customers wishing to access AWS. VPCs can also be connected to customer networks (private data centers) by allowing AWS cloud to extend a private data center for agility. The concept of connecting a private data center to an AWS VPC is using something AWS refers to as a customer gateway and virtual private gateway. A virtual private gateway in simple terms is just two redundant VPN tunnels, which are instantiated from the customer's private network.
Customer gateways expose a set of external static addresses from a customer site, which are typically Network Address Translation-Traversal (NAT-T) to hide the source address. UDP port
4500 should be accessible in the external firewall in the private data center. Multiple VPCs can be supported from one customer gateway device.
A VPC gives an isolated view of everything an AWS customer has provisioned in AWS public cloud. Different user accounts can then be set up against VPC using the AWS Identity and Access Management (IAM) service, which has customizable permissions.
By default, when an instance (virtual machine) is instantiated in a VPC, it will either be placed on a default subnet or custom subnet if specified.
When an instance is spun up in AWS, it will automatically be assigned a mandatory private IP address by Dynamic Host Configuration Protocol (DHCP) as well as a public IP and DNS entry too unless dictated otherwise. Private IPs are used in AWS to route east-west traffic between instances when virtual machine needs to communicate with adjacent virtual machines on the same subnet, whereas public IPs are available through the Internet.
If a persistent public IP address is required for an instance, AWS offers the elastic IP addresses feature, which is limited to five per VPC account, which any failed instances IP address can be quickly mapped to another instance. It is important to note that it can take up to 24 hours for a public IP address's DNS Time To Live (TTL) to propagate when using AWS.
In terms of throughput, AWS instances can support a Maximum Transmission Unit (MTU) of 1,500 that can be passed to an instance in AWS, so this needs to be considered when considering application performance.
Security groups in AWS are a way of grouping permissive ACL rules, so don't allow explicit denies. AWS security groups act as a virtual firewall for instances, and they can be associated with one or more instances' network interfaces. In a VPC, you can associate a network interface with up to five security groups, adding up to 50 rules to a security group, with a maximum of 500 security groups per VPC. A VPC in an AWS account automatically has a default security group, which will be automatically applied if no other security groups are specified.
Default security groups allow all outbound traffic and all inbound traffic only from other instances in a VPC that also use the default security group. The default security group cannot be deleted. Custom security groups when first created allow no inbound traffic, but all outbound traffic is allowed.
Permissive ACL rules associated with security groups govern inbound traffic and are added using the AWS console (GUI) as shown later in the text, or they can be programmatically added using APIs. Inbound ACL rules associated with security groups can be added by specifying type, protocol, port range, and the source address. Refer to the following screenshot:
A VPC has access to different regions and availability zones of shared compute, which dictate the data center that the AWS instances (virtual machines) will be deployed in. Regions in AWS are geographic areas that are completely isolated by design, where availability zones are isolated locations in that specific region, so an availability zone is a subset of a region.
AWS gives users the ability to place their resources in different locations for redundancy as sometimes the health of a specific region or availability zone can suffer issues. Therefore, AWS users are encouraged to use more than one availability zones when deploying production workloads on AWS. Users can choose to replicate their instances and data across regions if they choose to.
Within each isolated AWS region, there are child availability zones. Each availability zone is connected to sibling availability zones using low latency links. All communication from one region to another is across the public Internet, so using geographically distant regions will acquire latency and delay. Encryption of data should also be considered when hosting applications that send data across regions.
AWS also allows Elastic Load Balancing (ELB) to be configured within a VPC as a bolt-on service. ELB can either be internal or external. When ELB is external, it allows the creation of an Internet-facing entry point into your VPC using an associated DNS entry and balances load between different instances. Security groups are assigned to ELBs to control the access ports that need to be used.
The following image shows an elastic load balancer, load balancing 3 instances:
OpenStack is deployed in a data center on multiple controllers. These controllers contain all the OpenStack services, and they can be installed on either virtual machines, bare metal (physical) servers, or containers. The OpenStack controllers should host all the OpenStack services in a highly available and redundant fashion when they are deployed in production.
Different OpenStack vendors provide different installers to install OpenStack. Some examples of installers from the most prominent OpenStack distributions are RedHat Director (based on OpenStack TripleO), Mirantis Fuel, HPs HPE installer (based on Ansible), and Juju for Canonical, which all install OpenStack controllers and are used to scale out compute nodes on the OpenStack cloud acting as an OpenStack workflow management tool.
Swift is the object storage service for OpenStack and can be used as a redundant storage backend that stores replicated copies of objects on multiple servers. Swift is not like traditional block or file-based storage; objects can be any unstructured data.
Ironic is the bare metal provisioning service for OpenStack. Originally, a fork of part of the Nova codebase, it allows provisioning of images on to bare metal servers and uses IPMI and ILO or DRAC interfaces to manage physical hardware.
Useful links covering OpenStack services can be found at:
A Project, often referred to in OpenStack as a tenant, gives an isolated view of everything that a team has provisioned in an OpenStack cloud. Different user accounts can then be set up against a Project (tenant) using the keystone identity service, which can be integrated with Lightweight Directory Access Protocol (LDAP) or Active Directory to support customizable permission models.
The following network functions are provided by the neutron project in an OpenStack cloud:
Creating instances (virtual machines) mapped to networks
Assigning IP addresses using its in-built DHCP service
DNS entries are applied to instances from named servers
The assignment of private and Floating IP addresses
Creating or associating network subnets
Applying security groups
OpenStack is set up into its Modular Layer 2 (ML2) and Layer 3 (L3) agents that are configured on the OpenStack controllers. OpenStack's ML2 plugin allows OpenStack to integrate with switch vendors that use either Open vSwitch or Linux Bridge and acts as an agnostic plugin to switch vendors, so vendors can create plugins, to make their switches OpenStack compatible. The ML2 agent runs on the hypervisor communicating over Remote Procedure Call (RPC) to the compute host server.
OpenStack compute hosts are typically deployed using a hypervisor that uses Open vSwitch. Most OpenStack vendor distributions use the KVM hypervisor by default in their reference architectures, so this is deployed and configured on each compute host by the chosen OpenStack installer.
Compute hosts in OpenStack are connected to the access layer of the STP 3-tier model, or in modern networks connected to the Leaf switches, with VLANs connected to each individual OpenStack compute host. Tenant networks are then used to provide isolation between tenants and use VXLAN and GRE tunneling to connect the layer 2 network.
Open vSwitch runs in kernel space on the KVM hypervisor and looks after firewall rules by using OpenStack security groups that pushes down flow data via OVSDB from the switches. The neutron L3 agent allows OpenStack to route between tenant networks and uses neutron routers, which are deployed within the tenant network to accomplish this, without a neutron router networks are isolated from each other and everything else.
When setting up simple networking using neutron in a Project (tenant) network, two different networks, an internal network, and an external network will be configured. The internal network will be used for east-west traffic between instances. This is created as shown in the following horizon dashboard with an appropriate Network Name:
Finally, DHCP is enabled on the network, and any named Allocation Pools (specifies only a range of addresses that can be used in a subnet) are optionally configured alongside any named DNS Name Servers, as shown below:
An external network will also need to be created to make the internal network accessible from outside of OpenStack, when external networks are created by an administrative user, the set External Network checkbox needs to be selected, as shown in the next screenshot:
The created router will then need to be associated with the networks; this is achieved by adding an interface on the router for the private network, as illustrated in the following screenshot:
This then completes the network setup; the final configuration for the internal and external network is displayed below, which shows one router connected to an internal and external network:
In OpenStack, instances are provisioned onto the internal private network by selecting the private network NIC when deploying instances. OpenStack has the convention of assigning pools of public IPs (floating IP) addresses from an external network for instances that need to be externally routable outside of OpenStack.
To set up a set of floating IP addresses, an OpenStack administrator will set up an allocation pool using the external network from an external network, as shown in the following screenshot:
OpenStack like AWS, uses security groups to set up firewall rules between instances. Unlike AWS, OpenStack supports both ingress and egress ACL rules, whereas AWS allows all outbound communication, OpenStack can deal with both ingress and egress rules. Bespoke security groups are created to group ACL rules as shown below
Ingress and Rules can then be created against a security group. SSH access is configured as an ACL rule against the parent security group, which is pushed down to Open VSwitch into kernel space on each hypervisor, as seen in the next screenshot:
An instance is launched by selecting Launch Instance in horizon and setting the following parameters:
Flavor (CPU, RAM, and disk space)
Image Name (base operating system)
A security group should also be selected to govern the ACL rules for the instance; in this instance, the
testsg1 security group is selected as shown in the following screenshot:
A floating IP address from the external network floating IP address pool is then selected and associated with the instance:
The floating IP addresses NATs OpenStack instances that are deployed on the internal public IP address to the external network's floating IP address, which will allow the instance to be accessible from outside of OpenStack.
An availability zone in OpenStack is just a virtual separation of compute resources. In OpenStack, an availability zone can be further segmented into host aggregates. It is important to note that a compute host can be assigned to only one availability zone, but can be a part of multiple host aggregates in that same availability zone.
Nova uses a concept named nova scheduler rules, which dictates the placement of instances on compute hosts at provisioning time. A simple example of a nova scheduler rule is the
AvailabiltyZoneFilter filter, which means that if a user selects an availability zone at provisioning time, then the instance will land only on any of the compute instances grouped under that availability zone.
Another example of the
AggregateInstanceExtraSpecsFilter filter that means that if a custom flavor (CPU, RAM, and disk) is tagged with a key value pair and a host aggregate is tagged with the same key value pair, then if a user deploys with that flavor the
AggregateInstanceExtraSpecsFilter filter will place all instances on compute hosts under that host aggregate.
These host aggregates can be assigned to specific teams, which means that teams can be selective about which applications they share their compute with and can be used to prevent noisy neighbor syndrome. There is a wide array of filters that can be applied in OpenStack in all sorts of orders to dictate instance scheduling. OpenStack allows cloud operators to create a traditional cloud model with large groups of contended compute to more bespoke use cases where the isolation of compute resources is required for particular application workloads.
The following example shows host aggregates with groups and shows a host aggregate named 1-Host-Aggregate, grouped under an Availability Zone named DC1 containing two compute hosts (hypervisors), which could be allocated to a particular team:
The Nova compute service will issue a request for a new instance (virtual machine) using the image selected from the glance images service
Once the request for a new instance is processed, the request will write a new row into the nova Galera database in the nova database
Nova will look at the nova scheduler rules defined on the OpenStack controllers and will use those rules to place the instance on an available compute node (KVM hypervisor)
If an available hypervisor is found that meets the nova scheduler rules, then the provisioning process will begin
Nova will check whether the image already exists on the matched hypervisor. If it doesn't, the image will be transferred from the hypervisor and booted from local disk
Nova will issue a neutron request, which will create a new VPort in OpenStack and map it to the neutron network
The VPort information will then be written to both the nova and neutron databases in Galera to correlate the instance with the network
A private IP address will then be assigned, and the instance will start to start up on the private network
The neutron metadata service will then be contacted to retrieve cloud-init information on boot, which will assign a DNS entry to the instance from the named server, if specified
Once cloud-init has run, the instance will be ready to use
Floating IPs can then be assigned to the instance to NAT to external networks to make the instances publicly accessible
Like AWS OpenStack also offers a Load-Balancer-as-a-Service (LBaaS) option that allows incoming requests to be distributed evenly among designated instances using a Virtual IP (VIP). The features and functionality supported by LBaaS are dependent on the vendor plugin that is used.
These load balancers all expose varying degrees of features to the OpenStack LBaaS agent. The main driver for utilizing LBaaS on OpenStack is that it allows users to use LBaaS as a broker to the load balancing solution, allowing users to use the OpenStack API or configure the load balancer via the horizon GUI.
LBaaS allows load balancing to be set up within a tenant network in OpenStack. Using LBaaS means that if for any reason a user wishes to use a new load balancer vendor as opposed to their incumbent one; as long as they are using OpenStack LBaaS, it is made much easier. As all calls or administration are being done via the LBaaS APIs or Horizon, no changes would be required to the orchestration scripting required to provision and administrate the load balancer, and they wouldn't be tied into each vendor's custom APIs and the load balancing solution becomes a commodity.
In this chapter, we have covered some of the basic networking principles that are used in today's modern data centers, with special focus on the AWS and OpenStack cloud technologies which are two of the most popular solutions.
Having read this chapter, you should now be familiar with the difference between Leaf-Spine and Spanning Tree network architectures, it should have demystified AWS networking, and you should now have a basic understanding of how private and public networks can be configured in OpenStack.
In the forthcoming chapters, we will build on these basic networking constructs and look at how they can be programmatically controlled using configuration management tools and used to automate network functions. But first, we will focus on some of the software-defined networking controllers that can be used to extend the capability of OpenStack even further than neutron in the private clouds and some of the feature sets and benefits they bring to ease the pain of managing network operations.
Useful links for Amazon content are:
Useful links for OpenStack content are: