Apache CloudStack Architecture

(For more resources related to this topic, see here.)

Introducing cloud

Before embarking on a journey to understand and appreciate CloudStack, let's revisit the basic concepts of cloud computing and how CloudStack can help us in achieving our private, public, or hybrid cloud objectives.

Let's start this article with a plain and simple definition of cloud. Cloud is a shared multi-tenant environment built on a highly efficient, highly automated, and preferably virtualized IT infrastructure where IT resources can be provisioned on demand from anywhere over a broad network, and can be metered. Virtualization is the technology that has made the enablement of these features simpler and convenient. A cloud can be deployed in various models; including private, public, community or hybrid clouds. These deployment models can be explained as follows:

  • Private cloud: In this deployment model, the cloud infrastructure is operated solely for an organization and may exist on premise or off premise. It can be managed by the organization or a third-party cloud provider.
  • Public cloud: In this deployment model, the cloud service is provided to the general public or a large industry group, and is owned and managed by the organization providing cloud services.
  • Community cloud: In this deployment model, the cloud is shared by multiple organizations and is supported by a specific community that has shared concerns. It can be managed by the organization or a third party provider, and can exist on premise or off premise.
  • Hybrid cloud: This deployment model comprises two or more types of cloud (public, private, or community) and enables data and application portability between the clouds.

A cloud—be it private, public, or hybrid—has the following essential characteristics:

  • On-demand self service
  • Broad network access
  • Resource pooling
  • Rapid elasticity or expansion
  • Measured service
  • Shared by multiple tenants

Cloud has three possible service models, which means there are three types of cloud services that can be provided. They are:

  • Infrastructure as a service (IaaS): This type of cloud service model provides IT infrastructure resources as a service to the end users. This model provides the end users with the capability to provision processing, storage, networks, and other fundamental computing resources that the customer can use to run arbitrary software including operating systems and applications. The provider manages and controls the underlying cloud infrastructure and the user has control over the operating systems, storage and deployed applications. The user may also have some control over the networking services.
  • Platform as a service (PaaS): In this service model, the end user is provided with a platform that is provisioned over the cloud infrastructure. The provider manages the network, operating system, or storage and the end user has control over the applications and may have control over the hosting environment of the applications.
  • Software as a service (SaaS): This layer provides software as a service to the end users, such as providing an online calculation engine for their end users. The end users can access these software using a thin client interface such as a web browser. The end users do not manage the underlying cloud infrastructure such as network, servers, OS, storage, or even individual application capabilities but may have some control over the application configurations settings.

As depicted in the preceding diagram, the top layers of cloud computing are built upon the layer below it. In this book, we will be mainly dealing with the bottom layer—Infrastructure as a service.

Thus providing Infrastructure as a Service essentially means that the cloud provider assembles the building blocks for providing these services, including the computing resources hardware, networking hardware and storage hardware. These resources are exposed to the consumers through a request management system which in turn is integrated with an automated provisioning layer. The cloud system also needs to meter and bill the customer on various chargeback models. The concept of virtualization enables the provider to leverage and pool resources in a multi-tenant model. Thus, the features provided by virtualization resource pooling, combined with modern clustering infrastructure, enable efficient use IT resources to provide high availability and scalability, increase agility, optimize utilization, and provide a multi-tenancy model.

One can easily get confused about the differences between the cloud and a virtualized Datacenter; well, there are many differences, such as:

  • The cloud is the next stage after the virtualization of datacenters. It is characterized by a service layer over the virtualization layer. Instead of bare computing resources, services are built over the virtualization platforms and provided to the users. Cloud computing provides the request management layer, provisioning layer, metering and billing layers along with security controls and multi-tenancy.
  • Cloud resources are available to consumers on an on demand model wherein the resources can be provisioned and de-provisioned on an as needed basis. Cloud providers typically have huge capacities to serve variable workloads and manage variable demand from customers. Customers can leverage the scaling capabilities provided by cloud providers to scale up or scale down the IT infrastructure needed by the application and the workload. This rapid scaling helps the customer save money by using the capacity only when it is needed.
  • The resource provisioning in the cloud is governed by policies and rules, and the process of provisioning is automated.
  • Metering, Chargeback, and Billing are essential governance characteristics of any cloud environment as they govern and control the usage of precious IT resources.

Thus setting up a cloud is basically building capabilities to provide IT resources as a service in a well-defined manner. Services can be provided to end users in various offerings, depending upon the amount of resources each service offering provides. The amount of resources can be broken down to multiple resources such as the computing capacity, memory, storage, network bandwidth, storage IOPS, and so on. A cloud provider can provide and meter multiple service offerings for the end users to choose from.

Though the cloud provider makes upfront investments in creating the cloud capacity, however from a consumer's point of view the resources are available on demand on a pay per use model. Thus the customer gets billed for consumption just like in case of electricity or telecom services that individuals use. The billing may be based on hours of compute usage, the amount of storage used, bandwidth consumed, and so on.

Having understood the cloud computing model, let's look at the architecture of a typical Infrastructure as a Service cloud environment.

Infrastructure layer

The Infrastructure layer is the base layer and comprises of all the hardware resources upon which IT is built upon. These include computing resources, storage resources, network resources, and so on.

Computing resources

Virtualization is provided using a hypervisor that has various functions such as enabling the virtual machines of the hosts to interact with the hardware. The physical servers host the hypervisor layer. The physical server resources are accessed through the hypervisor. The hypervisor layer also enables access to the network and storage. There are various hypervisors on the market such as VMware, Hyper-V, XenServer, and so on. These hypervisors are responsible for making it possible for one physical server to host multiple machines, and for enabling resource pooling and multi tenancy.


Like the Compute capacity, we need storage which is accessible to the Compute layer.

The Storage in cloud environments is pooled just like the Compute and accessed through the virtualization layer. Certain types of services just offer storage as a service where the storage can be programmatically accessed to store and retrieve objects.

Pooled, virtualized storage is enabled through technologies such as Network Attached Storage (NAS) and Storage Area Network (SAN) which helps in allowing the infrastructure to allocate storage on demand that can be based on policy, that is, automated.

The storage provisioning using such technologies helps in providing storage capacity on demand to users and also enables the addition or removal of capacity as per the demand. The cost of storage can be differentiated according to the different levels of performance and classes of storage.

Typically, SAN is used for storage capacity in the cloud where statefulness is required. Direct-attached Storage (DAS) can be used for stateless workloads that can drive down the cost of service. The storage involved in cloud architecture can be redundant and prevent the single point of failure. There can be multiple paths for the access of disk arrays to provide redundancy in case connectivity failures.

The storage arrays can also be configured in a way that there is incremental backup of the allocated storage. The storage should be configured such that health information of the storage units is updated in the system monitoring service, which ensures that the outage and its impact are quickly identified and appropriate action can be taken in order to restore it to its normal state.

Networks and security

Network configuration includes defining the subnets, on-demand allocation of IP addresses, and defining the network routing tables to enable the flow of data in the network. It also includes enabling high availability services such as load balancing. Whereas the security configuration aims to secure the data flowing in the network that includes isolation of data of different tenants among each other and with the management data of cloud using techniques such as network isolation and security groups.

Networking in the cloud is supposed to deal with the isolation of resources between multiple tenants as well as provide tenants with the ability to create isolated components. Network isolation in the cloud can be done using various techniques of network isolation such as VLAN, VXLAN, VCDNI, STT, or other such techniques.

Applications are deployed in a multi-tenant environment and consist of components that are to be kept private, such as a database server which is to be accessed only from selected web servers and any other traffic from any other source is not permitted to access it. This is enabled using network isolation, port filtering, and security groups. These services help with segmenting and protecting various layers of application deployment architecture and also allow isolation of tenants from each other.

The provider can use security domains, layer 3 isolation techniques to group various virtual machines. The access to these domains can be controlled using providers' port filtering capabilities or by the usage of more stateful packet filtering by implementing context switches or firewall appliances. Using network isolation techniques such as VLAN tagging and security groups allows such configuration. Various levels of virtual switches can be configured in the cloud for providing isolation to the different networks in the cloud environment.

Networking services such as NAT, gateway, VPN, Port forwarding, IPAM systems, and access control management are used in the cloud to provide various networking services and accessibility. Some of these services are explained as follows:

  • NAT: Network address translation can be configured in the environment to allow communication of a virtual machine in private network with some other machine on some other network or on the public Internet. A NAT device allows the modification of IP address information in the headers of IP packets while they are transformed from a routing device. A machine in a private network cannot have direct access to the public network so in order for it to communicate to the Internet, the packets are sent to a routing device or a virtual machine with NAT configured which has direct access to the Internet. NAT modifies the IP packet header so that the private IP address of the machine is not visible to the external networks.
  • IPAM System/DHCP: An IP address management system or DHCP server helps with the automatic configuration of IP addresses to the virtual machines according to the configuration of the network and the IP range allocated to it. A virtual machine provisioned in a network can be assigned an IP address as per the user or is assigned an IP address from the IPAM. IPAM stores all the available IP addresses in the network and when a new IP address is to be allocated to a device, it is taken from the available IP pool, and when a device is terminated or releases the IP address, the address is given back to the IPAM system.
  • Identity and access management: A access control list describes the permissions of various users on different resources in the cloud. It is important to define an access control list for users in a multi-tenant environment. It helps in restricting actions that a user can perform on any resource in the cloud. A role-based access mechanism is used to assign roles to users' profile which describes the roles and permissions of users on different resources.

Use of switches in cloud

A switch is a LAN device that works at the data link layer (layer 2) of the OSI model and provides multiport bridge. Switches store a table of MAC addresses and ports. Let us see the various types of switches and their usage in the cloud environment:

  • Layer 3 switches: A layer-3 switch is a special type of switch which operates at layer 3—the Network layer of the OSI model. It is a high performance device that is used for network routing. A layer-3 switch has a IP routing table for lookups and it also forms a broadcast domain. Basically, a layer-3 switch is a switch which has a router's IP routing functionality built in.

    A layer-3 switch is used for routing and is used for better performance over routers. The layer-3 switches are used in large networks like corporate networks instead of routers. The performance of the layer-3 switch is better than that of a router because of some hardware-level differences. It supports the same routing protocols as network routers do. The layer-3 switch is used above the layer-2 switches and can be used to configure the routing configuration and the communication between two different VLANs or different subnets.

  • Layer 4-7 switches: These switches use the packet information up to OSI layer 7 and are also known as content switches, web-switches, or application switches. These types of switches are typically used for load balancing among a group of servers which can be performed on HTTP, HTTPS, VPN, or any TCP/IP traffic using a specific port. These switches are used in the cloud for allowing policy-based switching—to limit the different amount of traffic on specific end-user switch ports. It can also be used for prioritizing the traffic of specific applications. These switches also provide forwarding decision making like NAT services and also manages the state of individual sessions from beginning to end thus acting like firewalls. In addition, these switches are used for balancing traffic across a cluster of servers as per the configuration of the individual session information and status. Hence these types of switches are used above layer-3 switches or above a cluster of servers in the environment. They can be used to forward packets as per the configuration such as transferring the packets to a server that is supposed to handle the requests and this packet forwarding configuration is generally based on the current server loads or sticky bits that binds the session to a particular server.
  • Layer-3 traffic isolation provides traffic isolation across layer-3 devices. It's referred to as Virtual Routing and Forwarding (VRF). It virtualizes the routing table in a layer-3 switch and has set of virtualized tables for routing. Each table has a unique set of forwarding entries. Whenever traffic enters, it is forwarded using the routing table associated with the same VRF. It enables logical isolation of traffic as it crosses a common physical network infrastructure. VRFs provide access control, path isolation, and shared services. Security groups are also an example of layer-3 isolation capabilities which restricts the traffic to the guests based on the rules defined. The rules are defined based on the port, protocol, and source/destination of the traffic.
  • Virtual switches: The virtual switches are software program that allows one guest VM to communicate with another and is similar to the Ethernet switch explained earlier. Virtual switches provide a bridge between the virtual NICs of the guest VMs and the physical NIC of the host. Virtual switches have port groups on one side which may or may not be connected to the different subnets. There are various types of virtual switches used with various virtualization technologies such as VMware Vswitch, Xen, or Open Vswitch. VMware also provides a distributed virtual switch which spans multiple hosts. The virtual switches consists of port groups at one end and an uplink at the other. The port groups are connected to the virtual machines and the uplink is mapped to the physical NIC of the host. The virtual switches function as a virtual switch over the hypervisor layer on the host.
  • Management layer

    The Management layer in a cloud computing space provides management capabilities to manage the cloud setup.

    It provides features and functions such as reporting, configuration for the automation of tasks, configuration of parameters for the cloud setup, patching, and monitoring of the cloud components.


    The cloud is a highly automated environment and all tasks such as provisioning the virtual machine, allocation of resources, networking, and security are done in a self-service mode through automated systems.

    The automation layer in cloud management software is typically exposed through APIs. The APIs allow the creation of SDKs, scripts, and user interfaces.


    The Orchestration layer is the most critical interface between the IT organization and its infrastructure, and helps in the integration of the various pieces of software in the cloud computing platform.

    Orchestration is used to join together various individual tasks which are executed in a specified sequence with exception handling features. Thus a provisioning task for a virtual machine may involve various commands or scripts to be executed. The orchestration engine binds these individual tasks together and creates a provisioning workflow which may involve provisioning a virtual machine, adding it to your DNS, assigning IP Addresses, adding entries in your firewall and load balancer, and so on.

    The orchestration engine acts as an integration engine and also provides the capabilities to run an automated workflow through various subsystems. As an example, the service request to provision cloud resources may be sent to an orchestration engine which then talks to the cloud capacity layer to determine the best host or cluster where the workload can be provisioned. As a next step, the orchestration engine chooses the component to call to provision the resources.

    The orchestration platform helps in easy creation of complex workflows and also provides ease of management since all integrations are handled by a specialized orchestration engine and provide loose coupling.

    The orchestration engine is executed in the cloud system as an asynchronous job scheduler which orchestrates the service APIs to fulfill and execute a process.

    Task Execution

    The Task execution layer is at the lower level of the management operations that are performed using the command line or any other interface. The implementation of this layer can vary as per the platform on which the execution takes place. The activity of this layer is activated by the layers above in the management layer.

    Service Management

    The Service Management layer helps in compliance and provides means to implement automation and adapts IT service management best practices as per the policies of the organization, such as the IT Infrastructure Library (ITIL). This is used to build processes to implement different types of incident resolutions and also provide change management.

    The self service capability in cloud environment helps in providing users with a self-service catalog which consists of various service options that the user can request and provision resources from the cloud. The service layer can be comprised of various levels of services such as basic provisioning of virtual machines with some predefined templates/configuration, or can be of an advanced level with various options for provisioning servers with configuration options as well.

Understanding CloudStack modules

Having gone through the various components in the cloud setup, let's look at the CloudStack platform that offers a complete platform to set up a Private or Hybrid cloud for enterprises.

CloudStack is a solution or a platform for IT infrastructure as a service that allows to pool computing resources which can be used to build public, private and hybrid IaaS cloud services that can be used to provide IT infrastructure such as compute nodes (hosts), networks, and storage as a service to the end users on demand.

In this section we are going to discuss the underlying architecture of CloudStack that makes it possible to set up on demand computing services which can be leveraged as a public, private, or hybrid cloud and makes requesting, provisioning, configuring, and managing the IT infrastructure possible.

The architecture of CloudStack follows a hierarchical structure, which can be deployed in order to manage thousands of physical servers from a single management interface. Users can either define a basic networking zone or can configure an advanced networking zone with the desired networking features and capabilities. CloudStack also allows integration with public cloud AWS by using the APIs of some of its services.

The basic CloudStack deployment, commonly known as dev-cloud, consists of a single management server with or without CloudDB in a machine that can be used to manage a group of hypervisor host(s).

The management server controls or collaborates with the hypervisor layers on the physical hosts over the management network and thus controls the IT infrastructure.

Let's look at the architecture of private cloud infrastructure that the CloudStack management platform will manage. The physical servers in the various datacenters that are to be managed using CloudStack platform are segregated into various levels of logical partitions that help in managing them easily. Partitioning of the cloud into various zones, pods, and clusters makes them manageable. We will first look at the architecture, how we will set up the private cloud, and then focus on how CloudStack controls the underlying cloud infrastructure.

Cloud deployment model

The CloudStack deployment model covers the basic components of CloudStack, using which CloudStack provides all the functionalities; it also covers the logical segregation of resources to help better manage them.


Zones are the highest level of hierarchy in which the datacenter of an organization is logically partitioned to provide physical isolation and redundancy. The zone may be a complete datacenter that has a geographic location with its own power supply and network uplinks. The zone can also be distributed across various geographic locations. Typically a datacenter contains a single zone but it can also contain multiple zones. The logical division into zones enables the instances and the data stores at a specified location to be compliant with an organization's data storage policies, and other compliance policies, and so on.

A zone is divided into various other logical units. A zone comprises of following:

  • At least one or more pods. A pod is the second level of hierarchical unit discussed later which consists of one or more clusters. We will discuss this in detail next.
  • Secondary storage, this storage is shared by all the pods in a zone.

End users can select a zone while requesting a guest VM.

A zone can be associated with domains; a zone can be public or private. A public zone is visible to all the users whereas private zones can be reserved for some specific domain which means that the users that belong to that domain or its subdomain have the permissions to create guest VMs in that private zone.

The pod(s) in a zone consists of one or more clusters which in turn contains host(s).

The hosts in a zone can access and be accessed by other hosts in that zone if there's no restriction from any firewall and a flat network is defined, but if a host in a zone tries to access hosts in other zones, they have to follow VPN tunnels which are configured with firewall rules. Among all the other traffic, only key management traffic is allowed in between the hosts in different clusters.

The zones must be configured by the cloud administrator with the number of pods in the zone, number of clusters in a pod, and number of hosts in each of the clusters. The clusters also contain primary storage servers so the cloud administrator must also decide upon the number and the capacity of primary storage servers that are to be placed in each of the clusters. As a minimum configuration, each zone must have at least one pod, each pod must have at least one cluster with at least one host, the cluster must have at least one primary storage. The capacity of the secondary storage that is used by all the hosts in all the pods of a zone is also to be configured by the administrator.

Once these components are configured they are consumed by the management server to provide cloud services to the customers. A zone is hypervisor specific—a zone will only consist of hosts with the same hypervisor only. Hosts with different hypervisors are added to different zones.


A pod is the second level of hierarchy in the deployment of the CloudStack. A zone is further logically divided into one or more entities known as pod. A pod can be assumed to be like a physical rack in a datacenter in which all the hosts reside in the same network subnet. The pod defines the management subnet of the system VMs (discussed later), this network is used by the CloudStack management server communication. A pod contains one or more clusters and a cluster in turn contains one or more hosts. All the hosts that are inside a pod are configured to be in the same subnet. There are one or more primary storage servers in a pod. A pod is invisible to users—the users don't know which pod their machine is being provisioned in to. The logical division into pods is for administrative purposes only. As with the zone, all the hosts in a pod will have hosts with same hypervisor type as defined during the creation of zone and these hosts are in the same subnet.


A cluster is the third level of hierarchical division in the deployment of CloudStack inside pods. The hosts are grouped into a logical group called clusters. This cluster can be a XenServer server pool, VMware cluster that is preconfigured in the VCenter, or a set of KVM servers.

Hosts within a cluster are accessible to each other and guests residing on a host in a cluster can also be "live migrated" to another host in the same cluster using a shared storage device. The live migration occurs without any interruption to the users and thus provides high availability option. In order to support live migration of VMs between hosts in a cluster, those hosts should have similar properties such as using the same hypervisor version and the same hardware. They should also have access to the same primary storage servers that are shared among the hosts in a cluster. A cluster contains at least one primary storage server.


In this section we will discuss the various storage options available in the CloudStack system.

Primary storage

A cluster contains at least one primary storage server that is shared among all the hosts in that cluster. This storage server is among the critical component of the cluster and is used to host the guest virtual machine instances. The primary storage is unique to each of the clusters inside the pods.

The primary storage can be a SAN such as iSCSI, NAS such as NFS or CIFS, or it can be a DAS such as VMFS or EXT file systems. The primary storage should be supported by the underlying hypervisor. Building primary storage on high performance hardware with multiple high speed disks increases the performance. The primary storage should also be reachable from all the hosts in the cluster. This storage is used to host the guest virtual machines in the cluster stores. The volume of these virtual machines and the allocation of guest virtual disks to the primary storage managed by CloudStack.

The primary storage server is basically a machine with a large quantity of disk space, the capacity of which is dependent on the users' need. Primary storage can be a shared storage server or local storage, and hosts the storage for guest virtual machines.

The primary storage pool can be created using any of the technologies mentioned earlier for any given cluster. If it is created using iSCSI LUN or an NFS share, the CloudStack management server asks each of the hosts' hypervisors in the cluster to mount the NFS share or the LUN. The hypervisor then communicates with the primary storage pool as it is presented as a datastore in case of VMware, storage repository in case of a XenServer or a mount point in case of KVM.

The hosts are recommended to have a dedicated management interface for communication with the primary storage pool. The mechanism for making an additional interface for the host for primary storage traffic is to create a management interface. The management interface of the hosts and primary storage interfaces must be in the same CIDR.

The primary storage pool provides storage for the root volumes and the disk spaces for the guest VMs. When a guest virtual machine is created, its root volume is automatically created from this primary storage. When the VM is deleted, the data volumes associated with it are disabled and this VM can also be recovered afterwards. It thus provides security by ensuring no data is shared or made available to a new guest in a multi-tenant environment by deleting the data on deletion of a virtual machine.

These storage servers provide volume storage to the running virtual machines. The volumes can be the root device or other data volumes attached to the machine. These extra volumes can be attached to the virtual machine dynamically. Except for Oracle VM hypervisor, data volumes can be attached to the guest dynamically only when the machine is in stopped state. After the VM instance is destroyed, the VM instance is disabled from the start action, the VM instance is still in the database and volume is not yet deleted. The destroyed VM can also be recovered. After the destroyed state, there is an expunged state which signifies permanent deletion of the volume. The time for expunging is stored as 86400 seconds by default, but this can be changed. The administrator sets up and configures the iSCSI on the host not only the first time but also during the recovery of the host from a failure—whenever there is a host failure, the administrator has to set up and configure the iSCSI LUNs on the host again.

In case of XenServer, it uses clustered logical volume manager to store the VM images on iSCSI and fiber channel volumes. In this case CloudStack can support over provisioning only if the storage server itself allows that, otherwise over-provisioning is not supported. In case of KVM, shared mount point storage is supported but the mount path must be same across all the hosts in the cluster. The administrator must ensure that the storage is available otherwise CloudStack will not attempt to mount or unmount the storage.

With NFS storage, CloudStack manages the over provisioning of storage independent of the hypervisor.

The primary storage can also be a pool of local storage in case of vSphere, XenServer, OVM, and KVM. CloudStack supports multiple primary storage pools in a cluster. It also supports dynamic addition of storage servers as your requirements increase.

As primary storage is a critical component, its capacity must be monitored by the administrator and additional storage space must be attached to it when needed. This can be done by creating a storage pool that is associated with the clusters. Additional capacity can be added by adding additional iSCSI LUN to the storage when the primary storage is iSCSI or an additional NFS server can be added to the primary storage when the first one reaches its size limit. Thus CloudStack supports multiple storage pools in a cluster as well as a single SAN or a storage server can also be used to provide primary storage to multiple clusters.


Volumes are the basic unit of storage in CloudStack. All the storage space that is provided to the guest instance is provided in the form volumes. These volumes are created from the primary storage servers that are described as above.

The additional storage space and the storage space for the root disk drives of the VMs are provided by volumes. These volumes are dependent upon the type of hypervisor, because the disk image format for different hypervisors are different. So the volume that has been created for a guest of a hypervisor type cannot be attached to a guest VM using another type of hypervisor.

The guest VMs' root disk is provided storage in the form of volumes from the primary storage that contains all the files for booting the OS or additional storage for storing data. There can be multiple additional volumes mounted to a guest VM. The users are provided with multiple disk offerings (discussed later) that are pre-created by the administrator and which users can select to create different types of volumes. In addition, a volume can be used to create templates. A volume can be detached or attached to any instance but they must be of same hypervisor type.

The volumes are provided to the virtual machines from the primary storage. CloudStack doesn't provide the functionality of backing up the primary storage but it does provide the functionality for backing up of individual volumes in primary storage using snapshots.

Secondary storage

This storage space is used to store the templates, ISO images and snapshots that can be used to deploy IT infrastructure using CloudStack. Every zone has at least one secondary storage server and this secondary storage(s) is shared by all the Pods in that zone.

CloudStack also provides the functionality to automatically replicate the secondary storage across zones so that one can implement a disaster recovery solution by backing up the application across zones, allowing easy recovery from a zone failure. Thus, CloudStack provides a common storage solution across the cloud. Unlike primary storage, secondary storage only uses Network File System (NFS) as it is to be accessed by all the hosts in the clusters across the zones irrespective of the hypervisors on the hosts.

The secondary storage is used to store templates that are basically OS images that are used to boot VMs with some more additional configuration information and installed applications.

Secondary storage also stores ISO images that are disk images used for booting operating system and disk volume snapshot, used for backing up the data of VMs for data recovery or for creating new templates. These items are available to all the hosts in one zone.

A secondary storage device can be an OpenStack object (swift) with at least 100 GB of space, an NFS storage appliance or a Linux NFS server. The secondary storage and the VMs that it is serving must be located in the same zone and should be accessible by all the hosts in that zone. The scalability of a zone is dependent upon the capacity of the secondary storage device. A disk with high read:write ratio, larger disk drives, and lower IOPS than primary storage is best suited for secondary storage.

The administrators can change the secondary storage server afterwards or it can be replaced with a new one after implementation; to achieve this one just needs to copy all the files from the old one to the new one.

CloudStack management server

The CloudStack management server helps us to manage the IT infrastructure as defined in the previous sections. The CloudStack management server provides a single point of configuration for the cloud. Basic functionalities of the management server include:

  • The web user interface for both the administrator and the users is provided by the management server
  • The APIs for the CloudStack are also provided by the management server
  • The assignment of guest VMs to particular hosts is also provided by it
  • The public and private IP addresses to particular accounts are provided by management server
  • Storage to guests as virtual disks is allocated by management server
  • The functionalities such as snapshots, templates and ISO images, and their replication across multiple datacenters is also managed by the management server
  • It acts as a single point of configuration for the cloud

The management server provides an easy to use web user interface for administrators and the users who can leverage it to manage and request IT infrastructure on demand. It also provides the users and administrators with CloudStack APIs. The management server can be configured to be highly available to prevent single point of failure. This is achieved by deploying it in a multi-node configuration and placing it behind a load balancer so that multiple requests must be served by multiple nodes acting as the management server. High availability can be ensured for the CloudDB by setting up a MySQL cluster. Setting up the CloudStack in this fashion, with a highly available management server and highly available CloudDB provides no single point of failure.

Even though the management server is configured in a multi-node configuration, the operation of the guest VMs is not impacted due to its outage, but the creation of new VMs and management of existing VMs through the management server cannot be done and all the interfaces along with dynamic load balancing and HA will also stop working. The CloudStack Management server is comprised of various parts which handle the functionalities of the cloud.

The basic units of CloudStack management server are:

  • Interface: The User Interface and Application Program interface. The management server provides different types of interfaces for both administrators and users. The user interface is the Management server console which the admins and end users use for configuring, requesting and provisioning IT infrastructure. The API is used for programmatic access to the server's functionalities.
  • Business Logic: This part takes care of the business logic and sits below the interfaces. When the user or admins requests any kind of operation, it is processed through the business logic and then passed down to the Orchestration engine to perform the operation. For example, if a user requests a VM using the API or UI, the request is passed through the business logic section to process the necessary information such as the host the VM is to be deployed to, the workflow process, the required user authentication to perform that operation, the availability and/or applicability of required network configuration, and so on. The request is then passed down to the orchestration engine that performs the operation to create that virtual machine.
  • Orchestration Engine: The orchestration engine is critical to the CloudStack deployment and is used for configuring, provisioning and scheduling any operations. After the request or command is processed by the business logic, the request is passed to the orchestration engine for processing. The orchestration engine is responsible for performing that action, for example provisioning virtual machine and configuring them, allocating storage and so on. The orchestration engine also helps in scheduling operations.
  • Controllers: The controllers are basic underlying parts of the CloudStack management server that talks to the underlying hypervisor or hardware for compute, network, and storage resources. These controllers, also known as "Gurus" or "Providers" help in provisioning resources that are requested by the admin or the user and are used to build the guest virtual machine as per the request of the user.

On a more granular level, the basic functions of the management server are divided among the server's various modules, which are described as follows.

API layer

The API layer is the top-most layer of the CloudStack management server that the management server listens to. It is basically the call to the functional components of the CloudStack. It passes on this call to the concerned component of the CloudStack. The various API calls can be among OAM&P API, EC2 API, End User API or it can be any other pluggable service API engine. The translation of third-party APIs such as Amazon Web Services is done using the CloudBridge (discussed later). These API calls are fired as per the request by the users or the admin for the execution of some task; the CloudStack management server is responsible for executing the given task as per the request.

Access control

The access control component of the management server is responsible for the access control and authentication of the users requesting services. This layer is the second layer, just beneath the API layer, which cross-checks the authorization of the users requesting the action. The user must authenticate, and the access control component maps the users to the domain, project, and other groups to which the user belongs. The request action should always have an authentication token that authorizes the user for the action and also specifies the permissions, which indicate whether he has the rights to perform the action that he is requesting. This is also recorded in the logs of the actions performed.


The kernel is made up of several different components performing tasks in silos. It is the central component for distributing, integrating, and handling the tasks and operations going in and out. The kernel distributes the tasks among the various other components of the CloudStack and drives long-running VM operations, performs sync between the resources, and the database, generates events that are caught by different components and performs actions based on it.

  • Virtual Machine Manager: The virtual machine manager is responsible for the management of the virtual machines in the CloudStack environment. All the virtual machine requests, requirements, and states are handled by the virtual machine manager. It is responsible for the management of the resources allocated to the virtual machines in the CloudStack environment. It also manages the live migration, and other actions that are to be performed on the virtual machines such as start, stop, delete, assign IP addresses, and so on. In addition, it ensures that resources are allocated to the virtual machines, as per their needs or specifications.
  • Storage Manager: This component of the management server is responsible for the management of storage space attached to the CloudStack as resource. It creates, allocates, or deletes the storage volumes or space as per the end users' request. The storage manager is responsible for all the actions that are concerned with the storage from the users or virtual machines. The storage manager evaluates requests from users and performs the specific action to the storage resources or generates an error if necessary.
  • Network Manager: Network manager handles all the networking of the virtual machines in the CloudStack environment. The network manager is responsible for managing the network configurations of the virtual machines and any other resources in the environment. It has functions such as IP address management, load balancing, firewall configuration, and others that are performed as per the user's request. These configuration operations are predefined in the services that the user chooses. The users are unaware of the operations that are being performed in the back end by such managers.
  • Snapshot Manager: The snapshot manager is the component responsible for managing the snapshot of the virtual machines or any other resources in the environment. When a user requests an action for creating a snapshot or any other operations based on snapshots, such as creating a virtual machine using a snapshot, this component takes care of the request. Snapshots are used for backups and restoration. They are taken on the primary storage and then moved onto the secondary storage. There are basically two types of snapshot; incremental snapshot, where the snapshot of only modified data is taken since the last snapshot and full snapshot, where full snapshot of the service is taken every time.
  • Async Job Manager: The jobs that take place in the CloudStack can be synchronous or asynchronous. This component manages the asynchronous jobs that are requested and are to be performed. Commands are usually executed asynchronously; the manager schedules the jobs as per the priority.
  • Template Manager: The template manager is responsible for handing templates and their operations. Whenever there is a request for creating template, creating VM from a given template, deleting template, and such other tasks. The template manager is notified of the same and it handles all the operations pertaining to it.

Below all the components lie some more core components that provide the end-toend interaction possible in CloudStack. Some of these are discussed as follows:

  • Agent Manager: Agents are very critical resource to the Cloud architecture. Agents are deployed in all the resources that are managed in the CloudStack environment. They provide communication channel between the management server and the resources. They provide information, as well as assist in performing operations on the resources.
  • Resource Layer: The resource layer is the layer which provides fuel to our engine, i.e. the resources. The resources can be of many kinds such as XenServer resources, KVM resources, VSphere resources, F5 resources, and so on.

The CloudStack management server is the core component that is responsible for managing the actions that make the whole deployment a success. Let's take a close look into the process flow within the CloudStack—when a new request comes in, how is it fulfilled?

CloudStack operations

The user is presented with a number of options with respect to the interface through which he or she can submit his/her request or demand. There are user interfaces such as CloudPortal, Command Line Interface, API calls, or any other clients. The user submits the request using such a console.

When a request is submitted by the user, that request is authenticated and the access rights are checked to confirm the user's right to perform the specified request. If the user's authentication fails, the access control layer of the CloudStack management server denies further processing of the request. If the user's request is successfully authenticated, the request is passed by the Access layer to the kernel.

Security check

After all the security checks are passed, the request is passed down to the kernel and the kernel distributes the tasks to its different components for execution one by one. Let's take an example of a user requesting a new virtual machine service with some software packages installed on it. After passing through the security checks, the request is passed to the virtual machine manager.

The virtual machine manager

The virtual machine manager is responsible for the deployment plan for provisioning the virtual machine such as the host to which this machine is to be created on, to which cluster the host belongs, the pod and the zone of the cluster. The virtual machine manager then starts with the creation of virtual machine by allocating the resources from the host to the virtual machine.

The virtual machine manager initiates the creation of NICs that have to be attached to the virtual machine and assigns this task to the network manager. The network manager takes care of the preparation of NICs and the assignment of IP addresses that are to be attached to the virtual machine that is being created.

The virtual machine manager also triggers the storage allocation to the virtual machine, and requests the storage manager to create volume specified. This volume is then attached to the new VM.

If the virtual machine is to be created using a template, then the template manager is contacted for the details and other resources such as the OS, packages, and other resources to be associated with the request.

Server resources

After all the resources are allocated, the request is passed on to the deployment planner and the server resources. The server resource helps in translation of the CloudStack commands to the resources' APIs. The resource APIs perform tasks as per the instructions and create the virtual machine.


Once the virtual machine is created, the OS and the software packages are installed on it. The software packages also include the agent that is to be deployed inside the virtual machine, which helps the virtual machine and the CloudStack management server to communicate with each other.

Job result

After the request is provisioned, the job status is reported back to the user. The user gets the details of the virtual machine that has been newly created upon his request. He can now log in to the virtual machine and use it as he wishes. The job results and the properties of the virtual machine are stored in the databases. The logs of the process are stored for reference and the database records for the resource are updated.


The CloudDB is the primary MySQL database that is used by management server in the CloudStack deployment for storing all the configuration information. The CloudDB can be installed on the same server or on a different server. The CloudDB can also be set up as a MySQL cluster for high availability. The CloudStack management server communicates with the master database and the replicas are in sync with the master continuously. The administrator must configure the MySQL database or CloudDB with a username and password for security.

The administrator can also provide MySQL replication in order to provide manual failover in case of database failure.

CloudDB is a critical component for the working of CloudStack as it contains information about the offering, hosts' profiles, accounts credentials, configuration information, network information, and so on. The database is accessed for information on an on-demand basis, which shows the criticality of the management server and database and the need to configure them properly.

CloudStack networking architecture

There are two main types of network configurations that can be deployed using CloudStack: the basic and advanced types.

The basic type is similar to AWS level 3 isolation using security groups. The security groups ensure the isolation and the egress and ingress of the traffic. The IPAM system manages the IP addresses and tenants don't usually get a contiguous IP address or subnet, the assignment of IP addresses to tenants is basically random from the IPAM system. This configuration has the capability to scale to configure millions of virtual machines on thousands of hosts.

The advanced type offers full level 3 subnets where security and isolation are provided using VLANs, Stateless transport tunneling (STT), and Generic Routing Encapsulation (GRE). Other features such as NAT, VPN, and so on can also be configured.

Network service providers

CloudStack network services are made possible with the help of a network service provider, which is basically a network element, hardware, or a virtual appliance. It can be a Cisco or Juniper device(s) that provide(s) firewall services in the same physical network or a F5 load balancer which provides load balancing for the virtual machines registered with it, it can also be a CloudStack virtual router which provides networking configuration for VLANs or overlay network, which helps in the division of a network into multiple tiers.

There can be single or multiple network service providers which are used to provide network services for a single network. There can be multiple instances of the same service provider in a single network. In the case where various network service providers are configured to provide network services, the users have the option to select from the several network offerings that are created by the administrator.

CloudStack network offerings

CloudStack provides various network offerings. These network offerings are a group of network services such as firewall, DHCP, DNS, and so on, that are provided as an offering to the users. These network offerings also provide the specifications of the service providers and are tagged to specific underlying network.

The cloud administrator can define new network offerings which can be segregated based on tags. It is up to the administrator to determine the network offering they want to provide throughout their entire cloud offering. The users are allowed to access the network offering based on their tags. The network offerings group together a set of network service such as a firewall, DHCP, and DNS.

The administrator can also choose specific network service providers to be provided as an offering. The network offerings can be any of the three states—Enable, Disable or Inactive. By default they are in the Disable state when created. Some of the network offerings are used by the system only and the users don't have their visibility. The tags that are used with each network offering cannot be updated but the physical network tags can be updated or deleted.

CloudStack is deployed with three default network offerings for the end users, a virtual network offering, a shared network offering without a security group, and a shared network offering with security group. Furthermore new network offerings can be created by the administrator to suit the environment needs and include various networking services. These network offerings include different networking services based on the configuration defined.

The shared network offerings are created when the user provisions a VM using that network. The users can also create networks from these network offerings. A set of network offerings can be DHCP, DNS, Source NAT, Static NAT, port forwarding, load balancing, firewall, VPN, or any other optional service provider based network offering. Some of these services are provided using third party hardware equipment such as Juniper or Netscaler.

Types of network in CloudStack

CloudStack provides various types of network services for end users. CloudStack supports multiple network services from third parties. It helps with providing complex network configurations in the cloud.

Physical network

A zone in the CloudStack deployment can be associated with one or more physical networks. A physical network can be used to carry one or more types of network traffic. A zone can use the basic network configuration or advanced network configuration, which will decide the type of network traffic that flows through the physical networks.

In a zone with basic network configuration, only one physical network can be present. There are basically three types of network traffic that are allowed. They are:

  • Guest Network traffic: This is the traffic flowing over the guest network for communication between the guest VMs when they are running. All the guest networks which are of type isolated share the same subnet which is set at the zone level. Guest traffic of a VM within one zone is carried in one network, VMs in different zones cannot communicate with each other. In order for the VMs in different zone to communicate, they must do it via a router through a public IP address.
  • Management traffic: This traffic is generated by the internal resources of CloudStack. This basically comprises of the traffic between the hosts in the clusters, system VMs (these VMs perform various tasks by CloudStack in the cloud). The administrator must configure the IP ranges of the system VMs. This type of network traffic is usually untagged. The management traffic is should be isolated from the other traffic. The management traffic contains all the UDP traffic for heartbeats. It is highly recommended to isolate the management traffic from the other network traffic.
  • Storage traffic: This traffic is the traffic flowing between the primary and secondary storage servers. These can be the VM templates which are placed on the secondary storage and when the user requests to create a VM based on some template, that template data has to flow from secondary storage server to the primary storage server. Another example would be when a user creates a snapshot; the snapshots are stored in the secondary storage, so this snapshot data has to flow to the secondary storage. The storage network traffic is generally configured to be on a separate NIC to ensure better performance. In a zone with advanced network traffic types, there are additional network traffics that flow apart from the traffic flow in zone with basic network traffic. In the basic type of zone, VM traffic is publicly routable by default, whereas in advanced zone type, public label network traffic is exposed.
  • Public network traffic: This kind of traffic flows between VMs and the Internet; this requires the VM to have a public IPs which can be assigned to the VM through the UI. In the case of an advanced network zone, one public IP is assigned to per account to be used as the source NAT. Using hardware devices such as Juniper SRX firewall, a single public IP can be used across all the accounts. Users can also request additional public IPs. This public IP can also be used to implement and configure a NAT instance between the guest and public networks.

All these types of network traffics can be multiplexed in the same underlying physical network using VLANs. It is up to the admin how he configures the network traffic and maps these network types to the underlying physical network and configures the labels on the hypervisor. These can all be done using the admin user interface of the CloudStack. Once the zone is created, the traffic labels can be changed from the user interface, whereas if we need to change the physical networks, some database entries are to be changed as well.

Virtual network

In order to enable multi-tenancy on a single physical network, the physical network has to be logically divided into several logical constructs, each logical construct is known as virtual network. All the information about the virtual networks and their setting are configured and stored in CloudStack. These settings are activated only when the first VM is started and assigned to this network and the virtual network is also deleted or garbage collected when all the VMs are removed from that network. Thus, CloudStack helps in preserving the network resources and optimizing wastage. CloudStack allows the virtual network to be shared or isolated. The various types of virtual networks are discussed in the following sections.

Isolated networks

These networks, as the term suggests, are isolated and can be accessed only on virtual machines of a single account except for the domain administrators. The resources such as VLAN are allocated to these types of networks and the garbage collection is done dynamically. The isolated network can be upgraded or downgraded only if it is done for the entire network because it is unique for the entire network.

Shared networks

As the name suggests, a shared network can be accessed by the VMs of different accounts. But if one wants to attain isolation on this type of network, it can be achieved by using security groups as per CloudStack 4.0. These networks are created by the administrator who can also designate the shared network to a certain domain. The administrator has the responsibility to designate the network resources such as VLAN and the physical network it is mapped to. This network should be pre-created before the guest VM is provisioned on it.

CloudStack also supports the creation of guest network with the isolation type set to "STT". When configuring a zone, the administrator should create a new physical network for guest traffic and select "STT" as the isolation type.


We introduced you to the building blocks of cloud and the architecture of cloud computing in this article. We discussed the various deployment and the service models of cloud computing. We saw the various layers that are essential components of a cloud solution. We also described the architecture and the several components in Apache CloudStack.

Resources for Article :

Further resources on this subject:

You've been reading an excerpt of:

Apache CloudStack Cloud Computing

Explore Title