Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Cloud Computing

121 Articles
article-image-the-future-of-cloud-lies-in-revisiting-the-designs-and-limitations-of-todays-notion-of-serverless-computing-say-uc-berkeley-researchers
Savia Lobo
17 Dec 2018
5 min read
Save for later

The Future of Cloud lies in revisiting the designs and limitations of today’s notion of ‘serverless computing’, say UC Berkeley researchers

Savia Lobo
17 Dec 2018
5 min read
Last week, researchers at the UC Berkeley released a research paper titled ‘Serverless Computing: One Step Forward, Two Steps Back’, which highlights some pitfalls in the current serverless architectures. Researchers have also explored the challenges that should be addressed to utilize the complete potential that the cloud can offer to innovative developers. Cloud isn’t being used to the fullest The researchers have termed cloud as “the biggest assemblage of data capacity and distributed computing power ever available to the general public, managed as a service”. The cloud today is being used as an outsourcing platform for standard enterprise data services. In order to leverage the actual potential of the cloud to the fullest, creative developers need programming frameworks. The majority of cloud services are simply multi-tenant, easier-to-administer clones of legacy enterprise data services such as object storage, databases, queueing systems, and web/app servers. Off late, the buzz for serverless computing--a platform in the cloud where developers simply upload their code, and the platform executes it on their behalf as needed at any scale--is on the rise. This is because public cloud vendors have started offering new programming interfaces under the banner of serverless computing. The researchers support this with a Google search trend comparison where the term “serverless” recently matched the historic peak of popularity of the phrase “Map Reduce” or “MapReduce”. Source: arxiv.org They point out that the notion of serverless computing is vague enough to allow optimists to project any number of possible broad interpretations on what it might mean. Hence, in this paper, they have assessed the field based on the serverless computing services that vendors are actually offering today and also see why these services are a disappointment given that the cloud has a bigger potential. A Serverless architecture based on FaaS (Function-as-a-Service) Functions-as-a-Service (FaaS) is the commonly used and more descriptive name for the core of serverless offerings from the public cloud providers. Typical FaaS offerings today support a variety of languages (e.g., Python, Java, Javascript, Go), allow programmers to register functions with the cloud provider, and enable users to declare events that trigger each function. The FaaS infrastructure monitors the triggering events, allocates a runtime for the function, executes it, and persists the results. The user is billed only for the computing resources used during function invocation. Building applications on FaaS not only requires data management in both persistent and temporary storage but also mechanisms to trigger and scale function execution. According to the researchers, cloud providers are quick to emphasize that serverless is not only FaaS, but it is, FaaS supported by a “standard library”: the various multi-tenanted, autoscaling services provided by the vendor; for instance, S3 (large object storage), DynamoDB (key-value storage), SQS (queuing services), and more. However, current FaaS solutions are good for simple workloads of independent tasks such as parallel tasks embedded in Lambda functions, or jobs to be run by the proprietary cloud services. However, when it comes to use cases that involve stateful tasks, these FaaS have a surprisingly high latency. These realities limit the attractive use cases for FaaS today, discouraging new third-party programs that go beyond the proprietary service offerings from the vendors. Limitations of the current FaaS offering No recoverability Function invocations are shut down by the Lambda infrastructure automatically after 15 minutes. Lambda may keep the function’s state cached in the hosting VM in order to support a ‘warm start’ state. However, there is no way to ensure that subsequent invocations are run on the same VM. Hence functions must be written assuming that state will not be recoverable across invocations. I/O Bottlenecks Lambdas usually connect to cloud services or shared storage across a network interface. This means moving data across nodes or racks. With FaaS, things appear even worse than the network topology would suggest. Recent studies show that a single Lambda function can achieve on average 538 Mbps network bandwidth. This is an order of magnitude slower than a single modern SSD. Worse, AWS appears to attempt to pack Lambda functions from the same user together on a single VM, so the limited bandwidth is shared by multiple functions. The result is that as compute power scales up, per-function bandwidth shrinks proportionately. With 20 Lambda functions, average network bandwidth was 28.7Mbps—2.5 orders of magnitude slower than a single SSD. Communication Through Slow Storage Lambda functions can only communicate through an autoscaling intermediary service. As a corollary, a client of Lambda cannot address the particular function instance that handled the client’s previous request: there is no “stickiness” for client connections. Hence maintaining state across client calls require writing the state out to slow storage, and reading it back on every subsequent call. No Specialized Hardware FaaS offerings today only allow users to provision a time slice of a CPU hyperthread and some amount of RAM; in the case of AWS Lambda, one determines the other. There is no API or mechanism to access specialized hardware. These constraints, combined with some significant shortcomings in the standard library of FaaS offerings, substantially limit the scope of feasible serverless applications. The researchers conclude, “We see the future of cloud programming as far, far brighter than the promise of today’s serverless FaaS offerings. Getting to that future requires revisiting the designs and limitations of what is being called ‘serverless computing’ today.” They believe cloud programmers need to build a programmable framework that goes beyond FaaS, to dynamically manage the allocation of resources in order to meet user-specified performance goals for both compute and for data. The program analysis and scheduling issues are likely to open up significant opportunities for more formal research, especially for data-centric programs. To know more this research in detail, read the complete research paper. Introducing GitLab Serverless to deploy cloud-agnostic serverless functions and applications Introducing ‘Pivotal Function Service’ (alpha): an open, Kubernetes based, multi-cloud serverless framework for developer workloads Introducing numpywren, a system for linear algebra built on a serverless architecture
Read more
  • 0
  • 0
  • 18665

article-image-achieving-high-availability-aws-cloud
Packt
11 Aug 2015
18 min read
Save for later

Achieving High-Availability on AWS Cloud

Packt
11 Aug 2015
18 min read
In this article, by Aurobindo Sarkar and Amit Shah, author of the book Learning AWS, we will introduce some key design principles and approaches to achieving high availability in your applications deployed on the AWS cloud. As a good practice, you want to ensure that your mission-critical applications are always available to serve your customers. The approaches in this article will address availability across the layers of your application architecture including availability aspects of key infrastructural components, ensuring there are no single points of failure. In order to address availability requirements, we will use the AWS infrastructure (Availability Zones and Regions), AWS Foundation Services (EC2 instances, Storage, Security and Access Control, Networking), and the AWS PaaS services (DynamoDB, RDS, CloudFormation, and so on). (For more resources related to this topic, see here.) Defining availability objectives Achieving high availability can be costly. Therefore, it is important to ensure that you align your application availability requirements with your business objectives. There are several options to achieve the level of availability that is right for your application. Hence, it is essential to start with a clearly defined set of availability objectives and then make the most prudent design choices to achieve those objectives at a reasonable cost. Typically, all systems and services do not need to achieve the highest levels of availability possible; at the same time ensure you do not introduce a single point of failure in your architecture through dependencies between your components. For example, a mobile taxi ordering service needs its ordering-related service to be highly available; however, a specific customer's travel history need not be addressed at the same level of availability. The best way to approach high availability design is to assume that anything can fail, at any time, and then consciously design against it. "Everything fails, all the time." - Werner Vogels, CTO, Amazon.com In other words, think in terms of availability for each and every component in your application and its environment because any given component can turn into a single point of failure for your entire application. Availability is something you should consider early on in your application design process, as it can be hard to retrofit it later. Key among these would be your database and application architecture (for example, RESTful architecture). In addition, it is important to understand that availability objectives can influence and/or impact your design, development, test, and running your system on the cloud. Finally, ensure you proactively test all your design assumptions and reduce uncertainty by injecting or forcing failures instead of waiting for random failures to occur. The nature of failures There are many types of failures that can happen at any time. These could be a result of disk failures, power outages, natural disasters, software errors, and human errors. In addition, there are several points of failure in any given cloud application. These would include DNS or domain services, load balancers, web and application servers, database servers, application services-related failures, and data center-related failures. You will need to ensure you have a mitigation strategy for each of these types and points of failure. It is highly recommended that you automate and implement detailed audit trails for your recovery strategy, and thoroughly test as many of these processes as possible. In the next few sections, we will discuss various strategies to achieve high availability for your application. Specifically, we will discuss the use of AWS features and services such as: VPC Amazon Route 53 Elastic Load Balancing, auto-scaling Redundancy Multi-AZ and multi-region deployments Setting up VPC for high availability Before setting up your VPC, you will need to carefully select your primary site and a disaster recovery (DR) site. Leverage AWS's global presence to select the best regions and availability zones to match your business objectives. The choice of a primary site is usually the closest region to the location of a majority of your customers and the DR site could be in the next closest region or in a different country depending on your specific requirements. Next, we need to set up the network topology, which essentially includes setting up the VPC and the appropriate subnets. The public facing servers are configured in a public subnet; whereas the database servers and other application servers hosting services such as the directory services will usually reside in the private subnets. Ensure you chose different sets of IP addresses across the different regions for the multi-region deployment, for example 10.0.0.0/16 for the primary region and 192.168.0.0/16 for the secondary region to avoid any IP addressing conflicts when these regions are connected via a VPN tunnel. Appropriate routing tables and ACLs will also need to be defined to ensure traffic can traverse between them. Cross-VPC connectivity is required so that data transfer can happen between the VPCs (say, from the private subnets in one region over to the other region). The secure VPN tunnels are basically IPSec tunnels powered by VPN appliances—a primary and a secondary tunnel should be defined (in case the primary IPSec tunnel fails). It is imperative you consult with your network specialists through all of these tasks. An ELB is configured in the primary region to route traffic across multiple availability zones; however, you need not necessarily commission the ELB for your secondary site at this time. This will help you avoid costs for the ELB in your DR or secondary site. However, always weigh these costs against the total cost/time required for recovery. It might be worthwhile to just commission the extra ELB and keep it running. Gateway servers and NAT will need to be configured as they act as gatekeepers for all inbound and outbound Internet access. Gateway servers are defined in the public subnet with appropriate licenses and keys to access your servers in the private subnet for server administration purposes. NAT is required for servers located in the private subnet to access the Internet and is typically used for automatic patch updates. Again, consult your network specialists for these tasks. Elastic load balancing and Amazon Route 53 are critical infrastructure components for scalable and highly available applications; we discuss these services in the next section. Using ELB and Route 53 for high availability In this section, we describe different levels of availability and the role ELBs and Route 53 play from an availability perspective. Instance availability The simplest guideline here is to never run a single instance in a production environment. The simplest approach to improving greatly from a single server scenario is to spin up multiple EC2 instances and stick an ELB in front of them. The incoming request load is shared by all the instances behind the load balancer. ELB uses the least outstanding requests routing algorithm to spread HTTP/HTTPS requests across healthy instances. This algorithm favors instances with the fewest outstanding requests. Even though it is not recommended to have different instance sizes between or within the AZs, the ELB will adjust for the number of requests it sends to smaller or larger instances based on response times. In addition, ELBs use cross-zone load balancing to distribute traffic across all healthy instances regardless of AZs. Hence, ELBs help balance the request load even if there are unequal number of instances in different AZs at any given time (perhaps due to a failed instance in one of the AZs). There is no bandwidth charge for cross-zone traffic (if you are using an ELB). Instances that fail can be seamlessly replaced using auto scaling while other instances continue to operate. Though auto-replacement of instances works really well, storing application state or caching locally on your instances can be hard to detect problems. Instance failure is detected and the traffic is shifted to healthy instances, which then carries the additional load. Health checks are used to determine the health of the instances and the application. TCP and/or HTTP-based heartbeats can be created for this purpose. It is worthwhile implementing health checks iteratively to arrive at the right set that meets your goals. In addition, you can customize the frequency and the failure thresholds as well. Finally, if all your instances are down, then AWS will return a 503. Zonal availability or availability zone redundancy Availability zones are distinct geographical locations engineered to be insulated from failures in other zones. It is critically important to run your application stack in more than one zone to achieve high availability. However, be mindful of component level dependencies across zones and cross-zone service calls leading to substantial latencies in your application or application failures during availability zone failures. For sites with very high request loads, a 3-zone configuration might be the preferred configuration to handle zone-level failures. In this situation, if one zone goes down, then other two AZs can ensure continuing high availability and better customer experience. In the event of a zone failure, there are several challenges in a Multi-AZ configuration, resulting from the rapidly shifting traffic to the other AZs. In such situations, the load balancers need to expire connections quickly and lingering connections to caches must be addressed. In addition, careful configuration is required for smooth failover by ensuring all clusters are appropriately auto scaled, avoiding cross-zone calls in your services, and avoiding mismatched timeouts across your architecture. ELBs can be used to balance across multiple availability zones. Each load balancer will contain one or more DNS records. The DNS record will contain multiple IP addresses and DNS round-robin can be used to balance traffic between the availability zones. You can expect the DNS records to change over time. Using multiple AZs can result in traffic imbalances between AZs due to clients caching DNS records. However, ELBs can help reduce the impact of this caching. Regional availability or regional redundancy ELB and Amazon Route 53 have been integrated to support a single application across multiple regions. Route 53 is AWS's highly available and scalable DNS and health checking service. Route 53 supports high availability architectures by health checking load balancer nodes and rerouting traffic to avoid the failed nodes, and by supporting implementation of multi-region architectures. In addition, Route 53 uses Latency Based Routing (LBR) to route your customers to the endpoint that has the least latency. If multiple primary sites are implemented with appropriate health checks configured, then in cases of failure, traffic shifts away from that site to the next closest region. Region failures can present several challenges as a result of rapidly shifting traffic (similar to the case of zone failures). These can include auto scaling, time required for instance startup, and the cache fill time (as we might need to default to our data sources, initially). Another difficulty usually arises from the lack of information or clarity on what constitutes the minimal or critical stack required to keep the site functioning as normally as possible. For example, any or all services will need to be considered as critical in these circumstances. The health checks are essentially automated requests sent over the Internet to your application to verify that your application is reachable, available, and functional. This can include both your EC2 instances and your application. As answers are returned only for the resources that are healthy and reachable from the outside world, the end users can be routed away from a failed application. Amazon Route 53 health checks are conducted from within each AWS region to check whether your application is reachable from that location. The DNS failover is designed to be entirely automatic. After you have set up your DNS records and health checks, no manual intervention is required for failover. Ensure you create appropriate alerts to be notified when this happens. Typically, it takes about 2 to 3 minutes from the time of the failure to the point where traffic is routed to an alternate location. Compare this to the traditional process where an operator receives an alarm, manually configures the DNS update, and waits for the DNS changes to propagate. The failover happens entirely within the Amazon Route 53 data plane. Depending on your availability objectives, there is an additional strategy (using Route 53) that you might want to consider for your application. For example, you can create a backup static site to maintain a presence for your end customers while your primary dynamic site is down. In the normal course, Route 53 will point to your dynamic site and maintain health checks for it. Furthermore, you will need to configure Route 53 to point to the S3 storage, where your static site resides. If your primary site goes down, then traffic can be diverted to the static site (while you work to restore your primary site). You can also combine this static backup site strategy with a multiple region deployment. Setting up high availability for application and data layers In this section, we will discuss approaches for implementing high availability in the application and data layers of your application architecture. The auto healing feature of AWS OpsWorks provides a good recovery mechanism from instance failures. All OpsWorks instances have an agent installed. If an agent does not communicate with the service for a short duration, then OpsWorks considers the instance to have failed. If auto healing is enabled at the layer and an instance becomes unhealthy, then OpsWorks first terminates the instance and starts a new one as per the layer configuration. In the application layer, we can also do cold starts from preconfigured images or a warm start from scaled down instances for your web servers and application servers in a secondary region. By leveraging auto scaling, we can quickly ramp up these servers to handle full production loads. In this configuration, you would deploy the web servers and application servers across multiple AZs in your primary region while the standby servers need not be launched in your secondary region until you actually need them. However, keep the preconfigured AMIs for these servers ready to launch in your secondary region. The data layer can comprise of SQL databases, NoSQL databases, caches, and so on. These can be AWS managed services such as RDS, DynamoDB, and S3, or your own SQL and NoSQL databases such as Oracle, SQL Server, or MongoDB running on EC2 instances. AWS services come with HA built-in, while using database products running on EC2 instances offers a do-it-yourself option. It can be advantageous to use AWS services if you want to avoid taking on database administration responsibilities. For example, with the increasing sizes of your databases, you might choose to share your databases, which is easy to do. However, resharding your databases while taking in live traffic can be a very complex undertaking and present availability risks. Choosing to use the AWS DynamoDB service in such a situation offloads this work to AWS, thereby resulting in higher availability out of the box. AWS provides many different data replication options and we will discuss a few of those in the following several paragraphs. DynamoDB automatically replicates your data across several AZs to provide higher levels of data durability and availability. In addition, you can use data pipelines to copy your data from one region to another. DynamoDB streams functionality that can be leveraged to replicate to another DynamoDB in a different region. For very high volumes, low latency Kinesis services can also be used for this replication across multiple regions distributed all over the world. You can also enable the Multi-AZ setting for the AWS RDS service to ensure AWS replicates your data to a different AZ within the same region. In the case of Amazon S3, the S3 bucket contents can be copied to a different bucket and the failover can be managed on the client side. Depending on the volume of data, always think in terms of multiple machines, multiple threads and multiple parts to significantly reduce the time it takes to upload data to S3 buckets. While using your own database (running on EC2 instances), use your database-specific high availability features for within and cross-region database deployments. For example, if you are using SQL Server, you can leverage the SQL Server Always-on feature for synchronous and asynchronous replication across the nodes. If the volume of data is high, then you can also use the SQL Server log shipping to first upload your data to Amazon S3 and then restore into your SQL Server instance on AWS. A similar approach in case of Oracle databases uses OSB Cloud Module and RMAN. You can also replicate your non-RDS databases (on-premise or on AWS) to AWS RDS databases. You will typically define two nodes in the primary region with synchronous replication and a third node in the secondary region with asynchronous replication. NoSQL databases such as MongoDB and Cassandra have their own asynchronous replication features that can be leveraged for replication to a different region. In addition, you can create Read Replicas for your databases in other AZs and regions. In this case, if your master database fails followed by a failure of your secondary database, then one of the read replicas can be promoted to being the master. In hybrid architectures, where you need to replicate between on-premise and AWS data sources, you can do so through a VPN connection between your data center and AWS. In case of any connectivity issues, you can also temporarily store pending data updates in SQS, and process them when the connectivity is restored. Usually, data is actively replicated to the secondary region while all other servers like the web servers and application servers are maintained in a cold state to control costs. However, in cases of high availability for web scale or mission critical applications, you can also choose to deploy your servers in active configuration across multiple regions. Implementing high availability in the application In this section, we will discuss a few design principles to use in your application from a high availability perspective. We will briefly discuss using highly available AWS services to implement common features in mobile and Internet of Things (IoT) applications. Finally, we also cover running packaged applications on the AWS cloud. Designing your application services to be stateless and following a micro services-oriented architecture approach can help the overall availability of your application. In such architectures, if a service fails then that failure is contained or isolated to that particular service while the rest of your application services continue to serve your customers. This approach can lead to an acceptable degraded experience rather than outright failures or worse. You should also store user or session information in a central location such as the AWS ElastiCache and then spread information across multiple AZs for high availability. Another design principle is to rigorously implement exception handling in your application code, and in each of your services to ensure graceful exit in case of failures. Most mobile applications share common features including user authentication and authorization, data synchronization across devices; user behavior analytics; retention tracking, storing, sharing, and delivering media globally; sending push notifications; store shared data; stream real-time data; and so on. There are a host of highly available AWS services that can be used for implementing such mobile application functionality. For example, you can use Amazon Cognito to authenticate users, Amazon Mobile Analytics for analyzing user behavior and tracking retention, Amazon SNS for push notifications and Amazon Kinesis for streaming real-time data. In addition, other AWS services such as S3, DynamoDB, IAM, and so on can also be effectively used to complete most mobile application scenarios. For mobile applications, you need to be especially sensitive about latency issues; hence, it is important to leverage AWS regions to get as close to your customers as possible. Similar to mobile applications, for IoT applications you can use the same highly available AWS services to implement common functionality such as device analytics and device messaging/notifications. You can also leverage Amazon Kinesis to ingest data from hundreds of thousands of sensors that are continuously generating massive quantities of data. Aside from your own custom applications, you can also run packaged applications such as SAP on AWS. These would typically include replicated standby systems, Multi-AZ and multi-region deployments, hybrid architectures spanning your own data center, and AWS cloud (connected via VPN or AWS Direct Connect service), and so on. For more details, refer to the specific package guides for achieving high availability on the AWS cloud. Summary In this article, we reviewed some of the strategies you can follow for achieving high availability in your cloud application. We emphasized the importance of both designing your application architecture for availability and using the AWS infrastructural services to get the best results. Resources for Article: Further resources on this subject: Securing vCloud Using the vCloud Networking and Security App Firewall [article] Introduction to Microsoft Azure Cloud Services [article] AWS Global Infrastructure [article]
Read more
  • 0
  • 0
  • 18450

article-image-setting-vpnaas-openstack
Packt
11 Aug 2015
7 min read
Save for later

Setting up VPNaaS in OpenStack

Packt
11 Aug 2015
7 min read
In this article by Omar Khedher, author of the book Mastering OpenStack, we will create a VPN setup between two sites using the Neutron VPN driver plugin in Neutron. The VPN as a Service is a new network functionality that is provided by Neutron, which is the network service that introduces the VPN feature set. It was completely integrated in OpenStack since the Havana release. We will set up two private networks in two different OpenStack sites and create some IPSec site-to-site connections. The instances that are located in each OpenStack private network should be able to connect to each other across the VPN tunnel. The article assumes that you have two OpenStack environments in different networks. General settings The following figure illustrates the VPN topology between two OpenStack sites to provide a secure connection between a database instance that runs in a data center in Amsterdam along with a web server that runs in a data center in Tunis. Each site has a private and a public network as well as subnets that are managed by a neutron router. Enabling the VPNaaS The VPN driver plugin must be configured in /etc/neutron/neutron.conf. To enable it, add the following driver: # nano /etc/neutron/neutron.conf service_plugins = neutron.services.vpn.plugin.VPNDriverPlugin Next, we'll add the VPNaaS module interface in /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.py, as follows: 'enable_VPNaaS': True, Finally, restart the neutron-server and server web using the following commands: # service httpd restart # /etc/init.d/neutron-server restart Site's configuration Let's assume that each OpenStack site has at least a router attached within an external gateway and can provide access to the private network attached to an instance, such as a database or a web server virtual machine. A simple network topology of a private OpenStack environment running in the Amsterdam site will look like this: The private OpenStack environment running in the Tunis site will look like the following image: VPN management in the CLI Neutron offers several commands that can be used to create and manage the VPN connections in OpenStack. The essential commands that are associated with router management include the following: vpn-service-create vpn-service-delete vpn-service-list vpn-ikepolicy-create vpn-ikepolicy-list vpn-ipsecpolicy-create vpn-ipsecpolicy-delete vpn-ipsecpolicy-list vpn-ipsecpolicy-show ipsec-site-connection-create ipsec-site-connection-delete ipsec-site-connection-list Creating an IKE policy In the first VPN phase, we can create an IKE policy in Horizon. The following figure shows a simple IKE setup of the OpenStack environment that is located in the Amsterdam site: The same settings can be applied via the Neutron CLI using the vpn-ikepolicy-create command, as follows: Syntax: neutron vpn-ikepolicy-create --description <description> --auth-algorithm <auth-algorithm> --encryption-algorithm <encryption-algorithm> --ike-version <ike-version> --lifetime <units=UNITS, value=VALUE> --pfs <pfs> --phase1-negotiation-mode <phase1-negotiation-mode> --name <NAME> Creating an IPSec policy An IPSec policy in an OpenStack environment that is located in the Amsterdam site can be created in Horizon in the following way: The same settings can be applied via the Neutron CLI using the vpn-ipsecpolicy-create command, as follows: Syntax: neutron vpn-ipsecpolicy-create --description <description> --auth-algorithm <auth-algorithm> --encapsulation-mode <encapsulation-mode> --encryption-algorithm <encryption-algorithm> --lifetime <units=UNITS,value=VALUE> --pfs <pfs> --transform-protocol <transform-algorithm> --name <NAME> Creating a VPN service To create a VPN service, you will need to specify the router facing the external and attach the web server instance to the private network in the Amsterdam site. The router will act as a VPN gateway. We can add a new VPN service from Horizon in the following way: The same settings can be applied via the Neutron CLI using the vpn-service-create command, as follows: Syntax: neutron vpn-service-create --tenant-id <tenant-id> --description <description> ROUTER SUBNET --name <name> Creating an IPSec connection The following step requires an identification of the peer gateway of the remote site that is located in Tunis. We will need to collect the IPv4 address of the external gateway interface of the remote site as well as the remote private subnet. In addition to this, a pre-shared key (PSK) has to be defined and exchanged between both the sites in order to bring the tunnel up after establishing a successful PSK negotiation during the second phase of the VPN setup. You can collect the router information either from Horizon or from the command line. To get subnet-related information on the OpenStack Cloud that is located in the Tunis site, you can run the following command in the network node: # neutron router-list The preceding command yields the following result: We can then list the ports of Router-TN and check the attached subnets of each port using the following command line: # neutron router-port-list Router-TN The preceding command gives the following result: Now, we can add a new IPSec site connection from the Amsterdam OpenStack Cloud in the following way: Similarly, we can perform the same configuration to create the VPN service using the neutron ipsec-site-connection-create command line, as follows: Syntax: neutron ipsec-site-connection-create --name <NAME> --description <description> ---vpnservice-id <VPNSERVICE> --ikepolicy-id <IKEPOLICY> --ipsecpolicy-id <IPSECPOLICY> --peer-address <PEER-ADDRESS> --peer-id <PEER-ID> --psk <PRESHAREDKEY> Remote site settings A complete VPN setup requires you to perform the same steps in the OpenStack Cloud that is located in the Tunis site. For a successful VPN phase 1 configuration, you must set the same IKE and IPSec policy attributes and change only the naming convention for each IKE and IPSec setup. Creating a VPN service on the second OpenStack Cloud will look like this: The last tricky part involves gathering the same router information in the Amsterdam site. Using the command line in the network node in the Amsterdam Cloud side, we can perform the following Neutron command: # neutron router-list The preceding command yields the following result: To get a list of networks that are attached to Router-AMS, we can execute the following command line: # neutron router-port-list Router-AMS The preceding command gives the following result: The external and private subnets that are associated with Router-AMS can be checked from Horizon as well. Now that we have the peer router gateway and remote private subnet, we will need to fill the same PSK that was configured previously. The next figure illustrates the new IPSec site connection on the OpenStack Cloud that is located in the Tunis site: At the Amsterdam site, we can check the creation of the new IPSec site connection by means of the neutron CLI, as follows: # neutron ipsec-site-connection-list The preceding command will give the following result: The same will be done at the Tunis site, as follows: # neutron ipsec-site-connection-list The preceding command gives the following result: Managing security groups For VPNaaS to work when connecting the Amsterdam and Tunis subnets, you will need to create a few additional rules in each project's default security group to enable not only the pings by adding a general ICMP rule, but also SSH on port 22. Additionally, we can create a new security group called Application_PP to restrict traffic on port 53 (DNS), 80 (HTTP), and 443 (HTTPS), as follows: # neutron security-group-create Application_PP --description "allow web traffic from the Internet" # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 80 --port_range_max 80 Application_PP --remote-ip-prefix 0.0.0.0/0 # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 53 --port_range_max 53 Application_PP --remote-ip-prefix 0.0.0.0/0 # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 443 --port_range_max 443 Application_PP --remote-ip-prefix 0.0.0.0/0 From Horizon, we will see the following security group rules added: Summary In this article, we created a VPN setup between two sites using the Neutron VPN driver plugin in Neutron. This process included enabling the VPNaas, configuring the site, and creating an IKE policy, an IPSec policy, a VPN service, and an IPSec connection. To continue Mastering OpenStack, take a look to see what else you will learn in the book here.
Read more
  • 0
  • 0
  • 17468

article-image-erasure-coding-cold-storage
Packt
07 Jun 2017
20 min read
Save for later

Erasure coding for cold storage

Packt
07 Jun 2017
20 min read
In this article by Nick Frisk, author of the book Mastering Ceph, we will get acquainted with erasure coding. Ceph's default replication level provides excellent protection against data loss by storing three copies of your data on different OSD's. The chance of losing all three disks that contain the same objects within the period that it takes Ceph to rebuild from a failed disk, is verging on the extreme edge of probability. However, storing 3 copies of data vastly increases both the purchase cost of the hardware but also associated operational costs such as power and cooling. Furthermore, storing copies also means that for every client write, the backend storage must write three times the amount of data. In some scenarios, either of these drawbacks may mean that Ceph is not a viable option. Erasure codes are designed to offer a solution. Much like how RAID 5 and 6 offer increased usable storage capacity over RAID1, erasure coding allows Ceph to provide more usable storage from the same raw capacity. However also like the parity based RAID levels, erasure coding brings its own set of disadvantages. (For more resources related to this topic, see here.) In this article you will learn: What is erasure coding and how does it work Details around Ceph's implementation of erasure coding How to create and tune an erasure coded RADOS pool A look into the future features of erasure coding with Ceph Kraken release What is erasure coding Erasure coding allows Ceph to achieve either greater usable storage capacity or increase resilience to disk failure for the same number of disks versus the standard replica method. Erasure coding achieves this by splitting up the object into a number of parts and then also calculating a type of Cyclic Redundancy Check, the Erasure code, and then storing the results in one or more extra parts. Each part is then stored on a separate OSD. These parts are referred to as k and m chunks, where k refers to the number of data shards and m refers to the number of erasure code shards. As in RAID, these can often be expressed in the form k+m or 4+2 for example. In the event of an OSD failure which contains an objects shard which isone of the calculated erasure codes, data is read from the remaining OSD's that store data with no impact. However, in the event of an OSD failure which contains the data shards of an object, Ceph can use the erasure codes to mathematically recreate the data from a combination of the remaining data and erasure code shards. k+m The more erasure code shards you have, the more OSD failure's you can tolerate and still successfully read data. Likewise the ratio of k to m shards each object is split into, has a direct effect on the percentage of raw storage that is required for each object. A 3+1 configuration will give you 75% usable capacity, but only allows for a single OSD failure and so would not be recommended. In comparison a three way replica pool, only gives you 33% usable capacity. 4+2 configurations would give you 66% usable capacity and allows for 2 OSD failures. This is probably a good configuration for most people to use. At the other end of the scale a 18+2 would give you 90% usable capacity and still allows for 2 OSD failures. On the surface this sounds like an ideal option, but the greater total number of shards comes at a cost. The higher the number of total shards has a negative impact on performance and also an increased CPU demand. The same 4MB object that would be stored as a whole single object in a replicated pool, is now split into 20 x 200KB chunks, which have to be tracked and written to 20 different OSD's. Spinning disks will exhibit faster bandwidth, measured in MB/s with larger IO sizes, but bandwidth drastically tails off at smaller IO sizes. These smaller shards will generate a large amount of small IO and cause additional load on some clusters. Also its important not to forget that these shards need to be spread across different hosts according to the CRUSH map rules, no shard belonging to the same object can be stored on the same host as another shard from the same object. Some clusters may not have a sufficient number hosts to satisfy this requirement. Reading back from these high chunk pools is also a problem. Unlike in a replica pool where Ceph can read just the requested data from any offset in an object, in an Erasure pool, all shards from all OSD's have to be read before the read request can be satisfied. In the 18+2 example this can massively amplify the amount of required disk read ops and average latency will increase as a result. This behavior is a side effect which tends to only cause a performance impact with pools that use large number of shards. A 4+2 configuration in some instances will get a performance gain compared to a replica pool, from the result of splitting an object into shards.As the data is effectively striped over a number of OSD's, each OSD is having to write less data and there is no secondary and tertiary replica's to write. How does erasure coding work in Ceph As with Replication, Ceph has a concept of a primary OSD, which also exists when using erasure coded pools. The primary OSD has the responsibility of communicating with the client, calculating the erasure shards and sending them out to the remaining OSD's in the Placement Group (PG) set. This is illustrated in the diagram below: If an OSD in the set is down, the primary OSD, can use the remaining data and erasure shards to reconstruct the data, before sending it back to the client. During read operations the primary OSD requests all OSD's in the PG set to send their shards. The primary OSD uses data from the data shards to construct the requested data, the erasure shards are discarded. There is a fast read option that can be enabled on erasure pools, which allows the primary OSD to reconstruct the data from erasure shards if they return quicker than data shards. This can help to lower average latency at the cost of slightly higher CPU usage. The diagram below shows how Ceph reads from an erasure coded pool: The next diagram shows how Ceph reads from an erasure pool, when one of the data shards is unavailable. Data is reconstructed by reversing the erasure algorithm using the remaining data and erasure shards. Algorithms and profiles There are a number of different Erasure plugins you can use to create your erasure coded pool. Jerasure The default erasure plugin in Ceph is the Jerasure plugin, which is a highly optimized open source erasure coding library. The library has a number of different techniques that can be used to calculate the erasure codes. The default is Reed Solomon and provides good performance on modern processors which can accelerate the instructions that the technique uses. Cauchy is another technique in the library, it is a good alternative to Reed Solomon and tends to perform slightly better. As always benchmarks should be conducted before storing any production data on an erasure coded pool to identify which technique best suits your workload. There are also a number of other techniques that can be used, which all have a fixed number of m shards. If you are intending on only having 2 m shards, then they can be a good candidate, as there fixed size means that optimization's are possible lending to increased performance. In general the jerasure profile should be prefer in most cases unless another profile has a major advantage, as it offers well balanced performance and is well tested. ISA The ISA library is designed to work with Intel processors and offers enhanced performance. It too supports both Reed Solomon and Cauchy techniques. LRC One of the disadvantages of using erasure coding in a distributed storage system is that recovery can be very intensive on networking between hosts. As each shard is stored on a separate host, recovery operations require multiple hosts to participate in the process. When the crush topology spans multiple racks, this can put pressure on the inter rack networking links. The LRC erasure plugin, which stands for Local Recovery Codes, adds an additional parity shard which is local to each OSD node. This allows recovery operations to remain local to the node where a OSD has failed and remove the need for nodes to receive data from all other remaining shard holding nodes. However the addition of these local recovery codes does impact the amount of usable storage for a given number of disks. In the event of multiple disk failures, the LRC plugin has to resort to using global recovery as would happen with the jerasure plugin. SHingled Erasure Coding The SHingled Erasure Coding (SHEC) profile is designed with similar goals to the LRC plugin, in that it reduces the networking requirements during recovery. However instead of creating extra parity shards on each node, SHEC shingles the shards across OSD's in an overlapping fashion. The shingle part of the plugin name represents the way the data distribution resembles shingled tiles on a roof of a house. By overlapping the parity shards across OSD's, the SHEC plugin reduces recovery resource requirements for both single and multiple disk failures. Where can I use erasure coding Since the Firefly release of Ceph in 2014, there has been the ability to create a RADOS pool using erasure coding. There is one major thing that you should be aware of, the erasure coding support in RADOS does not allow an object to be partially updated. You can write to an object in an erasure pool, read it back and even overwrite it whole, but you cannot update a partial section of it. This means that erasure coded pools can't be used for RBD and CephFS workloads and is limited to providing pure object storage either via the Rados Gateway or applications written to use librados. The solution at the time was to use the cache tiering ability which was released around the same time, to act as a layer above an erasure coded pools that RBD could be used. In theory this was a great idea, in practice, performance was extremely poor. Every time an object was required to be written to, the whole object first had to be promoted into the cache tier. This act of promotion probably also meant that another object somewhere in the cache pool was evicted. Finally the object now in the cache tier could be written to. This whole process of constantly reading and writing data between the two pools meant that performance was unacceptable unless a very high percentage of the data was idle. During the development cycle of the Kraken release, an initial implementation for support for direct overwrites on n erasure coded pool was introduced. As of the final Kraken release, support is marked as experimental and is expected to be marked as stable in the following release. Testing of this feature will be covered later in this article. Creating an erasure coded pool Let's bring our test cluster up again and switch into SU mode in Linux so we don't have to keep prepending sudo to the front of our commands Erasure coded pools are controlled by the use of erasure profiles, these control how many shards each object is broken up into including the split between data and erasure shards. The profiles also include configuration to determine what erasure code plugin is used to calculate the hashes. The following plugins are available to use <list of plugins> To see a list of the erasure profiles run # cephosd erasure-code-profile ls You can see there is a default profile in a fresh installation of Ceph. Lets see what configuration options it contains # cephosd erasure-code-profile get default The default specifies that it will use the jerasure plugin with the Reed Solomon error correcting codes and will split objects into 2 data shards and 1 erasure shard. This is almost perfect for our test cluster, however for the purpose of this exercise we will create a new profile. # cephosd erasure-code-profile set example_profile k=2 m=1 plugin=jerasure technique=reed_sol_van # cephosd erasure-code-profile ls You can see our new example_profile has been created. Now lets create our erasure coded pool with this profile: # cephosd pool create ecpool 128 128 erasure example_profile The above command instructs Ceph to create a new pool called ecpool with a 128 PG's. It should be an erasure coded pool and should use our "example_profile" we previously created. Lets create an object with a small text string inside it and the prove the data has been stored by reading it back: # echo "I am test data for a test object" | rados --pool ecpool put Test1 – # rados --pool ecpool get Test1 - That proves that the erasure coded pool is working, but it's hardly the most exciting of discoveries. Lets have a look to see if we can see what's happening at a lower level. First, find out what PG is holding the object we just created # cephosd map ecpoolTest1 The result of the above command tells us that the object is stored in PG 3.40 on OSD's1, 2 and 0. In this example Ceph cluster that's pretty obvious as we only have 3 OSD's, but in larger clusters that is a very useful piece of information. We can now look at the folder structure of the OSD's and see how the object has been split. The PG's will likely be different on your test cluster, so make sure the PG folder structure matches the output of the "cephosd map" command above. ls -l /var/lib/ceph/osd/ceph-2/current/1.40s0_head/ # ls -l /var/lib/ceph/osd/ceph-1/current/1.40s1_head/ # ls -l /var/lib/ceph/osd/ceph-0/current/1.40s2_head/                 total 4 Notice how the PG directory names have been appended with the shard number, replicated pools just have the PG number as their directory name. If you examine the contents of the object files, you will see our text string that we entered into the object when we created it. However due to the small size of the text string, Ceph has padded out the 2nd shard with null characters and the erasure shard hence will contain the same as the first. You can repeat this example with a new object containing larger amounts of text to see how Ceph splits the text into the shards and calculates the erasure code. Overwrites on erasure code pools with Kraken Introduced for the first time in the Kraken release of Cephas an experimental feature, was the ability to allow partial overwrites on erasure coded pools. Partial overwrite support allows RBD volumes to be created on erasure coded pools, making better use of raw capacity of the Ceph cluster. In parity RAID, where a write request doesn't span the entire stripe, a read modify write operation is required. This is needed as the modified data chunks will mean the parity chunk is now incorrect. The RAID controller has to read all the current chunks in the stripe, modify them in memory, calculate the new parity chunk and finally write this back out to the disk. Ceph is also required to perform this read modify write operation, however the distributed model of Ceph increases the complexity of this operation.When the primary OSD for a PG receives a write request that will partially overwrite an existing object, it first works out which shards will be not be fully modified by the request and contacts the relevant OSD's to request a copy of these shards. The primary OSD then combines these received shards with the new data and calculates the erasure shards. Finally the modified shards are sent out to the respective OSD's to be committed. This entire operation needs to conform the other consistency requirements Ceph enforces, this entails the use of temporary objects on the OSD, should a condition arise that Ceph needs to roll back a write operation. This partial overwrite operation, as can be expected, has a performance impact. In general the smaller the write IO's, the greater the apparent impact. The performance impact is a result of the IO path now being longer, requiring more disk IO's and extra network hops. However, it should be noted that due to the striping effect of erasure coded pools, in the scenario where full stripe writes occur, performance will normally exceed that of a replication based pool. This is simply down to there being less write amplification due to the effect of striping. If performance of an Erasure pool is not suitable, consider placing it behind a cache tier made up of a replicated pool. Despite partial overwrite support coming to erasure coded pools in Ceph, not every operation is supported. In order to store RBD data on an erasure coded pool, a replicated pool is still required to hold key metadata about the RBD. This configuration is enabled by using the –data-pool option with the rbd utility. Partial overwrite is also not recommended to be used with Filestore. Filestore lacks several features that partial overwrites on erasure coded pools uses, without these features extremely poor performance is experienced. Demonstration This feature requires the Kraken release or newer of Ceph. If you have deployed your test cluster with the Ansible and the configuration provided, you will be running Ceph Jewel release. The following steps show how to use Ansible to perform a rolling upgrade of your cluster to the Kraken release. We will also enable options to enable experimental options such as bluestore and support for partial overwrites on erasure coded pools. Edit your group_vars/ceph variable file and change the release version from Jewel to Kraken. Also add: ceph_conf_overrides: global: enable_experimental_unrecoverable_data_corrupting_features: "debug_white_box_testing_ec_overwrites bluestore" And to correct a small bug when using Ansible to deploy Ceph Kraken, add: debian_ceph_packages: - ceph - ceph-common - ceph-fuse To the bottom of the file run the following Ansible playbook: ansible-playbook -K infrastructure-playbooks/rolling_update.yml Ansible will prompt you to make sure that you want to carry out the upgrade, once you confirm by entering yes the upgrade process will begin. Once Ansible has finished, all the stages should be successful as shown below: Your cluster has now been upgraded to Kraken and can be confirmed by running ceph -v on one of yours VM's running Ceph. As a result of enabling the experimental options in the configuration file, every time you now run a Ceph command, you will be presented with the following warning. This is designed as a safety warning to stop you running these options in a live environment, as they may cause irreversible data loss. As we are doing this on a test cluster, that is fine to ignore, but should be a stark warning not to run this anywhere near live data. The next command that is required to be run is to enable the experimental flag which allows partial overwrites on erasure coded pools. DO NOT RUN THIS ON PRODUCTION CLUSTERS cephosd pool get ecpooldebug_white_box_testing_ec_overwrites true Double check you still have your erasure pool called ecpool and the default RBD pool # cephosdlspools 0 rbd,1ecpool, And now create the rbd. Notice that the actual RBD header object still has to live on a replica pool, but by providing an additional parameter we can tell Ceph to store data for this RBD on an erasure coded pool. rbd create Test_On_EC --data-pool=ecpool --size=1G The command should return without error and you now have an erasure coded backed RBD image. You should now be able to use this image with any librbd application. Note: Partial overwrites on Erasure pools require Bluestore to operate efficiently. Whilst Filestore will work, performance will be extremely poor. Troubleshooting the 2147483647 error An example of this error is shown below when running the ceph health detail command. If you see 2147483647 listed as one of the OSD's for an erasure coded pool, this normally means that CRUSH was unable to find a sufficient number of OSD's to complete the PG peering process. This is normally due to the number of k+m shards being larger than the number of hosts in the CRUSH topology. However, in some cases this error can still occur even when the number of hosts is equal or greater to the number of shards. In this scenario it's important to understand how CRUSH picks OSD's as candidates for data placement. When CRUSH is used to find a candidate OSD for a PG, it applies the crushmap to find an appropriate location in the crush topology. If the result comes back as the same as a previous selected OSD, Ceph will retry to generate another mapping by passing slightly different values into the crush algorithm. In some cases if there is a similar number of hosts to the number of erasure shards, CRUSH may run out of attempts before it can suitably find correct OSD mappings for all the shards. Newer versions of Ceph has mostly fixed these problems by increasing the CRUSH tunable choose_total_tries. Reproducing the problem In order to aid understanding of the problem in more detail, the following steps will demonstrate how to create an erasure coded profile that will require more shards than our 3 node cluster can support. Firstly, like earlier in the articlecreate a new erasure profile, but modify the k/m parameters to be k=3 m=1: $ cephosd erasure-code-profile set broken_profile k=3 m=1 plugin=jerasure technique=reed_sol_van And now create a pool with it: $ cephosd pool create broken_ecpool 128 128 erasure broken_profile If we look at the output from ceph -s, we will see that the PG's for this new pool are stuck in the creating state. The output of ceph health detail, shows the reason why and we see the 2147483647 error. If you encounter this error and it is a result of your erasure profile being larger than your number of hosts or racks, depending on how you have designed your crushmap. Then the only real solution is to either drop the number of shards, or increase number of hosts. Summary In this article you have learnt what erasure coding is and how it is implemented in Ceph. You should also have an understanding of the different configuration options possible when creating erasure coded pools and their suitability for different types of scenarios and workloads. Resources for Article: Further resources on this subject: Ceph Instant Deployment [article] Working with Ceph Block Device [article] GNU Octave: Data Analysis Examples [article]
Read more
  • 0
  • 0
  • 17450

article-image-monitoring-openstack-networks
Packt
07 Oct 2015
6 min read
Save for later

Monitoring OpenStack Networks

Packt
07 Oct 2015
6 min read
In this article by Chandan Dutta Chowdhury and Sriram Subramanian, authors of the book OpenStack Networking Cookbook, we will explore various means to monitor the network resource utilization using Ceilometer. We will cover the following topics: Virtual Machine bandwidth monitoring L3 bandwidth monitoring (For more resources related to this topic, see here.) Introduction Due to the dynamic nature of virtual infrastructure and multiple users sharing the same cloud platform, the OpenStack administrator needs to track how the tenants use the resources. The data can also help in capacity planning by giving an estimate of the capacity of the physical devices and the trends of resource usage. An OpenStack Ceilometer project provides you with telemetry service. It can measure the usage of the resources by collecting statistics across the various OpenStack components. The usage data is collected over the message bus or by polling the various components. The OpenStack Neutron provides Ceilometer with the statistics that are related to the virtual networks. The following figure shows you how Ceilometer interacts with the Neutron and Nova services: To implement these recipes, we will use an OpenStack setup as described in the following screenshot: This setup has two compute nodes and one node for the controller and networking services. Virtual Machine bandwidth monitoring OpenStack Ceilometer collects the resource utilization of virtual machines by running a Ceilometer compute agent on all the compute nodes. These agents collect the various metrics that are related to each virtual machine running on the compute node. The data that is collected is periodically sent to the Ceilometer collector over the message bus. In this recipe, we will learn how to use the Ceilometer client to check the bandwidth utilization by a virtual machine. Getting ready For this recipe, you will need the following information: The SSH login credentials for a node where the OpenStack client packages are installed A shell RC file that initializes the environment variables for CLI How to do it… The following steps will show you how to determine the bandwidth utilization of a virtual machine: Using the appropriate credentials, SSH into the OpenStack node installed with the OpenStack client packages. Source the shell RC file to initialize the environment variables required for the CLI commands. Use the nova list command to find the ID of the virtual machine instance that is to be monitored: Use the ceilometer resource-list| grep <virtual-machine-id> command to find the resource ID of the network port associated with the virtual machine. Note down the resource ID for the virtual port associated to the virtual machine for use in the later commands. The virtual port resource ID is a combination of the virtual machine ID and the name of the tap interface for the virtual port. It's named in the form instance-<virtual-machine-id>-<tap-interface-name>: Use ceilometer meter-list –q resource=<virtual-port-resource-id> to find the meters associated with the network port on the virtual machine: Next, use ceilometer statistics –m <meter-name> –q resource=<virtual-port-resource-id> to view the network usage statistics. Use the meters that we discovered in the last step to view the associated data: Ceilometer stores the port bandwidth data for the incoming and outgoing packets and the bytes and their rates. How it works… The OpenStack Ceilometer compute agent collects the statistics related to the network port connected to the virtual machines and posts them on the message bus. These statistics are collected by the Ceilometer collector daemon. The Ceilometer client can be used to query a meter and filter the statistical data based on the resource ID. L3 bandwidth monitoring The OpenStack Neutron provides you with metering commands in order to enable the Layer 3 (L3) traffic monitoring. The metering commands create a label that can hold a list of the packet matching rules. Neutron counts and associates any L3 packet that matches these rules with the metering label. In this recipe, we will learn how to use the L3 traffic monitoring commands of Neutron to enable packet counting. Getting ready For this recipe, we will use a virtual machine that is connected to a network that, in turn, is connected to a router. The following figure describes the topology: We will use a network called private with CIDR of 10.10.10.0/24. For this recipe, you will need the following information: The SSH login credentials for a node where the OpenStack client packages are installed A shell RC file that initializes the environment variables for CLI The name of the L3 metering label The CIDR for which the traffic needs to be measured How to do it… The following steps will show you how to enable the monitoring traffic to or from any L3 network: Using the appropriate credentials, SSH into the OpenStack node installed with the OpenStack client packages. Source the shell RC file to initialize the environment variables required for the CLI commands. Use the Neutron meter-label-create command to create a metering label. Note the label ID as this will be used later with the Ceilometer commands: Use the Neutron meter-label-rule-create command to create a rule that associates a network address to the label that we created in the last step. In our case, we will count any packet that reaches the gateway from the CIDR 10.10.10.0/24 network to which the virtual machine is connected: Use the ceilometer meter-list command with the resource filter to find the meters associated with the label resource: Use the ceilometer statistics command to view the number of packets matching the metering label: The packet counting is now enabled and the bandwidth statistics can be viewed using Ceilometer. How it works… The Neutron monitoring agent implements the packet counting meter in the L3 router. It uses iptables to implement a packet counter. The Neutron agent collects the counter statistics periodically and posts them on the message bus, which is collected by the Ceilometer collector daemon. Summary In this article, we learned about ways to monitor the usage of virtual and physical networking resources. The resource utilization data can be used to bill the users of a public cloud and debug the infrastructure-related problems. Resources for Article: Further resources on this subject: Using the OpenStack Dash-board [Article] Installing Red Hat CloudForms on Red Hat OpenStack [Article] Securing OpenStack Networking [Article]
Read more
  • 0
  • 0
  • 17026

article-image-provision-iaas-terraform
Packt
14 Dec 2016
9 min read
Save for later

Provision IaaS with Terraform

Packt
14 Dec 2016
9 min read
 In this article by Stephane Jourdan and Pierre Pomes, the authors of Infrastructure as Code (IAC) Cookbook, the following sections will be covered: Configuring the Terraform AWS provider Creating and using an SSH key pair to use on AWS Using AWS security groups with Terraform Creating an Ubuntu EC2 instance with Terraform (For more resources related to this topic, see here.) Introduction A modern infrastructure often usesmultiple providers (AWS, OpenStack, Google Cloud, Digital Ocean, and many others), combined with multiple external services (DNS, mail, monitoring, and others). Many providers propose their own automation tool, but the power of Terraform is that it allows you to manage it all from one place, all using code. With it, you can dynamically create machines at two IaaS providers depending on the environment, register their names at another DNS provider, and enable monitoring at a third-party monitoring company, while configuring the company GitHub account and sending the application logs to an appropriate service. On top of that, it can delegate configuration to those who do it well (configuration management tools such as Chef, Puppet, and so on),all with the same tool. The state of your infrastructure is described, stored, versioned, and shared. In this article, we'll discover how to use Terraform to bootstrap a fully capable infrastructure on Amazon Web Services (AWS), deploying SSH key pairs and securing IAM access keys. Configuring the Terraform AWS provider We can use Terraform with many IaaS providers such as Google Cloud or Digital Ocean. Here we'll configure Terraform to be used with AWS. For Terraform to interact with an IaaS, it needs to have a provider configured. Getting ready To step through this section, you will need the following: An AWS account with keys A working Terraform installation An empty directory to store your infrastructure code An Internet connection How to do it… To configure the AWS provider in Terraform, we'll need the following three files: A file declaring our variables, an optional description, and an optional default for each (variables.tf) A file setting the variables for the whole project (terraform.tfvars) A provider file (provider.tf) Let's declare our variables in the variables.tf file. We can start by declaring what's usually known as the AWS_DEFAULT_REGION, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY environment variables: variable "aws_access_key" { description = "AWS Access Key" } variable "aws_secret_key" { description = "AWS Secret Key" } variable "aws_region" { default = "eu-west-1" description = "AWS Region" } Set the two variables matching the AWS account in the terraform.tfvars file. It's not recommended to check this file into source control: it's better to use an example file instead (that is: terraform.tfvars.example). It's also recommended that you use a dedicated Terraform user for AWS, not the root account keys: aws_access_key = "< your AWS_ACCESS_KEY >" aws_secret_key = "< your AWS_SECRET_KEY >" Now, let's tie all this together into a single file—provider.tf: provider "aws" { access_key = "${var.aws_access_key}" secret_key = "${var.aws_secret_key}" region = "${var.aws_region}" } Apply the following Terraform command: $ terraform apply Apply complete! Resources: 0 added, 0 changed, 0 destroyed. It only means the code is valid, not that it can really authenticate with AWS (try with a bad pair of keys). For this, we'll need to create a resource on AWS. You now have a new file named terraform.tfstate that has been created at the root of your repository. This file is critical: it's the stored state of your infrastructure. Don't hesitate to look at it, it's a text file. How it works… This first encounter with HashiCorp Configuration Language (HCL), the language used by Terraform, looks pretty familiar: we've declared variables with an optional description for reference. We could have declared them simply with the following: variable "aws_access_key" { } All variables are referenced to use the following structure: ${var.variable_name} If the variable has been declared with a default, as our aws_region has been declared with a default of eu-west-1, this value will be used if there's no override in the terraform.tfvars file. What would have happened if we didn't provide a safe default for our variable? Terraform would have asked us for a value when executed: $ terraform apply var.aws_region AWS Region Enter a value: There's more… We've used values directly inside the Terraform code to configure our AWS credentials. If you're already using AWS on the command line, chances are you already have a set of standard environment variables: $ echo ${AWS_ACCESS_KEY_ID} <your AWS_ACCESS_KEY_ID> $ echo ${AWS_SECRET_ACCESS_KEY} <your AWS_SECRET_ACCESS_KEY> $ echo ${AWS_DEFAULT_REGION} eu-west-1 If not, you can simply set them as follows: $ export AWS_ACCESS_KEY_ID="123" $ export AWS_SECRET_ACCESS_KEY="456" $ export AWS_DEFAULT_REGION="eu-west-1" Then Terraform can use them directly, and the only code you have to type would be to declare your provider! That's handy when working with different tools. The provider.tffile will then look as simple as this: provider "aws" { } Creating and using an SSH key pair to use on AWS Now we have our AWS provider configured in Terraform, let's add a SSH key pair to use on a default account of the virtual machines we intend to launch soon. Getting ready To step through this section, you will need the following: A working Terraform installation An AWS provider configured in Terraform Generate a pair of SSH keys somewhere you remember. An example can be under the keys folder at the root of your repo: $ mkdir keys $ ssh-keygen -q -f keys/aws_terraform -C aws_terraform_ssh_key -N '' An Internet connection How to do it… The resource we want for this is named aws_key_pair. Let's use it inside a keys.tf file, and paste the public key content: resource "aws_key_pair""admin_key" { key_name = "admin_key" public_key = "ssh-rsa AAAAB3[…]" } This will simply upload your public key to your AWS account under the name admin_key: $ terraform apply aws_key_pair.admin_key: Creating... fingerprint: "" =>"<computed>" key_name: "" =>"admin_key" public_key: "" =>"ssh-rsa AAAAB3[…]" aws_key_pair.admin_key: Creation complete Apply complete! Resources: 1 added, 0 changed, 0 destroyed. If you manually navigate to your AWS account, under EC2 |Network & Security | Key Pairs, you'll now find your key: Another way to use our key with Terraform and AWS would be to read it directly from the file, and that would show us how to use file interpolation with Terraform. To do this, let's declare a new empty variable to store our public key in variables.tf: variable "aws_ssh_admin_key_file" { } Initialize the variable to the path of the key in terraform.tfvars: aws_ssh_admin_key_file = "keys/aws_terraform" Now let's use it in place of our previous keys.tf code, using the file() interpolation: resource "aws_key_pair""admin_key" { key_name = "admin_key" public_key = "${file("${var.aws_ssh_admin_key_file}.pub")}" } This is a much clearer and more concise way of accessing the content of the public key from the Terraform resource. It's also easier to maintain, as changing the key will only require to replace the file and nothing more. How it works… Our first resource, aws_key_pair takes two arguments (a key name and the public key content). That's how all resources in Terraform work. We used our first file interpolation, using a variable, to show how to use a more dynamic code for our infrastructure. There's more… Using Ansible, we can create a role to do the same job. Here's how we can manage our EC2 key pair using a variable, under the name admin_key. For simplification, we're using here the three usual environment variables—AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION: Here's a typical Ansible file hierarchy: ├── keys │ ├── aws_terraform │ └── aws_terraform.pub ├── main.yml └── roles └── ec2_keys └── tasks └── main.yml In the main file (main.yml), let's declare that our host (localhost) will apply the role dedicated to manage our keys: --- - hosts: localhost roles: - ec2_keys In the ec2_keys main task file, create the EC2 key (roles/ec2_keys/tasks/main.yml): --- - name: ec2 admin key ec2_key: name: admin_key key_material: "{{ item }}" with_file: './keys/aws_terraform.pub' Execute the code with the following command: $ ansible-playbook -i localhost main.yml TASK [ec2_keys : ec2 admin key] ************************************************ ok: [localhost] => (item=ssh-rsa AAAA[…] aws_terraform_ssh) PLAY RECAP ********************************************************************* localhost : ok=2 changed=0 unreachable=0 failed=0   Using AWS security groups with Terraform Amazon's security groups are similar to traditional firewalls, with ingress and egress rules applied to EC2 instances. These rules can be updated on-demand. We'll create an initial security group allowing ingress Secure Shell (SSH) traffic only for our own IP address, while allowing all outgoing traffic. Getting ready To step through this section, you will need the following: A working Terraform installation An AWS provider configured in Terraform An Internet connection How to do it… The resource we're using is called aws_security_group. Here's the basic structure: resource "aws_security_group""base_security_group" { name = "base_security_group" description = "Base Security Group" ingress { } egress { } } We know we want to allow inbound TCP/22 for SSH only for our own IP (replace 1.2.3.4/32 by yours!), and allow everything outbound. Here's how it looks: ingress { from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = ["1.2.3.4/32"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } You can add a Name tag for easier reference later: tags { Name = "base_security_group" } Apply this and you're good to go: $ terraform apply aws_security_group.base_security_group: Creating... […] aws_security_group.base_security_group: Creation complete Apply complete! Resources: 1 added, 0 changed, 0 destroyed. You can see your newly created security group by logging into the AWS Console and navigating to EC2 Dashboard|Network & Security|Security Groups: Another way of accessing the same AWS console information is through the AWS command line: $ aws ec2 describe-security-groups --group-names base_security_group {...} There's more… We can achieve the same result using Ansible. Here's the equivalent of what we just did with Terraform in this section: --- - name: base security group ec2_group: name: base_security_group description: Base Security Group rules: - proto: tcp from_port: 22 to_port: 22 cidr_ip: 1.2.3.4/32 Summary In this article, you learnedhow to configure the Terraform AWS provider, create and use an SSH key pair to use on AWS, and use AWS security groups with Terraform. Resources for Article: Further resources on this subject: Deploying Highly Available OpenStack [article] Introduction to Microsoft Azure Cloud Services [article] Concepts for OpenStack [article]
Read more
  • 0
  • 0
  • 16753
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-determining-resource-utilization-requirements
Packt
17 Feb 2014
11 min read
Save for later

Determining resource utilization requirements

Packt
17 Feb 2014
11 min read
(For more resources related to this topic, see here.) For those hoping to find a magical catch-all formula that will work in every scenario, you'll have to keep looking. Remember every environment is unique, and even where similarities may arise, the use case your organization has, will most likely be different from another organization. Beyond your specific VM resource requirements, the hosts you are installing ESXi on will also vary; the hardware available to you will affect your consolidation ratio (the number of virtual machines you can fit on a single host). For example, if you have 10 servers that you want to virtualize, and you have determined each requires 4 GB of RAM, you might easily virtualize those 10 servers on a host with 48 GB of memory. However, if your host only has 16 GB of memory, you may need two or three hosts in order to achieve the required performance. Another important aspect to consider is when to collect resource utilization statistics about your servers. Think about the requirements you have for a specific server; let's use your finance department as an example. You can certainly collect resource statistics over a period of time in the middle of the month, and that might work just fine; however, the people in your finance department are more likely to utilize the system heavily during the first few days of the month as they are working on their month end processes. If you collect resource statistics on the 15th, you might miss a huge increase in resource utilization requirements, which could lead to the system not working as expected, making unhappy users. One last thing before we jump into some example statistics; you should consider collecting these statistics over at least two periods for each server: First, during the normal business hours of your organization or the specific department, during a time when systems are likely to be heavily utilized The second round should include an entire day or week so you are aware of the impact of after hours tasks such as backups and anti-virus scans on your environment It's important to have a strong understanding of the use cases for all the systems you will be virtualizing. If you are running your test during the middle of the month, you might miss the increase of traffic for systems utilized heavily only at the end of the month, for example, accounting systems. The more information you collect, the better prepared you will be to determine your resource utilization requirements. There are quite a few commercial tools available to help determine the specific resource requirements for your environment. In fact, if you have an active project and/or budget, check with your server and storage vendor as they can most likely provide tools to assess your environment over a period of time to help you collect this information. If you work with a VMware Partner or the VMware Professional Services Organization (PSO), you could also work with them to run a tool called VMware Capacity Planner. This tool is only available to partners who have passed the corresponding partner exams. For purposes of this article, however, we will look at the statistics we can capture natively within an operating system, for example, using Performance Monitor on Windows and the sar command in Linux. If you are an OS X user, you might be wondering why we are not touching OS X. This is because while Apple allows virtualizing OS X 10.5 and later, it is only supported on the Apple hardware and is not likely an everyday use case. If your organization requires virtualizing OSX, ESXi 5.1 is supported on specific Mac Pro desktops with Intel Xeon 5600 series processors and 5.0 is supported on Xserve using Xeon 5500 series processors. The current Apple license agreement allows virtualizing OSX 10.5 and up; of course, you should check for the latest agreement to ensure you are adhering to the license agreement. Monitoring common resource statistics From a statistics perspective, there are four main types of resources you generally monitor: CPU, memory, disk, and network. Unless you have a very chatty application, network utilization is generally very low, but this doesn't mean we won't check on it; however, we probably won't dedicate as much time to it as we do for CPU, memory, and disk. As we think about the CPU and memory, we generally look at utilization in terms of percentages. When we look at example servers, you will see that having an accurate inventory of the physical server is important so we can properly gauge the virtual CPU and memory requirements when we virtualize. If a physical server has dual quad core CPUs and 16 GB of memory, it does not necessarily mean we want to provide the same amount of virtual resources. Disk performance is where many people spend the least amount of time, and those people generally have the most headaches after they have virtualized. Disk performance is probably the most critical aspect to think about when you are planning your virtualization project. Most people only think of storage in terms of storage capacity, generally gigabytes (GB) or terabytes (TB). However, from a server perspective, we are mostly concerned with the amount of input and output per second, otherwise known as IOPS and throughput. We break down IOPS in into reads and writes per second and then their ratio by comparing one with the other. Understanding your I/O patterns will help you design your storage architecture to properly support all your applications. Storage design and understanding is an art and science by itself. Sample workload Let's break this down into a practical example so we can see how we are applying these concepts. In this example, we will look at two different types of servers that are likely to have various resource requirements: Windows Active Directory Domain Controller and a CentOS Apache web server. In this scenario, let's assume that each of these server operating systems and applications are running on dedicated hardware, that is, they are not yet virtual machines. The first step you should take, if you do not have this already, is to document the physical systems, their components, and other relevant information such as computer or DNS name, IP address (es), location, and so on. For larger environments, you may also want to document installed software, user groups or departments, and so on. Collecting statistics on Windows On Windows servers, your first step would be to start performance monitoring. Perform the following steps to do so: Navigate to Start | Run and enter perfmon. Once the Performance Monitor window opens, expand Monitoring Tools and click on Performance Monitor. Here, you could start adding various counters; however, as of Windows 2008/Windows 7, Performance Monitor includes Data Collector Sets. Expand the Data Collector Sets folder and then the System folder; right-click on System Performance and select Start. Performance Monitor will start to collect key statistics about your system and its resource utilization. When you are satisfied that you have collected an appropriate amount of data, click on System Performance and select Stop. Your reports will be saved into the Reports folder; navigate to Reports| System, click on the System Performance folder, and finally double-click on the report to see the report. In the following screenshot for our domain controller, you can see we were using 10 percent of the total CPU resources available, 54 percent of the memory, a low 18 IOPS, and 0 percent of the available network resources (this is not really uncommon; I have had busy application servers that barely break 2 percent). Now let's compare what we are utilizing with the actual physical resources available to the server. This server has two dual core processors (four total cores) running at 2 GHz per core (8 GHz total available), 4 GB of memory, two 200 GB SAS drives configured in a RAID 1, and a 1 Gbps network card. Here, performance monitor shows averages, but you should also investigate peak usage. If you scroll down in the report, you will find a menu labeled CPU. Navigate to CPU | Process. Here you will see quite a bit of data, more than the space we have to review in this book; however, if you scroll down, you will see a section called Processor User Time by CPU. Here, your mean (that is, average) column should match fairly closely to the report overview provided for the total, but we also want to look at any spikes we may have encountered. As you can see, this CPU had one core that received a maximum of 35 percent utilization, slightly more than the average suggested. If we take the average CPU utilization at 10 percent of the total CPU, it means we will theoretically require only 800 MHz of CPU power, something a single physical core could easily support. The, memory is also using only half of what is available, so we can most likely reduce the amount of memory to 3 GB and still have room for various changes in operating conditions we might have not encountered during our collection window. Finally, having only 18 IOPS used means that we have plenty of performance left in the drives; even a SATA 7200 RPM drive can provide around 80 IOPS. Collecting statistics on Linux Now let's look at the Linux web server to see how we can collect this same set of information using sar in an additional package with sysstat that can monitor resource utilization over time. This is similar to what you might get from top or iotop. The sysstat package can easily be added to your system by running yum install sysstat, as it is a part of the base repository (yum install sysstat is command format). Once the sysstat package is installed, it will start collecting information about resource utilization every 10 minutes and keep this information for a period of seven days. To see the information, you just need to run the sar command; there are different options to display different sets of information , which we will look at next. Here, we can see that our system is idle right now by viewing the %idle column. A simple way to generate some load on your system is to run dd if=/dev/zero of=/dev/null, which will spike your CPU load to 100 percent, so, don't do this on production systems! Let's look at the output with some CPU load. In this example, you can see that the CPU was under load for about half of the 10-minute collection window. One problem here is that unless the CPU spike, in this case to 100 percent, was not consistent for at least 10 minutes, we would potentially miss these spikes using sar with a 10-minute window. This is easily changed by editing /etc/cron.d/sysstat, which tells the system to run this every 10 minutes. During a collection window, one or two minutes may provide more valuable detail. In this example, you can see I am now logging at a five-minute interval instead of 10, so I will have a better chance to find maximum CPU usage during my monitoring period. Now, we are not only concerned with the CPU, but we also want to see memory and disk utilization. To access those statistics, run sar with the following options: The sar –r command will show RAM (memory) statistics. At a basic level, the items we are concerned with here would be the percentage of memory used, which we could use to determine how much memory is actually being utilized. The sar –b command will show disk I/O. From a disk perspective, sar –b will tell us the total number of transactions per second (tps), read transactions per second (rtps), and write transactions per second (wtps). As you can see, you are able to natively collect quite a bit of relevant data about resource utilization on our systems. However, without the help of a vendor or VMware PSO who has access to VMware Capacity Planner, another commercial tool, or a good automation system, this can become difficult to do on a large scale (hundreds or thousands of servers), but certainly not impossible. Resources for Article: Further resources on this subject: Windows 8 with VMware View [Article] Troubleshooting Storage Contention [Article] Networking Performance Design [Article]
Read more
  • 0
  • 0
  • 16389

article-image-installing-openstack-swift
Packt
04 Jun 2015
10 min read
Save for later

Installing OpenStack Swift

Packt
04 Jun 2015
10 min read
In this article by Amar Kapadia, Sreedhar Varma, and Kris Rajana, authors of the book OpenStack Object Storage (Swift) Essentials, we will see how IT administrators can install OpenStack Swift. The version discussed here is the Juno release of OpenStack. Installation of Swift has several steps and requires careful planning before beginning the process. A simple installation consists of installing all Swift components on a single node, and a complex installation consists of installing Swift on several proxy server nodes and storage server nodes. The number of storage nodes can be in the order of thousands across multiple zones and regions. Depending on your installation, you need to decide on the number of proxy server nodes and storage server nodes that you will configure. This article demonstrates a manual installation process; advanced users may want to use utilities such as Puppet or Chef to simplify the process. This article walks you through an OpenStack Swift cluster installation that contains one proxy server and five storage servers. (For more resources related to this topic, see here.) Hardware planning This section describes the various hardware components involved in the setup. Since Swift deals with object storage, disks are going to be a major part of hardware planning. The size and number of disks required should be calculated based on your requirements. Networking is also an important component, where factors such as a public or private network and a separate network for communication between storage servers need to be planned. Network throughput of at least 1 GB per second is suggested, while 10 GB per second is recommended. The servers we set up as proxy and storage servers are dual quad-core servers with 12 GB of RAM. In our setup, we have a total of 15 x 2 TB disks for Swift storage; this gives us a total size of 30 TB. However, with in-built replication (with a default replica count of 3), Swift maintains three copies of the same data. Therefore, the effective capacity for storing files and objects is approximately 10 TB, taking filesystem overhead into consideration. This is further reduced due to less than 100 percent utilization. The following figure depicts the nodes of our Swift cluster configuration: The storage servers have container, object, and account services running in them. Server setup and network configuration All the servers are installed with the Ubuntu server operating system (64-bit LTS version 14.04). You'll need to configure three networks, which are as follows: Public network: The proxy server connects to this network. This network provides public access to the API endpoints within the proxy server. Storage network: This is a private network and it is not accessible to the outside world. All the storage servers and the proxy server will connect to this network. Communication between the proxy server and the storage servers and communication between the storage servers take place within this network. In our configuration, the IP addresses assigned in this network are 172.168.10.0 and 172.168.10.99. Replication network: This is also a private network that is not accessible to the outside world. It is dedicated to replication traffic, and only storage servers connect to it. All replication-related communication between storage servers takes place within this network. In our configuration, the IP addresses assigned in this network are 172.168.9.0 and 172.168.9.99. This network is optional, and if it is set up, the traffic on it needs to be monitored closely. Pre-installation steps In order for various servers to communicate easily, edit the /etc/hosts file and add the host names of each server in it. This has to be done on all the nodes. The following screenshot shows an example of the contents of the /etc/hosts file of the proxy server node: Install the Network Time Protocol (NTP) service on the proxy server node and storage server nodes. This helps all the nodes to synchronize their services effectively without any clock delays. The pre-installation steps to be performed are as follows: Run the following command to install the NTP service: # apt-get install ntp Configure the proxy server node to be the reference server for the storage server nodes to set their time from the proxy server node. Make sure that the following line is present in /etc/ntp.conf for NTP configuration in the proxy server node: server ntp.ubuntu.com For NTP configuration in the storage server nodes, add the following line to /etc/ntp.conf. Comment out the remaining lines with server addresses such as 0.ubuntu.pool.ntp.org, 1.ubuntu.pool.ntp.org, 2.ubuntu.pool.ntp.org, and 3.ubuntu.pool.ntp.org: # server 0.ubuntu.pool.ntp.org# server 1.ubuntu.pool.ntp.org# server 2.ubuntu.pool.ntp.org# server 3.ubuntu.pool.ntp.orgserver s-swift-proxy Restart the NTP service on each server with the following command: # service ntp restart Downloading and installing Swift The Ubuntu Cloud Archive is a special repository that provides users with the ability to install new releases of OpenStack. The steps required to download and install Swift are as follows: Enable the capability to install new releases of OpenStack, and install the latest version of Swift on each node using the following commands. The second command shown here creates a file named cloudarchive-juno.list in /etc/apt/sources.list.d, whose content is "deb http://ubuntu-cloud.archieve.canonical.com/ubuntu": Now, update the OS using the following command: # apt-get update && apt-get dist-upgrade On all the Swift nodes, we will install the prerequisite software and services using this command: # apt-get install swift rsync memcached python-netifaces python-xattr python-memcache Next, we create a Swift folder under /etc and give users the permission to access this folder, using the following commands: # mkdir –p /etc/swift/# chown –R swift:swift /etc/swift Download the /etc/swift/swift.conf file from GitHub using this command: # curl –o /etc/swift/swift.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/swift.conf-sample Modify the /etc/swift/swift.conf file and add a variable called swift_hash_path_suffix in the swift-hash section. We then create a unique hash string using # python –c "from uuid import uuid4; print uuid4()" or # openssl rand –hex 10, and assign it to this variable, as shown in the following configuration option: We then add another variable called swift_hash_path_prefix to the swift-hash section, and assign to it another hash string created using the method described in the preceding step. These strings will be used in the hashing process to determine the mappings in the ring. The swift.conf file should be identical on all the nodes in the cluster. Setting up storage server nodes This section explains additional steps to set up the storage server nodes, which will contain the object, container, and account services. Installing services The first step required to set up the storage server node is installing services. Let's look at the steps involved: On each storage server node, install the packages for swift-account services, swift-container services, swift-object services, and xfsprogs (XFS Filesystem) using this command: # apt-get install swift-account swift-container swift-object xfsprogs Download the account-server.conf, container-server.conf, and object-server.conf samples from GitHub, using the following commands: # curl –o /etc/swift/account-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/account-server.conf-sample# curl –o /etc/swift/container-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/container-server.conf-sample# curl –o /etc/swift/object-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/object-server.conf-sample Edit the /etc/swift/account-server.conf file with the following section: Edit the /etc/swift/container-server.conf file with this section: Edit the /etc/swift/object-server.conf file with the following section: Formatting and mounting hard disks On each storage server node, we need to identify the hard disks that will be used to store the data. We will then format the hard disks and mount them on a directory, which Swift will then use to store data. We will not create any RAID levels or subpartitions on these hard disks because they are not necessary for Swift. They will be used as entire disks. The operating system will be installed on separate disks, which will be RAID configured. First, identify the hard disks that are going to be used for storage and format them. In our storage server, we have identified sdb, sdc, and sdd to be used for storage. We will perform the following operations on sdb. These four steps should be repeated for sdc and sdd as well: Carry out the partitioning for sdb and create the filesystem using this command: # fdisk /dev/sdb# mkfs.xfs /dev/sdb1 Then let's create a directory in /srv/node/sdb1 that will be used to mount the filesystem. Give the permission to the swift user to access this directory. These operations can be performed using the following commands: # mkdir –p /srv/node/sdb1# chown –R swift:swift /srv/node/sdb1 We set up an entry in fstab for the sdb1 partition in the sdb hard disk, as follows. This will automatically mount sdb1 on /srv/node/sdb1 upon every boot. Add the following command line to the /etc/fstab file: /dev/sdb1 /srv/node/sdb1 xfsnoatime,nodiratime,nobarrier,logbufs=8 0 2 Mount sdb1 on /srv/node/sdb1 using the following command: # mount /srv/node/sdb1 RSYNC and RSYNCD In order for Swift to perform the replication of data, we need to configure rsync by configuring rsyncd.conf. This is done by performing the following steps: Create the rsyncd.conf file in the /etc folder with the following content: # vi /etc/rsyncd.conf We are setting up synchronization within the network by including the following lines in the configuration file: 172.168.9.52 is the IP address that is on the replication network for this storage server. Use the appropriate replication network IP addresses for the corresponding storage servers. We then have to edit the /etc/default/rsync file and set RSYNC_ENABLE to true using the following configuration option: RSYNC_ENABLE=true Next, we restart the rsync service using this command: # service rsync restart Then we create the swift, recon, and cache directories using the following commands, and then set its permissions: # mkdir -p /var/cache/swift# mkdir -p /var/swift/recon Setting permissions is done using these commands: # chown -R swift:swift /var/cache/swift# chown -R swift:swift /var/swift/recon Repeat these steps on every storage server. Setting up the proxy server node This section explains the steps required to set up the proxy server node, which are as follows: Install the following services only on the proxy server node: # apt-get install python-swiftclient python-keystoneclientpython-keystonemiddleware swift-proxy Swift doesn't support HTTPS. OpenSSL has already been installed as part of the operating system installation to support HTTPS. We are going to use the OpenStack Keystone service for authentication. In order to set up the proxy-server.conf file for this, we download the configuration file from the following link and edit it: https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/proxy-server.conf-sample# vi /etc/swift/proxy-server.conf The proxy-server.conf file should be edited to get the correct auth_host, admin_token, admin_tenant_name, admin_user, and admin_password values: admin_token = 01d8b673-9ebb-41d2-968a-d2a85daa1324admin_tenant_name = adminadmin_user = adminadmin_password = changeme Next, we create a keystone-signing directory and give permissions to the swift user using the following commands: # mkdir -p /home/swift/keystone-signing# mkdir -R swift:swift /home/swift/keystone-signing Summary In this article, you learned how to install and set up the OpenStack Swift service to provide object storage, and install and set up the Keystone service to provide authentication for users to access the Swift object storage. Resources for Article: Further resources on this subject: Troubleshooting in OpenStack Cloud Computing [Article] Using OpenStack Swift [Article] Playing with Swift [Article]
Read more
  • 0
  • 0
  • 15975

article-image-cloud-native-architectures-microservices-containers-serverless-part-2
Guest Contributor
14 Aug 2018
8 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 2

Guest Contributor
14 Aug 2018
8 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. We published Part 1 of this paper yesterday and here we come up with Part 2 which involves Containers and Serverless applications. Let us get started: Containers The technology of software containers is the next key technology that needs to be discussed to practically explain cloud native applications. A container is simply the idea of encapsulating some software inside an isolated user space or “container.” For example, a MySQL database can be isolated inside a container where the environmental variables, and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server. Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, runtimes, tools, and accompanying files, on the same machine. In a cloud environment, this ability translates into saved costs and efforts, as the need for provisioning and buying server nodes for each microservices will diminish, since different microservices can be deployed on the same host without disrupting each other. Containers  combined with microservices architectures are powerful tools to build modern, portable, scalable, and cost efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy. Containers also add more benefits to cloud native applications beyond microservices isolation. With a container, you can move your microservices, with all the configuration, dependencies, and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability. Due to the power and popularity of the software containers technology, some new operating systems like CoreOS, or Photon OS, are built from the ground up to function as hosts for containers. One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google, and IBM utilize Docker containers in their infrastructure as well as in their products. Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. It was built by Google to facilitate the management of their containers, which are counted by billions per week. Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers. The project is part of the cloud native foundation along with Prometheus. Container complexities In case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of microservices. As containers or microservices grow in size, there needs to be a mechanism to identify where each container or microservices is deployed, what their purpose is, and what they need in resources to keep running. Serverless applications Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. In order to fully understand serverless applications, we must first cover an important concept known as ‘Function As A service’, or FaaS for short. Function as a service or FaaS is the idea that a cloud provider such as Amazon or even a local piece of software such as Fission.io or funktion would provide a service, where a user can request a function to run remotely in order to perform a very specific task, and then after the function concludes, the function results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function. The idea behind properly designed cloud native production applications that utilize the serverless architecture is that instead of building multiple microservices expected to run continuously in order to carry out individual tasks, build an application that has fewer microservices combined with FaaS, where FaaS covers tasks that don’t need services to run continuously. FaaS is a smaller construct than a microservice. For example, in case of the event booking application we covered earlier, there were multiple microservices covering different tasks. If we use a serverless applications model, some of those microservices would be replaced with a number of functions that serve their purpose. Here is a diagram that showcases the application utilizing a serverless architecture: In this diagram, the event handler microservices as well as the booking handler microservices were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing microservices. Serverless architectures have the advantage that no virtual machines and/or containers need to be provisioned to build the part of the application that utilizes FaaS. The computing instances that run the functions cease to exist from the user point of view once their functions conclude. Furthermore, the number of microservices and/or containers that need to be monitored and maintained by the user decreases, saving cost, time, and effort. Serverless architectures provide yet another powerful software building tool in the hands of software engineers and architects to design flexible and scalable software. Known FaaS are AWS Lambda by Amazon, Azure Functions by Microsoft, Cloud Functions by Google, and many more. Another definition for serverless applications is the applications that utilize the BaaS or backend as a service paradigm. BaaS is the idea that developers only write the client code of their application, which then relies on several software pre-built services hosted in the cloud, accessible via APIs. BaaS is popular in mobile app programming, where developers would rely on a number of backend services to drive the majority of the functionality of the application. Examples of BaaS services are: Firebase, and Parse. Disadvantages of serverless applications Similarly to microservices and cloud native applications, the serverless architecture is not suitable for all scenarios. The functions provided by FaaS don’t keep state by themselves which means special considerations need to be observed when writing the function code. This is unlike a full microservice, where the developer has full control over the state. One approach to keep state in case of FaaS, in spite of this limitation, is to propagate the state to a database or a memory cache like Redis. The startup times for the functions are not always fast since there is time allocated to sending the request to the FaaS service provider then the time needed to start a computing instance that runs the function in some cases. These delays have to be accounted for when designing serverless applications. FaaS do not run continuously like microservices, which makes them unsuitable for any task that requires continuous running of the software. Serverless applications have the same limitation as other cloud native applications where portability of the application from one cloud provider to another or from the cloud to a local environment becomes challenging because of vendor lock-in Conclusion Cloud computing architectures have opened avenues for developing efficient, scalable, and reliable software. This paper covered some significant concepts in the world of cloud computing such as microservices, cloud native applications, containers, and serverless applications. Microservices are the building blocks for most scalable cloud native applications; they decouple the application tasks into various efficient services. Containers are how microservices could be isolated and deployed safely to production environments without polluting them.  Serverless applications decouple application tasks into smaller constructs mostly called functions that can be consumed via APIs. Cloud native applications make use of all those architectural patterns to build scalable, reliable, and always available software. You read Part 2 of of Modern cloud native architectures, a white paper by Mina Andrawos. Also read Part 1 which includes Microservices and Cloud native applications with their advantages and disadvantages. If you are interested to learn more, check out Mina’s Cloud Native programming with Golang to explore practical techniques for building cloud-native apps that are scalable, reliable, and always available. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Build Java EE containers using Docker [Tutorial] Are containers the end of virtual machines? Why containers are driving DevOps
Read more
  • 0
  • 0
  • 15950

article-image-introduction-ansible
Packt
26 Dec 2016
25 min read
Save for later

Introduction to Ansible

Packt
26 Dec 2016
25 min read
In this article by Walter Bentley, the author of the book OpenStack Administration with Ansible 2 - Second Edition. This article will serve as a high-level overview of Ansible 2.0 and components that make up this open source configuration management tool. We will cover the definition of the Ansible components and their typical use. Also, we will discuss how to define variables for the roles and defining/setting facts about the hosts for the playbooks. Next, we will transition into how to set up your Ansible environment and the ways you can define the host inventory used to run your playbooks against. We will then cover some of the new components introduced in Ansible 2.0 named Blocks and Strategies. It will also review the cloud integrations natively part of the Ansible framework. Finally, the article will finish up with a working example of a playbook that will confirm the required host connectivity needed to use Ansible. The following topics are covered: Ansible 2.0 overview What are playbooks, roles, and modules? Setting up the environment Variables and facts Defining the inventory Blocks and Strategies Cloud integrations (For more resources related to this topic, see here.) Ansible 2.0 overview Ansible in its simplest form has been described as a python-based open source IT automation tool that can be used to configuremanage systems, deploy software (or almost anything), and provide orchestration to a process. These are just a few of the many possible use cases for Ansible. In my previous life as a production support infrastructure engineer, I wish such a tool would have existed. Would have surely had much more sleep and a lot less gray hairs. One thing that always stood out to me in regard to Ansible is that the developer's first and foremost goal was to create a tool that offers simplicity and maximum ease of use. In the world filled with complicated and intricate software, keeping it simple goes a long way for most IT professionals. Staying with the goal of keeping things simple, Ansible handles configuration/management of hosts solely through Secure Shell (SSH). Absolutely no daemon or agent is required. The server or workstation where you run the playbooks from only needs python and a few other packages, most likely already present, installed. Honestly, it does not get simpler than that. The automation code used with Ansible is organized into something named playbooks and roles, of which is written in YAML markup format. Ansible follows the YAML formatting and structure within the playbooks/roles. Being familiar with YAML formatting helps in creating your playbooks/roles. If you are not familiar do not worry, as it is very easy to pick up (it is all about the spaces and dashes). The playbooks and roles are in a noncomplied format making the code very simple to read if familiar with standard UnixLinux commands. There is also a suggested directory structure in order to create playbooks. This also is one of my favorite features of Ansible. Enabling the ability to review and/or use playbooks written by anyone else with little to no direction needed. It is strongly suggested that you review the Ansible playbook best practices before getting started: http://docs.ansible.com/playbooks_best_practices.html. I also find the overall Ansible website very intuitive and filled with great examples at http://docs.ansible.com. My favorite excerpt from the Ansible playbook best practices is under the Content Organization section. Having a clear understanding of how to organize your automation code proved very helpful to me. The suggested directory layout for playbooks is as follows: group_vars/ group1 # here we assign variables to particular groups group2 # "" host_vars/ hostname1 # if systems need specific variables, put them here hostname2 # "" library/ # if any custom modules, put them here (optional) filter_plugins/ # if any custom filter plugins, put them here (optional) site.yml # master playbook webservers.yml # playbook for webserver tier dbservers.yml # playbook for dbserver tier roles/ common/ # this hierarchy represents a "role" tasks/ # main.yml # <-- tasks file can include smaller files if warranted handlers/ # main.yml # <-- handlers file templates/ # <-- files for use with the template resource ntp.conf.j2 # <------- templates end in .j2 files/ # bar.txt # <-- files for use with the copy resource foo.sh # <-- script files for use with the script resource vars/ # main.yml # <-- variables associated with this role defaults/ # main.yml # <-- default lower priority variables for this role meta/ # main.yml # <-- role dependencies It is now time to dig deeper into reviewing what playbooks, roles, and modules consist of. This is where we will break down each of these component's distinct purposes. What are playbooks, roles, and modules? The automation code you will create to be run by Ansible is broken down in hierarchical layers. Envision a pyramid with its multiple levels of elevation. We will start at the top and discuss playbooks first. Playbooks Imagine that a playbook is the very topmost triangle of the pyramid. A playbook takes on the role of executing all of the lower level code contained in a role. It can also be seen as a wrapper to the roles created. We will cover the roles in the next section. Playbooks also contain other high-level runtime parameters, such as the host(s) to run the playbook against, the root user to use, and/or if the playbook needs to be run as a sudo user. These are just a few of the many playbook parameters you can add. Below is an example of what the syntax of a playbook looks like: --- # Sample playbooks structure/syntax. - hosts: dbservers remote_user: root become: true roles: - mysql-install In the preceding example, you will note that the playbook begins with ---. This is required as the heading (line 1) for each playbook and role. Also, please note the spacing structure at the beginning of each line. The easiest way to remember it is each main command starts with a dash (-). Then, every subcommand starts with two spaces and repeats the lower in the code hierarchy you go. As we walk through more examples, it will start to make more sense. Let's step through the preceding example and break down the sections. The first step in the playbook was to define what hosts to run the playbook against; in this case, it was dbservers (which can be a single host or list of hosts). The next area sets the user to run the playbook as locally, remotely, and it enables executing the playbook as sudo. The last section of the syntax lists the roles to be executed. The earlier example is similar to the formatting of the other playbooks. This format incorporates defining roles, which allows for scaling out playbooks and reusability (you will find the most advanced playbooks structured this way). With Ansible's high level of flexibility, you can also create playbooks in a simpler consolidated format. An example of such kind is as follows: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers remote_user: root become: true tasks: - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted - name: Prep MySQL db command: chdir=/usr/bin mysqladmin -u root password 'passwd' Now that we have reviewed what playbooks are, we will move on to reviewing roles and their benefits. Roles Moving down to the next level of the Ansible pyramid, we will discuss roles. The most effective way to describe roles is the breaking up a playbook into multiple smaller files. So, instead of having one long playbook with multiple tasks defined, all handling separately related steps, you can break the playbook into individual specific roles. This format keeps your playbooks simple and leads to the ability to reuse roles between playbooks. The best advice I personally received concerning creating roles is to keep them simple. Try to create a role to do a specific function, such as just installing a software package. You can then create a second role to just do configurations. In this format, you can reuse the initial installation role over and over without needing to make code changes for the next project. The typical syntax of a role can be found here and would be placed into a file named main.yml within the roles/<name of role>/tasks directory: --- - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted - name: Prep MySQL db command: chdir=/usr/bin mysqladmin -u root password 'passwd' The complete structure of a role is identified in the directory layout found in the Ansible Overview section of this article. We will review additional functions of roles as we step through the working examples. With having covered playbooks and roles, we are prepared to cover the last topic in this session, which are modules. Modules Another key feature of Ansible is that it comes with predefined code that can control system functions, named modules. The modules are executed directly against the remote host(s) or via playbooks. The execution of a module generally requires you to pass a set number of arguments. The Ansible website (http://docs.ansible.com/modules_by_category.html) does a great job of documenting every available module and the possible arguments to pass to that module. The documentation for each module can also be accessed via the command line by executing the command ansible-doc <module name>. The use of modules will always be the recommended approach within Ansible as they are written to avoid making the requested change to the host unless the change needs to be made. This is very useful when re-executing a playbook against a host more than once. The modules are smart enough to know not to re-execute any steps that have already completed successfully, unless some argument or command is changed. Another thing worth noting is with every new release of Ansible additional modules are introduced. Personally, there was an exciting addition to Ansible 2.0, and these are the updated and extended set of modules set to ease the management of your OpenStack cloud. Referring back to the previous role example shared earlier, you will note the use of various modules. The modules used are highlighted here again to provide further clarity: --- - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted ... Another feature worth mentioning is that you are able to not only use the current modules, but you can also write your very own modules. Although the core of Ansible is written in python, your modules can be written in almost any language. Underneath it, all the modules technically return JSON format data, thus allowing for the language flexibility. In this section, we were able to cover the top two sections of our Ansible pyramid, playbooks, and roles. We also reviewed the use of modules, that is, the built-in power behind Ansible. Next, we transition into another key features of Ansible—variable substitution and gathering host facts. Setting up the environment Before you can start experimenting with Ansible, you must install it first. There was no need in duplicating all the great documentation to accomplish this already created on http://docs.ansible.com/ . I would encourage you to go to the following URL and choose an install method of your choice: http://docs.ansible.com/ansible/intro_installation.html. If you are installing Ansible on Mac OS, I found using Homebrew was much simpler and consistent. More details on using Homebrew can be found at http://brew.sh. The command to install Ansible with Homebrew is brew install ansible. Upgrading to Ansible 2.0 It is very important to note that in order to use the new features part of Ansible version 2.0, you must update the version running on your OSA deployment node. The version currently running on the deployment node is either 1.9.4 or 1.9.5. The method that seemed to work well every time is outlined here. This part is a bit experimental, so please make a note of any warnings or errors incurred. From the deployment node, execute the following commands: $ pip uninstall -y ansible $ sed -i 's/^export ANSIBLE_GIT_RELEASE.*/export ANSIBLE_GIT_RELEASE=${ANSIBLE_GIT_RELEASE:-"v2.1.1.0-1"}/' /opt/openstack-ansible/scripts/bootstrap-ansible.sh $ cd /opt/openstack-ansible $ ./scripts/bootstrap-ansible.sh New OpenStack client authenticate Alongside of the introduction of the new python-openstackclient, CLI was the unveiling of the os-client-config library. This library offers an additional way to provide/configure authentication credentials for your cloud. The new OpenStack modules part of Ansible 2.0 leverages this new library through a package named shade. Through the use of os-client-config and shade, you can now manage multiple cloud credentials within a single file named clouds.yml. When deploying OSA, I discovered that shade will search for this file in the $HOME/.config/openstack/ directory wherever the playbook/role and CLI command is executed. A working example of the clouds.yml file is shown as follows: # Ansible managed: /etc/ansible/roles/openstack_openrc/templates/clouds.yaml.j2 modified on 2016-06-16 14:00:03 by root on 082108-allinone02 clouds: default: auth: auth_url: http://172.29.238.2:5000/v3 project_name: admin tenant_name: admin username: admin password: passwd user_domain_name: Default project_domain_name: Default region_name: RegionOne interface: internal identity_api_version: "3" Using this new authentication method drastically simplifies creating automation code to work on an OpenStack environment. Instead of passing a series of authentication parameters in line with the command, you can just pass a single parameter, --os-cloud=default. The Ansible OpenStack modules can also use this new authentication method. More details about os-client-config can be found at: http://docs.openstack.org/developer/os-client-config. Installing shade is required to use the Ansible OpenStack modules in version 2.0. Shade will be required to be installed directly on the deployment node and the Utility container (if you decide to use this option). If you encounter problems installing shade, try the command—pip install shade—isolated. Variables and facts Anyone who has ever attempted to create some sort of automation code, whether be via bash or Perl scripts, knows that being able to define variables is an essential component. Although Ansible does not compare with other programming languages mentioned, it does contain some core programming language features such as variable substitution. Variables To start, let's first define the meaning of variables and use in the event this is a new concept. Variable (computer science), a symbolic name associated with a value and whose associated value may be changed Using variable allows you to set a symbolic placeholder in your automation code that you can substitute values for on each execution. Ansible accommodates defining variables within your playbooks and roles in various ways. When dealing with OpenStack and/or cloud technologies in general being able to adjust your execution parameters on the fly is critical. We will step through a few ways how you can set variable placeholders in your playbooks, how to define variable values, and how you can register the result of a task as a variable. Setting variable placeholders In the event you wanted to set a variable placeholder within your playbooks, you would add the following syntax like this: - name: Copying my.cnf configuration file template: src=cust_my.cnf dest={{ CONFIG_LOC }} mode=0755 In the preceding example, the variable CONFIG_LOC was added in the place of the configuration file location (/etc/my.cnf) designated in the earlier example. When setting the placeholder, the variable name must be encased within {{ }} as shown in the example. Defining variable values Now that you have added the variable to your playbook, you must define the variable value. This can be done easily by passing command-line values as follows: $ ansible-playbook base.yml --extra-vars "CONFIG_LOC=/etc/my.cnf" Or you can define the values directly in your playbook, within each role or include them inside of global playbook variable files. Here are the examples of the three options. Define a variable value directly in your playbook by adding the vars section: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers ... vars: CONFIG_LOC: /etc/my.cnf ... Define a variable value within each role by creating a variable file named main.yml within the vars/ directory of the role with the following contents: --- CONFIG_LOC: /etc/my.cnf To define the variable value inside of the global playbook, you would first create a host-specific variable file within the group_vars/ directory in the root of the playbook directory with the exact same contents as mentioned earlier. In this case, the variable file must be named to match the host or host group name defined within the hosts file. As in the earlier example, the host group name is dbservers; in turn, a file named dbservers would be created within the group_vars/ directory. Registering variables The situation at times arises when you want to capture the output of a task. Within the process of capturing the result you are in essence registering a dynamic variable. This type of variable is slightly different from the standard variables we have covered so far. Here is an example of registering the result of a task to a variable: - name: Check Keystone process shell: ps -ef | grep keystone register: keystone_check The registered variable value data structure can be stored in a few formats. It will always follow a base JSON format, but the value can be stored under different attributes. Personally, I have found it difficult at times to blindly determine the format, The tip given here will save you hours of troubleshooting. To review and have the data structure of a registered variable returned when running a playbook, you can use the debug module, such as adding this to the previous example: - debug: var=keystone_check. Facts When Ansible runs a playbook, one of the first things it does on your behalf is gather facts about the host before executing tasks or roles. The information gathered about the host will range from the base information such as operating system and IP addresses to the detailed information such as the hardware type/resources. The details capture on then stored into a variable named facts. You can find a complete list of available facts on the Ansible website at: http://docs.ansible.com/playbooks_variables.html#information-discovered-from-systems-facts. You have the option to disable the facts gather process by adding the following to your playbook: gather_facts: false. Facts about a host are captured by default unless the feature is disabled. A quick way of viewing all facts associated with a host, you can manually execute the following via a command line: $ ansible dbservers –m setup There is plenty more you can do with facts, and I would encourage you to take some time reviewing them in the Ansible documentation. Next, we will learn more about the base of our pyramid, the host inventory. Without an inventory of hosts to run the playbooks against, you would be creating the automation code for nothing. So to close out this artticle, we will dig deeper into how Ansible handles host inventory whether it be in a static and/or dynamic format. Defining the inventory The process of defining a collection of hosts to Ansible is named the inventory. A host can be defined using its fully qualified domain name (FQDN), local hostname, and/or its IP address. Since Ansible uses SSH to connect to the hosts, you can provide any alias for the host that the machine where Ansible is installed can understand. Ansible expects the inventory file to be in an INI-like format and named hosts. By default, the inventory file is usually located in the /etc/ansible directory and will look as follows: athena.example.com [ocean] aegaeon.example.com ceto.example.com [air] aeolus.example.com zeus.example.com apollo.example.com Personally I have found the default inventory file to be located in different places depending on the operating system Ansible is installed on. With that point, I prefer to use the –i command-line option when executing a playbook. This allows me to designate the specific hosts file location. A working example would look like this: ansible-playbook -i hosts base.yml. In the preceding example, there is a single host and a group of hosts defined. The hosts are grouped together into a group by defining a group name enclosed in [ ] inside the inventory file. Two groups are defined in the earlier-mentioned example—ocean and air. In the event where you do not have any hosts within your inventory file (such as in the case of running a playbook locally only), you can add the following entry to define localhost like this: [localhost] localhost ansible_connection=local The option exists to define variable for hosts and a group inside of your inventory file. More information on how to do this and additional inventory details can be found on the Ansible website at http://docs.ansible.com/intro_inventory.html. Dynamic inventory It seemed appropriate since we are automating functions on a cloud platform to review yet another great feature of Ansible, which is the ability to dynamically capture an inventory of hosts/instances. One of the primary principles of cloud is to be able to create instances on demand directly via an API, GUI, CLI, and/or through automation code, like Ansible. That basic principle will make relying on a static inventory file pretty much a useless choice. This is why, you will need to rely heavily on dynamic inventory. A dynamic inventory script can be created to pull information from your cloud at runtime and then, in turn, use that information for the playbooks execution. Ansible provides the functionality to detect if an inventory file is set as an executable and if so will execute the script to pull current time inventory data. Since creating an Ansible dynamic inventory script is considered more of an advanced activity, I am going to direct you to the Ansible website, (http://docs.ansible.com/intro_dynamic_inventory.html), as they have a few working examples of dynamic inventory scripts there. Fortunately, in our case, we will be reviewing an OpenStack cloud built using openstack-ansible (OSA) repository. OSA comes with a prebuilt dynamic inventory script that will work for your OpenStack cloud. That script is named dynamic_inventory.py and can be found within the playbooks/inventory directory located in the root OSA deployment folder. First, execute the dynamic inventory script manually to become familiar with the data structure and group names defined (this example assumes that you are in the root OSA deployment directory): $ cd playbooks/inventory $ ./dynamic_inventory.py This will print to the screen an output similar to this: ... }, "compute_all": { "hosts": [ "compute1_rsyslog_container-19482f86", "compute1", "compute2_rsyslog_container-dee00ea5", "compute2" ] }, "utility_container": { "hosts": [ "infra1_utility_container-c5589031" ] }, "nova_spice_console": { "hosts": [ "infra1_nova_spice_console_container-dd12200f" ], "children": [] }, ... Next, with this information, you now know that if you wanted to run a playbook against the utility container, all you would have to do is execute the playbook like this: $ ansible-playbook -i inventory/dynamic_inventory.py playbooks/base.yml –l utility_container Blocks & Strategies In this section, we will cover two new features added to version 2.0 of Ansible. Both features add additional functionality to how tasks are grouped or executed within a playbook. So far, they seem to be really nice features when creating more complex automation code. We will now briefly review each of the two new features. Blocks The Block feature can simply be explained as a way of logically grouping tasks together with the option of also applying customized error handling. It gives the option to group a set of tasks together establishing specific conditions and privileges. An example of applying the block functionality to an earlier example can be found here: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers tasks: - block: - apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - command: chdir=/usr/bin mysql_install_db - service: name=mysqld enabled=yes state=restarted - command: chdir=/usr/bin mysqladmin -u root password 'passwd' when: ansible_distribution == 'Ubuntu' remote_user: root become: true Additional details on how to implement Blocks and any associated error handling can be found at http://docs.ansible.com/ansible/playbooks_blocks.html. Strategies The strategy feature allows you to add control on how a play is executed by the hosts. Currently, the default behavior is described as being the linear strategy, where all hosts will execute each task before any host moves on to the next task. As of today, the two other strategy types that exist are free and debug. Since Strategies are implemented as a new type of plugin to Ansible more can be easily added by contributing code. Additional details on Strategies can be found at http://docs.ansible.com/ansible/playbooks_strategies.html. A simple example of implementing a strategy within a playbook is as follows: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers strategy: free tasks: ... The new debug strategy is extremely helpful when you need to step through your playbook/role to find something like a missing variable, determine what variable value to supply or figure out why it may be sporadically failing. These are just a few of the possible use cases. Definitely I encourage you to give this feature a try. Here is the URL to more details on the playbook debugger: http://docs.ansible.com/ansible/playbooks_debugger.html. Cloud integrations Since cloud automation is the main and most important theme of this article, it only makes sense that we highlight the many different cloud integrations Ansible 2.0 offers right out of the box. Again, this was one of the reasons why I immediately fell in love with Ansible. Yes, the other automation tools also have hooks into many of the cloud providers, but I found at times they did not work or were not mature enough to leverage. Ansible has gone above and beyond to not fall into that trap. Not saying Ansible has all the bases covered, it does feel like most are and that is what matters most to me. If you have not checked out the cloud modules available for Ansible, take a moment now and take a look at http://docs.ansible.com/ansible/list_of_cloud_modules.html. From time to time check back as I am confident, you will be surprised to find more have been added. I am very proud of my Ansible family for keeping on top of these and making it much easier to write automation code against our clouds. Specific to OpenStack, a bunch of new modules have been added to the Ansible library as of version 2.0. The extensive list can be found at http://docs.ansible.com/ansible/list_of_cloud_modules.html#openstack. You will note that the biggest changes, from the first version of this book to this one, will be focused on using as many of the new OpenStack modules when possible. Summary Let's pause here on exploring the dynamic inventory script capabilities and continue to build upon it as we dissect the working examples. We will create our very first OpenStack administration playbook together. We will start off with a fairly simple task of creating users and tenants. This will also include reviewing a few automation considerations you will need to keep in mind when creating automation code for OpenStack. Ready? Ok, let's get started! Resources for Article:   Further resources on this subject: AIO setup of OpenStack – preparing the infrastructure code environment [article] RDO Installation [article] Creating Multiple Users/Tenants [article]
Read more
  • 0
  • 0
  • 15420
article-image-vm-it-not-what-you-think
Packt
10 Mar 2016
10 min read
Save for later

VM, It Is Not What You Think!

Packt
10 Mar 2016
10 min read
In this article by Iwan 'e1' Rahabok, the author of the book VMware Performance and Capacity Management, Second Edition, we will look at why a seemingly simple technology, a virtualized x86 machine, has huge ramifications for the IT industry. In fact, it is turning a lot of things upside down and breaking down silos that have existed for decades in large IT organizations. We will cover the following topics: Why virtualization is not what we think it is Virtualization versus partitioning A comparison between a physical server and a virtual machine (For more resources related to this topic, see here.) Our journey into the virtual world A virtual machine, or simply, VM - who doesn't know what it is? Even a business user who has never seen one knows what it is. It is just a physical server, virtualized. Nothing more. Wise men say that small leaks sink the ship. This is a good way to explain why IT departments that manage physical servers well struggle when the same servers are virtualized. We can also use the Pareto principle (80/20 rule). 80 percent of a VM is identical to a physical server. But it's the 20 percent of difference that hits you. We will highlight some of this 20 percent portion, focusing on areas that impact data center management. The change caused by virtualization is much larger than the changes brought about by previous technologies. In the past two or more decades, we transitioned from mainframes to the client/server-based model and then to the web-based model. These are commonly agreed upon as the main evolutions in IT architecture. However, all of these are just technological changes. They changed the architecture, yes, but they did not change the operation in a fundamental way. Both the client-server and web shifts did not talk about the journey. There was no journey to the client-server based model. However, with virtualization, we talk about the journey. It is a journey because the changes are massive and involve a lot of people. In 2007, Gartner correctly predicted the impact of virtualization (http://www.gartner.com/newsroom/id/505040). More than 8 years later, we are still in the midst of the journey. Proving how pervasive the change is, here is the summary on the article from Gartner: Notice how Gartner talks about a change in culture. Virtualization has a cultural impact too. In fact, if your virtualization journey is not fast enough, look at your organization's structure and culture. Have you broken the silos? Do you empower your people to take risks and do things that have never been done before? Are you willing to flatten the organizational chart? The silos that have served you well are likely your number one barrier to a hybrid cloud. So why exactly is virtualization causing such a fundamental shift? To understand this, we need to go back to the basics, which is exactly what virtualization is. It's pretty common that chief information officers (CIOs) have a misconception about what it is. Take a look at the following comments. Have you seen them in your organization? VM is just a virtualized physical machine. Even VMware says that the guest OS is not aware it's virtualized and that it does not run differently. It is still about monitoring CPU, RAM, disk, network, and other resources. No difference. It is a technological change. Our management process does not have to change. All of these VMs must still feed into our main enterprise IT management system. This is how we have run our business for decades, and it works. If only life were that simple, we would all be 100-percent virtualized and have no headaches! Virtualization has been around for years, and yet, most organizations have not mastered it. The proof of mastering it is when you have completed the journey and have reached the highest level of the virtualization maturity model. Not all virtualizations are equal There are plenty of misconceptions about the topic of virtualization, especially among IT folks who are not familiar with virtualization. CIOs who have not felt the strategic impact of virtualization (be it a good or bad experience) tend to carry these misconceptions. Although virtualization looks similar to a physical system from the outside, it is completely re-architected under the hood. So, let's take a look at the first misconception: what exactly is virtualization? Because it is an industry trend, virtualization is often generalized to include other technologies that are not virtualized. This is a typical strategy by IT vendors who have similar technologies. A popular technology often branded under virtualization is hardware partitioning; since it is parked under the umbrella of virtualization, both should be managed in the same way. Since both are actually different, customers who try to manage both with a single piece of management software struggle to do well. Partitioning and virtualization are two different architectures in computer engineering, resulting in there being major differences between their functionalities. They are shown in the following screenshot: Virtualization versus partitioning With partitioning, there is no hypervisor that virtualizes the underlying hardware. There is no software layer separating the VM and the physical motherboard. There is, in fact, no VM. This is why some technical manuals for partitioning technology do not even use the term VM. The manuals use the term domain, partition, or container instead. There are two variants of partitioning technology, hardware-level and OS-level partitioning, which are covered in the following bullet points: In hardware-level partitioning, each partition runs directly on the hardware. It is not virtualized. This is why it is more scalable and has less of a performance hit. Because it is not virtualized, it has to have an awareness of the underlying hardware. As a result, it is not fully portable. You cannot move the partition from one hardware model to another. The hardware has to be built for a purpose to support that specific version of the partition. The partitioned OS still needs all the hardware drivers and will not work on other hardware if the compatibility matrix does not match. As a result, even the version of the OS matters, as it is just like a physical server. In OS-level partitioning, there is a parent OS that runs directly on the server motherboard. This OS then creates an "OS partition", where other OSes can run. We use double quotes as it is not exactly the full OS that runs inside that partition. The OS has to be modified and qualified to be able to run as a zone or container. Because of this, application compatibility is affected. This is different in a VM, where there is no application compatibility issue as the hypervisor is transparent to the guest OS. Hardware partitioning We covered the difference between virtualization and partitioning from an engineering point of view. However, does it translate into different data center architectures and operations? We will focus on hardware partitioning since there are fundamental differences between hardware partitioning and software partitioning. The use case for both is also different. Software partitioning is typically used in native cloud applications. With that, let's do a comparison between hardware partitioning and virtualization. We will start with availability. With virtualization, all VMs are protected by vSphere High Availability (vSphere HA), which provides 100 percent protection and that too without VM awareness. Nothing needs to be done at the VM layer. No shared or quorum disk and no heartbeat-network VM is required to protect a VM with basic HA. With hardware partitioning, the protection has to be configured manually, one by one for each logical partition (LPAR) or logical domain (LDOM). The underlying platform does not provide that. With virtualization, you can even go beyond five nines (99.999 percent) and move to 100 percent with vSphere Fault Tolerance. This is not possible in the partitioning approach as there is no hypervisor that replays CPU instructions. Also, because it is virtualized and transparent to the VM, you can turn the Fault Tolerance capability on and off on demand. Fault Tolerance is completely defined in the software. Another area of difference between partitioning and virtualization is disaster recovery (DR). With partitioning technology, the DR site requires another instance to protect the production instance. It is a different instance, with its own OS image, hostname, and IP address. Yes, we can perform a Storage Area Network (SAN) boot, but that means another Logical Unit Number (LUN) is required to manage, zone, replicate, and so on. Disaster recovery is not scalable to thousands of servers. To make it scalable, it has to be simpler. Compared to partitioning, virtualization takes a different approach. The entire VM fits inside a folder; it becomes like a document and we migrate the entire folder as if the folder is one object. This is what vSphere Replication or Site Recovery Manager do. They perform a replication per VM; there is no need to configure a SAN boot. The entire DR exercise, which can cover thousands of virtual servers, is completely automated and has audit logs automatically generated. Many large enterprises have automated their DR with virtualization. There is probably no company that has automated DR for their entire LPAR, LDOM, or container. In the previous paragraph, we're not implying LUN-based or hardware-based replication as inferior solutions. We're merely driving the point that virtualization enables you to do things differently. We're also not saying that hardware partitioning is an inferior technology. Every technology has its advantages and disadvantages and addresses different use cases. Before joining VMware, the author was a Sun Microsystems sales engineer for five years, so he is aware of the benefits of UNIX partitioning. This article is merely trying to dispel the misunderstanding that hardware partitioning equals virtualization. OS partitioning We've covered the differences between hardware partitioning and virtualization. Let's switch gear to software partitioning. In 2016, the adoption of Linux containers will continue its rapid rise. You can actually use both containers and virtualization, and they complement each other in some use cases. There are two main approaches to deploying containers: Run them directly on bare metal Run them inside a virtual machine As both technologies evolve, the gap gets wider. As a result, managing a software partition is different from managing a VM. Securing a container is different to securing a VM. Be careful when opting for a management solution that claims to manage both. You will probably end up with the most common denominator. This is one reason why VMware is working on vSphere Integrated Containers and the Photon platform. Now that's a separate topic by itself! Summary We hope you enjoyed the comparison and found it useful. We covered, to a great extent, the impact caused by virtualization and the changes it introduces. We started by clarifying that virtualization is a different technology compared to partitioning. We then explained that once a physical server is converted to a virtual machine, it takes on a different form and has radically different properties. Resources for Article: Further resources on this subject: Deploying New Hosts with vCenter [article] VMware vCenter Operations Manager Essentials - Introduction to vCenter Operations Manager [article] VMware vRealize Operations Performance and Capacity Management [article]
Read more
  • 0
  • 0
  • 15101

article-image-aws-global-infrastructure
Packt
06 Jul 2015
5 min read
Save for later

AWS Global Infrastructure

Packt
06 Jul 2015
5 min read
In this article by Uchit Vyas, who is the author of the Mastering AWS Development book, we will see how to use AWS services in detail. It is important to have a choice of placing applications as close as possible to your users or customers, in order to ensure the lowest possible latency and best user experience while deploying them. AWS offers a choice of nine regions located all over the world (for example, East Coast of the United States, West Coast of the United States, Europe, Tokyo, Singapore, Sydney, and Brazil), 26 redundant Availability Zones, and 53 Amazon CloudFront points of presence. (For more resources related to this topic, see here.) It is very crucial and important to have the option to put applications as close as possible to your customers and end users by ensuring the best possible lowest latency and user-expected features and experience, when you are creating and deploying apps for performance. For this, AWS provides worldwide means to the regions located all over the world. To be specific via name and location, they are as follows: US East (Northern Virginia) region US West (Oregon) region US West (Northern California) region EU (Ireland) Region Asia Pacific (Singapore) region Asia Pacific (Sydney) region Asia Pacific (Tokyo) region South America (Sao Paulo) region US GovCloud In addition to regions, AWS has 25 redundant Availability Zones and 51 Amazon CloudFront points of presence. Apart from these infrastructure-level highlights, they have plenty of managed services that can be the cream of AWS candy bar! The managed services bucket has the following listed services: Security: For every organization, security in each and every aspect is the vital element. For that, AWS has several remarkable security features that distinguishes AWS from other Cloud providers as follows : Certifications and accreditations Identity and Access Management Right now, I am just underlining the very important security features. Global infrastructure: AWS provides a fully-functional, flexible technology infrastructure platform worldwide, with managed services over the globe with certain characteristics, for example: Multiple global locations for deployment Low-latency CDN service Reliable, low-latency DNS service Compute: AWS offers a huge range of various cloud-based core computing services (including variety of compute instances that can be auto scaled to justify the needs of your users and application), a managed elastic load balancing service, and more of fully managed desktop resources on the pathway of cloud. Some of the common characteristics of computer services include the following: Broad choice of resizable compute instances Flexible pricing opportunities Great discounts for always on compute resources Lower hourly rates for elastic workloads Wide-ranging networking configuration selections A widespread choice of operating systems Virtual desktops Save further as you grow with tiered pricing model Storage: AWS offers low cost with high durability and availability with their storage services. With pay-as-you-go pricing model with no commitment, provides more flexibility and agility in services and processes for storage with a highly secured environment. AWS provides storage solutions and services for backup, archive, disaster recovery, and so on. They also support block, file, and object kind of storages with a highly available and flexible infrastructure. A few major characteristics for storage are as follows: Cost-effective, high-scale storage varieties Data protection and data management Storage gateway Choice of instance storage options Content delivery and networking: AWS offers a wide set of networking services that enables you to create a logical isolated network that the architect defines and to create a private network connection to the AWS infrastructure, with fault tolerant, scalable, and highly available DNS service. It also provides delivery services for content to your end users, by very low latency and high data transfer speed with AWS CDN service. A few major characteristics for content delivery and networking are as follows: Application and media files delivery Software and large file distribution Private content Databases: AWS offers fully managed, distributed relational and NoSQL type of database services. Moreover, database services are capable of in-memory caching, sharding, and scaling with/without data warehouse solutions. A few major characteristics for databases are as follows: RDS DynamoDB Redshift ElastiCache Application services: AWS provides a variety of managed application services with lower cost such as application streaming and queuing, transcoding, push notification, searching, and so on. A few major characteristics for databases are as follows: AppStream CloudSearch Elastic transcoder SWF, SES, SNS, SQS Deployment and management: AWS offers management of credentials to explore AWS services such as monitor services, application services, and updating stacks of AWS resources. They also have deployment and security services alongside with AWS API activity. A few major characteristics for deployment and management services are as follows: IAM CloudWatch Elastic Beanstalk CloudFormation Data Pipeline OpsWorks CloudHSM Cloud Trail Summary There are a few more additional important services from AWS, such as support, integration with existing infrastructure, Big Data, and ecosystem, which puts it on the top of other infrastructure providers. As a cloud architect, it is necessary to learn about cloud service offerings and their all-important functionalities. Resources for Article: Further resources on this subject: Amazon DynamoDB - Modelling relationships, Error handling [article] Managing Microsoft Cloud [article] Amazon Web Services [article]
Read more
  • 0
  • 0
  • 15051

article-image-deploying-highly-available-openstack
Packt
21 Sep 2015
17 min read
Save for later

Deploying Highly Available OpenStack

Packt
21 Sep 2015
17 min read
In this article by Arthur Berezin, the author of the book OpenStack Configuration Cookbook, we will cover the following topics: Installing Pacemaker Installing HAProxy Configuring Galera cluster for MariaDB Installing RabbitMQ with mirrored queues Configuring highly available OpenStack services (For more resources related to this topic, see here.) Many organizations choose OpenStack for its distributed architecture and ability to deliver the Infrastructure as a Service (IaaS) platform for mission-critical applications. In such environments, it is crucial to configure all OpenStack services in a highly available configuration to provide as much possible uptime for the control plane services of the cloud. Deploying a highly available control plane for OpenStack can be achieved in various configurations. Each of these configurations would serve certain set of demands and introduce a growing set of prerequisites. Pacemaker is used to create active-active clusters to guarantee services' resilience to possible faults. Pacemaker is also used to create a virtual IP addresses for each of the services. HAProxy serves as a load balancer for incoming calls to service's APIs. This article discusses neither high availably of virtual machine instances nor Nova-Compute service of the hypervisor. Most of the OpenStack services are stateless, OpenStack services store persistent in a SQL database, which is potentially a single point of failure we should make highly available. In this article, we will deploy a highly available database using MariaDB and Galera, which implements multimaster replication. To ensure availability of the message bus, we will configure RabbitMQ with mirrored queues. This article discusses configuring each service separately on three controllers' layout that runs OpenStack controller services, including Neutron, database, and RabbitMQ message bus. All can be configured on several controller nodes, or each service could be implemented on its separate set of hosts. Installing Pacemaker All OpenStack services consist of system Linux services. The first step of ensuring services' availability is to configure Pacemaker clusters for each service, so Pacemaker monitors the services. In case of failure, Pacemaker restarts the failed service. In addition, we will use Pacemaker to create a virtual IP address for each of OpenStack's services to ensure services are accessible using the same IP address when failures occurs and the actual service has relocated to another host. In this section, we will install Pacemaker and prepare it to configure highly available OpenStack services. Getting ready To ensure maximum availability, we will install and configure three hosts to serve as controller nodes. Prepare three controller hosts with identical hardware and network layout. We will base our configuration for most of the OpenStack services on the configuration used in a single controller layout, and we will deploy Neutron network services on all three controller nodes. How to do it… Run the following steps on three highly available controller nodes: Install pacemaker packages: [root@controller1 ~]# yum install -y pcs pacemaker corosync fence-agents-all resource-agents Enable and start the pcsd service: [root@controller1 ~]# systemctl enable pcsd [root@controller1 ~]# systemctl start pcsd Set a password for hacluster user; the password should be identical on all the nodes: [root@controller1 ~]# echo 'password' | passwd --stdin hacluster We will use the hacluster password through the HAProxy configuration. Authenticate all controller nodes running using -p option to give the password on the command line, and provide the same password you have set in the previous step: [root@controller1 ~] # pcs cluster auth controller1 controller2 controller3 -u hacluster -p password --force At this point, you may run pcs commands from a single controller node instead of running commands on each node separately. [root@controller1 ~]# rabbitmqctl set_policy HA '^(?!amq.).*' '{"ha-mode": "all"}' There's more... You may find the complete Pacemaker documentation, which includes installation documentation, complete configuration reference, and examples in Cluster Labs website at http://clusterlabs.org/doc/. Installing HAProxy Addressing high availability for OpenStack includes avoiding high load of a single host and ensuring incoming TCP connections to all API endpoints are balanced across the controller hosts. We will use HAProxy, an open source load balancer, which is particularly suited for HTTP load balancing as it supports session persistence and layer 7 processing. Getting ready In this section, we will install HAProxy on all controller hosts, configure Pacemaker cluster for HAProxy services, and prepare for OpenStack services configuration. How to do it... Run the following steps on all controller nodes: Install HAProxy package: # yum install -y haproxy Enable nonlocal binding Kernel parameter: # echo net.ipv4.ip_nonlocal_bind=1 >> /etc/sysctl.d/haproxy.conf # echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind Configure HAProxy load balancer settings for the GaleraDB, RabbitMQ, and Keystone service as shown in the following diagram: Edit /etc/haproxy/haproxy.cfg with the following configuration: global    daemon defaults    mode tcp    maxconn 10000    timeout connect 2s    timeout client 10s    timeout server 10s   frontend vip-db    bind 192.168.16.200:3306    timeout client 90s    default_backend db-vms-galera   backend db-vms-galera    option httpchk    stick-table type ip size 2    stick on dst    timeout server 90s    server rhos5-db1 192.168.16.58:3306 check inter 1s port 9200    server rhos5-db2 192.168.16.59:3306 check inter 1s port 9200    server rhos5-db3 192.168.16.60:3306 check inter 1s port 9200   frontend vip-rabbitmq    bind 192.168.16.213:5672    timeout client 900m    default_backend rabbitmq-vms   backend rabbitmq-vms    balance roundrobin    timeout server 900m    server rhos5-rabbitmq1 192.168.16.61:5672 check inter 1s    server rhos5-rabbitmq2 192.168.16.62:5672 check inter 1s    server rhos5-rabbitmq3 192.168.16.63:5672 check inter 1s   frontend vip-keystone-admin    bind 192.168.16.202:35357    default_backend keystone-admin-vms backend keystone-admin-vms    balance roundrobin    server rhos5-keystone1 192.168.16.64:35357 check inter 1s    server rhos5-keystone2 192.168.16.65:35357 check inter 1s    server rhos5-keystone3 192.168.16.66:35357 check inter 1s   frontend vip-keystone-public    bind 192.168.16.202:5000    default_backend keystone-public-vms backend keystone-public-vms    balance roundrobin    server rhos5-keystone1 192.168.16.64:5000 check inter 1s    server rhos5-keystone2 192.168.16.65:5000 check inter 1s    server rhos5-keystone3 192.168.16.66:5000 check inter 1s This configuration file is an example for configuring HAProxy with load balancer for the MariaDB, RabbitMQ, and Keystone service. We need to authenticate on all nodes before we are allowed to change the configuration to configure all nodes from one point. Use the previously configured hacluster user and password to do this. # pcs cluster auth controller1 controller2 controller3 -u hacluster -p password --force Create a Pacemaker cluster for HAProxy service as follows: Note that you can run pcs commands now from a single controller node. # pcs cluster setup --name ha-controller controller1 controller2 controller3 # pcs cluster enable --all # pcs cluster start --all Finally, using pcs resource create command, create a cloned systemd resource that will run a highly available active-active HAProxy service on all controller hosts: pcs resource create lb-haproxy systemd:haproxy op monitor start-delay=10s --clone Create the virtual IP address for each of the services: # pcs resource create vip-db IPaddr2 ip=192.168.16.200 # pcs resource create vip-rabbitmq IPaddr2 ip=192.168.16.213 # pcs resource create vip-keystone IPaddr2 ip=192.168.16.202 You may use pcs status command to verify whether all resources are successfully running: # pcs status Configuring Galera cluster for MariaDB Galera is a multimaster cluster for MariaDB, which is based on synchronous replication between all cluster nodes. Effectively, Galera treats a cluster of MariaDB nodes as one single master node that reads and writes to all nodes. Galera replication happens at transaction commit time, by broadcasting transaction write set to the cluster for application. Client connects directly to the DBMS and experiences close to the native DBMS behavior. wsrep API (write set replication API) defines the interface between Galera replication and the DBMS: Getting ready In this section, we will install Galera cluster packages for MariaDB on our three controller nodes, then we will configure Pacemaker to monitor all Galera services. Pacemaker can be stopped on all cluster nodes, as shown, if it is running from previous steps: # pcs cluster stop --all How to do it.. Perform the following steps on all controller nodes: Install galera packages for MariaDB: # yum install -y mariadb-galera-server xinetd resource-agents Edit /etc/sysconfig/clustercheck and add the following lines: MYSQL_USERNAME="clustercheck" MYSQL_PASSWORD="password" MYSQL_HOST="localhost" Edit Galera configuration file /etc/my.cnf.d/galera.cnf with the following lines: Make sure to enter host's IP address at the bind-address parameter. [mysqld] skip-name-resolve=1 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=[host-IP-address] wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_name="galera_cluster" wsrep_slave_threads=1 wsrep_certify_nonPK=1 wsrep_max_ws_rows=131072 wsrep_max_ws_size=1073741824 wsrep_debug=0 wsrep_convert_LOCK_to_trx=0 wsrep_retry_autocommit=1 wsrep_auto_increment_control=1 wsrep_drupal_282555_workaround=0 wsrep_causal_reads=0 wsrep_notify_cmd= wsrep_sst_method=rsync You can learn more on each of the Galera's default options on the documentation page at http://galeracluster.com/documentation-webpages/configuration.html. Add the following lines to the xinetd configuration file /etc/xinetd.d/galera-monitor: service galera-monitor {        port           = 9200        disable         = no        socket_type     = stream        protocol       = tcp        wait           = no        user           = root        group           = root        groups         = yes        server         = /usr/bin/clustercheck        type           = UNLISTED        per_source     = UNLIMITED        log_on_success =        log_on_failure = HOST        flags           = REUSE } Start and enable the xinetd service: # systemctl enable xinetd # systemctl start xinetd # systemctl enable pcsd # systemctl start pcsd Authenticate on all nodes. Use the previously configured hacluster user and password to do this as follows: # pcs cluster auth controller1 controller2 controller3 -u hacluster -p password --force Now commands can be run from a single controller node. Create a Pacemaker cluster for Galera service: # pcs cluster setup --name controller-db controller1 controller2 controller3 # pcs cluster enable --all # pcs cluster start --all Add the Galera service resource to the Galera Pacemaker cluster: # pcs resource create galera galera enable_creation=true wsrep_cluster_address="gcomm://controller1,controller2,controll er3" meta master-max=3 ordered=true op promote timeout=300s on- fail=block --master Create a user for CLusterCheck xinetd service: mysql -e "CREATE USER 'clustercheck'@'localhost' IDENTIFIED BY 'password';" See also You can find the complete Galera documentation, which includes installation documentation and complete configuration reference and examples in Galera cluster website at http://galeracluster.com/documentation-webpages/. Installing RabbitMQ with mirrored queues RabbitMQ is used as a message bus for services to inner-communicate. The queues are located on a single node that makes the RabbitMQ service a single point of failure. To avoid RabbitMQ being a single point of failure, we will configure RabbitMQ to use mirrored queues across multiple nodes. Each mirrored queue consists of one master and one or more slaves, with the oldest slave being promoted to the new master if the old master disappears for any reason. Messages published to the queue are replicated to all slaves. Getting Ready In this section, we will install RabbitMQ packages on our three controller nodes and configure RabbitMQ to mirror its queues across all controller nodes, then we will configure Pacemaker to monitor all RabbitMQ services. How to do it.. Perform the following steps on all controller nodes: Install RabbitMQ packages on all controller nodes: # yum -y install rabbitmq-server Start and enable rabbitmq-server service: # systemctl start rabbitmq-server # systemctl stop rabbitmq-server RabbitMQ cluster nodes use a cookie to determine whether they are allowed to communicate with each other; for nodes to be able to communicate, they must have the same cookie. Copy erlang.cookie from controller1 to controller2 and controller3: [root@controller1 ~]# scp /var/lib/rabbitmq/.erlang.cookie root@controller2:/var/lib/rabbitmq/ [root@controller1 ~]## scp /var/lib/rabbitmq/.erlang.cookie root@controller3:/var/lib/rabbitmq/ Start and enable Pacemaker on all nodes: # systemctl enable pcsd # systemctl start pcsd Since we already authenticated all nodes of the cluster in the previous section, we can now run following commands on controller1. Create a new Pacemaker cluster for RabbitMQ service as follows: [root@controller1 ~]# pcs cluster setup --name rabbitmq controller1 controller2 controller3 [root@controller1 ~]# pcs cluster enable --all [root@controller1 ~]# pcs cluster start --all To the Pacemaker cluster, add a systemd resource for RabbitMQ service: [root@controller1 ~]# pcs resource create rabbitmq-server systemd:rabbitmq-server op monitor start-delay=20s --clone Since all RabbitMQ nodes must join the cluster one at a time, stop RabbitMQ on controller2 and controller3: [root@controller2 ~]# rabbitmqctl stop_app [root@controller3 ~]# rabbitmqctl stop_app Join controller2 to the cluster and start RabbitMQ on it: [root@controller2 ~]# rabbitmqctl join_cluster rabbit@controller1 [root@controller2 ~]# rabbitmqctl start_app Now join controller3 to the cluster as well and start RabbitMQ on it: [root@controller3 ~]# rabbitmqctl join_cluster rabbit@controller1 [root@controller3 ~]# rabbitmqctl start_app At this point, the cluster should be configured and we need to set RabbitMQ's HA policy to mirror the queues to all RabbitMQ cluster nodes as follows: There's more.. The RabbitMQ cluster should be configured with all the queues cloned to all controller nodes. To verify cluster's state, you can use the rabbitmqctl cluster_status and rabbitmqctl list_policies commands from each of controller nodes as follows: [root@controller1 ~]# rabbitmqctl cluster_status [root@controller1 ~]# rabbitmqctl list_policies To verify Pacemaker's cluster status, you may use pcs status command as follows: [root@controller1 ~]# pcs status See also For a complete documentation on how RabbitMQ implements the mirrored queues feature and additional configuration options, you can refer to project's documentation pages at https://www.rabbitmq.com/clustering.html and https://www.rabbitmq.com/ha.html. Configuring Highly OpenStack Services Most OpenStack services are stateless web services that keep persistent data on a SQL database and use a message bus for inner-service communication. We will use Pacemaker and HAProxy to run OpenStack services in an active-active highly available configuration, so traffic for each of the services is load balanced across all controller nodes and cloud can be easily scaled out to more controller nodes if needed. We will configure Pacemaker clusters for each of the services that will run on all controller nodes. We will also use Pacemaker to create a virtual IP addresses for each of OpenStack's services, so rather than addressing a specific node, services will be addressed by their corresponding virtual IP address. We will use HAProxy to load balance incoming requests to the services across all controller nodes. Get Ready In this section, we will use the virtual IP address we created for the services with Pacemaker and HAProxy in previous sections. We will also configure OpenStack services to use the highly available Galera-clustered database, and RabbitMQ with mirrored queues. This is an example for the Keystone service. Please refer to the Packt website URL here for complete configuration of all OpenStack services. How to do it.. Perform the following steps on all controller nodes: Install the Keystone service on all controller nodes: yum install -y openstack-keystone openstack-utils openstack-selinux Generate a Keystone service token on controller1 and copy it to controller2 and controller3 using scp: [root@controller1 ~]# export SERVICE_TOKEN=$(openssl rand -hex 10) [root@controller1 ~]# echo $SERVICE_TOKEN > ~/keystone_admin_token [root@controller1 ~]# scp ~/keystone_admin_token root@controller2:~/keystone_admin_token Export the Keystone service token on controller2 and controller3 as well: [root@controller2 ~]# export SERVICE_TOKEN=$(cat ~/keystone_admin_token) [root@controller3 ~]# export SERVICE_TOKEN=$(cat ~/keystone_admin_token) Note: Perform the following commands on all controller nodes. Configure the Keystone service on all controller nodes to use vip-rabbit: # openstack-config --set /etc/keystone/keystone.conf DEFAULT admin_token $SERVICE_TOKEN # openstack-config --set /etc/keystone/keystone.conf DEFAULT rabbit_host vip-rabbitmq Configure the Keystone service endpoints to point to Keystone virtual IP: # openstack-config --set /etc/keystone/keystone.conf DEFAULT admin_endpoint 'http://vip-keystone:%(admin_port)s/' # openstack-config --set /etc/keystone/keystone.conf DEFAULT public_endpoint 'http://vip-keystone:%(public_port)s/' Configure Keystone to connect to the SQL databases use Galera cluster virtual IP: # openstack-config --set /etc/keystone/keystone.conf database connection mysql://keystone:keystonetest@vip-mysql/keystone # openstack-config --set /etc/keystone/keystone.conf database max_retries -1 On controller1, create Keystone KPI and sync the database: [root@controller1 ~]# keystone-manage pki_setup --keystone-user keystone --keystone-group keystone [root@controller1 ~]# chown -R keystone:keystone /var/log/keystone   /etc/keystone/ssl/ [root@controller1 ~] su keystone -s /bin/sh -c "keystone-manage db_sync" Using scp, copy Keystone SSL certificates from controller1 to controller2 and controller3: [root@controller1 ~]# rsync -av /etc/keystone/ssl/ controller2:/etc/keystone/ssl/ [root@controller1 ~]# rsync -av /etc/keystone/ssl/ controller3:/etc/keystone/ssl/ Make sure that Keystone user is owner of newly copied files controller2 and controller3: [root@controller2 ~]# chown -R keystone:keystone /etc/keystone/ssl/ [root@controller3 ~]# chown -R keystone:keystone /etc/keystone/ssl/ Create a systemd resource for the Keystone service, use --clone to ensure it runs with active-active configuration: [root@controller1 ~]# pcs resource create keystone systemd:openstack-keystone op monitor start-delay=10s --clone Create endpoint and user account for Keystone with the Keystone VIP as given: [root@controller1 ~]# export SERVICE_ENDPOINT="http://vip-keystone:35357/v2.0" [root@controller1 ~]# keystone service-create --name=keystone --type=identity --description="Keystone Identity Service" [root@controller1 ~]# keystone endpoint-create --service keystone --publicurl 'http://vip-keystone:5000/v2.0' --adminurl 'http://vip-keystone:35357/v2.0' --internalurl 'http://vip-keystone:5000/v2.0'   [root@controller1 ~]# keystone user-create --name admin --pass keystonetest [root@controller1 ~]# keystone role-create --name admin [root@controller1 ~]# keystone tenant-create --name admin [root@controller1 ~]# keystone user-role-add --user admin --role admin --tenant admin Create all controller nodes on a keystonerc_admin file with OpenStack admin credentials using the Keystone VIP: cat > ~/keystonerc_admin << EOF export OS_USERNAME=admin export OS_TENANT_NAME=admin export OS_PASSWORD=password export OS_AUTH_URL=http://vip-keystone:35357/v2.0/ export PS1='[u@h W(keystone_admin)]$ ' EOF Source the keystonerc_admin credentials file to be able to run the authenticated OpenStack commands: [root@controller1 ~]# source ~/keystonerc_admin At this point, you should be able to execute the Keystone commands and create the Services tenant: [root@controller1 ~]# keystone tenant-create --name services --description "Services Tenant" Summary In this article, we have covered the installation of Pacemaker and HAProxy, configuration of Galera cluster for MariaDB, installation of RabbitMQ with mirrored queues, and configuration of highly available OpenStack services. Resources for Article: Further resources on this subject: Using the OpenStack Dash-board [article] Installing OpenStack Swift [article] Architecture and Component Overview [article]
Read more
  • 0
  • 0
  • 14902
article-image-network-virtualization-and-vsphere
Packt
21 Oct 2013
12 min read
Save for later

Network Virtualization and vSphere

Packt
21 Oct 2013
12 min read
(For more resources related to this topic, see here.) Network virtualization is what makes vCloud Director such an awesome tool. When we talk about isolated networks, we are talking about vCloud Director making use of different methods of the Network layer 3 encapsulation (OSI/ISO model). Basically, it's the same concept that was introduced with VLANs. VLANs split up the network communication in a network in different totally-isolated communication streams. vCloud makes use of these isolated networks to create networks in Organizations and vApps. vCloud Director has three different network items listed as follows: External Network: This is a network that exists outside vCloud, for example, a production network. It is basically a port group in vSphere that is used in vCloud to connect to the outside world. An External Network can be connected to multiple Organization Networks. External Networks are not virtualized and are based on existing port groups on vSwitch or a Distributed Switch (also called a vNetwork Distributed Switch or vNDS). Organization Network: This is a network that exists only inside one organization. You can have multiple Organization Networks in an organization. Organizational networks come in three different types: Isolated: An isolated Organization Network exists only in this organization and is not connected to an External Network; however, it can be connected to vApp Networks or VMs. This network type uses network virtualization and its own network settings. Routed Network (Edge Gateway): An Organization Network connects to an existing Edge Device. An Edge Gateway allows defining firewall, NAT rules, DHCP services, Static Routes, as well as VPN connections and the load balance functionality. Routed Gateways connect External Networks to vApp Networks and/or VMs. This network uses virtualized networks and its own network settings. Directly connected: This Organization Network is an extension of an External Network into the organization. They directly connect External Networks to vApp Networks or VMs. These networks do NOT use network virtualization and they make use of the network settings of an External Network. vApp Network: This is a virtualized network that only exists inside a vApp. You can have multiple vApp Networks inside one vApp. A vApp Network can connect to VMs and to Organization Networks. It has its own network settings. When connecting a vApp Network to an Organization Network, you can create a router between the vApp and the Organization Network, which lets you define DHCP, firewall, NAT rules, and Static Routing. To create isolated networks, vCloud Director uses Network Pools. Network Pools are a collection of VLANs, port groups, and VLANs that can use layer 2 in the layer 3 encapsulation. The content of these pools can be used by Organizations and vApp Networks for network virtualization. Network Pools There are four kinds of Network Pools that can be created: Virtual eXtensible LANs (VXLAN): VXLAN networks are layer 2 networks that are encapsulated in layer 3 packets. VMware calls this Software Defined Networking (SDN). VXLANs are automatically created by vCloud Director (vCD); however, they don't work out of the box and require some extra configuration in vCloud Network and Security (refer to the Making VXLANs work recipe). Network isolation-backed: These have basically the same concept as VXLANs; however, they work out of the box and use MAC-in-MAC encapsulation. The difference is that VXLANs can transcend routers whereas Network isolation-backed networks can't (refer to the Creating isolated networks without 1,000 VXLANs recipe). vSphere port groups-backed: vCD uses pre-created port groups to build the vApp or Organization Networks. You need to pre-provision one port group for every vApp/Organization Network you would like to use. VLAN-backed: vCD uses a pool of VLAN numbers to automatically provision port groups on demand; however, you still need to configure the VLAN trunking. You will need to reserve one VLAN for every vApp/Organization Network you would like to use. VXLANs and Network isolation-backed networks solve the problems of pre-provisioning and reserving a multitude of VLANs, which makes them extremely important. However, using a port group or VLAN Network Pools can have additional benefits that we will explore later. So let's get started! Now let's have a closer look at what one can do with networks in vCloud, but before we dive into the recipes, let's make sure we are all on the same page. Usage of different Network types vCloud Director has three different network items. An External Network is basically a port group in vSphere that is imported into vCloud. An Organization Network is an isolated network that exists only in an organization. The same is true for vApp Networks, which exists only in vApps. In each example you will also see a diagram of the specific network: Isolated vApp Network Isolated vApp Networks exist only inside vApps. They are useful if one needs to test how VMs behave in a network or to test using an IP range that is already in use (for example, production). The downside of them is that they are isolated, meaning that it is hard to get information or software in and out. Have a look at the Forwarding an RDP (or SSH) session into an isolated vApp and accessing a fully isolated vApp or Organization Network recipes in this article to find some answers to this problem. VMs directly connected to an External Network VMs inside a vApp are connected to a Direct Organization Network that is again directly connected to an External Network, meaning that they will use the IPs from the External Network Pool. Typically, these VMs are used for production, making it possible for customers to choose vCloud for fast provisioning of preconfigured templates. As vCloud manages the IPs for a given IP range (Static Pool), it can be quite easy to fast provision multiple VMs this way. vApp Network connected via vApp router to an External Network VMs are connected to a vApp Network that has a vApp router defined as its gateway. The gateway connects to a Direct Organization Network. The gateway will automatically be given an IP from the External Network Pool. The IPs of the VMs inside the vApp will be managed by the vApp Static Pool. These configurations come in handy to reduce the amount of physical networking that has to be provisioned. The vApp router can act as a router with defined firewall rules, it can do S-NAT and D-NAT as well as define static routing and DHCP services. So instead of using a physical VLAN or subnet, one can hide away applications this way. As an added benefit, these applications can be used as templates for fast deployment. VMs directly connected to an isolated Organization Network VMs are connected directly to an isolated Organization Network. Connecting VMs directly to an isolated Organization Network normally only makes sense if there's more than one vApp/VM connected to the same Organization Network. These network constructs come in handy when we want to repeatedly test complex applications that require certain infrastructure services such as Active Directory, DHCP, DNS, database, and Exchange Servers. Instead of deploying the needed infrastructure inside the testing vApp, we create a new vApp that contains only the infrastructure. By connecting the test vApp to the infrastructure vApp via an isolated Organization Network, the test vApp can now use the infrastructure. This makes it possible to re-use these infrastructure services not only for one vApp but also for many vApps, reducing the amount of resources needed for testing. By using vApp sharing options, you can even hide away the infrastructure vApp from your users. vApp connected via a vApp router to an isolated Organization Network VMs are connected to a vApp Network that has a vApp router as its gateway. The vApp router gets its IP automatically from the Organization Network pool. The VMs will get their IPs from the vApp Network pool. Basically, it is a combination of the network examples—VMs directly connected to an isolated Organization Network and a vApp Network connected via a vApp router to an External Network. A test vApp or an infrastructure vApp can be packaged this way and be made ready for fast deployment. VMs connected directly to an Edge device. VMs are directly connected to the Edge Organization Network and get their IPs from the Organization Network pool. Their gateway is the Edge device that connects them to the External Networks through the Edge firewall. A typical example for this is the usage of the Edge load balancing feature in order to load balance VMs inside the vApp. Another example is that organizations that are using the same External Network are secured against each other using the Edge firewall. This is mostly the case if the External Network is the Internet and each organization is an external customer. A vApp connected to an Edge via a vApp router. VMs are connected to a vApp Network that has the vApp router as its gateway. The vApp router will automatically get an IP from the Organization Network, which again has its gateway as the Edge. The VMs will get their IPs from the vApp Network pool. This is a more complicated variant of the previous example, allowing customers to package their VMs, secure them against other vApps or VMs, or subdivide their allocated networks. IP management Let's have a look at IP management with vCloud. vCloud has the following three different settings for IP management of VMs: DHCP: You will need to provide a DHCP as vCloud doesn't automatically create one. However, a vApp router or an Edge can create one. Static-IP Pool: The IP for the VM comes from the Static IP Pool of the network it is connected to. In addition to the IP, the subnet mask, DNS, gateway, and domain suffix will be configured on the VM according to the IP settings. Static-Manual: The IP can be defined manually; it doesn't come from the pool. The IP you define must be part of the network segment that is defined by the gateway and the subnet mask. In addition to the IP, the subnet mask, DNS, gateway, and domain suffix will be configured on the VM according to the IP settings. All these settings require Guest Customization to be effective. If no Guest Customization is selected, or if the VM doesn't have VMware tools installed, it doesn't work, and whatever the VM was configured with as a template will be used. Instead of wasting space and retyping what you need for each recipe every time, the following are some of the basic ingredients you will have to have ready for this article. An organization in which at least one OvDC is present. The OvDC needs to be configured with at least three free isolated networks that have a network pool defined. Some VM templates of an OS type you find easy to use (Linux or Windows) An External Network that connects you to the outside world (as in outside vCloud), for example, your desktop, and has at least five IPs in the Static IP Pool. One thing that needs to be said about vApps is that they actually come in two completely different versions: the vSphere vApp and the vCloud vApp. vSphere and vCloud vApps The vSphere vApp concept was introduced in vSphere 4.0 as a container for VMs. In vSphere, a vApp is essentially a resource pool with some extras, such as the starting and stopping order and (if you configured it) network IP allocation methods. The idea is for the vApp to be an entity of VMs that build one unit. Such vApps can then be exported or imported using OVF (Open Virtualization Format). A very good example of a vApp is VMware Operations Manager. It comes as a vApp in an OVF and contains not only the VMs but also the startup sequence as well as setup scripts. When the vApp is deployed for the first time, additional information such as network settings are asked and then implemented. A vSphere vApp is a resource pool; it can be configured so that it will only demand resources that it is using; on the other hand, resource pool configuration is something that most people struggle with. A vSphere vApp is only a resource pool; it is not automatically represented as a folder within the VMs and Template view of vSphere, but is viewed there as a vApp, as shown in the following screenshot: The vCloud vApp is a very different concept. First of all, it is not a resource pool. The VMs of the vCloud vApp live in the OvDC resource pool. However, the vCloud vApp is automatically a folder in the VMs and Template view of vSphere. It is a construct that is created by vCloud, and consists of VMs, a start and stop sequence, and networks. The network part is one of the major differences (next to the resource pool). In vSphere, only basic network information (IP's assignment, gateway, and DNS) is stored in the vApp. A vCloud vApp actually encapsulates the networks. The vCloud vApp networks are full networks, meaning they contain the full information for a given network including network settings and IP pools. This information is kept while importing and exporting vCloud vApps, as shown in the following screenshot: While I'm referring to vApps in this article, I will always mean vCloud vApps. If vCenter vApps feature anywhere in this article, they will be written as vCenter vApp. Summary In this article we learned different VMware concepts that will help in improving productivity. We also went through recipes that deal with the daily tasks and also present new ideas and concepts that you may not have thought of before. Resources for Article: Further resources on this subject: Windows 8 with VMware View [Article] Cloning and Snapshots in VMware Workstation [Article] vCloud Networks [Article]
Read more
  • 0
  • 0
  • 14654

article-image-microsoft-becomes-the-worlds-most-valuable-public-company-moves-ahead-of-apple
Sugandha Lahoti
03 Dec 2018
3 min read
Save for later

Microsoft becomes the world's most valuable public company, moves ahead of Apple

Sugandha Lahoti
03 Dec 2018
3 min read
Last week, Microsoft moved ahead of Apple as the world’s most valuable publicly traded U.S. company. On Friday, the company closed on with a market value of $851 billion with Apple a few steps short at $847 billion. The move from Windows to Cloud Microsoft's success can be attributed to its able leadership under CEO Satya Nadella and his focus on moving away from the flagship Windows operating system and focusing on cloud-computing services with long-term business contracts. The organization's biggest growth has happened in its Azure Cloud platform. Cloud computing now accounts for more than a quarter of Microsoft’s revenue rivaling Amazon, which is also a leading provider. Microsoft is also building new products and features for Azure. Last month, it announced container support for Azure Cognitive Services to build intelligent applications. In October, it invested in Grab to together conquer the Southeast Asian on-demand services market with Azure’s Intelligent Cloud. In September, at the Ignite 2018, the company announced major changes and improvements to their cloud offering. It also came up with Azure Functions 2.0 with better workload support for serverless, general availability of Microsoft’s Immutable storage for Azure Storage Blobs, and Azure DevOps. In August, Microsoft made Azure supported for NVIDIA GPU Cloud (NGC), and a new governance DApp for Azure. Wedbush analyst Dan Ives commented that “Azure is still in its early days, meaning there’s plenty of room for growth, especially considering the company’s large customer base for Office and other products. While the tech carnage seen over the last month has been brutal, shares of (Microsoft) continue to hold up like the Rock of Gibraltar” he said. Focus on business and values Microsoft has also prioritized business-oriented services such as Office and other workplace software, as well as newer additions such as LinkedIn and Skype. In 2016, Microsoft bought LinkedIn, the social network for professionals, for $26.2 billion. This year, Microsoft paid $7.5 billion for GitHub, an open software platform used by 28 million programmers. Another reason Microsoft is flourishing is because of its focus on upholding its founding values without compromising on issues like internet censorship and surveillance. Daniel Morgan, senior portfolio manager for Synovus Trust, says “Microsoft is outperforming its tech rivals in part because it doesn’t face as much regulatory scrutiny as advertising-hungry Google and Facebook, which have attracted controversy over their data-harvesting practices. Unlike Netflix, it’s not on a hunt for a diminishing number of international subscribers. And while Amazon also has a strong cloud business, it’s still more dependent on online retail.” In a recent episode of Pivot with Kara Swisher and Scott Galloway, the two speakers also talked about why Microsoft is more valuable than Apple. Scott said that Microsoft’s success is because of Nadella’s decision of diversifying Microsoft’s business into enough verticals which is the reason why the company hasn’t been as impacted by tech stocks’ recent decline. He argues that Satya Nadella deserves the title of “tech CEO of the year”. Microsoft wins $480 million US Army contract for HoloLens. Microsoft amplifies focus on conversational AI: Acquires XOXCO; shares guide to developing responsible bots. Microsoft announces official support for Windows 10 to build 64-bit ARM apps
Read more
  • 0
  • 0
  • 14296
Modal Close icon
Modal Close icon