How-To Tutorials

27 Dec 2016

26 min read

Building Images

27 Dec 2016

0
0
2613

Packt

27 Dec 2016

5 min read

The prototype pattern

Packt

27 Dec 2016

5 min read

In this article by Kyle Mew, author of the book Android Design Patterns and Best Practices, we will discuss how the prototype design pattern performs similar tasks to other creational patterns, such as builders and factories, but it takes a very different approach. Rather than relying heavily on a number of hardcoded subclasses, the prototype, as its name suggests, makes copies of an original, vastly reducing the number of subclasses required and preventing any lengthy creation processes. (For more resources related to this topic, see here.) Setting up a prototype The prototype is most useful when the creation of an instance is expensive in some way. This could be the loading of a large file, a detailed cross-examination of a database, or some other computationally expensive operation. Furthermore, it allows us to decouple cloned objects from their originals, allowing us to make modifications without having to reinstantiate each time. In the following example, we will demonstrate this using functions that take considerable time to calculate when first created, the nth prime number and the nth Fibonacci number. Viewed diagrammatically, our prototype will look like this: We will not need the prototype pattern in our main app as there are very few expensive creations as such. However, it is vitally important in many situations and should not be neglected. Follow the given steps to apply a prototype pattern: We will start with the following abstract class: public abstract class Sequence implements Cloneable { protected long result; private String id; public long getResult() { return result; } public String getId() { return id; } public void setId(String id) { this.id = id; } public Object clone() { Object clone = null; try { clone = super.clone(); } catch (CloneNotSupportedException e) { e.printStackTrace(); } return clone; } } Next, add this cloneable concrete class, as shown: // Calculates the 10,000th prime number public class Prime extends Sequence { public Prime() { result = nthPrime(10000); } public static int nthPrime(int n) { int i, count; for (i = 2, count = 0; count < n; ++i) { if (isPrime(i)) { ++count; } } return i - 1; } // Test for prime number private static boolean isPrime(int n) { for (int i = 2; i < n; ++i) { if (n % i == 0) { return false; } } return true; } } Add another Sequence class, like so: for the Fibonacci numbers, like so: // Calculates the 100th Fibonacci number public class Fibonacci extends Sequence { public Fibonacci() { result = nthFib(100); } private static long nthFib(int n) { long f = 0; long g = 1; for (int i = 1; i <= n; i++) { f = f + g; g = f - g; } return f; } } Next, create the cache class, as follows: public class SequenceCache { private static Hashtable<String, Sequence> sequenceHashtable = new Hashtable<String, Sequence>(); public static Sequence getSequence(String sequenceId) { Sequence cachedSequence = sequenceHashtable.get(sequenceId); return (Sequence) cachedSequence.clone(); } public static void loadCache() { Prime prime = new Prime(); prime.setId("1"); sequenceHashtable.put(prime.getId(), prime); Fibonacci fib = new Fibonacci(); fib.setId("2"); sequenceHashtable.put(fib.getId(), fib); } } Add three TextViews to your layout, and then, add the code to your Main Activity's onCreate() method. Add the following lines to the client code: // Load the cache once only SequenceCache.loadCache(); // Lengthy calculation and display of prime result Sequence prime = (Sequence) SequenceCache.getSequence("1"); primeText.setText(new StringBuilder() .append(getString(R.string.prime_text)) .append(prime.getResult()) .toString()); // Lengthy calculation and display of Fibonacci result SSequence fib = (Sequence) SequenceCache.getSequence("2"); fibText.setText(new StringBuilder() .append(getString(R.string.fib_text)) .append(fib.getResult()) .toString()); As you can see, the preceding code creates the pattern but does not demonstrate it. Once loaded, the cache can create instant copies of our previously expensive output. Furthermore, we can modify the copy, making the prototype very useful when we have a complex object and want to modify just one or two properties. Applying the prototype Consider a detailed user profile similar to what you might find on a social media site. Users modify details such as images and text, but the overall structure is the same for all profiles, making it an ideal candidate for a prototype pattern. To put this principle into practice, include the following code in your client source code: // Create a clone of already constructed object Sequence clone = (Fibonacci) new Fibonacci().clone(); // Modify the result long result = clone.getResult() / 2; // Display the result quickly cloneText.setText(new StringBuilder() .append(getString(R.string.clone_text)) .append(result) .toString()); The prototype is a very useful pattern in many occasions, where we have expensive objects to create or when we face a proliferation of subclasses. Although this is not the only pattern that helps reduce excessive sub classing, it leads us on to another design pattern, decorator. Summary In this article we explored the vital pattern, the prototype pattern and saw how vital it can be whenever we have large files or slow processes to contend with. Resources for Article: Further resources on this subject: Why we need Design Patterns? [article] Understanding Material Design [article] Asynchronous Control Flow Patterns with ES2015 and beyond [article]

0
0
10717

How-To Tutorials

Packt

27 Dec 2016

14 min read

A Smarter Way of Working with WPF

Packt

27 Dec 2016

14 min read

In this article by Sheridan Yuen, author of the book Mastering Windows Presentation Foundation, we would learn that, when Windows Presentation Foundation (WPF) was first released as part of the .NET Framework version 3.0 in 2006, it was billed as the future of desktop application Graphical User Interface (GUI) languages and supporters claimed that it would put an end to the previous GUI technology, Windows Forms. However, as time passed, it has fallen far short of this claim. (For more resources related to this topic, see here.) There are three main reasons that WPF has not taken off as widely as previously expected. The first reason has nothing to do with WPF and stems from the recent push to host everything in the cloud and having web interfaces rather than desktop applications. The second reason relates to the very steep learning curve and the very different way of working that is required to master WPF. The last reason is that it is not a very efficient language and if a WPF application has lots of 'bells and whistles' in, then either the client computers will need to have additional RAM and/or graphics cards installed, or they could face a slow and stuttering user experience. This explains why many companies that make use of WPF today are in the finance industry, where they can afford to upgrade all users' computers to be able to run their applications optimally. This article will aim to make WPF more accessible to the rest of us by providing practical tips and tricks to help build our real-world applications more easily and more efficiently. One of the simplest changes with the biggest workflow improvements that we can make to improve the way we work with WPF is to follow the MVVM software architectural pattern. It describes how we can organize our classes to make our applications more maintainable, testable, and generally simpler to understand. In this article, we will take a closer look at this pattern and discover how it can help us to improve our applications. After discovering what MVVM is and what its benefits are, we'll learn several new ways to communicate between the various components in this new environment. We'll then focus on the physical structure of the code base in a typical MVVM application and investigate a variety of alternative arrangements. What is MVVM and how does it help? Model-View-View Model (MVVM) is a software architectural pattern that was famously introduced by John Gossman on his blog back in 2005 and is now commonly used when developing WPF applications. Its main purpose is to provide a Separation of Concerns between the business model, the User Interface (UI), and the business logic. It does this by dividing them into three distinct types of core components: Models, Views, and View Models. Let's take a look at how they are arranged and what each of these components represent. As we can see here, the View Models component sits between the Models and the Views and provides two-way access to each of them. It should be noted at this point that there should be no direct relationship between the Views and Models components and only loose connections between the other components. Let's now take a closer look at what each of these components represent. Models Unlike the other MVVM components, the Model constituent comprises of a number of elements. It encompasses the business data model along with its related validation logic and also the Data Access Layer (DAL), or data repositories, that provide the application with data access and persistence. The data model represents the classes that hold the data in the application. They typically mirror the columns in the database more or less, although it is common that they are hierarchical in form, and so may require joins to be performed in the data source in order to fully populate them. One alternative would be to design the data model classes to fit the requirements in the UI, but either way, the business logic or validation rules will typically reside in the same project as the data model. The code that is used to interface with whatever data persistence technology is used in our application is also included within the Models component of the pattern. Care should be taken when it comes to organizing this component in the code base, as there are a number of issues to take into consideration. We'll investigate this further in a while, but for now, let's continue to find out more about the components in this pattern. View Models The View Models can be explained easily; each View Model provides its associated View with all of the data and functionality that it requires. In some ways, they can be considered to be similar to the old Windows Forms code behind files, except that they have no direct relationship with the View that they are serving. A better analogy, if you're familiar with MVC, would be that they are similar to the Controllers in the Model-View-Controller (MVC) software architectural pattern. In fact, in his blog, John describes the MVVM pattern as being a variation of the MVC pattern. They have two-way connections with the Model component in order to access and update the data that the Views require, and often, they transform that data in some way to make it easier to display and interact with in the UI. They also have two-way connections with the Views through data binding and property change notification. In short, View Models form the bridge between the Model and the View, which otherwise have no connection to each other. However, it should be noted that the View Models are only loosely connected to the Views and Model components through their data binding and interfaces. The beauty of this pattern enables each element to be able to function independently from each other. To maintain the separation between the View Models and the View, we avoid declaring any properties of UI-related types in the View Model. We don't want any references to UI-related DLLs in our View Models project, and so we make use of custom IValueConverter implementations to convert them to primitive types. For example, we might convert Visibility objects from the UI to plain bool values or convert the selection of some colored Brush objects to an Enum instance that is safe to use in the View Model. Views The Views define the appearance and layout of the UI. They typically connect with a View Model through the use of their DataContext property and display the data that it supplies. They expose the functionality provided by the View Model by connecting its commands to the UI controls that the users interact with. In general, the basic rule of thumb is that each View has one associated View Model. This does not mean that a View cannot data bind to more than one data source or that we cannot reuse View Models. It simply means that, in general, if we have a class called SecurityView, it is more than likely that we'll also have an instance of a class named SecurityViewModel that will be set as the value of that View's DataContext property. Data binding One often overlooked aspect of the MVVM pattern is its requirement for data binding. We could not have the full Separation of Concerns without it, as there would be no easy way of communicating between the Views and View Models. The XAML markup, data binding classes, and ICommand and INotifyPropertyChanged interfaces are the main tools in WPF that provide this functionality. The ICommand interface is how commanding is implemented in the .NET Framework. It provides behavior that implements and even extends the ever useful Command pattern, in which an object encapsulates everything needed to perform an action. Most of the UI controls in WPF have Command properties that we can use to connect them to the functionality that the commands provide. The INotifyPropertyChanged interface is used to notify binding clients that property values have been changed. For example, if we had a User object and it had a Name property, then our User class would be responsible for raising the PropertyChanged event of the INotifyPropertyChanged interface, specifying the name of the property each time its value was changed. We'll look much deeper into all of this later, but now let's see how the arrangement of these components help us. So how does MVVM help? One major benefit of adopting MVVM is that it provides the crucial Separation of Concerns between the business model, the UI, and the business logic. This enables us to do several things. It frees the View Models from the Models, both the business model and the data persistence technology. This in turn enables us to reuse the business model in other applications and swap out the DAL and replace it with a mock data layer so that we can effectively test the functionality in our view models without requiring any kind of real data connection. It also disconnects the Views from the View logic that they require, as this is provided by the View Models. This allows us to run each component independently, which has the advantage of enabling one team to work on designing the Views, while another team works on the View Models. Having parallel work streams enables companies to benefit from vastly reduced production times. Furthermore, this separation also makes it easier for us to swap the Views for a different technology without needing to change our Model code. We may well need to change some aspects of the View Models, for example, the new technology used for the Views may not support the ICommand interface, but in principal, the amount of code that we would need to change would be fairly minimal. The simplicity of the MVVM pattern also makes WPF easier to comprehend. Knowing that each View has a View Model that provides it with all the data and functionality that it requires means that we always know where to look when we want to find where our data bound properties have been declared. Is there a downside? There are, however, a few drawbacks to using MVVM, and it will not help us in every situation. The main downside to implementing MVVM is that it adds a certain level of complexity to our applications. First, there's the data binding, which can take some time to master. Also, depending on your version of Visual Studio, data binding errors may only appear at runtime and can be very tricky to track down. Then, there are the different ways to communicate between the Views and View Models that we need to understand. Commanding and handling events in an unusual way takes a while to get used to. Having to discover the optimal arrangement of all the required components in the code base also takes time. So, there is a steep learning curve to climb before we can become competent at implementing MVVM for sure. However, even when we are well practiced at the pattern, there are still occasional situations when it wouldn't make sense to implement MVVM. One example would be if our application was going to be very small, it would be unlikely that we would want to have unit tests for it or swap out any of its components. It would, therefore, be impractical to go through the added complexity of implementing the pattern when the benefits of the Separation of Concerns that it provides were not required. Debunking the myth about code behind One of the great misconceptions about MVVM is that we should avoid putting any code into the code behind files of our Views. While there is some truth to this, it is certainly not true in all situations. If we think logically for a moment, we already know that the main reason to use MVVM is to take advantage of the Separation of Concerns that its architecture provides. Part of this separates the business functionality in the View Model from the user interface-related code in the Views. Therefore, the rule should really be we should avoid putting any business logic into the code behind files of our Views. Keeping this in mind, let's look at what code we might want to put into the code behind file of a View. The most likely suspects would be some UI-related code, maybe handling a particular event, or launching a child window of some kind. In these cases, using the code behind file would be absolutely fine. We have no business-related code here, and so we have no need to separate it from the other UI-related code. On the other hand, if we had written some business-related code in a View's code behind file, then how could we test it? In this case, we would have no way to separate this from the View, no longer have our Separation of Concerns and, therefore, would have broken our implementation of MVVM. So in cases like this, the myth is no longer a myth… it is good advice. However, even in cases like this where we want to call some business-related code from a View, it is possible to achieve without breaking any rules. As long as our business code resides in a View Model, it can be tested through that View Model, so it's not so important where it is called from during runtime. Understanding that we can always access the View Model that is data bound to a View's DataContext property, let's look at this simple example: private void Button_Click(object sender, RoutedEventArgs e) { UserViewModel viewModel = (UserViewModel)DataContext; viewModel.PerformSomeAction(); } Now, there are some who would balk at this code example, as they correctly believe that Views should not know anything about their related View Models. This code effectively ties this View Model to this View. If we wanted to change the UI layer in our application at some point or have designers work on the View, then this code would cause us a problem. However, we need to be realistic… what is the likelihood that we will really need do that? If it is likely, then we really shouldn't put this code into the code behind file and instead handle the event by wrapping it in an Attached Property, however, if it is not at all likely, then there is really no problem with putting it there. Let's follow rules when they make sense for us to follow them rather than blindly sticking to them because somebody in a different scenario said they were a good idea. One other situation when we can ignore this 'No code behind' rule is when writing self-contained controls based on the UserControl class. In these cases, the code behind files are often used for defining Dependency Properties and/or handling UI events and for implementing general UI functionality. Remember though, if these controls are implementing some business-related functionality, we should write that into a View Model and call it from the control so that it can still be tested. There is definitely perfect sense in the general idea of avoiding writing business-related code in the code behind files of our Views and we should always try to do so. However, we now hopefully understand the reasoning behind this idea and can use our logic to determine whether it is ok to do it or not in each particular case that may arise. Summary In this article, we have discovered what the MVVM architectural pattern is and the benefits of using it when developing WPF applications. We also learned about the various components of the MVVM architectural pattern and their functionalities. We're now in a better position to decide which applications to use it with and which not to. Resources for Article: Further resources on this subject: Features of Dynamics GP [article] Auditing and E-discovery [article] Installing Microsoft Forefront UAG [article]

0
0
1167

How-To Tutorials

article-image-planning-failure-and-success

Packt

27 Dec 2016

24 min read

Planning for Failure (and Success)

Packt

27 Dec 2016

24 min read

In this article by Michael Solberg and Ben Silverman, the author of the book Openstack for Architects, we will be walking through how to architect your cloud to avoid hardware and software failures. The OpenStack control plane is comprised of web services, application services, database services, and a message bus. Each of these tiers require different approaches to make them highly available and some organizations will already have defined architectures for each of the services. We've seen that customers either reuse those existing patterns or adopt new ones which are specific to the OpenStack platform. Both of these approaches make sense, depending on the scale of the deployment. Many successful deployments actually implement a blend of these. For example, if your organization already has a supported pattern for highly available MySQL databases, you might chose that pattern instead of the one outlined in this article. If your organization doesn't have a pattern for highly available MongoDB, you might have to architect a new one. (For more resources related to this topic, see here.) Building a highly available control plane Back in the Folsom and Grizzly days, coming up with an high availability (H/A) design for the OpenStack control plane was something of a black art. Many of the technologies recommended in the first iterations of the OpenStack High Availability Guide were specific to the Ubuntu distribution of Linux and were unavailable on the Red Hat Enterprise Linux-derived distributions. The now-standard cluster resource manager (Pacemaker) was unsupported by Red Hat at that time. As such, architects using Ubuntu might use one set of software, those using CentOS or RHEL might use another set of software, and those using a Rackspace or Mirantis distribution might use yet another set of software. However, these days, the technology stack has converged and the H/A pattern is largely consistent regardless of the distribution used. About failure and success When we design a highly available OpenStack control plane, we're looking to mitigate two different scenarios: The first is failure. When a physical piece of hardware dies, we want to make sure that we recover without human interaction and continue to provide service to our users The second and perhaps more important scenario is success Software systems always work as designed and tested until humans start using them. While our automated test suites will try to launch a reasonable number of virtual objects, humans are guaranteed to attempt to launch an unreasonable number. Also, many of the OpenStack projects we've worked on have grown far past their expected size and need to be expanded on the fly. There are a few different types of success scenarios that we need to plan for when architecting an OpenStack cloud. First, we need to plan for a growth in the number of instances. This is relatively straightforward. Each additional instance grows the size of the database, it grows the amount of metering data in Ceilometer, and, most importantly, it will grow the number of compute nodes. Adding compute nodes and reporting puts strain on the message bus, which is typically the limiting factor in the size of OpenStack regions or cells. We'll talk more about this when we talk about dividing up OpenStack clouds into regions, cells, and Availability Zones. The second type of growth we need to plan for is an increase in the number of API calls. Deployments which support Continuous Integration(CI) development environments might have (relatively) small compute requirements, but CI typically brings up and tears down environments rapidly. This will generate a large amount of API traffic, which in turn generates a large amount of database and message traffic. In hosting environments, end users might also manually generate a lot of API traffic as they bring up and down instances, or manually check the status of deployments they've already launched. While a service catalog might check the status of instances it has launched on a regular basis, humans tend to hit refresh on their browsers in an erratic fashion. Automated testing of the platform has a tendency to grossly underestimate this kind of behavior. With that in mind, any pattern that we adopt will need to provide for the following requirements: API services must continue to be available during a hardware failure in the control plane The systems which provide API services must be horizontally scalable (and ideally elastic) to respond to unanticipated demands The database services must be vertically or horizontally scalable to respond to unanticipated growth of the platform The message bus can either be vertically or horizontally scaled depending on the technology chosen Finally, every system has its limits. These limits should be defined in the architecture documentation so that capacity planning can account for them. At some point, the control plane has scaled as far as it can and a second control plane should be deployed to provide additional capacity. Although OpenStack is designed to be massively scalable, it isn't designed to be infinitely scalable. High availability patterns for the control plane There are three approaches commonly used in OpenStack deployments these days for achieving high availability of the control plane. The first is the simplest. Take the single-node cloud controller virtualize it, and then make the virtual machine highly available using either VMware clustering or Linux clustering. While this option is simple and it provides for failure scenarios, it scales vertically (not horizontally) and doesn't provide for success scenarios. As such, it should only be used in regions with a limited number of compute nodes and a limited number of API calls. In practice, this method isn't used frequently and we won't spend any more time on it here. The second pattern provides for H/A, but not horizontal scalability. This is the "Active/Passive" scenario described in the OpenStack High Availability Guide. At Red Hat, we used this a lot with our Folsom and Grizzly deployments, but moved away from it starting with Havana. It's similar to the virtualization solution described earlier but instead of relying on VMware clustering or Linux clustering to restart a failed virtual machine, it relies on Linux clustering to restart failed services on a second cloud controller node, also running the same subset of services. This pattern doesn't provide for success scenarios in the Web tier, but can still be used in the database and messaging tiers. Some networking services may still need to be provided as Active/Passive as well. The third H/A pattern available to OpenStack architectures is the Active/Active pattern. In this pattern, services are horizontally scaled out behind a load balancing service or appliance, which is Active/Passive. As a general rule, most OpenStack services should be enabled as Active/Active where possible to allow for success scenarios while mitigating failure scenarios. Ideally, Active/Active services can be scaled out elastically without service disruption by simply adding additional control plane nodes. Both of the Active/Passive and Active/Active designs require clustering software to determine the health of services and the hosts on which they run. In this article, we'll be using Pacemaker as the cluster manager. Some architects may choose to use Keepalived instead of Pacemaker. Active/Passive service configuration In the Active/Passive service configuration, the service is configured and deployed to two or more physical systems. The service is associated with a Virtual IP(VIP)address. A cluster resource manager (normally Pacemaker) is used to ensure that the service and its VIP are enabled on only one of the two systems at any point in time. The resource manager may be configured to favor one of the machines over the other. When the machine that the service is running on fails, the resource manager first ensures that the failed machine is no longer running and then it starts the service on the second machine. Ensuring that the failed machine is no longer running is accomplished through a process known as fencing. Fencing usually entails powering off the machine using the management interface on the BIOS. The fence agent may also talk to a power supply connected to the failed server to ensure that the system is down. Some services (such as the Glance image registry) require shared storage to operate. If the storage is network-based, such as NFS, the storage may be mounted on both the active and the passive nodes simultaneously. If the storage is block-based, such as iSCSI, the storage will only be mounted on the active node and the resource manager will ensure that the storage migrates with the service and the VIP. Active/Active service configuration Most of the OpenStack API services are designed to be run on more than one system simultaneously. This configuration, the Active/Active configuration, requires a load balancer to spread traffic across each of the active services. The load balancer manages the VIP for the service and ensures that the backend systems are listening before forwarding traffic to them. The cluster manager ensures that the VIP is only active on one node at a time. The backend services may or may not be managed by the cluster manager in the Active/Active configuration. Service or system failure is detected by the load balancer and failed services are brought out of rotation. There are a few different advantages to the Active/Active service configuration, which are as follows: The first advantage is that it allows for horizontal scalability. If additional capacity is needed for a given service, a new system can be brought up which is running the service and it can be added into rotation behind the load balancer without any downtime. The control plane may also be scaled down without downtime in the event that it was over-provisioned. The second advantage is that Active/Active services have a much shorter mean time to recovery. Fencing operations often take up to 2 minutes and fencing is required before the cluster resource manager will move a service from a failed system to a healthy one. Load balancers can immediately detect system failure and stop sending requests to unresponsive nodes while the cluster manager fences them in the background. Whenever possible, architects should employ the Active/Active pattern for the control plane services. OpenStack service specifics In this section, we'll walk through each of the OpenStack services and outline the H/A strategy for them. While most of the services can be configured as Active/Active behind a load balancer, some of them must be configured as Active/Passive and others may be configured as Active/Passive. Some of the configuration is dependent on a particular version of OpenStack as well, especially, Ceilometer, Heat, and Neutron. The following details are current as of the Liberty release of OpenStack. The OpenStack web services As a general rule, all of the web services and the Horizon dashboard may be run Active/Active. These include the API services for Keystone, Glance, Nova, Cinder, Neutron, Heat, and Ceilometer. The scheduling services for Nova, Cinder, Neutron, Heat, and Ceilometer may also be deployed Active/Active. These services do not require a load balancer, as they respond to requests on the message bus. The only web service which must be run Active/Passive is the Ceilometer Central agent. This service can be configured to split its workload among multiple instances, however, to support scaling horizontally. The database services All state for the OpenStack web services is stored in a central database—usually a MySQL database. MySQL is usually deployed in an Active/Passive configuration, but can be made Active/Active with the Galera replication extension. Galera is clustering software for MySQL (MariaDB in OpenStack) and this uses synchronous replication to achieve H/A. However, even with Galera, we still recommend directing writes to only one of the replicas—some queries used by the OpenStack services may deadlock when writing to more than one master. With Galera, a load balancer is typically deployed in front of the cluster and is configured to deliver traffic to only one replica at a time. This configuration reduces the mean time to recovery of the service while ensuring that the data is consistent. In practice, many organizations will defer to the database architects for their preference regarding highly available MySQL deployments. After all, it is typically the database administration team who is responsible for responding to failures of that component. Deployments which use the Ceilometer service also require a MongoDB database to store telemetry data. MongoDB is horizontally scalable by design and is typically deployed Active/Active with at least three replicas. The message bus All OpenStack services communicate through the message bus. Most OpenStack deployments these days use the RabbitMQ service as the message bus. RabbitMQ can be configured to be Active/Active through a facility known as "mirrored queues". The RabbitMQ service is not load balanced, each service is given a list of potential nodes and the client is responsible for determining which nodes are active and which ones have failed. Other messaging services used with OpenStack such as ZeroMQ, ActiveMQ, or Qpid may have different strategies and configurations for achieving H/A and horizontal scalability. For these services, refer to the documentation to determine the optimal architecture. Compute, storage, and network agents The compute, storage, and network components in OpenStack has a set of services which perform the work which is scheduled by the API services. These services register themselves with the schedulers on start up over the message bus. The schedulers are responsible for determining the health of the services and scheduling work to active services. The compute and storage services are all designed to be run Active/Active but the network services need some extra consideration. Each hypervisor in an OpenStack deployment runs the nova-compute service. When this service starts up, it registers itself with the nova-scheduler service. A list of currently available nova services is available via the nova service-list command. If a compute node is unavailable, its state is listed as down and the scheduler skips it when performing instance actions. When the node becomes available, the scheduler includes it in the list of available hosts. For KVM or Xen-based deployments, the nova-compute service runs once per hypervisor and is not made highly available. For VMware-based deployments though, a single nova-compute service is run for every vSphere cluster. As such, this service should be made highly available in an Active/Passive configuration. This is typically done by virtualizing the service within a vSphere cluster and configuring the virtual machine to be highly available. Cinder includes a service known as the volume service or cinder-volume. The volume service registers itself with the Cinder scheduler on startup and is responsible for creating, modifying, or deleting LUNs on block storage devices. For backends which support multiple writers, multiple copies of this service may be run in Active/Active configuration. The LVM backend (which is the reference backend) is not highly available, though, and may only have one cinder-volume service for each block device. This is because the LVM backend is responsible for providing iSCSI access to a locally attached storage device. For this reason, highly available deployments of OpenStack should avoid the LVM Cinder backend and instead use a backend that supports multiple cinder-volume services. Finally, the Neutron component of OpenStack has a number of agents, which all require some special consideration for highly available deployments. The DHCP agent can be configured as highly available, and the number of agents which will respond to DHCP requests for each subnet is governed by a parameter in the neutron.conf file, dhcp_agents_per_network. This is typically set to 2, regardless of the number of DHCP agents which are configured to run in a control plane. For most of the history of OpenStack, the L3 routing agent in Neutron has been a single point of failure. It could be made highly available in Active/Passive configuration, but its failover meant the interruption of network connections in the tenant space. Many of the third-party Neutron plugins have addressed this in different ways and the reference Open vSwitch plugin has a highly available L3 agent as of the Juno release. For details on implementing a solution to the single routing point of failure using OpenStack's Distributed Virtual Routers (DVR), refer to the OpenStack Foundation's Neutron documentation at http://docs.openstack.org/liberty/networking-guide/scenario-dvr-ovs.html. Regions, cells, and availability Zones As we mentioned before, OpenStack is designed to be scalable, but not infinitely scalable. There are three different techniques architects can use to segregate an OpenStack cloud—regions, cells, and Availability Zones. In this section, we'll walk through how each of these concepts maps to hypervisor topologies. Regions From an end user's perspective, OpenStack regions are equivalent to regions in Amazon Web Services. Regions live in separate data centers and are often named after their geographical location. If your organization has a data center in Phoenix and one in Raleigh (like ours does) you'll have at least a PHX and a RDU region. Users who want to geographically disperse their workloads will place some of them in PHX and some of them in RDU. Regions have separate API endpoints, and although the Horizon UI has some support for multiple regions, they essentially entirely separate deployments. From an architectural standpoint, there are two main design choices for implementing regions, which are as follows: The first is around authorization. Users will want to have the same credentials for accessing each of the OpenStack regions. There are a few ways to accomplish this. The simplest way is to use a common backing store (usually LDAP) for the Keystone service in each region. In this scenario, the user has to authenticate separately to each region to get a token, but the credentials are the same. In Juno and later, Keystone also supports federation across regions. In this scenario, a Keystone token granted by one region can be presented to another region to authenticate a user. While this currently isn't widely used, it is a major focus area for the OpenStack Foundation and will probably see broader adoption in the future. The second major consideration for regional architectures is whether or not to present a single set of Glance images to each region. While work is currently being done to replicate Glance images across federated clouds, most organizations are manually ensuring that the shared images are consistent. This typically involves building a workflow around image publishing and deprecation which is mindful of the regional layout. Another option for ensuring consistent images across regions is to implement a central image repository using Swift. This also requires shared Keystone and Glance services which span multiple data centers. Details on how to design multiple regions with shared services are in the OpenStack Architecture Design Guide. Cells The Nova compute service has a concept of cells, which can be used to segregate large pools of hypervisors within a single region. This technique is primarily used to mitigate the scalability limits of the OpenStack message bus. The deployment at CERN makes wide use of cells to achieve massive scalability within single regions. Support for cells varies from service to service and as such cells are infrequently used outside a few very large cloud deployments. The CERN deployment is well-documented and should be used as a reference for these types of deployments. In our experience, it's much simpler to deploy multiple regions within a single data center than to implement cells to achieve large scale. The added inconvenience of presenting your users with multiple API endpoints within a geographic location is typically outweighed by the benefits of having a more robust platform. If multiple control planes are available in a geographic region, the failure of a single control plane becomes less dramatic. The cells architecture has its own set of challenges with regard to networking and scheduling of instance placement. Some very large companies that support the OpenStack effort have been working for years to overcome these hurdles. However, many different OpenStack distributions are currently working on a new control plane design. These new designs would begin to split the OpenStack control plane into containers running the OpenStack services in a microservice type architecture. This way the services themselves can be placed anywhere and be scaled horizontally based on the load. One architecture that has garnered a lot of attention lately is the Kolla project that promotes Docker containers and Ansible playbooks to provide production-ready containers and deployment tools for operating OpenStack clouds. To see more, go to https://wiki.openstack.org/wiki/Kolla. Availability Zones Availability Zones are used to group hypervisors within a single OpenStack region. Availability Zones are exposed to the end user and should be used to provide the user with an indication of the underlying topology of the cloud. The most common use case for Availability Zones is to expose failure zones to the user. To ensure the H/A of a service deployed on OpenStack, a user will typically want to deploy the various components of their service onto hypervisors within different racks. This way, the failure of a top of rack switch or a PDU will only bring down a portion of the instances which provide the service. Racks form a natural boundary for Availability Zones for this reason. There are a few other interesting uses of Availability Zones apart from exposing failure zones to the end user. One financial services customer we work with had a requirement for the instances of each line of business to run on dedicated hardware. A combination of Availability Zones and the AggregateMultiTenancyIsolation Nova Scheduler filter were used to ensure that each tenant had access to dedicated compute nodes. Availability Zones can also be used to expose hardware classes to end users. For example, hosts with faster processors might be placed in one Availability Zone and hosts with slower processors might be placed in different Availability Zones. This allows end users to decide where to place their workloads based upon compute requirements. Updating the design document In this article, we walked through the different approaches and considerations for achieving H/A and scalability in OpenStack deployments. As Cloud Architects, we need to decide on the correct approach for our deployment and then document it thoroughly so that it can be evaluated by the larger team in our organization. Each of the major OpenStack vendors has a reference architecture for highly available deployments and those should be used as a starting point for the design. The design should then be integrated with existing Enterprise Architecture and modified to ensure that best practices established by the various stakeholders within an organization are followed. The system administrators within an organization may be more comfortable supporting Pacemaker than Keepalived. The design document presents the choices made for each of these key technologies and gives the stakeholders an opportunity to comment on them before the deployment. Planning the physical architecture The simplest way to achieve H/A is to add additional cloud controllers to the deployment and cluster them. Other deployments may choose to segregate services into different host classes, which can then be clustered. This may include separating the database services into database nodes, separating the messaging services into messaging nodes, and separating the memcached service into memcache nodes. Load balancing services might live on their own nodes as well. The primary considerations for mapping scalable services to physical (or virtual) hosts are the following: Does the service scale horizontally or vertically? Will vertically scaling the service impede the performance of other co-located services? Does the service have particular hardware or network requirements that other services don't have? For example, some OpenStack deployments which use the HAProxy load balancing service chose to separate out the load balancing nodes on a separate hardware. The VIPs which the load balancing nodes host must live on a public, routed network, while the internal IPs of services that they route to don't have that requirement. Putting the HAProxy service on separate hosts allows the rest of the control plane to only have private addressing. Grouping all of the API services on dedicated hosts may ease horizontal scalability. These services don't need to be managed by a cluster resource manager and can be scaled by adding additional nodes to the load balancers without having to update cluster definitions. Database services have high I/O requirements. Segregating these services onto machines which have access to high performance fiber channel may make sense. Finally, you should consider whether or not to virtualize the control plane. If the control plane will be virtualized, creating additional host groups to host dedicated services becomes very attractive. Having eight or nine virtual machines dedicated to the control plane is a very different proposition than having eight or nine physical machines dedicated to the control plane. Most highly available control planes require at least three nodes to ensure that quorum is easily determined by the cluster resource manager. While dedicating three physical nodes to the control function of a hundred node OpenStack deployment makes a lot of sense, dedicating nine physical nodes may not. Many of the organizations that we've worked with will already have a VMware-based cluster available for hosting management appliances and the control plane can be deployed within that existing footprint. Organizations which are deploying a KVM-only cloud may not want to incur the additional operational complexity of managing the additional virtual machines outside OpenStack. Updating the physical architecture design Once the mapping of services to physical (or virtual) machines has been determined, the design document should be updated to include definition of the host groups and their associated functions. A simple example is provided as follows: Load balancer: These systems provide the load balancing services in an Active/Passive configuration Cloud controller: These systems provide the API services, the scheduling services, and the Horizon dashboard services in an Active/Active configuration Database node: These systems provide the MySQL database services in an Active/Passive configuration Messaging node: These systems provide the RabbitMQ messaging services in an Active/Active configuration Compute node: These systems act as KVM hypervisors and run the nova-compute and openvswitch-agent services Deployments which will be using only the cloud controller host group might use the following definitions: Cloud controller: These systems provide the load balancing services in an Active/Passive configuration and the API services, MySQL database services, and RabbitMQ messaging services in an Active/Active configuration Compute node: These systems act as KVM hypervisors and run the nova-compute and openvswitch-agent services After defining the host groups, the physical architecture diagram should be updated to reflect the mapping of host groups to physical machines in the deployment. This should also include considerations for network connectivity. The following is an example architecture diagram for inclusion in the design document: Summary A complete guide to implementing H/A of the OpenStack services is probably worth a book to itself. In this article we started out by covering the main strategies for making OpenStack services highly available and which strategies apply well to each service. Then we covered how OpenStack deployments are typically segmented across physical regions. Finally, we updated our documentation and implemented a few of the technologies we discussed in the lab. While walking through the main considerations for highly available deployments in this article, we've tried to emphasize a few key points: Scalability is at least as important as H/A in cluster design. Ensure that your design is flexible in case of unexpected growth. OpenStack doesn't scale forever. Plan for multiple regions. Also, it's important to make sure that the strategy and architecture that you adopt for H/A is supportable by your organization. Consider reusing existing architectures for H/A in the message bus and database layers. Resources for Article: Further resources on this subject: Neutron API Basics [article] The OpenFlow Controllers [article] OpenStack Networking in a Nutshell [article]

0
0
10142

article-image-orchestration-docker-swarm

Packt

27 Dec 2016

28 min read

Orchestration with Docker Swarm

Packt

27 Dec 2016

28 min read

0
1
10212

Packt

26 Dec 2016

25 min read

Introduction to Ansible

Packt

26 Dec 2016

25 min read

In this article by Walter Bentley, the author of the book OpenStack Administration with Ansible 2 - Second Edition. This article will serve as a high-level overview of Ansible 2.0 and components that make up this open source configuration management tool. We will cover the definition of the Ansible components and their typical use. Also, we will discuss how to define variables for the roles and defining/setting facts about the hosts for the playbooks. Next, we will transition into how to set up your Ansible environment and the ways you can define the host inventory used to run your playbooks against. We will then cover some of the new components introduced in Ansible 2.0 named Blocks and Strategies. It will also review the cloud integrations natively part of the Ansible framework. Finally, the article will finish up with a working example of a playbook that will confirm the required host connectivity needed to use Ansible. The following topics are covered: Ansible 2.0 overview What are playbooks, roles, and modules? Setting up the environment Variables and facts Defining the inventory Blocks and Strategies Cloud integrations (For more resources related to this topic, see here.) Ansible 2.0 overview Ansible in its simplest form has been described as a python-based open source IT automation tool that can be used to configuremanage systems, deploy software (or almost anything), and provide orchestration to a process. These are just a few of the many possible use cases for Ansible. In my previous life as a production support infrastructure engineer, I wish such a tool would have existed. Would have surely had much more sleep and a lot less gray hairs. One thing that always stood out to me in regard to Ansible is that the developer's first and foremost goal was to create a tool that offers simplicity and maximum ease of use. In the world filled with complicated and intricate software, keeping it simple goes a long way for most IT professionals. Staying with the goal of keeping things simple, Ansible handles configuration/management of hosts solely through Secure Shell (SSH). Absolutely no daemon or agent is required. The server or workstation where you run the playbooks from only needs python and a few other packages, most likely already present, installed. Honestly, it does not get simpler than that. The automation code used with Ansible is organized into something named playbooks and roles, of which is written in YAML markup format. Ansible follows the YAML formatting and structure within the playbooks/roles. Being familiar with YAML formatting helps in creating your playbooks/roles. If you are not familiar do not worry, as it is very easy to pick up (it is all about the spaces and dashes). The playbooks and roles are in a noncomplied format making the code very simple to read if familiar with standard UnixLinux commands. There is also a suggested directory structure in order to create playbooks. This also is one of my favorite features of Ansible. Enabling the ability to review and/or use playbooks written by anyone else with little to no direction needed. It is strongly suggested that you review the Ansible playbook best practices before getting started: http://docs.ansible.com/playbooks_best_practices.html. I also find the overall Ansible website very intuitive and filled with great examples at http://docs.ansible.com. My favorite excerpt from the Ansible playbook best practices is under the Content Organization section. Having a clear understanding of how to organize your automation code proved very helpful to me. The suggested directory layout for playbooks is as follows: group_vars/ group1 # here we assign variables to particular groups group2 # "" host_vars/ hostname1 # if systems need specific variables, put them here hostname2 # "" library/ # if any custom modules, put them here (optional) filter_plugins/ # if any custom filter plugins, put them here (optional) site.yml # master playbook webservers.yml # playbook for webserver tier dbservers.yml # playbook for dbserver tier roles/ common/ # this hierarchy represents a "role" tasks/ # main.yml # <-- tasks file can include smaller files if warranted handlers/ # main.yml # <-- handlers file templates/ # <-- files for use with the template resource ntp.conf.j2 # <------- templates end in .j2 files/ # bar.txt # <-- files for use with the copy resource foo.sh # <-- script files for use with the script resource vars/ # main.yml # <-- variables associated with this role defaults/ # main.yml # <-- default lower priority variables for this role meta/ # main.yml # <-- role dependencies It is now time to dig deeper into reviewing what playbooks, roles, and modules consist of. This is where we will break down each of these component's distinct purposes. What are playbooks, roles, and modules? The automation code you will create to be run by Ansible is broken down in hierarchical layers. Envision a pyramid with its multiple levels of elevation. We will start at the top and discuss playbooks first. Playbooks Imagine that a playbook is the very topmost triangle of the pyramid. A playbook takes on the role of executing all of the lower level code contained in a role. It can also be seen as a wrapper to the roles created. We will cover the roles in the next section. Playbooks also contain other high-level runtime parameters, such as the host(s) to run the playbook against, the root user to use, and/or if the playbook needs to be run as a sudo user. These are just a few of the many playbook parameters you can add. Below is an example of what the syntax of a playbook looks like: --- # Sample playbooks structure/syntax. - hosts: dbservers remote_user: root become: true roles: - mysql-install In the preceding example, you will note that the playbook begins with ---. This is required as the heading (line 1) for each playbook and role. Also, please note the spacing structure at the beginning of each line. The easiest way to remember it is each main command starts with a dash (-). Then, every subcommand starts with two spaces and repeats the lower in the code hierarchy you go. As we walk through more examples, it will start to make more sense. Let's step through the preceding example and break down the sections. The first step in the playbook was to define what hosts to run the playbook against; in this case, it was dbservers (which can be a single host or list of hosts). The next area sets the user to run the playbook as locally, remotely, and it enables executing the playbook as sudo. The last section of the syntax lists the roles to be executed. The earlier example is similar to the formatting of the other playbooks. This format incorporates defining roles, which allows for scaling out playbooks and reusability (you will find the most advanced playbooks structured this way). With Ansible's high level of flexibility, you can also create playbooks in a simpler consolidated format. An example of such kind is as follows: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers remote_user: root become: true tasks: - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted - name: Prep MySQL db command: chdir=/usr/bin mysqladmin -u root password 'passwd' Now that we have reviewed what playbooks are, we will move on to reviewing roles and their benefits. Roles Moving down to the next level of the Ansible pyramid, we will discuss roles. The most effective way to describe roles is the breaking up a playbook into multiple smaller files. So, instead of having one long playbook with multiple tasks defined, all handling separately related steps, you can break the playbook into individual specific roles. This format keeps your playbooks simple and leads to the ability to reuse roles between playbooks. The best advice I personally received concerning creating roles is to keep them simple. Try to create a role to do a specific function, such as just installing a software package. You can then create a second role to just do configurations. In this format, you can reuse the initial installation role over and over without needing to make code changes for the next project. The typical syntax of a role can be found here and would be placed into a file named main.yml within the roles/<name of role>/tasks directory: --- - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted - name: Prep MySQL db command: chdir=/usr/bin mysqladmin -u root password 'passwd' The complete structure of a role is identified in the directory layout found in the Ansible Overview section of this article. We will review additional functions of roles as we step through the working examples. With having covered playbooks and roles, we are prepared to cover the last topic in this session, which are modules. Modules Another key feature of Ansible is that it comes with predefined code that can control system functions, named modules. The modules are executed directly against the remote host(s) or via playbooks. The execution of a module generally requires you to pass a set number of arguments. The Ansible website (http://docs.ansible.com/modules_by_category.html) does a great job of documenting every available module and the possible arguments to pass to that module. The documentation for each module can also be accessed via the command line by executing the command ansible-doc <module name>. The use of modules will always be the recommended approach within Ansible as they are written to avoid making the requested change to the host unless the change needs to be made. This is very useful when re-executing a playbook against a host more than once. The modules are smart enough to know not to re-execute any steps that have already completed successfully, unless some argument or command is changed. Another thing worth noting is with every new release of Ansible additional modules are introduced. Personally, there was an exciting addition to Ansible 2.0, and these are the updated and extended set of modules set to ease the management of your OpenStack cloud. Referring back to the previous role example shared earlier, you will note the use of various modules. The modules used are highlighted here again to provide further clarity: --- - name: Install MySQL apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - name: Copying my.cnf configuration file template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - name: Prep MySQL db command: chdir=/usr/bin mysql_install_db - name: Enable MySQL to be started at boot service: name=mysqld enabled=yes state=restarted ... Another feature worth mentioning is that you are able to not only use the current modules, but you can also write your very own modules. Although the core of Ansible is written in python, your modules can be written in almost any language. Underneath it, all the modules technically return JSON format data, thus allowing for the language flexibility. In this section, we were able to cover the top two sections of our Ansible pyramid, playbooks, and roles. We also reviewed the use of modules, that is, the built-in power behind Ansible. Next, we transition into another key features of Ansible—variable substitution and gathering host facts. Setting up the environment Before you can start experimenting with Ansible, you must install it first. There was no need in duplicating all the great documentation to accomplish this already created on http://docs.ansible.com/ . I would encourage you to go to the following URL and choose an install method of your choice: http://docs.ansible.com/ansible/intro_installation.html. If you are installing Ansible on Mac OS, I found using Homebrew was much simpler and consistent. More details on using Homebrew can be found at http://brew.sh. The command to install Ansible with Homebrew is brew install ansible. Upgrading to Ansible 2.0 It is very important to note that in order to use the new features part of Ansible version 2.0, you must update the version running on your OSA deployment node. The version currently running on the deployment node is either 1.9.4 or 1.9.5. The method that seemed to work well every time is outlined here. This part is a bit experimental, so please make a note of any warnings or errors incurred. From the deployment node, execute the following commands: $ pip uninstall -y ansible $ sed -i 's/^export ANSIBLE_GIT_RELEASE.*/export ANSIBLE_GIT_RELEASE=${ANSIBLE_GIT_RELEASE:-"v2.1.1.0-1"}/' /opt/openstack-ansible/scripts/bootstrap-ansible.sh $ cd /opt/openstack-ansible $ ./scripts/bootstrap-ansible.sh New OpenStack client authenticate Alongside of the introduction of the new python-openstackclient, CLI was the unveiling of the os-client-config library. This library offers an additional way to provide/configure authentication credentials for your cloud. The new OpenStack modules part of Ansible 2.0 leverages this new library through a package named shade. Through the use of os-client-config and shade, you can now manage multiple cloud credentials within a single file named clouds.yml. When deploying OSA, I discovered that shade will search for this file in the $HOME/.config/openstack/ directory wherever the playbook/role and CLI command is executed. A working example of the clouds.yml file is shown as follows: # Ansible managed: /etc/ansible/roles/openstack_openrc/templates/clouds.yaml.j2 modified on 2016-06-16 14:00:03 by root on 082108-allinone02 clouds: default: auth: auth_url: http://172.29.238.2:5000/v3 project_name: admin tenant_name: admin username: admin password: passwd user_domain_name: Default project_domain_name: Default region_name: RegionOne interface: internal identity_api_version: "3" Using this new authentication method drastically simplifies creating automation code to work on an OpenStack environment. Instead of passing a series of authentication parameters in line with the command, you can just pass a single parameter, --os-cloud=default. The Ansible OpenStack modules can also use this new authentication method. More details about os-client-config can be found at: http://docs.openstack.org/developer/os-client-config. Installing shade is required to use the Ansible OpenStack modules in version 2.0. Shade will be required to be installed directly on the deployment node and the Utility container (if you decide to use this option). If you encounter problems installing shade, try the command—pip install shade—isolated. Variables and facts Anyone who has ever attempted to create some sort of automation code, whether be via bash or Perl scripts, knows that being able to define variables is an essential component. Although Ansible does not compare with other programming languages mentioned, it does contain some core programming language features such as variable substitution. Variables To start, let's first define the meaning of variables and use in the event this is a new concept. Variable (computer science), a symbolic name associated with a value and whose associated value may be changed Using variable allows you to set a symbolic placeholder in your automation code that you can substitute values for on each execution. Ansible accommodates defining variables within your playbooks and roles in various ways. When dealing with OpenStack and/or cloud technologies in general being able to adjust your execution parameters on the fly is critical. We will step through a few ways how you can set variable placeholders in your playbooks, how to define variable values, and how you can register the result of a task as a variable. Setting variable placeholders In the event you wanted to set a variable placeholder within your playbooks, you would add the following syntax like this: - name: Copying my.cnf configuration file template: src=cust_my.cnf dest={{ CONFIG_LOC }} mode=0755 In the preceding example, the variable CONFIG_LOC was added in the place of the configuration file location (/etc/my.cnf) designated in the earlier example. When setting the placeholder, the variable name must be encased within {{ }} as shown in the example. Defining variable values Now that you have added the variable to your playbook, you must define the variable value. This can be done easily by passing command-line values as follows: $ ansible-playbook base.yml --extra-vars "CONFIG_LOC=/etc/my.cnf" Or you can define the values directly in your playbook, within each role or include them inside of global playbook variable files. Here are the examples of the three options. Define a variable value directly in your playbook by adding the vars section: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers ... vars: CONFIG_LOC: /etc/my.cnf ... Define a variable value within each role by creating a variable file named main.yml within the vars/ directory of the role with the following contents: --- CONFIG_LOC: /etc/my.cnf To define the variable value inside of the global playbook, you would first create a host-specific variable file within the group_vars/ directory in the root of the playbook directory with the exact same contents as mentioned earlier. In this case, the variable file must be named to match the host or host group name defined within the hosts file. As in the earlier example, the host group name is dbservers; in turn, a file named dbservers would be created within the group_vars/ directory. Registering variables The situation at times arises when you want to capture the output of a task. Within the process of capturing the result you are in essence registering a dynamic variable. This type of variable is slightly different from the standard variables we have covered so far. Here is an example of registering the result of a task to a variable: - name: Check Keystone process shell: ps -ef | grep keystone register: keystone_check The registered variable value data structure can be stored in a few formats. It will always follow a base JSON format, but the value can be stored under different attributes. Personally, I have found it difficult at times to blindly determine the format, The tip given here will save you hours of troubleshooting. To review and have the data structure of a registered variable returned when running a playbook, you can use the debug module, such as adding this to the previous example: - debug: var=keystone_check. Facts When Ansible runs a playbook, one of the first things it does on your behalf is gather facts about the host before executing tasks or roles. The information gathered about the host will range from the base information such as operating system and IP addresses to the detailed information such as the hardware type/resources. The details capture on then stored into a variable named facts. You can find a complete list of available facts on the Ansible website at: http://docs.ansible.com/playbooks_variables.html#information-discovered-from-systems-facts. You have the option to disable the facts gather process by adding the following to your playbook: gather_facts: false. Facts about a host are captured by default unless the feature is disabled. A quick way of viewing all facts associated with a host, you can manually execute the following via a command line: $ ansible dbservers –m setup There is plenty more you can do with facts, and I would encourage you to take some time reviewing them in the Ansible documentation. Next, we will learn more about the base of our pyramid, the host inventory. Without an inventory of hosts to run the playbooks against, you would be creating the automation code for nothing. So to close out this artticle, we will dig deeper into how Ansible handles host inventory whether it be in a static and/or dynamic format. Defining the inventory The process of defining a collection of hosts to Ansible is named the inventory. A host can be defined using its fully qualified domain name (FQDN), local hostname, and/or its IP address. Since Ansible uses SSH to connect to the hosts, you can provide any alias for the host that the machine where Ansible is installed can understand. Ansible expects the inventory file to be in an INI-like format and named hosts. By default, the inventory file is usually located in the /etc/ansible directory and will look as follows: athena.example.com [ocean] aegaeon.example.com ceto.example.com [air] aeolus.example.com zeus.example.com apollo.example.com Personally I have found the default inventory file to be located in different places depending on the operating system Ansible is installed on. With that point, I prefer to use the –i command-line option when executing a playbook. This allows me to designate the specific hosts file location. A working example would look like this: ansible-playbook -i hosts base.yml. In the preceding example, there is a single host and a group of hosts defined. The hosts are grouped together into a group by defining a group name enclosed in [ ] inside the inventory file. Two groups are defined in the earlier-mentioned example—ocean and air. In the event where you do not have any hosts within your inventory file (such as in the case of running a playbook locally only), you can add the following entry to define localhost like this: [localhost] localhost ansible_connection=local The option exists to define variable for hosts and a group inside of your inventory file. More information on how to do this and additional inventory details can be found on the Ansible website at http://docs.ansible.com/intro_inventory.html. Dynamic inventory It seemed appropriate since we are automating functions on a cloud platform to review yet another great feature of Ansible, which is the ability to dynamically capture an inventory of hosts/instances. One of the primary principles of cloud is to be able to create instances on demand directly via an API, GUI, CLI, and/or through automation code, like Ansible. That basic principle will make relying on a static inventory file pretty much a useless choice. This is why, you will need to rely heavily on dynamic inventory. A dynamic inventory script can be created to pull information from your cloud at runtime and then, in turn, use that information for the playbooks execution. Ansible provides the functionality to detect if an inventory file is set as an executable and if so will execute the script to pull current time inventory data. Since creating an Ansible dynamic inventory script is considered more of an advanced activity, I am going to direct you to the Ansible website, (http://docs.ansible.com/intro_dynamic_inventory.html), as they have a few working examples of dynamic inventory scripts there. Fortunately, in our case, we will be reviewing an OpenStack cloud built using openstack-ansible (OSA) repository. OSA comes with a prebuilt dynamic inventory script that will work for your OpenStack cloud. That script is named dynamic_inventory.py and can be found within the playbooks/inventory directory located in the root OSA deployment folder. First, execute the dynamic inventory script manually to become familiar with the data structure and group names defined (this example assumes that you are in the root OSA deployment directory): $ cd playbooks/inventory $ ./dynamic_inventory.py This will print to the screen an output similar to this: ... }, "compute_all": { "hosts": [ "compute1_rsyslog_container-19482f86", "compute1", "compute2_rsyslog_container-dee00ea5", "compute2" ] }, "utility_container": { "hosts": [ "infra1_utility_container-c5589031" ] }, "nova_spice_console": { "hosts": [ "infra1_nova_spice_console_container-dd12200f" ], "children": [] }, ... Next, with this information, you now know that if you wanted to run a playbook against the utility container, all you would have to do is execute the playbook like this: $ ansible-playbook -i inventory/dynamic_inventory.py playbooks/base.yml –l utility_container Blocks & Strategies In this section, we will cover two new features added to version 2.0 of Ansible. Both features add additional functionality to how tasks are grouped or executed within a playbook. So far, they seem to be really nice features when creating more complex automation code. We will now briefly review each of the two new features. Blocks The Block feature can simply be explained as a way of logically grouping tasks together with the option of also applying customized error handling. It gives the option to group a set of tasks together establishing specific conditions and privileges. An example of applying the block functionality to an earlier example can be found here: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers tasks: - block: - apt: name={{item}} state=present with_items: - libselinux-python - mysql - mysql-server - MySQL-python - template: src=cust_my.cnf dest=/etc/my.cnf mode=0755 - command: chdir=/usr/bin mysql_install_db - service: name=mysqld enabled=yes state=restarted - command: chdir=/usr/bin mysqladmin -u root password 'passwd' when: ansible_distribution == 'Ubuntu' remote_user: root become: true Additional details on how to implement Blocks and any associated error handling can be found at http://docs.ansible.com/ansible/playbooks_blocks.html. Strategies The strategy feature allows you to add control on how a play is executed by the hosts. Currently, the default behavior is described as being the linear strategy, where all hosts will execute each task before any host moves on to the next task. As of today, the two other strategy types that exist are free and debug. Since Strategies are implemented as a new type of plugin to Ansible more can be easily added by contributing code. Additional details on Strategies can be found at http://docs.ansible.com/ansible/playbooks_strategies.html. A simple example of implementing a strategy within a playbook is as follows: --- # Sample simple playbooks structure/syntax - name: Install MySQL Playbook hosts: dbservers strategy: free tasks: ... The new debug strategy is extremely helpful when you need to step through your playbook/role to find something like a missing variable, determine what variable value to supply or figure out why it may be sporadically failing. These are just a few of the possible use cases. Definitely I encourage you to give this feature a try. Here is the URL to more details on the playbook debugger: http://docs.ansible.com/ansible/playbooks_debugger.html. Cloud integrations Since cloud automation is the main and most important theme of this article, it only makes sense that we highlight the many different cloud integrations Ansible 2.0 offers right out of the box. Again, this was one of the reasons why I immediately fell in love with Ansible. Yes, the other automation tools also have hooks into many of the cloud providers, but I found at times they did not work or were not mature enough to leverage. Ansible has gone above and beyond to not fall into that trap. Not saying Ansible has all the bases covered, it does feel like most are and that is what matters most to me. If you have not checked out the cloud modules available for Ansible, take a moment now and take a look at http://docs.ansible.com/ansible/list_of_cloud_modules.html. From time to time check back as I am confident, you will be surprised to find more have been added. I am very proud of my Ansible family for keeping on top of these and making it much easier to write automation code against our clouds. Specific to OpenStack, a bunch of new modules have been added to the Ansible library as of version 2.0. The extensive list can be found at http://docs.ansible.com/ansible/list_of_cloud_modules.html#openstack. You will note that the biggest changes, from the first version of this book to this one, will be focused on using as many of the new OpenStack modules when possible. Summary Let's pause here on exploring the dynamic inventory script capabilities and continue to build upon it as we dissect the working examples. We will create our very first OpenStack administration playbook together. We will start off with a fairly simple task of creating users and tenants. This will also include reviewing a few automation considerations you will need to keep in mind when creating automation code for OpenStack. Ready? Ok, let's get started! Resources for Article: Further resources on this subject: AIO setup of OpenStack – preparing the infrastructure code environment [article] RDO Installation [article] Creating Multiple Users/Tenants [article]

0
0
15420

article-image-start-treating-your-infrastructure-code

Packt

26 Dec 2016

18 min read

Start Treating your Infrastructure as Code

Packt

26 Dec 2016

18 min read

0
0
14052

Packt

26 Dec 2016

20 min read

Why Bother? – Basic

Packt

26 Dec 2016

20 min read

0
0
1462

article-image-installing-your-first-application

Packt

23 Dec 2016

9 min read

Installing Your First Application

Packt

23 Dec 2016

9 min read

In this article by Greg Moss, author of the book Working with Odoo 9 - Second Edition, we will learn about the various applications that Odoo has to offer and how you can install Odoo on your own system. Before the release of Odoo 8, most users were focused on ERP and financial-related applications. Now, Odoo has added several important applications that allow companies to use Odoo in much greater scope than ever before. For example, the website builder can be installed to quickly launch a simple website for your business. A task that typically would have been accomplished with a content management system such as Wordpress. Despite all the increasing options available in Odoo, the overall process is the same. We begin by looking at the overall business requirements and decide on the first set of applications that we wish to implement. After understanding our basic objectives, we will create an Odoo database and configure the required company information. Next, we begin exploring the Odoo interface for creating and viewing information. We will see just how easy Odoo is to use by completing an entire sales order workflow. In this article we will cover following topics: Gathering requirements Creating a new database in Odoo Knowing the basic Odoo interface (For more resources related to this topic, see here.) Gathering requirements Setting up an Odoo system is no easy task. Many companies get into trouble believing that they can just install the software and throw in some data. Inevitably, the scope of the project grows and what was supposed to be a simple system ends up a confusing mess. Fortunately, Odoo's modular design will allow you to take a systematic approach to implementing Odoo for your business. Implementing Odoo using a modular approach The bare bones installation of Odoo simply provides you a limited messaging system. To manage your Odoo implementation, you must begin with the planning of the modules with which you will work first. Odoo allows you to install just what you need now and then install additional Odoo modules as you better define your requirements. It can be valuable to take this approach when you are considering how you will implement Odoo for your own business. Don't try and install all the modules and get everything running all at once. Instead, break down the implementation into smaller phases. Introducing Silkworm – our real-world case study To best understand how to work with Odoo, we will build our exercises around a real-world case study. Silkworm is a mid-sized screen printer that manufactures and sells t-shirts as well as a variety of printing projects. Using Odoo's modular design, we will begin by implementing the Sales Order module to set up the selling of basic products. In this specific case, we will be selling t-shirts. As we proceed through this book, we will continue to expand the system by installing additional modules. When implementing Odoo for your organization, you will also want to create a basic requirements document. This information is important for configuration of the company settings in Odoo, and should be considered essential documentation when implementing an ERP system. Creating a new database in Odoo If you have installed Odoo on your own server you will need to first create a database. As you add additional applications to Odoo, the necessary tables and fields will be added to the database you specify. Odoo Online: If you are using Odoo Online, you will not have access to create a new database and instead will use Odoo's one click application installer to manage your Odoo installation. Skip to XXX if you are using an online Odoo installation. If you have just installed a fresh copy of Odoo, you will be prompted automatically to create a new Odoo database. In the preceding screenshot, you can see the Odoo form to Create Database. Odoo provides basic instructions for creating your database. Let us quickly review the fields and how they are used. Selecting a database name When selecting a database name, choose a name that describes the system and that will make clear the purpose of the database. There are a few rules for creating an Odoo database: Your database name cannot contain spaces and must start with a number or letter Also you will need to avoid commas, periods, and quotes Underscores and hyphens are allowed if they are not the first character in the name It can also be a good idea to specify in the name if the database is for development, testing, or production purposes. For the purposes of our real-world case study, we will use the database name: SILKWORM-DEV We have chosen the -DEV suffix as we will consider this a development database that will not be used for production or even for testing. Take the time to consider what you will name your databases. It can be useful to have standard prefixes or suffixes depending on the purpose of your database. For example, you may use -PROD for your production database or -TEST for the database that you are using for testing. Loading demonstration data Notice the box reading Check this box to evaluate Odoo. If you mark this checkbox when you create a database, Odoo will preload your tables with a host of sample data for each module that is installed. This may include fake customers, suppliers, sales orders, invoices, inbox messages, stock moves, and products. The purpose of the demonstration data is to allow you to run modules through their paces without having to key in a ton of test data. For the purposes of our real-world case study in this book, do not load demonstration data. Specifying our default language Odoo offers a variety of language translation features with support for more than twenty languages. All of the examples in this book will use the English (US) language option. Be aware that depending on the language you select in Odoo, you may need to have that language also installed in your base operating system. Choosing a password Each Odoo database is created with an administrator account named admin. This is also known as the superuser account. The password you choose during the creation of the database will be the password for the admin account. Choose any password you wish and click on Create Database to create the SILKWORM-DEV database. Managing databases in Odoo The database management interface allows you to perform basic database management tasks such as backing up or restoring a database. Often with Odoo, it is possible to manage your databases without ever having to go directly into the Postgre database server. It is also possible to set up multiple databases under the same installation of Odoo. For instance, you may want in the future to install another database that does load demonstration data and may be used to install modules simply for testing purposes. If you have trouble getting to the interface to manage databases you can access the database management interface directly by going to the /web/database/manager path. Installing the Sales Management module After clicking on Create Database, it can take a little time depending on your system before you are shown a page that lists the available applications. This screen lets you select from a list of the most common Odoo modules to install. There is very little you can do with just an Odoo database with no modules installed. Now we will install the Sales Management module so we can begin setting up our business selling t-shirts. Click on the Install button to install the Sales Management module. During installation of modules and other long operations, you will often see a Loading icon at the top center of your screen. Unlike previous versions of Odoo that prompted for accounting and other setup information, Odoo now completes the installation unattended. Knowing the basic Odoo interface After the installation of the sales order application, Odoo takes you directly to the Sales dashboard. As we have just installed the application, there is very little to see in the dashboard, but we can see the available menu options along the left edge of the interface. The menus along the top allow you to change between the major applications and settings within Odoo, while the menus down the left side outline all your available choices. In the following screenshot, we are in the main Sales menu. Let's look at one of the main master files that we will be using in many Odoo applications, the Customers. Click the Customers menu on the left. Let's take a moment to look at the screen elements that will appear consistently throughout Odoo. In the top left of the main form, you can clearly see that we are in the Customers section. Using the search box In the top right corner of our form we have a search box: The search box allows you to quickly search for records in the Odoo application. If you are in the Customers section, naturally the search will be looking for customer records. Likewise, if you are looking at the product view, the search box will allow you to search the product records that you have entered into the system. Picking different views Odoo also offers a standard interface to switch between a list view, form view, or other views such as Kanban or graph views. You can see the icon selections under the search box in the right corner of the form: The currently selected view is highlighted in dark. If you move the mouse over the icon you will get a tool-tip that shows you the description of the view. As we have no records in our system currently, let us add a record so we can further explore the Odoo interface. Summary In this article, we have started by creating an Odoo database. We then installed the Sales Order Management module and learned our basic Odoo interface. Resources for Article: Further resources on this subject: Web Server Development [article] Getting Started with Odoo Development [article] Introduction to Odoo [article]

0
0
1577

How-To Tutorials

Dylan Frankland

23 Dec 2016

12 min read

Web Scraping with Electron

Dylan Frankland

23 Dec 2016

12 min read

Web scraping is a necessary evil that takes unordered information from the Web and transforms it into something that can be used programmatically. The first big use of this was indexing websites for use in search engines like Google. Whether you like it or not, there will likely be a time in your life, career, academia, or personal projects where you will need to pull information that has no API or easy way to digest programatically. The simplest way to accomplish this is to do something along the lines of a curl request and then RegEx for the necessary information. That has been what the standard procedure for much of the history of web scraping until the single-page application came along. With the advent of Ajax, JavaScript became the mainstay of the Web and prevented much of it from being scraped with traditional methods such as curl that could only get static server rendered content. This is where Electron truly comes into play because it is a full-fledged browser but can be run programmatically. Electron's Advantages Electron goes beyond the abilities of other programmatic browser solutions: PhantomJS, SlimerJS, or Selenium. This is because Electron is more than just a headless browser or automation framework. PhantomJS (and by extension SlimerJS), run headlessly only preventing most debugging and because they have their own JavaScript engines, they are not 100% compatible with node and npm. Selenium, used for automating other browsers, can be a bit of a pain to set up because it requires setting up a central server and separately installing a browser you would like to scrape with. On the other hand, Electron can do all the above and more which makes it much easier to setup, run, and debug as you start to create your scraper. Applications of Electron Considering the flexibility of Electron and all the features it provides, it is poised to be used in a variety of development, testing, and data collection situations. As far as development goes, Electron provides all the features that Chrome does with its DevTools, making it easy to inspect, run arbitrary JavaScript, and put debugger statements. Testing can easily be done on websites that may have issues with bugs in browsers, but work perfectly in a different environment. These tests can easily be hooked up to a continuous integration server too. Data collection, the real reason we are all here, can be done on any type of website, including those with logins, using a familiar API with DevTools for inspection, and optionally run headlessly for automation. What more could one want? How to Scrape with Electron Because of Electron's ability to integrate with Node, there are many different npm modules at your disposal. This takes doing most of the leg work of automating the web scraping process and makes development a breeze. My personal favorite, nightmare, has never let me down. I have used it for tons of different projects from collecting info and automating OAuth processes to grabbing console errors, and also taking screenshots for visual inspection of entire websites and services. Packt's blog actually gives a perfect situation to use a web scraper. On the blog page, you will find a small single-page application that dynamically loads different blog posts using JavaScript. Using Electron, let's collect the latest blog posts so that we can use that information elsewhere. Full disclosure, Packt's Blog server-side renders the first page and could possibly be scraped using traditional methods. This won't be the case for all single-page applications. Prerequisites and Assumptions Following this guide, I assume that you already have Node (version 6+) and npm (version 3+) installed and that you are using some form of a Unix shell. I also assume that you are already inside a directory to do some coding in. Setup First of all, you will need to install some npm modules, so create a package.json like this: { "scripts": { "start": "DEBUG=nightmare babel-node ./main.js" }, "devDependencies": { "babel-cli": "6.18.0", "babel-preset-modern-node": "3.2.0", "babel-preset-stage-0": "6.16.0" }, "dependencies": { "nightmare": "2.8.1" }, "babel": { "presets": [ [ "modern-node", { "version": "6.0" } ], "stage-0" ] } } Using babel we can make use of some of the newer JavaScript utilities like async/await, making our code much more readable. nightmare has electron as a dependency, so there is no need to list that in our package.json (so easy). After this file is made, run npm install as usual. Investgating the Content to be Scraped First, we will need to identify the information we want to collect and how to get it: We go to here and we can see a page of different blog posts, but we are not sure of the order in which they are. We only want the latest and greatest, so let's change the sort order to "Date of Publication (New to Older)". Now we can see that we have only the blog posts that we want. Changing the sort order Next, to make it easier to collect more blog posts, change the number of blog posts shown from 12 to 48. Changing post view Last, but not least, are the blog posts themselves. Packt blog posts Next, we will figure the selectors that we will be using with nightmare to change the page and collect data: Inspecting the sort order selector, we find something like the following HTML: <div class="solr-pager-sort-selector"> <span>Sort By:  </span> <select class="solr-page-sort-selector-select selectBox" style="display: none;"> <option value="" selected="">Relevance</option> <option value="ds_blog_release_date asc">Date of Publication (Older - Newer)</option> <option value="ds_blog_release_date desc">Date of Publication (Newer - Older)</option> <option value="sort_title asc">Title (A - Z)</option> <option value="sort_title desc">Title (Z - A)</option> </select> <a class="selectBox solr-page-sort-selector-select selectBox-dropdown" title="" tabindex="0" style="width: 243px; display: inline-block;"> <span class="selectBox-label" style="width: 220.2px;">Relevance</span> <span class="selectBox-arrow">▼</span> </a> </div> Checking out the "View" options, we see the following HTML: <div class="solr-pager-rows-selector"> <span>View:  </span> <strong>12</strong>   <a class="solr-page-rows-selector-option" data-rows="24">24</a>   <a class="solr-page-rows-selector-option" data-rows="48">48</a>   </div> Finally, the info we need from the post should look similar to this: <div class="packt-article-line-view float-left"> <div class="packt-article-line-view-image"> <a href="/books/content/mapt-v030-release-notes"> <noscript> <img src="//d1ldz4te4covpm.cloudfront.net/sites/default/files/imagecache/ppv4_article_thumb/Mapt FB PPC3_0.png" alt="" title="" class="bookimage imagecache imagecache-ppv4_article_thumb"/> </noscript> <img src="//d1ldz4te4covpm.cloudfront.net/sites/default/files/imagecache/ppv4_article_thumb/Mapt FB PPC3_0.png" alt="" title="" data-original="//d1ldz4te4covpm.cloudfront.net/sites/default/files/imagecache/ppv4_article_thumb/Mapt FB PPC3_0.png" class="bookimage imagecache imagecache-ppv4_article_thumb" style="opacity: 1;"> <div class="packt-article-line-view-bg"></div> <div class="packt-article-line-view-title">Mapt v.0.3.0 Release Notes</div> </a> </div> <a href="/books/content/mapt-v030-release-notes"> <div class="packt-article-line-view-text ellipsis"> <div> <p>New features and fixes for the Mapt platform</p> </div> </div> </a> </div> Great! Now we have all the information necessary to start collecting data from the page! Creating the Scraper First things first, create a main.js file in the same directory as the package.json. Now let's put some initial code to get the ball rolling on the first step: import Nightmare from 'nightmare'; const URL_BLOG = 'https://www.packtpub.com/books/content/blogs'; const SELECTOR_SORT = '.selectBox.solr-page-sort-selector-select.selectBox-dropdown'; // Viewport must have a width at least 1040 for the desktop version of Packt's blog const nightmare = new Nightmare({ show: true }).viewport(1041, 800); (async() => { await nightmare .goto(URL_BLOG) .wait(SELECTOR_SORT) // Always good to wait before performing an action on an element .click(SELECTOR_SORT); })(); If you try to run this code, you will notice a window pop up with Packt's blog, then resize the window, and then nothing. This is because the sort selector only listens for mousedown events, so we will need to modify our code a little, by changing click to mousedown: await nightmare .goto(BLOG_URL) .wait(SELECTOR_SORT) // Always good to wait before performing an action on an element .mousedown(SELECTOR_SORT); Running the code again, after the mousedown, we get a small HTML dropdown, which we can inspect and see the following: <ul class="selectBox-dropdown-menu selectBox-options solr-page-sort-selector-select-selectBox-dropdown-menu selectBox-options-bottom" style="width: 274px; top: 354.109px; left: 746.188px; display: block;"> <li class="selectBox-selected"><a rel="">Relevance</a></li> <li class=""><a rel="ds_blog_release_date asc">Date of Publication (Older - Newer)</a></li> <li class=""><a rel="ds_blog_release_date desc">Date of Publication (Newer - Older)</a></li> <li class=""><a rel="sort_title asc">Title (A - Z)</a></li> <li class=""><a rel="sort_title desc">Title (Z - A)</a></li> </ul> So, to select the option to sort the blog posts by date, we will need to alter our code a little further (again, it does not work with click, so use mousedown plus mouseup): import Nightmare from 'nightmare'; const URL_BLOG = 'https://www.packtpub.com/books/content/blogs'; const SELECTOR_SORT = '.selectBox.solr-page-sort-selector-select.selectBox-dropdown'; const SELECTOR_SORT_OPTION = '.solr-page-sort-selector-select-selectBox-dropdown-menu > li > a[rel="ds_blog_release_date desc"]'; // Viewport must have a width at least 1040 for the desktop version of Packt's blog const nightmare = new Nightmare({ show: true }).viewport(1041, 800); (async() => { await nightmare .goto(URL_BLOG) .wait(SELECTOR_SORT) // Always good to wait before performing an action on an element .mousedown(SELECTOR_SORT) .wait(SELECTOR_SORT_OPTION) .mousedown(SELECTOR_SORT_OPTION) // Drop down menu doesn't listen for `click` events like normal... .mouseup(SELECTOR_SORT_OPTION); })(); Awesome! We can clearly see the page reload, with posts from newest to oldest. After we have sorted everything, we need to change the number of blog posts shown (finally we can use click; lol!): import Nightmare from 'nightmare'; const URL_BLOG = 'https://www.packtpub.com/books/content/blogs'; const SELECTOR_SORT = '.selectBox.solr-page-sort-selector-select.selectBox-dropdown'; const SELECTOR_SORT_OPTION = '.solr-page-sort-selector-select-selectBox-dropdown-menu > li > a[rel="ds_blog_release_date desc"]'; const SELECTOR_VIEW = '.solr-page-rows-selector-option[data-rows="48"]'; // Viewport must have a width at least 1040 for the desktop version of Packt's blog const nightmare = new Nightmare({ show: true }).viewport(1041, 800); (async() => { await nightmare .goto(URL_BLOG) .wait(SELECTOR_SORT) // Always good to wait before performing an action on an element .mousedown(SELECTOR_SORT) .wait(SELECTOR_SORT_OPTION) .mousedown(SELECTOR_SORT_OPTION) // Drop down menu doesn't listen for `click` events like normal... .mouseup(SELECTOR_SORT_OPTION) .wait(SELECTOR_VIEW) .click(SELECTOR_VIEW); })(); Collecting the data is next; all we need to do is evaluate parts of the HTML and return the formatted information. The way the evaluate function works is by stringifying the function provided to it and injecting it into Electron's event loop. Because the function gets stringified, only a function that does not rely on outside variables or functions will execute properly. The value that is returned by this function must also be able to be stringified, so only JSON, strings, numbers, and booleans are applicable. import Nightmare from 'nightmare'; const URL_BLOG = 'https://www.packtpub.com/books/content/blogs'; const SELECTOR_SORT = '.selectBox.solr-page-sort-selector-select.selectBox-dropdown'; const SELECTOR_SORT_OPTION = '.solr-page-sort-selector-select-selectBox-dropdown-menu > li > a[rel="ds_blog_release_date desc"]'; const SELECTOR_VIEW = '.solr-page-rows-selector-option[data-rows="48"]'; // Viewport must have a width at least 1040 for the desktop version of Packt's blog const nightmare = new Nightmare({ show: true }).viewport(1041, 800); (async() => { await nightmare .goto(URL_BLOG) .wait(SELECTOR_SORT) // Always good to wait before performing an action on an element .mousedown(SELECTOR_SORT) .wait(SELECTOR_SORT_OPTION) .mousedown(SELECTOR_SORT_OPTION) // Drop down menu doesn't listen for `click` events like normal... .mouseup(SELECTOR_SORT_OPTION) .wait(SELECTOR_VIEW) .click(SELECTOR_VIEW) .evaluate( () => { let posts = []; document.querySelectorAll('.packt-article-line-view').forEach( post => { posts = posts.concat({ image: post.querySelector('img').src, title: post.querySelector('.packt-article-line-view-title').textContent.trim(), link: post.querySelector('a').href, description: post.querySelector('p').textContent.trim(), }); } ); return posts; } ) .end() .then( result => console.log(result) ); })(); Once you run your function again, you should get a very long array of objects containing information about each post. There you have it! You have successfully scraped a page using Electron! There’s a bunch more of possibilities and tricks with this, but hopefully this small example gives you a good idea of what can be done. About the author Dylan Frankland is a frontend engineer at Narvar. He is an agile web developer, with over 4 years of experience developing and designing for start-ups and medium-sized businesses to create a functional, fast, and beautiful web experiences.

0
0
31612

How-To Tutorials

article-image-implementing-rethinkdb-query-language

Packt

23 Dec 2016

5 min read

Implementing RethinkDB Query Language

Packt

23 Dec 2016

5 min read

0
0
1982

article-image-object-oriented-javascript

Packt

21 Dec 2016

9 min read

Object-Oriented JavaScript

Packt

21 Dec 2016

9 min read

In this article by Ved Antani, author of the book Object Oriented JavaScript - Third Edition, we will learn that you need to know about object-oriented JavaScript. In this article, we will cover the following topics: ECMAScript 5 ECMAScript 6D Object-oriented programming (For more resources related to this topic, see here.) ECMAScript 5 One of the most important milestone in ECMAScript revisions was ECMAScript5 (ES5), officially accepted in December 2009. ECMAScript5 standard is implemented and supported on all major browsers and server-side technologies. ES5 was a major revision because apart from several important syntactic changes and additions to the standard libraries, ES5 also introduced several new constructs in the language. For instance, ES5 introduced some new objects and properties, and also the so-called strict mode. Strict mode is a subset of the language that excludes deprecated features. The strict mode is opt-in and not required, meaning that if you want your code to run in the strict mode, you will declare your intention using (once per function, or once for the whole program) the following string: "use strict"; This is just a JavaScript string, and it's ok to have strings floating around unassigned to any variable. As a result, older browsers that don't speak ES5 will simply ignore it, so this strict mode is backwards compatible and won't break older browsers. For backwards compatibility, all the examples in this book work in ES3, but at the same time, all the code in the book is written so that it will run without warnings in ES5's strict mode. Strict mode in ES6 While strict mode is optional in ES5, all ES6 modules and classes are strict by default. As you will see soon, most of the code we write in ES6 resides in a module; hence, strict mode is enforced by default. However, it is important to understand that all other constructs do not have implicit strict mode enforced. There were efforts to make newer constructs, such as arrow and generator functions, to also enforce strict mode, but it was later decided that doing so will result in very fragment language rules and code. ECMAScript 6 ECMAScript6 revision took a long time to finish and was finally accepted on 17th June, 2015. ES6 features are slowly becoming part of major browsers and server technologies. It is possible to use transpilers to compile ES6 to ES5 and use the code on environments that do not yet support ES6 completely. ES6 substantially upgrades JavaScript as a language and brings in very exciting syntactical changes and language constructs. Broadly, there are two kinds of fundamental changes in this revision of ECMAScript, which are as follows: Improved syntax for existing features and editions to the standard library; for example, classes and promises New language features; for example, generators ES6 allows you to think differently about your code. New syntax changes can let you write code that is cleaner, easier to maintain, and does not require special tricks. The language itself now supports several constructs that required third-party modules earlier. Language changes introduced in ES6 needs serious rethink in the way we have been coding in JavaScript. A note on the nomenclature—ECMAScript6, ES6, and ECMAScript 2015 are the same, but used interchangeably. Browser support for ES6 Majority of the browsers and server frameworks are on their way toward implementing ES6 features. You can check out the what is supported and what is not by clicking: http://kangax.github.io/compat-table/es6/ Though ES6 is not fully supported on all the browsers and server frameworks, we can start using almost all features of ES6 with the help of transpilers. Transpilers are source-to-source compilers. ES6 transpilers allow you to write code in ES6 syntax and compile/transform them into equivalent ES5 syntax, which can then be run on browsers that do not support the entire range of ES6 features.The defacto ES6 transpiler at the moment is Babel Object-oriented programming Let's take a moment to review what people mean when they say object-oriented, and what the main features of this programming style are. Here's a list of some concepts that are most often used when talking about object-oriented programming (OOP): Object, method, and property Class Encapsulation Inheritance Polymorphism Let's take a closer look into each one of these concepts. If you're new to the object-oriented programming lingo, these concepts might sound too theoretical, and you might have trouble grasping or remembering them from one reading. Don't worry, it does take a few tries, and the subject can be a little dry at a conceptual level. Objects As the name object-oriented suggests, objects are important. An object is a representation of a thing (someone or something), and this representation is expressed with the help of a programming language. The thing can be anything—a real-life object, or a more convoluted concept. Taking a common object, a cat, for example, you can see that it has certain characteristics—color, name, weight, and so on—and can perform some actions—meow, sleep, hide, escape, and so on. The characteristics of the object are called properties in OOP-speak, and the actions are called methods. Classes In real life, similar objects can be grouped based on some criteria. A hummingbird and an eagle are both birds, so they can be classified as belonging to some made up Birds class. In OOP, a class is a blueprint or a recipe for an object. Another name for object is instance, so we can say that the eagle is one concrete instance of the general Birds class. You can create different objects using the same class because a class is just a template, while the objects are concrete instances based on the template. There's a difference between JavaScript and the classic OO languages such as C++ and Java. You should be aware right from the start that in JavaScript, there are no classes; everything is based on objects. JavaScript has the notion of prototypes, which are also objects. In a classic OO language, you'd say something like—create a new object for me called Bob, which is of class Person. In a prototypal OO language, you'd say—I'm going to take this object called Bob's dad that I have lying around (on the couch in front of the TV?) and reuse it as a prototype for a new object that I'll call Bob. Encapsulation Encapsulation is another OOP related concept, which illustrates the fact that an object contains (encapsulates) the following: Data (stored in properties) The means to do something with the data (using methods) One other term that goes together with encapsulation is information hiding. This is a rather broad term and can mean different things, but let's see what people usually mean when they use it in the context of OOP. Imagine an object, say, an MP3 player. You, as the user of the object, are given some interface to work with, such as buttons, display, and so on. You use the interface in order to get the object to do something useful for you, like play a song. How exactly the device is working on the inside, you don't know, and, most often, don't care. In other words, the implementation of the interface is hidden from you. The same thing happens in OOP when your code uses an object by calling its methods. It doesn't matter if you coded the object yourself or it came from some third-party library; your code doesn't need to know how the methods work internally. In compiled languages, you can't actually read the code that makes an object work. In JavaScript, because it's an interpreted language, you can see the source code, but the concept is still the same—you work with the object's interface without worrying about its implementation. Another aspect of information hiding is the visibility of methods and properties. In some languages, objects can have public, private, and protected methods and properties. This categorization defines the level of access the users of the object have. For example, only the methods of the same object have access to the private methods, while anyone has access to the public ones. In JavaScript, all methods and properties are public, but we'll see that there are ways to protect the data inside an object and achieve privacy. Inheritance Inheritance is an elegant way to reuse existing code. For example, you can have a generic object, Person, which has properties such as name and date_of_birth, and which also implements the walk, talk, sleep, and eat functionality. Then, you figure out that you need another object called Programmer. You can reimplement all the methods and properties that a Person object has, but it will be smarter to just say that the Programmer object inherits a Person object, and save yourself some work. The Programmer object only needs to implement more specific functionality, such as the writeCode method, while reusing all of the Person object's functionality. In classical OOP, classes inherit from other classes, but in JavaScript, as there are no classes, objects inherit from other objects. When an object inherits from another object, it usually adds new methods to the inherited ones, thus extending the old object. Often, the following phrases can be used interchangeably—B inherits from A and B extends A. Also, the object that inherits can pick one or more methods and redefine them, customizing them for its own needs. This way, the interface stays the same and the method name is the same, but when called on the new object, the method behaves differently. This way of redefining how an inherited method works is known as overriding. Polymorphism In the preceding example, a Programmer object inherited all of the methods of the parent Person object. This means that both objects provide a talk method, among others. Now imagine that somewhere in your code, there's a variable called Bob, and it just so happens that you don't know if Bob is a Person object or a Programmer object. You can still call the talk method on the Bob object and the code will work. This ability to call the same method on different objects, and have each of them respond in their own way, is called polymorphism. Summary In this article, you learned about how JavaScript came to be and where it is today. You were also introduced to ECMAScript 5 and ECMAScript 6, We also discussed about some of the object-oriented programming concepts. Resources for Article: Further resources on this subject: Diving into OOP Principles [article] Prototyping JavaScript [article] Developing Wiki Seek Widget Using Javascript [article]

0
0
12147

How-To Tutorials

Packt

21 Dec 2016

9 min read

Say Hi to Tableau

Packt

21 Dec 2016

9 min read

0
0
3225

article-image-finishing-attack-report-and-withdraw

Packt

21 Dec 2016

11 min read

Finishing the Attack: Report and Withdraw

Packt

21 Dec 2016

11 min read

In this article by Michael McPhee and Jason Beltrame, the author of the book Penetration Testing with Raspberry Pi - Second Edition we will look at the final stage of the Penetration Testing Kill Chain, which is Reporting and Withdrawing. Some may argue the validity and importance of this step, since much of the hard-hitting effort and impact. But, without properly cleaning up and covering our tracks, we can leave little breadcrumbs with can notify others to where we have been and also what we have done. This article covers the following topics: Covering our tracks Masking our network footprint (For more resources related to this topic, see here.) Covering our tracks One of the key tasks in which penetration testers as well as criminals tend to fail is cleaning up after they breach a system. Forensic evidence can be anything from the digital network footprint (the IP address, type of network traffic seen on the wire, and so on) to the logs on a compromised endpoint. There is also evidence on the used tools, such as those used when using a Raspberry Pi to do something malicious. An example is running more ~/.bash_history on a Raspberry Pi to see the entire history of the commands that were used. The good news for Raspberry Pi hackers is that they don't have to worry about storage elements such as ROM since the only storage to consider is the microSD card. This means attackers just need to re-flash the microSD card to erase evidence that the Raspberry Pi was used. Before doing that, let's work our way through the clean up process starting from the compromised system to the last step of reimaging our Raspberry Pi. Wiping logs The first step we should perform to cover our tracks is clean any event logs from the compromised system that we accessed. For Windows systems, we can use a tool within Metasploit called Clearev that does this for us in an automated fashion. Clearev is designed to access a Windows system and wipe the logs. An overzealous administrator might notice the changes when we clean the logs. However, most administrators will never notice the changes. Also, since the logs are wiped, the worst that could happen is that an administrator might identify that their systems have been breached, but the logs containing our access information would have been removed. Clearev comes with the Metasploit arsenal. To use clearev once we have breached a Windows system with a Meterpreter, type meterpreter > clearev. There is no further configuration, which means clearev just wipes the logs upon execution. The following screenshot shows what that will look like: Here is an example of the logs before they are wiped on a Windows system: Another way to wipe off logs from a compromised Windows system is by installing a Windows log cleaning program. There are many options available to download, such as ClearLogs found at http://ntsecurity.nu/toolbox/clearlogs/. Programs such as these are simple to use, we can just install and run it on a target once we are finished with our penetration test. We can also just delete the logs manually using the C: del %WINDR%* .log /a/s/q/f command. This command directs all logs using /a including subfolders /s, disables any queries so we don't get prompted, and /f forces this action. Whichever program you use, make sure to delete the executable file once the log files are removed so that the file isn't identified during a future forensic investigation. For Linux systems, we need to get access to the /var/log folder to find the log files. Once we have access to the log files, we can simply open them and remove all entries. The following screenshot shows an example of our Raspberry Pi's log folder: We can just delete the files using the remove command, rm, such as rm FILE.txt or delete the entire folder; however, this wouldn't be as stealthy as wiping existing files clean of your footprint. Another option is in Bash. We can simply type > /path/to/file to empty the contents of a file, without removing it necessarily. This approach has some stealth benefits. Kali Linux does not have a GUI-based text editor, so one easy-to-use tool that we can install is gedit. We'll use apt-get install gedit to download it. Once installed, we can find gedit under the application dropdown or just type gedit in the terminal window. As we can see from the following screenshot, it looks like many common text file editors. Let's click on File and select files from the /var/log folder to modify them. We also need to erase the command history since the Bash shell saves the last 500 commands. This forensic evidence can be accessed by typing the more ~/.bash_history command. The following screenshot shows the first of the hundreds of commands we recently ran on my Raspberry Pi: To verify the number of stored commands in the history file, we can type the echo $HISTSIZE command. To erase this history, let's type export HISTSIZE=0. From this point, the shell will not store any command history, that is, if we press the up arrow key, it will not show the last command. These commands can also be placed in a .bashrc file on Linux hosts. The following screenshot shows that we have verified if our last 500 commands are stored. It also shows what happens after we erase them: It is a best practice to set this command prior to using any commands on a compromised system, so that nothing is stored upfront. You could log out and log back in once the export HISTSIZE=0 command is set to clear your history as well. You should also do this on your C&C server once you conclude your penetration test if you have any concerns of being investigated. A more aggressive and quicker way to remove our history file on a Linux system is to shred it with the shred –zu /root/.bash_history command. This command overwrites the history file with zeros and then deletes the log files. We can verify this using the less /root/.bash_history command to see if there is anything left in your history file, as shown in the following screenshot: Masking our network footprint Anonymity is a key ingredient when performing our attacks, unless we don't mind someone being able to trace us back to our location and giving up our position. Because of this, we need a way to hide or mask where we are coming from. This approach is perfect for a proxy or groups of proxies if we really want to make sure we don't leave a trail of breadcrumbs. When using a proxy, the source of an attack will look as though it is coming from the proxy instead of the real source. Layering multiple proxies can help provide an onion effect, in which each layer hides the other, and makes it very difficult to determine the real source during any forensic investigation. Proxies come in various types and flavors. There are websites devoted for hiding our source online, and with a quick Google search, we can see some of the most popular like hide.me, Hidestar, NewIPNow, ProxySite and even AnonyMouse. Following is a screenshot from NewIPNow website. Administrators of proxies can see all traffic as well as identify both the target and the victims that communicate through their proxy. It is highly recommended that you research any proxy prior to using it as some might use information captured without your permission. This includes providing forensic evidence to authorities or selling your sensitive information. Using ProxyChains Now, if web based proxies are not what we are looking for, we can use our Raspberry Pi as a proxy server utilizing the ProxyChains application. ProxyChains is very easy application to setup and start using. First, we need to install the application. This can be accomplished this by running the following command from the CLI: root@kali:~# apt-get install proxychains Once installed, we just need to edit the ProxyChains configuration located at /etc/proxychains.conf, and put in the proxy servers we would like to use: There are lots of options out there for finding public proxies. We should certainly use with some caution, as some proxies will use our data without our permission, so we'll be sure to do our research prior to using one. Once we have one picked out and have updated our proxychains.conf file, we can test it out. To use ProxyChains, we just need to follow the following syntax: proxychains <command you want tunneled and proxied> <opt args> Based on that syntax, to run a nmap scan, we would use the following command: root@kali:~# proxychains nmap 192.168.245.0/24 ProxyChains-3.1 (http://proxychains.sf.net) Starting Nmap 7.25BETA1 ( https://nmap.org ) Clearing the data off the Raspberry Pi Now that we have covered our tracks on the network side, as well as on the endpoint, all we have left is any of the equipment that we have left behind. This includes our Raspberry Pi. To reset our Raspberry Pi back to factory defaults, we can refer back to installing Kali Linux. For re-installing Kali or the NOOBS software. This will allow us to have clean image running once again. If we had cloned your golden image we could just re-image our Raspberry Pi with that image. If we don't have the option to re-image or reinstall your Raspberry Pi, we do have the option to just destroy the hardware. The most important piece to destroy would be the microSD card (see image following), as it contains everything that we have done on the Pi. But, we may want to consider destroying any of the interfaces that you may have used (USB WiFi, Ethernet or Bluetooth adapters), as any of those physical MAC addresses may have been recorded on the target network, and therefore could prove that device was there. If we had used our onboard interfaces, we may even need to destroy the Raspberry Pi itself. If the Raspberry Pi is in a location that we cannot get to reclaim it or to destroy it, our only option is to remotely corrupt it so that we can remove any clues of our attack on the target. To do this, we can use the rm command within Kali. The rm command is used to remove files and such from the operating systems. As a cool bonus, rm has some interesting flags that we can use to our advantage. These flags include the –r and the –f flag. The –r flag indicates to perform the operation recursively, so everything in that directory and preceding will be removed while the –f flag is to force the deletion without asking. So running the command rm –fr * from any directory will remove all contents within that directory and anything preceding that. Where this command gets interesting is if we run it from / a.k.a. the top of the directory structure. Since the command will remove everything in that directory and preceding, running it from the top level will remove all files and hence render that box unusable. As any data forensics person will tell us, that data is still there, just not being used by the operation system. So, we really need to overwrite that data. We can do this by using the dd command. We used dd back when we were setting up the Raspberry Pi. We could simply use the following to get the job done: dd if=/dev/urandom of=/dev/sda1 (where sda1 is your microSD card) In this command we are basically writing random characters to the microSD card. Alternatively, we could always just reformat the whole microSD card using the mkfs.ext4 command: mkfs.ext4 /dev/sda1 ( where sda1 is your microSD card ) That is all helpful, but what happens if we don't want to destroy the device until we absolutely need to – as if we want the ability to send over a remote destroy signal? Kali Linux now includes a LUKS Nuke patch with its install. LUKS allows for a unified key to get into the container, and when combined with Logical Volume Manager (LVM), can created an encrypted container that needs a password in order to start the boot process. With the Nuke option, if we specify the Nuke password on boot up instead of the normal passphrase, all the keys on the system are deleted and therefore rendering the data inaccessible. Here are some great links to how and do this, as well as some more details on how it works: https://www.kali.org/tutorials/nuke-kali-linux-luks/ http://www.zdnet.com/article/developers-mull-adding-data-nuke-to-kali-linux/ Summary In this article we sawreports themselves are what our customer sees as our product. It should come as no surprise that we should then take great care to ensure they are well organized, informative, accurate and most importantly, meet the customer's objectives. Resources for Article: Further resources on this subject: Penetration Testing [article] Wireless and Mobile Hacks [article] Building Your Application [article]

0
0
22518

How-To Tutorials

article-image-introducing-powershell-remoting

Packt

21 Dec 2016

9 min read

Introducing PowerShell Remoting

Packt

21 Dec 2016

9 min read

0
0
71804

Building Images

The prototype pattern

A Smarter Way of Working with WPF

Planning for Failure (and Success)

Orchestration with Docker Swarm

Introduction to Ansible

Start Treating your Infrastructure as Code

Why Bother? – Basic

Installing Your First Application

Web Scraping with Electron

Trending Topics

Implementing RethinkDB Query Language

Object-Oriented JavaScript

Say Hi to Tableau

Finishing the Attack: Report and Withdraw

Introducing PowerShell Remoting

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access