How-To Tutorials

article-image-how-do-machine-learning-python

12 Aug 2015

5 min read

How to do Machine Learning with Python

12 Aug 2015

In this article, Sunila Gollapudi, author of Practical Machine Learning, introduces the key aspects of machine learning semantics and various toolkit options in Python. Machine learning has been around for many years now and all of us, at some point in time, have been consumers of machine learning technology. One of the most common examples is facial recognition software, which can identify if a digital photograph includes a particular person. Today, Facebook users can see automatic suggestions to tag their friends in their uploaded photos. Some cameras and software such as iPhoto also have this capability. What is learning? Let's spend some time understanding what the "learning" in machine learning means. We are referring to learning from some kind of observation or data to automatically carry out further actions. An intelligent system cannot be built without using learning to get there. The following are some questions that you’ll need to answer to define your learning problem: What do you want to learn? What is the required data and where does it come from? Is the complete data available in one shot? What is the goal of learning or why should there be learning at all? Before we plunge into understanding the internals of each learning type, let's quickly understand a simple predictive analytics process for building and validating models that solve a problem with maximum accuracy: Identify whether the raw dataset is validated or cleansed and is broken into training, testing, and evaluation datasets. Pick a model that best suits and has an error function that will be minimized over the training set. Make sure this model works on the testing set. Iterate this process with other machine learning algorithms and/or attributes until there is a reasonable performance on the test set. This result can now be used to apply for new inputs and predict the output. The following diagram depicts how learning can be applied to predict behavior: Key aspects of machine learning semantics The following concept map shows the key aspects of machine learning semantics: Python Python is one of the most highly adopted programming or scripting languages in the field of machine learning and data science. Python is known for its ease of learning, implementation, and maintenance. Python is highly portable and can run on Unix, Windows, and Mac platforms. With the availability of libraries such as Pydoop and SciPy, its relevance in the world of big data analytics has tremendously increased. Some of the key reasons for the popularity of Python in solving machine learning problems are as follows: Python is well suited for data analysis It is a versatile scripting language that can be used to write some basic, quick and dirty scripts to test some basic functions or can be used in real-time applications leveraging its full-featured toolkits Python comes with mature machine learning packages and can be used in a plug-and-play manner Toolkit options in Python Before we go deeper into what toolkit options we have in Python, let's first understand what toolkit option trade-offs should be considered before choosing one: What are my performance priorities? Do I need offline or real-time processing implementations? How transparent are the toolkits? Can I customize the library myself? What is the community status? How fast are bugs fixed and how is the community support and expert communication availability? There are three options in Python: Python external bindings. These are interfaces to popular packages in the market such as Matlab, R, Octave, and so on. This option will work well if you already have existing implementations in these frameworks. Python-based toolkits. There are a number of toolkits written in Python which come with a bunch of algorithms. Write your own logic/toolkit. Python has two core toolkits that are more like building blocks. Almost all the following specialized toolkits use these core ones: NumPy: Fast and efficient arrays built in Python SciPy: A bunch of algorithms for standard operations built on NumPy There are also C/C++ based implementations such as LIBLINEAR, LIBSVM, OpenCV, and others. Some of the most popular Python toolkits are as follows: nltk: The natural language toolkit. This focuses on natural language processing (NLP). mlpy: The machine learning algorithms toolkit that comes with support for some key machine learning algorithms such as classifications, regression, and clustering, among others. PyML: This toolkit focuses on support vector machine (SVM). PyBrain: This toolkit focuses on neural network and related functions. mdp-toolkit: The focus of this toolkit is on data processing and it supports scheduling and parallelizing the processing. scikit-learn: This is one of the most popular toolkits and has been highly adopted by data scientists in the recent past. It has support for supervised and unsupervised learning and some special support for feature selection and visualizations. There is a large team that is actively building this toolkit and is known for its excellent documentation. PyDoop: Python integration with the Hadoop platform. PyDoop and SciPy are heavily deployed in big data analytics. Find out how to apply python machine learning to your working environment with our 'Practical Machine Learning' book

0
0
3181

article-image-what-we-can-learn-attacks-wep-protocol

Packt

12 Aug 2015

4 min read

What we can learn from attacks on the WEP Protocol

Packt

12 Aug 2015

4 min read

In the past years, many types of attacks on the WEP protocol have been undertaken. Being successful with such an attack is an important milestone for anyone who wants to undertake penetration tests of wireless networks. In this article by Marco Alamanni, the author of Kali Linux Wireless Penetration Testing Essentials, we will take a look at the basics and the most common types of WEP protocols. What is the WEP protocol? The WEP protocol was introduced with the original 802.11 standard as a means to provide authentication and encryption to wireless LAN implementations. It is based on the Rivest Cipher 4 (RC4) stream cypher with a Pre-shared Secret Key (PSK) of 40 or 104 bits, depending on the implementation. A 24-bit pseudorandom Initialization Vector (IV) is concatenated with the pre-shared key to generate the per-packet keystream used by RC4 for the actual encryption and decryption process. Thus, the resulting keystream could be 64 or 128 bits long. In the encryption phase, the keystream is encrypted with the XOR cypher with the plaintext data to obtain the encrypted data. While in the decryption phase, the encrypted data is XOR-encrypted with the keystream to obtain the plaintext data. The encryption process is shown in the following diagram: Attacks against WEP and why do they occur? WEP is an insecure protocol and has been deprecated by the Wi-Fi Alliance. It suffers from various vulnerabilities related to the generation of the keystreams, to the use of IVs (initialization vectors), and to the length of the keys. The IV is used to add randomness to the keystream, trying to avoid the reuse of the same keystream to encrypt different packets. This purpose has not been accomplished in the design of WEP because the IV is only 24 bits long (with 2^24 =16,777,216 possible values) and it is transmitted in clear text within each frame. Thus, after a certain period of time (depending on the network traffic), the same IV and consequently the same keystream will be reused, allowing the attacker to collect the relative cypher texts and perform statistical attacks to recover plain texts and the key. FMS attacks on WEP The first well-known attack against WEP was the Fluhrer, Mantin, and Shamir (FMS) attack back in 2001. The FMS attack relies on the way WEP generates the keystreams and on the fact that it also uses weak IV to generate weak keystreams, making it possible for an attacker to collect a sufficient number of packets encrypted with these keys, to analyze them, and recover the key. The number of IVs to be collected to complete the FMS attack is about 250,000 for 40-bit keys and 1,500,000 for 104-bit keys. The FMS attack has been enhanced by Korek, improving its performance. Andreas Klein found more correlations between the RC4 keystream and the key than the ones discovered by Fluhrer, Mantin, and Shamir, which can be used to crack the WEP key. PTW attacks on WEP In 2007, Pyshkin, Tews, and Weinmann (PTW) extended Andreas Klein's research and improved the FMS attack, significantly reducing the number of IVs needed to successfully recover the WEP key. Indeed, the PTW attack does not rely on weak IVs such as the FMS attack does and is very fast and effective. It is able to recover a 104-bit WEP key with a success probability of 50% using less than 40,000 frames and with a probability of 95% with 85,000 frames. The PTW attack is the default method used by Aircrack-ng to crack WEP keys. ARP Request replay attacks on WEP Both FMS and PTW attacks need to collect quite a large number of frames to succeed and can be conducted passively, sniffing the wireless traffic on the same channel of the target AP and capturing frames. The problem is that, in normal conditions, we will have to spend quite a long time to passively collect all the necessary packets for the attacks, especially with the FMS attack. To accelerate the process, the idea is to reinject frames in the network to generate traffic in response so that we can collect the necessary IVs more quickly. A type of frame that is suitable for this purpose is the ARP request because the AP broadcasts it, each time with a new IV. As we are not associated with the AP, if we send frames to it directly, they are discarded and a de-authentication frame is sent. Instead, we can capture ARP requests from associated clients and retransmit them to the AP. This technique is called the ARP Request Replay attack and is also adopted by Aircrack-ng for the implementation of the PTW attack. Find out more to become a master penetration tester by reading Kali Linux Wireless Penetration Testing Essentials

0
0
18105

How-To Tutorials

article-image-benchmarking-and-optimizing-go-code-part-2

Alex Browne

12 Aug 2015

9 min read

Benchmarking and Optimizing Go Code, Part 2

Alex Browne

12 Aug 2015

9 min read

In this two part post, you'll learn the basics of how to benchmark and optmize Go code. You'll do this by trying out a few different implementations for calculating a specific element in Pascal's Triangle (a.k.a. the binomial coeffecient). In Part 1 we setup Go, covered the structure of our example, created an interface, and conducted some testing. Here in Part 2 we will cover benchmarking our example for performance, and look at a recursive and bultin implementation. All of the code for these two posts is available on github. Benchmarking for Performance Since we've verified that our implementation works in Part 1, let's benchmark it to see how fast it is. Benchmarking one implementation by itself is not incredibly useful, but soon we will benchmark other implementations too and compare the results. In Go, benchmarks are simply functions that start with the word "Benchmark" and accept *testing.B as an argument. Like testing functions, benchmark functions also need to exist in a file that ends in "_test.go". As we did for our tests, we'll write a utility function that we can use to benchmark any implementation. Add the following to test/benchmark_test.go package test import ( "github.com/albrow/go-benchmark-example/common" "github.com/albrow/go-benchmark-example/implementations" "testing" ) func BenchmarkNaive(b *testing.B) { benchmarkPascaler(b, implementations.Naive) } func benchmarkPascaler(b *testing.B, p common.Pascaler) { for i := 0; i < b.N; i++ { p.Pascal(32, 16) } } We'll use the parameters n=32 and m=16 for our benchmarks. You could actually choose anything you want, but if n gets too big, you'll overflow any int values (which are stored in 64 bits on 64-bit operating systems). If that happens, any implementation that relies on storing numbers as ints will return incorrect results (and may even panic, depending on what math you do). When I tested the limit, overflow first started happening around n=67, m=33. You can run the benchmark by adding the bench flag to the test command: go test ./test -bench .. If it works, your output should look like this: ? github.com/albrow/go-benchmark-example/common [no test files] ? github.com/albrow/go-benchmark-example/implementations [no test files] PASS BenchmarkNaive 100000 15246 ns/op ok github.com/albrow/go-benchmark-example/test 3.970s The important bit is the line that starts with BenchmarkNaive. The first number tells us that this particular benchmark was run 100,000 times. Go will automatically run the benchmark as many times as it needs to until it gets a consistent result. Typically, faster benchmarks will be run more times. The second number tells us that it took an average of 15,246 ns per operation. The exact number can vary widely from machine to machine, which is why it doesn't mean much on its own. Let's write another implementation so we have something to compare it to. Recursive Implementation You may have noticed that our naive implementation was doing more work than it needed to. We don't actually need to construct the entire triangle to figure out a specific element. We actually only need to calculate the elements directly above the one we want (along with the elements directly above those, and so on). What if instead of iterating through from top to bottom, we started at the bottom and iterated upwards, calculating only the elements we needed? It should perform better because we're doing less calculations, right? Let's write an implementation that does exactly that and benchmark it to find out for sure. The idea of moving from bottom to top leads pretty naturally to a recursive implementation. For the base case, we know that the top element is 1. We also know that any element on the edges, where m == 0 or m == n is 1. For all other cases, the element is equal to the sum of the elements directly above it. We'll write the implementation in implementations/recursive.go, and it looks like this: package implementations type recursiveType struct{} var Recursive = &recursiveType{} func (p *recursiveType) Pascal(n, m int) int { if n == 0 || m == 0 || n == m { return 1 } return p.Pascal(n-1, m-1) + p.Pascal(n-1, m) } func (p *recursiveType) String() string { return "Recursive Implementation" } Personally, I think this implementation is much cleaner and easier to understand. Let's test it for correctness by adding a few lines to test/pascal_test.go: func TestRecursive(t *testing.T) { testPascaler(t, implementations.Recursive) } Then run the tests again with go test ./.... If it works you should see the same output as before. Now, let's benchmark this implementation to see how fast it is compared to our first one. Add the following lines to test/benchmark_test.go: func BenchmarkRecursive(b *testing.B) { benchmarkPascaler(b, implementations.Recursive) } Then run the benchmarks with go test ./test -bench .. You might be surprised to find that with the parameters n=32, m=16, the recursive implementation is much, much slower! In fact, if you increase n much more the benchmark will timeout because the recursive implementation takes too long. These were the results on my machine: BenchmarkNaive 100000 15246 ns/op BenchmarkRecursive 1 4005199128 ns/op This shows the recursive implementation took just over 4 seconds, whereas the "naive" implementation took a tiny fraction of a millisecond! The reason this happens has to do with the way Go allocates stack frames for functions. Currently, the Go compiler does not optimize much for recursion (though it might in the future), which means our code results in a lot of extra memory allocations. In languages/compilers that heavily optimize recursive function calls, it's possible that the recursive implementation would be faster, but not so in Go. A naive runtime analysis that didn't take stack frame allocation into account would incorrectly predict that the recursive implementation would be faster. This example illustrates why benchmarking is important whenever you care about performance! Built-in Implementation There are a lot of techniques you could use to optimize the function further. You could try writing an implementation that moves from bottom to top without using recursion. You could take advantage of the symmetry of Pascal's Triangle and cut number of calculations in half. You could even take advantage of memoization or use a lookup table to prevent repeated calculations. But for the sake of time, in this two part series we're just going to try one more implementation. If you are familiar with Combinatorics or Statistics, you might recognize that what we've been trying to calculate with our Pascal function is the equivalent of the binomial coefficient. There are formulas for calcuating the binomial coefficient directly, such as without building a complicated data structure. The most common formula involves factorials and is as follows: binomial(n, k) = n! / (k! * (n-k)!) One unexpected problem with this formula is that the intermediate values in the numerator and denominator can get too large and overflow 64-bit integers. This can happen even when the final result can be represented perfectly with 64 bits. Luckily, there is a package in the Go standard library called "big" for dealing with numbers that are too big to fit in a 64-bit int. That package also has a function for calculating the binomial coeffecient, which is exactly what we're looking for. Here's how we can use the builtin function to create our own implementation in implementations/builtin.go: package implementations import ( "math/big" ) type builtinType struct{} var Builtin = builtinType{} func (p builtinType) Pascal(n, m int) int { z := big.NewInt(0) z.Binomial(int64(n), int64(m)) return int(z.Int64()) } func (p builtinType) String() string { return "Builtin Implementation" } Test it for correctness by adding the following lines to test/pascal_test.go and running the tests again: func TestBuiltin(t *testing.T) { testPascaler(t, implementations.Builtin) } Then, we can compare the performance by adding the following lines to test/benchmark_test.go and running the benchmarks: func BenchmarkBuiltin(b *testing.B) { benchmarkPascaler(b, implementations.Builtin) } On my machine, the builtin implementation was only slightly faster for parameters n=32 and m=16: func BenchmarkBuiltin(b *testing.B) { benchmarkPascaler(b, implementations.Builtin) } Let's see what happens if we increase n and m. You will probably need to skip the recursive implementation by adding the following line inside of the BenchmarkRecursive function: b.Skip("Skipping slower recursive implementation") Here are the results on my machine for n=64, m=32: BenchmarkNaive 50000 40234 ns/op BenchmarkBuiltin 50000 25233 ns/op As you can see, when we increase the difficulty, the builtin implementation really starts to out-perform the others. We can go even higher, but keep in mind that doing so will overflow 64-bit ints. Since the builtin implementation uses the big package, we could work around this restriction. However, the benchmark results for the naive implementation wouldn't mean much since it can't return the correct values. To demonstrate how the builtin implementation performs with higher values of n, here's the results of n=10,000 and m=5,000 on my machine (skipping the other implementations): BenchmarkBuiltin 300 4978329 ns/op That's only 4 ms to compute a really high coeffecient. It appears that for high values of n and m, the builtin implementation is by far the best, not to mention the fact that it is the only one out of the three that returns correct results for higher values of n and m. Conclusion In this two part series, we tested and benchmarked three different implementations of the Pascal function, we learned how to write benchmarking functions, and we found that sometimes the fastest implementation is not necessarily the one you would expect. Benchmarking is critical whenever you are writing performance-sensitive code. Looking for a challenge? See if you can come up with an implementation to beat the builtin one. Some ideas are using parallelization, memoization, GPU acceleration, or binding to some optimized c libraries with cgo. Don't forget to test each implementation for correctness. Feel free to send a pull request here if you are able to beat it. Good luck! About the Author Alex Browne is a programmer and entrepreneur with about 4 years of product development experience and 3 years experience working on small startups. He has worked with a wide variety of technologies and has single-handedly built many production-grade applications. He is currently working with two co-founders on an early stage startup exploring ways of applying machine learning and computer vision to the manufacturing industry.His favorite programming language is Go.

0
0
2139

How-To Tutorials

Packt

12 Aug 2015

13 min read

Tracking Objects in Videos

Packt

12 Aug 2015

13 min read

In this article by Salil Kapur and Nisarg Thakkar, authors of the book Mastering OpenCV Android Application Programming, we will look at the broader aspects of object tracking in Videos. Object tracking is one of the most important applications of computer vision. It can be used for many applications, some of which are as follows: Human–computer interaction: We might want to track the position of a person's finger and use its motion to control the cursor on our machines Surveillance: Street cameras can capture pedestrians' motions that can be tracked to detect suspicious activities Video stabilization and compression Statistics in sports: By tracking a player's movement in a game of football, we can provide statistics such as distance travelled, heat maps, and so on In this article, you will learn the following topics: Optical flow Image Pyramids (For more resources related to this topic, see here.) Optical flow Optical flow is an algorithm that detects the pattern of the motion of objects, or edges, between consecutive frames in a video. This motion may be caused by the motion of the object or the motion of the camera. Optical flow is a vector that depicts the motion of a point from the first frame to the second. The optical flow algorithm works under two basic assumptions: The pixel intensities are almost constant between consecutive frames The neighboring pixels have the same motion as the anchor pixel We can represent the intensity of a pixel in any frame by f(x,y,t). Here, the parameter t represents the frame in a video. Let's assume that, in the next dt time, the pixel moves by (dx,dy). Since we have assumed that the intensity doesn't change in consecutive frames, we can say: f(x,y,t) = f(x + dx,y + dy,t + dt) Now we take the Taylor series expansion of the RHS in the preceding equation: Cancelling the common term, we get: Where . Dividing both sides of the equation by dt we get: This equation is called the optical flow equation. Rearranging the equation we get: We can see that this represents the equation of a line in the (u,v) plane. However, with only one equation available and two unknowns, this problem is under constraint at the moment. The Horn and Schunck method By taking into account our assumptions, we get: We can say that the first term will be small due to our assumption that the brightness is constant between consecutive frames. So, the square of this term will be even smaller. The second term corresponds to the assumption that the neighboring pixels have similar motion to the anchor pixel. We need to minimize the preceding equation. For this, we differentiate the preceding equation with respect to u and v. We get the following equations: Here, and are the Laplacians of u and v respectively. The Lucas and Kanade method We start off with the optical flow equation that we derived earlier and noticed that it is under constrained as it has one equation and two variables: To overcome this problem, we make use of the assumption that pixels in a 3x3 neighborhood have the same optical flow: We can rewrite these equations in the form of matrices, as shown here: This can be rewritten in the form: Where: As we can see, A is a 9x2 matrix, U is a 2x1 matrix, and b is a 9x1 matrix. Ideally, to solve for U, we just need to multiply by A-1on both sides of the equation. However, this is not possible, as we can only take the inverse of square matrices. Thus, we try to transform A into a square matrix by first multiplying the equation by AT on both sides of the equation: Now is a square matrix of dimension 2x2. Hence, we can take its inverse: On solving this equation, we get: This method of multiplying the transpose and then taking an inverse is called pseudo-inverse. This equation can also be obtained by finding the minimum of the following equation: According to the optical flow equation and our assumptions, this value should be equal to zero. Since the neighborhood pixels do not have exactly the same values as the anchor pixel, this value is very small. This method is called Least Square Error. To solve for the minimum, we differentiate this equation with respect to u and v, and equate it to zero. We get the following equations: Now we have two equations and two variables, so this system of equations can be solved. We rewrite the preceding equations as follows: So, by arranging these equations in the form of a matrix, we get the same equation as obtained earlier: Since, the matrix A is now a 2x2 matrix, it is possible to take an inverse. On taking the inverse, the equation obtained is as follows: This can be simplified as: Solving for u and v, we get: Now we have the values for all the , , and . Thus, we can find the values of u and v for each pixel. When we implement this algorithm, it is observed that the optical flow is not very smooth near the edges of the objects. This is due to the brightness constraint not being satisfied. To overcome this situation, we use image pyramids. Checking out the optical flow on Android To see the optical flow in action on Android, we will create a grid of points over a video feed from the camera, and then the lines will be drawn for each point that will depict the motion of the point on the video, which is superimposed by the point on the grid. Before we begin, we will set up our project to use OpenCV and obtain the feed from the camera. We will process the frames to calculate the optical flow. First, create a new project in Android Studio. We will set the activity name to MainActivity.java and the XML resource file as activity_main.xml. Second, we will give the app the permissions to access the camera. In the AndroidManifest.xml file, add the following lines to the manifest tag: <uses-permission android_name="android.permission.CAMERA" /> Make sure that your activity tag for MainActivity contains the following line as an attribute: android:screenOrientation="landscape" Our activity_main.xml file will contain a simple JavaCameraView. This is a custom OpenCV defined layout that enables us to access the camera frames and processes them as normal Mat objects. The XML code has been shown here: <LinearLayout android_layout_width="match_parent" android_layout_height="match_parent" android_orientation="horizontal"> <org.opencv.android.JavaCameraView android_layout_width="fill_parent" android_layout_height="fill_parent" android_id="@+id/main_activity_surface_view" /> </LinearLayout> Now, let's work on some Java code. First, we'll define some global variables that we will use later in the code: private static final String TAG = "com.packtpub.masteringopencvandroid.chapter5.MainActivity"; private static final int VIEW_MODE_KLT_TRACKER = 0; private static final int VIEW_MODE_OPTICAL_FLOW = 1; private int mViewMode; private Mat mRgba; private Mat mIntermediateMat; private Mat mGray; private Mat mPrevGray; MatOfPoint2f prevFeatures, nextFeatures; MatOfPoint features; MatOfByte status; MatOfFloat err; private MenuItem mItemPreviewOpticalFlow, mItemPreviewKLT; private CameraBridgeViewBase mOpenCvCameraView; We will need to create a callback function for OpenCV, like we did earlier. In addition to the code we used earlier, we will also enable CameraView to capture frames for processing: private BaseLoaderCallback mLoaderCallback = new BaseLoaderCallback(this) { @Override public void onManagerConnected(int status) { switch (status) { case LoaderCallbackInterface.SUCCESS: { Log.i(TAG, "OpenCV loaded successfully"); mOpenCvCameraView.enableView(); } break; default: { super.onManagerConnected(status); } break; } } }; We will now check whether the OpenCV manager is installed on the phone, which contains the required libraries. In the onResume function, add the following line of code: OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_2_4_10, this, mLoaderCallback); In the onCreate() function, add the following line before calling setContentView to prevent the screen from turning off, while using the app: getWindow().addFlags(WindowManager.LayoutParams. FLAG_KEEP_SCREEN_ON); We will now initialize our JavaCameraView object. Add the following lines after setContentView has been called: mOpenCvCameraView = (CameraBridgeViewBase) findViewById(R.id.main_activity_surface_view); mOpenCvCameraView.setCvCameraViewListener(this); Notice that we called setCvCameraViewListener with the this parameter. For this, we need to make our activity implement the CvCameraViewListener2 interface. So, your class definition for the MainActivity class should look like the following code: public class MainActivity extends Activity implements CvCameraViewListener2 We will add a menu to this activity to toggle between different examples in this article. Add the following lines to the onCreateOptionsMenu function: mItemPreviewKLT = menu.add("KLT Tracker"); mItemPreviewOpticalFlow = menu.add("Optical Flow"); We will now add some actions to the menu items. In the onOptionsItemSelected function, add the following lines: if (item == mItemPreviewOpticalFlow) { mViewMode = VIEW_MODE_OPTICAL_FLOW; resetVars(); } else if (item == mItemPreviewKLT){ mViewMode = VIEW_MODE_KLT_TRACKER; resetVars(); } return true; We used a resetVars function to reset all the Mat objects. It has been defined as follows: private void resetVars(){ mPrevGray = new Mat(mGray.rows(), mGray.cols(), CvType.CV_8UC1); features = new MatOfPoint(); prevFeatures = new MatOfPoint2f(); nextFeatures = new MatOfPoint2f(); status = new MatOfByte(); err = new MatOfFloat(); } We will also add the code to make sure that the camera is released for use by other applications, whenever our application is suspended or killed. So, add the following snippet of code to the onPause and onDestroy functions: if (mOpenCvCameraView != null) mOpenCvCameraView.disableView(); After the OpenCV camera has been started, the onCameraViewStarted function is called, which is where we will add all our object initializations: public void onCameraViewStarted(int width, int height) { mRgba = new Mat(height, width, CvType.CV_8UC4); mIntermediateMat = new Mat(height, width, CvType.CV_8UC4); mGray = new Mat(height, width, CvType.CV_8UC1); resetVars(); } Similarly, the onCameraViewStopped function is called when we stop capturing frames. Here we will release all the objects we created when the view was started: public void onCameraViewStopped() { mRgba.release(); mGray.release(); mIntermediateMat.release(); } Now we will add the implementation to process each frame of the feed that we captured from the camera. OpenCV calls the onCameraFrame method for each frame, with the frame as a parameter. We will use this to process each frame. We will use the viewMode variable to distinguish between the optical flow and the KLT tracker, and have different case constructs for the two: public Mat onCameraFrame(CvCameraViewFrame inputFrame) { final int viewMode = mViewMode; switch (viewMode) { case VIEW_MODE_OPTICAL_FLOW: We will use the gray()function to obtain the Mat object that contains the captured frame in a grayscale format. OpenCV also provides a similar function called rgba() to obtain a colored frame. Then we will check whether this is the first run. If this is the first run, we will create and fill up a features array that stores the position of all the points in a grid, where we will compute the optical flow: mGray = inputFrame.gray(); if(features.toArray().length==0){ int rowStep = 50, colStep = 100; int nRows = mGray.rows()/rowStep, nCols = mGray.cols()/colStep; Point points[] = new Point[nRows*nCols]; for(int i=0; i<nRows; i++){ for(int j=0; j<nCols; j++){ points[i*nCols+j]=new Point(j*colStep, i*rowStep); } } features.fromArray(points); prevFeatures.fromList(features.toList()); mPrevGray = mGray.clone(); break; } The mPrevGray object refers to the previous frame in a grayscale format. We copied the points to a prevFeatures object that we will use to calculate the optical flow and store the corresponding points in the next frame in nextFeatures. All of the computation is carried out in the calcOpticalFlowPyrLK OpenCV defined function. This function takes in the grayscale version of the previous frame, the current grayscale frame, an object that contains the feature points whose optical flow needs to be calculated, and an object that will store the position of the corresponding points in the current frame: nextFeatures.fromArray(prevFeatures.toArray()); Video.calcOpticalFlowPyrLK(mPrevGray, mGray, prevFeatures, nextFeatures, status, err); Now, we have the position of the grid of points and their position in the next frame as well. So, we will now draw a line that depicts the motion of each point on the grid: List<Point> prevList=features.toList(), nextList=nextFeatures.toList(); Scalar color = new Scalar(255); for(int i = 0; i<prevList.size(); i++){ Core.line(mGray, prevList.get(i), nextList.get(i), color); } Before the loop ends, we have to copy the current frame to mPrevGray so that we can calculate the optical flow in the subsequent frames: mPrevGray = mGray.clone(); break; default: mViewMode = VIEW_MODE_OPTICAL_FLOW; After we end the switch case construct, we will return a Mat object. This is the image that will be displayed as an output to the user of the application. Here, since all our operations and processing were performed on the grayscale image, we will return this image: return mGray; So, this is all about optical flow. The result can be seen in the following image: Optical flow at various points in the camera feed Image pyramids Pyramids are multiple copies of the same images that differ in their sizes. They are represented as layers, as shown in the following figure. Each level in the pyramid is obtained by reducing the rows and columns by half. Thus, effectively, we make the image's size one quarter of its original size: Relative sizes of pyramids Pyramids intrinsically define reduce and expand as their two operations. Reduce refers to a reduction in the image's size, whereas expand refers to an increase in its size. We will use a convention that lower levels in a pyramid mean downsized images and higher levels mean upsized images. Gaussian pyramids In the reduce operation, the equation that we use to successively find levels in pyramids, while using a 5x5 sliding window, has been written as follows. Notice that the size of the image reduces to a quarter of its original size: The elements of the weight kernel, w, should add up to 1. We use a 5x5 Gaussian kernel for this task. This operation is similar to convolution with the exception that the resulting image doesn't have the same size as the original image. The following image shows you the reduce operation: The reduce operation The expand operation is the reverse process of reduce. We try to generate images of a higher size from images that belong to lower layers. Thus, the resulting image is blurred and is of a lower resolution. The equation we use to perform expansion is as follows: The weight kernel in this case, w, is the same as the one used to perform the reduce operation. The following image shows you the expand operation: The expand operation The weights are calculated using the Gaussian function to perform Gaussian blur. Summary In this article, we have seen how to detect a local and global motion in a video, and how we can track objects. We have also learned about Gaussian pyramids, and how they can be used to improve the performance of some computer vision tasks. Resources for Article: Further resources on this subject: New functionality in OpenCV 3.0 [article] Seeing a Heartbeat with a Motion Amplifying Camera [article] Camera Calibration [article]

0
0
6468

How-To Tutorials

Packt

12 Aug 2015

11 min read

Tappy Defender – Building the home screen

Packt

12 Aug 2015

11 min read

In this article by John Horton, the author of Android Game Programming by Example, we will look at developing the home screen UI for our game. (For more resources related to this topic, see here.) Creating the project Fire up Android Studio and create a new project by following these steps. On the welcome page of Android Studio, click on Start a new Android Studio project. In the Create New Project window shown next, we need to enter some basic information about our app. These bits of information will be used by Android Studio to determine the package name. In the following image, you can see the Edit link where you can customize the package name if required. If you will be copy/pasting the supplied code into your project, then use C1 Tappy Defender for the Application name field and gamecodeschool.com in the Company Domain field as shown in the following screenshot: Click on the Next button when you're ready. When asked to select the form factors your app will run on, we can accept the default settings (Phone and Tablet). So click on Next again. On the Add an activity to mobile dialog, just click on Blank Activity followed by the Next button. On the Choose options for your new file dialog, again we can accept the default settings because MainActivity seems like a good name for our main Activity. So click on the Finish button. What we did Android Studio has built the project and created a number of files, most of which you will see and edit during the course of building this game. As mentioned earlier, even if you are just copying and pasting the code, you need to go through this step because Android Studio is doing things behind the scenes to make your project work. Building the home screen UI The first and simplest part of your Tappy Defender game is the home screen. All you need is a neat picture with a scene about the game, a high score, and a button to start the game. The finished home screen will look a bit like this: When you built the project, Android Studio opens up two files ready for you to edit. You can see them as tabs in the following Android Studio UI designer. The files (and tabs) are MainActivity.java and activity_main.xml: The MainActivity.java file is the entry point to your game, and you will see this in more detail soon. The activity_main.xml file is the UI layout that your home screen will use. Now, you can go ahead and edit the activity_main.xml file, so it actually looks like your home screen should. First of all, your game will be played with the Android device in landscape mode. If you change your UI preview window to landscape, you will see your progress more accurately. Look for the button shown in the next image. It is just preceding the UI preview: Click on the button shown in the preceding screenshot, and your UI preview will switch to landscape like this: Make sure activity_main.xml is open by clicking on its tab. Now, you will set in a background image. You can use your own. Add your chosen image to the drawable folder of the project in Android Studio. In the Properties window of the UI designer, find and click on the background property as shown in the next image: Also, in the previous image the button labelled ... is outlined. It is just to the right of the background property. Click on that ... button and browse to and select the background image file that you will be using. Next, you need a TextView widget that you will use to display the high score. Note that there is already a TextView widget on the layout. It says Hello World. You will modify this and use it for your high score. Left click on and drag the TextView to where you want it. You can copy me if you intend using the supplied background or put it where it looks best with your background. Next, in the Properties window, find and click on the id property. Enter textHighScore. You can also edit the text property to say High Score: 99999 or similar so that the TextView looks the part. However, this isn't necessary because your Java code will take care of this later. Now, you will drag a button from the widget palette as shown in the following screenshot: Drag it to where it looks good on your background. You can copy me if using the supplied background or put it where it looks best with your background. What we did You now have a cool background with neatly arranged widgets (a TextView and a Button) for your home screen. You can add functionality via Java code to the Button widget next. Revisit the TextView for the player's high score. The important point is that both the widgets have been assigned a unique ID that you can use to reference and manipulate in your Java code. Coding the functionality Now, you have a simple layout for your game home screen. Now, you need to add the functionality that will allow the player to click on the Play button to start the game. Click on the tab for the MainActivity.java file. The code that was automatically generated for us is not exactly what we need. Therefore, we will start again as it is simpler and quicker than tinkering with what is already there. Delete the entire contents of the MainActivity.java file except the package name and enter the following code in it. Of course, your package name may be different. package com.gamecodeschool.c1tappydefender;import android.app.Activity;import android.os.Bundle;public class MainActivity extends Activity{ // This is the entry point to our game @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); //Here we set our UI layout as the view setContentView(R.layout.activity_main); }} The mentioned code is the current contents of your main MainActivity class and the entry point of your game, the onCreate method. The line of code that begins with setContentView... is the line that loads our UI layout from activity_main.xml to the players screen. We can run the game now and see our home screen. Now, let's handle the Play button on our home screen. Add the two highlighted lines of the following code into the onCreate method just after the call to setContentView(). The first new line creates a new Button object and gets a reference to Button in our UI layout. The second line is the code to listen for clicks on the button. //Here we set our UI layout as the viewsetContentView(R.layout.activity_main);// Get a reference to the button in our layoutfinal Button buttonPlay = (Button)findViewById(R.id.buttonPlay);// Listen for clicksbuttonPlay.setOnClickListener(this); Note that you have a few errors in your code. You can resolve these errors by holding down the Alt keyboard key and then pressing Enter. This will add an import directive for the Button class. You still have one error. You need to implement an interface so that your code listens to the button clicks. Modify the MainActivity class declaration as highlighted: public class MainActivity extends Activity implements View.OnClickListener{ When you implement the onClickListener interface, you must also implement the onClick method. This is where you will handle what happens when a button is clicked. You can automatically generate the onClick method by right-clicking somewhere after the onCreate method, but within the MainActivity class, and navigating to Generate | Implement methods | onClick(v:View):void. You also need to have Android Studio add another import directive for Android.view.View. Use the Alt | Enter keyboard combination again. You can now scroll to near the bottom of the MainActivity class and see that Android Studio has implemented an empty onClick method for you. You should have no errors in your code at this point. Here is the onClick method: @Overridepublic void onClick(View v) { //Our code goes here} As you only have one Button object and one listener, you can safely assume that any clicks on your home screen are the player pressing your Play button. Android uses the Intent class to switch between activities. As you need to go to a new activity when the Play button is clicked, you will create a new Intent object and pass in the name of your future Activity class, GameActivity to its constructor. You can then use the Intent object to switch activities. Add the following code to the body of the onClick method: // must be the Play button.// Create a new Intent objectIntent i = new Intent(this, GameActivity.class);// Start our GameActivity class via the IntentstartActivity(i);// Now shut this activity downfinish(); Once again, you have errors in your code because you need to generate a new import directive, this time for the Intent class so use the Alt | Enter keyboard combination again. You still have one error in your code. This is because your GameActivity class does not exist yet. You will now solve this problem. Creating GameActivity You have seen that when the player clicks on the Play button, main activity will close and game activity will begin. Therefore, you need to create a new activity called GameActivity that will be where your game actually executes. From the main menu, navigate to File | New | Activity | Blank Activity. In the Choose options for your new file dialog, change the Activity name field to GameActivity. You can accept all the other default settings from this dialog, so click on Finish. As you did with your MainActivity class, you will code this class from scratch. Therefore, delete the entire code content from GameActivity.java. What we did Android Studio has generated two more files for you and done some work behind the scenes that you will investigate soon. The new files are GameActivity.java and activity_game.xml. They are both automatically opened for you in two new tabs, in the same place as the other tabs above the UI designer. You will never need activity_game.xml because you will build a dynamically generated game view, not a static UI. Feel free to close that now or just ignore it. You will come back to the GameActivity.java file, when you start to code your game for real. Configuring the AndroidManifest.xml file We briefly mentioned that when we create a new project or a new activity, Android Studio does more than just creating two files for us. This is why we create new projects/activities the way we do. One of the things going on behind the scenes is the creation and modification of the AndroidManifest.xml file in the manifests directory. This file is required for your app to work. Also, it needs to be edited to make your app work the way you want it to. Android Studio has automatically configured the basics for you, but you will now do two more things to this file. By editing the AndroidManifest.xml file, you will force both of your activities to run with a full screen, and you will also lock them to a landscape layout. Let's make these changes here: Open the manifests folder now, and double click on the AndroidManifest.xml file to open it in the code editor. In the AndroidManifest.xml file, find the following line of code:android:name=".MainActivity" Immediately following it, type or copy and paste these two lines to make MainActivity run full screen and lock it in the landscape orientation:android:theme="@android:style/Theme.NoTitleBar.Fullscreen"android:screenOrientation="landscape" In the AndroidManifest.xml file, find the following line of code:android:name=".GameActivity" Immediately following it, type or copy and paste these two lines to make GameActivity run full screen and lock it in the landscape orientation: android:theme="@android:style/Theme.NoTitleBar.Fullscreen"android:screenOrientation="landscape" What you did You have now configured both activities from your game to be full screen. This gives a much more pleasing appearance for your player. In addition, you have disabled the player's ability to affect your game by rotating their Android device. Continue building on what you've learnt so far with Android Game Programming by Example! Learn to implement the game rules, game mechanics, and game objects such as guns, life, money; and of course, the enemy. You even get to control a spaceship!

0
0
2771

Packt

11 Aug 2015

20 min read

Task Execution with Asio

Packt

11 Aug 2015

20 min read

In this article by Arindam Mukherjee, the author of Learning Boost C++ Libraries, we learch how to execute a task using Boost Asio (pronounced ay-see-oh), a portable library for performing efficient network I/O using a consistent programming model. At its core, Boost Asio provides a task execution framework that you can use to perform operations of any kind. You create your tasks as function objects and post them to a task queue maintained by Boost Asio. You enlist one or more threads to pick these tasks (function objects) and invoke them. The threads keep picking up tasks, one after the other till the task queues are empty at which point the threads do not block but exit. (For more resources related to this topic, see here.) IO Service, queues, and handlers At the heart of Asio is the type boost::asio::io_service. A program uses the io_service interface to perform network I/O and manage tasks. Any program that wants to use the Asio library creates at least one instance of io_service and sometimes more than one. In this section, we will explore the task management capabilities of io_service. Here is the IO Service in action using the obligatory "hello world" example: Listing 11.1: Asio Hello World 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6 asio::io_service service; 7 8 service.post( 9 [] { 10 std::cout << "Hello, world!" << 'n'; 11 }); 12 13 std::cout << "Greetings: n"; 14 service.run(); 15 } We include the convenience header boost/asio.hpp, which includes most of the Asio library that we need for the examples in this aritcle (line 1). All parts of the Asio library are under the namespace boost::asio, so we use a shorter alias for this (line 3). The program itself just prints Hello, world! on the console but it does so through a task. The program first creates an instance of io_service (line 6) and posts a function object to it, using the post member function of io_service. The function object, in this case defined using a lambda expression, is referred to as a handler. The call to post adds the handler to a queue inside io_service; some thread (including that which posted the handler) must dispatch them, that is, remove them off the queue and call them. The call to the run member function of io_service (line 14) does precisely this. It loops through the handlers in the queue inside io_service, removing and calling each handler. In fact, we can post more handlers to the io_service before calling run, and it would call all the posted handlers. If we did not call run, none of the handlers would be dispatched. The run function blocks until all the handlers in the queue have been dispatched and returns only when the queue is empty. By itself, a handler may be thought of as an independent, packaged task, and Boost Asio provides a great mechanism for dispatching arbitrary tasks as handlers. Note that handlers must be nullary function objects, that is, they should take no arguments. Asio is a header-only library by default, but programs using Asio need to link at least with boost_system. On Linux, we can use the following command line to build this example: $ g++ -g listing11_1.cpp -o listing11_1 -lboost_system -std=c++11 Running this program prints the following: Greetings: Hello, World! Note that Greetings: is printed from the main function (line 13) before the call to run (line 14). The call to run ends up dispatching the sole handler in the queue, which prints Hello, World!. It is also possible for multiple threads to call run on the same I/O object and dispatch handlers concurrently. We will see how this can be useful in the next section. Handler states – run_one, poll, and poll_one While the run function blocks until there are no more handlers in the queue, there are other member functions of io_service that let you process handlers with greater flexibility. But before we look at this function, we need to distinguish between pending and ready handlers. The handlers we posted to the io_service were all ready to run immediately and were invoked as soon as their turn came on the queue. In general, handlers are associated with background tasks that run in the underlying OS, for example, network I/O tasks. Such handlers are meant to be invoked only once the associated task is completed, which is why in such contexts, they are called completion handlers. These handlers are said to be pending until the associated task is awaiting completion, and once the associated task completes, they are said to be ready. The poll member function, unlike run, dispatches all the ready handlers but does not wait for any pending handler to become ready. Thus, it returns immediately if there are no ready handlers, even if there are pending handlers. The poll_one member function dispatches exactly one ready handler if there be one, but does not block waiting for pending handlers to get ready. The run_one member function blocks on a nonempty queue waiting for a handler to become ready. It returns when called on an empty queue, and otherwise, as soon as it finds and dispatches a ready handler. post versus dispatch A call to the post member function adds a handler to the task queue and returns immediately. A later call to run is responsible for dispatching the handler. There is another member function called dispatch that can be used to request the io_service to invoke a handler immediately if possible. If dispatch is invoked in a thread that has already called one of run, poll, run_one, or poll_one, then the handler will be invoked immediately. If no such thread is available, dispatch adds the handler to the queue and returns just like post would. In the following example, we invoke dispatch from the main function and from within another handler: Listing 11.2: post versus dispatch 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6 asio::io_service service; 7 // Hello Handler – dispatch behaves like post 8 service.dispatch([]() { std::cout << "Hellon"; }); 9 10 service.post( 11 [&service] { // English Handler 12 std::cout << "Hello, world!n"; 13 service.dispatch([] { // Spanish Handler, immediate 14 std::cout << "Hola, mundo!n"; 15 }); 16 }); 17 // German Handler 18 service.post([&service] {std::cout << "Hallo, Welt!n"; }); 19 service.run(); 20 } Running this code produces the following output: Hello Hello, world! Hola, mundo! Hallo, Welt! The first call to dispatch (line 8) adds a handler to the queue without invoking it because run was yet to be called on io_service. We call this the Hello Handler, as it prints Hello. This is followed by the two calls to post (lines 10, 18), which add two more handlers. The first of these two handlers prints Hello, world! (line 12), and in turn, calls dispatch (line 13) to add another handler that prints the Spanish greeting, Hola, mundo! (line 14). The second of these handlers prints the German greeting, Hallo, Welt (line 18). For our convenience, let's just call them the English, Spanish, and German handlers. This creates the following entries in the queue: Hello Handler English Handler German Handler Now, when we call run on the io_service (line 19), the Hello Handler is dispatched first and prints Hello. This is followed by the English Handler, which prints Hello, World! and calls dispatch on the io_service, passing the Spanish Handler. Since this executes in the context of a thread that has already called run, the call to dispatch invokes the Spanish Handler, which prints Hola, mundo!. Following this, the German Handler is dispatched printing Hallo, Welt! before run returns. What if the English Handler called post instead of dispatch (line 13)? In that case, the Spanish Handler would not be invoked immediately but would queue up after the German Handler. The German greeting Hallo, Welt! would precede the Spanish greeting Hola, mundo!. The output would look like this: Hello Hello, world! Hallo, Welt! Hola, mundo! Concurrent execution via thread pools The io_service object is thread-safe and multiple threads can call run on it concurrently. If there are multiple handlers in the queue, they can be processed concurrently by such threads. In effect, the set of threads that call run on a given io_service form a thread pool. Successive handlers can be processed by different threads in the pool. Which thread dispatches a given handler is indeterminate, so the handler code should not make any such assumptions. In the following example, we post a bunch of handlers to the io_service and then start four threads, which all call run on it: Listing 11.3: Simple thread pools 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 #define PRINT_ARGS(msg) do { 8 boost::lock_guard<boost::mutex> lg(mtx); 9 std::cout << '[' << boost::this_thread::get_id() 10 << "] " << msg << std::endl; 11 } while (0) 12 13 int main() { 14 asio::io_service service; 15 boost::mutex mtx; 16 17 for (int i = 0; i < 20; ++i) { 18 service.post([i, &mtx]() { 19 PRINT_ARGS("Handler[" << i << "]"); 20 boost::this_thread::sleep( 21 boost::posix_time::seconds(1)); 22 }); 23 } 24 25 boost::thread_group pool; 26 for (int i = 0; i < 4; ++i) { 27 pool.create_thread([&service]() { service.run(); }); 28 } 29 30 pool.join_all(); 31 } We post twenty handlers in a loop (line 18). Each handler prints its identifier (line 19), and then sleeps for a second (lines 19-20). To run the handlers, we create a group of four threads, each of which calls run on the io_service (line 21) and wait for all the threads to finish (line 24). We define the macro PRINT_ARGS which writes output to the console in a thread-safe way, tagged with the current thread ID (line 7-10). We will use this macro in other examples too. To build this example, you must also link against libboost_thread, libboost_date_time, and in Posix environments, with libpthread too: $ g++ -g listing9_3.cpp -o listing9_3 -lboost_system -lboost_thread -lboost_date_time -pthread -std=c++11 One particular run of this program on my laptop produced the following output (with some lines snipped): [b5c15b40] Handler[0] [b6416b40] Handler[1] [b6c17b40] Handler[2] [b7418b40] Handler[3] [b5c15b40] Handler[4] [b6416b40] Handler[5] … [b6c17b40] Handler[13] [b7418b40] Handler[14] [b6416b40] Handler[15] [b5c15b40] Handler[16] [b6c17b40] Handler[17] [b7418b40] Handler[18] [b6416b40] Handler[19] You can see that the different handlers are executed by different threads (each thread ID marked differently). If any of the handlers threw an exception, it would be propagated across the call to the run function on the thread that was executing the handler. io_service::work Sometimes, it is useful to keep the thread pool started, even when there are no handlers to dispatch. Neither run nor run_one blocks on an empty queue. So in order for them to block waiting for a task, we have to indicate, in some way, that there is outstanding work to be performed. We do this by creating an instance of io_service::work, as shown in the following example: Listing 11.4: Using io_service::work to keep threads engaged 1 #include <boost/asio.hpp> 2 #include <memory> 3 #include <boost/thread.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 typedef std::unique_ptr<asio::io_service::work> work_ptr; 8 9 #define PRINT_ARGS(msg) do { … ... 14 15 int main() { 16 asio::io_service service; 17 // keep the workers occupied 18 work_ptr work(new asio::io_service::work(service)); 19 boost::mutex mtx; 20 21 // set up the worker threads in a thread group 22 boost::thread_group workers; 23 for (int i = 0; i < 3; ++i) { 24 workers.create_thread([&service, &mtx]() { 25 PRINT_ARGS("Starting worker thread "); 26 service.run(); 27 PRINT_ARGS("Worker thread done"); 28 }); 29 } 30 31 // Post work 32 for (int i = 0; i < 20; ++i) { 33 service.post( 34 [&service, &mtx]() { 35 PRINT_ARGS("Hello, world!"); 36 service.post([&mtx]() { 37 PRINT_ARGS("Hola, mundo!"); 38 }); 39 }); 40 } 41 42 work.reset(); // destroy work object: signals end of work 43 workers.join_all(); // wait for all worker threads to finish 44 } In this example, we create an object of io_service::work wrapped in a unique_ptr (line 18). We associate it with an io_service object by passing to the work constructor a reference to the io_service object. Note that, unlike listing 11.3, we create the worker threads first (lines 24-27) and then post the handlers (lines 33-39). However, the worker threads stay put waiting for the handlers because of the calls to run block (line 26). This happens because of the io_service::work object we created, which indicates that there is outstanding work in the io_service queue. As a result, even after all handlers are dispatched, the threads do not exit. By calling reset on the unique_ptr, wrapping the work object, its destructor is called, which notifies the io_service that all outstanding work is complete (line 42). The calls to run in the threads return and the program exits once all the threads are joined (line 43). We wrapped the work object in a unique_ptr to destroy it in an exception-safe way at a suitable point in the program. We omitted the definition of PRINT_ARGS here, refer to listing 11.3. Serialized and ordered execution via strands Thread pools allow handlers to be run concurrently. This means that handlers that access shared resources need to synchronize access to these resources. We already saw examples of this in listings 11.3 and 11.4, when we synchronized access to std::cout, which is a global object. As an alternative to writing synchronization code in handlers, which can make the handler code more complex, we can use strands. Think of a strand as a subsequence of the task queue with the constraint that no two handlers from the same strand ever run concurrently. The scheduling of other handlers in the queue, which are not in the strand, is not affected by the strand in any way. Let us look at an example of using strands: Listing 11.5: Using strands 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <cstdlib> 5 #include <iostream> 6 #include <ctime> 7 namespace asio = boost::asio; 8 #define PRINT_ARGS(msg) do { ... 13 14 int main() { 15 std::srand(std::time(0)); 16 asio::io_service service; 17 asio::io_service::strand strand(service); 18 boost::mutex mtx; 19 size_t regular = 0, on_strand = 0; 20 21 auto workFuncStrand = [&mtx, &on_strand] { 22 ++on_strand; 23 PRINT_ARGS(on_strand << ". Hello, from strand!"); 24 boost::this_thread::sleep( 25 boost::posix_time::seconds(2)); 26 }; 27 28 auto workFunc = [&mtx, &regular] { 29 PRINT_ARGS(++regular << ". Hello, world!"); 30 boost::this_thread::sleep( 31 boost::posix_time::seconds(2)); 32 }; 33 // Post work 34 for (int i = 0; i < 15; ++i) { 35 if (rand() % 2 == 0) { 36 service.post(strand.wrap(workFuncStrand)); 37 } else { 38 service.post(workFunc); 39 } 40 } 41 42 // set up the worker threads in a thread group 43 boost::thread_group workers; 44 for (int i = 0; i < 3; ++i) { 45 workers.create_thread([&service, &mtx]() { 46 PRINT_ARGS("Starting worker thread "); 47 service.run(); 48 PRINT_ARGS("Worker thread done"); 49 }); 50 } 51 52 workers.join_all(); // wait for all worker threads to finish 53 } In this example, we create two handler functions: workFuncStrand (line 21) and workFunc (line 28). The lambda workFuncStrand captures a counter on_strand, increments it, and prints a message Hello, from strand!, prefixed with the value of the counter. The function workFunc captures another counter regular, increments it, and prints Hello, World!, prefixed with the counter. Both pause for 2 seconds before returning. To define and use a strand, we first create an object of io_service::strand associated with the io_service instance (line 17). Thereafter, we post all handlers that we want to be part of that strand by wrapping them using the wrap member function of the strand (line 36). Alternatively, we can post the handlers to the strand directly by using either the post or the dispatch member function of the strand, as shown in the following snippet: 33 for (int i = 0; i < 15; ++i) { 34 if (rand() % 2 == 0) { 35 strand.post(workFuncStrand); 37 } else { ... The wrap member function of strand returns a function object, which in turn calls dispatch on the strand to invoke the original handler. Initially, it is this function object rather than our original handler that is added to the queue. When duly dispatched, this invokes the original handler. There are no constraints on the order in which these wrapper handlers are dispatched, and therefore, the actual order in which the original handlers are invoked can be different from the order in which they were wrapped and posted. On the other hand, calling post or dispatch directly on the strand avoids an intermediate handler. Directly posting to a strand also guarantees that the handlers will be dispatched in the same order that they were posted, achieving a deterministic ordering of the handlers in the strand. The dispatch member of strand blocks until the handler is dispatched. The post member simply adds it to the strand and returns. Note that workFuncStrand increments on_strand without synchronization (line 22), while workFunc increments the counter regular within the PRINT_ARGS macro (line 29), which ensures that the increment happens in a critical section. The workFuncStrand handlers are posted to a strand and therefore are guaranteed to be serialized; hence no need for explicit synchronization. On the flip side, entire functions are serialized via strands and synchronizing smaller blocks of code is not possible. There is no serialization between the handlers running on the strand and other handlers; therefore, the access to global objects, like std::cout, must still be synchronized. The following is a sample output of running the preceding code: [b73b6b40] Starting worker thread [b73b6b40] 0. Hello, world from strand! [b6bb5b40] Starting worker thread [b6bb5b40] 1. Hello, world! [b63b4b40] Starting worker thread [b63b4b40] 2. Hello, world! [b73b6b40] 3. Hello, world from strand! [b6bb5b40] 5. Hello, world! [b63b4b40] 6. Hello, world! … [b6bb5b40] 14. Hello, world! [b63b4b40] 4. Hello, world from strand! [b63b4b40] 8. Hello, world from strand! [b63b4b40] 10. Hello, world from strand! [b63b4b40] 13. Hello, world from strand! [b6bb5b40] Worker thread done [b73b6b40] Worker thread done [b63b4b40] Worker thread done There were three distinct threads in the pool and the handlers from the strand were picked up by two of these three threads: initially, by thread ID b73b6b40, and later on, by thread ID b63b4b40. This also dispels a frequent misunderstanding that all handlers in a strand are dispatched by the same thread, which is clearly not the case. Different handlers in the same strand may be dispatched by different threads but will never run concurrently. Summary Asio is a well-designed library that can be used to write fast, nimble network servers that utilize the most optimal mechanisms for asynchronous I/O available on a system. It is an evolving library and is the basis for a Technical Specification that proposes to add a networking library to a future revision of the C++ Standard. In this article, we learned how to use the Boost Asio library as a task queue manager and leverage Asio's TCP and UDP interfaces to write programs that communicate over the network. Resources for Article: Further resources on this subject: Animation features in Unity 5 [article] Exploring and Interacting with Materials using Blueprints [article] A Simple Pathfinding Algorithm for a Maze [article]

0
1
22134

Packt

11 Aug 2015

17 min read

Ext JS 5 – an Introduction

Packt

11 Aug 2015

17 min read

In this article by Carlos A. Méndez, the author of the book Learning Ext JS - Fourth Edition, we will see some of the important features in Ext JS. When learning a new technology such as Ext JS, some developers face a hard time to begin with, so this article will cover up certain important points that have been included in the recent version of Ext JS. We will be referencing certain online documentations, blogs and forums looking for answers, trying to figure out how the library and all the components work together. Even though there are tutorials in the official learning center, it would be great to have a guide to learn the library from the basics to a more advanced level. Ext JS is a state-of-the-art framework to create Rich Internet Applications (RIAs). The framework allows us to create cross-browser applications with a powerful set of components and widgets. The idea behind the framework is to create user-friendly applications in rapid development cycles, facilitate teamwork (MVC or MVVM), and also have a long-term maintainability. Ext JS is not just a library of widgets anymore; the brand new version is a framework full of new exciting features for us to play with. Some of these features are the new class system, the loader, the new application package, which defines a standard way to code our applications, and much more awesome stuff. The company behind the Ext JS library is Sencha Inc. They work on great products that are based on web standards. Some of the most famous products that Sencha also have are Sencha Touch and Sencha Architect. In this article, we will cover some of the basic concepts of the framework of version 5 and take a look at some of the new features in Ext JS 5. (For more resources related to this topic, see here.) Considering Ext JS for your next project Ext JS is a great library to create RIAs that require a lot of interactivity with the user. If you need complex components to manage your information, then Ext is your best option because it contains a lot of widgets such as the grid, forms, trees, panels, and a great data package and class system. Ext JS is best suited for enterprise or intranet applications; it's a great tool to develop an entire CRM or ERP software solution. One of the more appealing examples is the Desktop sample (http://dev.sencha.com/ext/5.1.0/examples/desktop/index.html). It really looks and feels like a native application running in the browser. In some cases, this is an advantage because the users already know how to interact with the components and we can improve the user experience. Ext JS 5 came out with a great tool to create themes and templates in a very simple way. The framework for creating themes is built on top of Compass and Sass, so we can modify some variables and properties and in a few minutes we can have a custom template for our Ext JS applications. If we want something more complex or unique, we can modify the original template to suit our needs. This might be more time-consuming depending on our experience with Compass and Sass. Compass and Sass are extensions for CSS. We can use expressions, conditions, variables, mixins, and many more awesome things to generate well-formatted CSS. You can learn more about Compass on their website at http://compass-style.org/. The new class system allows us to define classes incredibly easily. We can develop our application using the object-oriented programming paradigm and take advantage of the single and multiple inheritances. This is a great advantage because we can implement any of the available patterns such as MVC, MVVM, Observable, or any other. This will allow us to have a good code structure, which leads us to have easy access for maintenance. Another thing to keep in mind is the growing community around the library; there are lot of people around the world that are working with Ext JS right now. You can even join the meeting groups that have local reunions frequently to share knowledge and experiences; I recommend you to look for a group in your city or create one. The new loader system is a great way to load our modules or classes on demand. We can load only the modules and applications that the user needs just in time. This functionality allows us to bootstrap our application faster by loading only the minimal code for our application to work. One more thing to keep in mind is the ability to prepare our code for deployment. We can compress and obfuscate our code for a production environment using the Sencha Command, a tool that we can run on our terminal to automatically analyze all the dependencies of our code and create packages. Documentation is very important and Ext JS has great documentation, which is very descriptive with a lot of examples, videos, and sample code so that we can see it in action right on the documentation pages, and we can also read the comments from the community. What's new in Ext JS 5 Ext JS 5 introduces a great number of new features, and we'll briefly cover a few of the significant additions in version 5 as follows: Tablet support and new themes: This has introduced the ability to create apps compatible with touch-screen devices (touch-screen laptops, PCs, and tablets). The Crisp theme is introduced and is based on the Neptune theme. Also, there are new themes for tablet support, which are Neptune touch and Crisp touch. New application architecture – MVVM: Adding a new alternative to MVC Sencha called MVVM (which stands for Model-View-ViewModel), this new architecture has data binding and two-way data binding, allowing us to decrease much of the extra code that some of us were doing in past versions. This new architecture introduces: Data binding View controllers View models Routing: Routing provides deep linking of application functionality and allows us to perform certain actions or methods in our application by translating the URL. This gives us the ability to control the application state, which means that we can go to a specific part or a direct link to our application. Also, it can handle multiple actions in the URL. Responsive configurations: Now we have the ability to set the responsiveConfig property (new property) to some components, which will be a configuration object that represents conditions and criteria on which the configurations set will be applied, if the rule meets these configurations. As an example: responsiveConfig: { 'width > 800': { region: 'west' }, 'width <= 800':{ region: 'north' } } Data package improvements: Some good changes came in version 5 relating to data handling and data manipulation. These changes allowed developers an easier journey in their projects, and some of the new things are: Common Data (the Ext JS Data class, Ext.Data, is now part of the core package) Many-to-many associations Chained stores Custom field types Event system: The event logic was changed, and is now a single listener attached at the very top of the DOM hierarchy. So this means when a DOM element fires an event, it bubbles to the top of the hierarchy before it's handled. So Ext JS intercepts this and checks the relevant listeners you added to the component or store. This reduces the number of interactions on the DOM and also gives us the ability to enable gestures. Sencha Charts: Charts can work on both Ext JS and Sencha Touch, and have enhanced performance on tablet devices. Legacy Ext JS 4 charts were converted into a separate package to minimize the conversion/upgrade. In version 5, charts have new features such as: Candlestick and OHLC series Pan, zoom, and crosshair interactions Floating axes Multiple axes SVG and HTML Canvas support Better performance Greater customization Chart themes Tab Panels: Tab panels have more options to control configurations such as icon alignment and text rotation. Thanks to new flexible Sass mixins, we can easily control presentation options. Grids: This component, which has been present since version 2x, is one of the most popular components, and we may call it one of the cornerstones of this framework. In version 5, it got some awesome new features: Components in Cells Buffered updates Cell updaters Grid filters (The popular "UX" (user extension) has been rewritten and integrated into the framework. Also filters can be saved in the component state.) Rendering optimizations Widgets: This is a lightweight component, which is a middle ground between Ext.Component and the Cell renderer. Breadcrumb bars: This new component displays the data of a store (a specific data store for the tree component) in a toolbar form. This new control can be a space saver on small screens or tablets. Form package improvements: Ext JS 5 introduces some new controls and significant changes on others: Tagfield: This is a new control to select multiple values. Segmented buttons: These are buttons with presentation such as multiple selection on mobile interfaces. Goodbye to TriggerField: In version 5, TriggerField is deprecated and now the way to create triggers is by using the Text field and implementing the triggers on the TextField configuration. (TriggerField in version 4 is a text field with a configured button(s) on the right side.) Field and Form layouts: Layouts were refactored using HTML and CSS, so there is improvement as the performance is now better. New SASS Mixins (http://sass-lang.com/): Several components that were not able to be custom-themed now have the ability to be styled in multiple ways in a single theme or application. These components are: Ext.menu.Menu Ext.form.Labelable Ext.form.FieldSet Ext.form.CheckboxGroup Ext.form.field.Text Ext.form.field.Spinner Ext.form.field.Display Ext.form.field.Checkbox The Sencha Core package: The core package contains code shared between Ext JS and Sencha Touch and in the future, this core will be part of the next major release of Sencha Touch. The Core includes: Class system Data Events Element Utilities Feature/environment detection Preparing for deployment So far, we have seen a few features that helps to architect a JavaScript code; but we need to prepare our application for a production environment. So initially, when an application is in the development environment, we need to make sure that Ext JS classes (also our own classes) are dynamically loaded when the application requires to use them. In this environment, it's really helpful to load each class in a separate file. This will allow us to debug the code easily, and find and fix bugs. Now, before the application is compiled, we must know the three basic parts of an application, as marked here: app.json: This file contains specific details about our application. Also, Sencha CMD processes this file first. build.xml: This file contains a minimal initial Ant script, and imports a task file located at .sencha/app/build-impl.xml. .sencha: This folder contains many files related to, and are to be used for, the build process. The app.json file As we said before, the app.json file contains the information about the settings of the application. Open the file and take a look. We can make changes to this file, such as the theme that our application is going to use. For example, we can use the following line of code: "theme": "my-custom-theme-touch", Alternatively, we can use a normal theme: "theme": "my-custom-theme", We can also use the following for using charts: "requires": [ "sencha-charts" ], This was to specify that we are going to use the charts or draw classes in our application (the chart package for Ext JS 5). Now, at the end of the file, there is an ID for the application: "id": "7833ee81-4d14-47e6-8293-0cb8120281ab" After this ID, we can add other properties. As an example, suppose the application will be generated for Central and South America. Then we need to include the locale (ES or PT), so we can add the following: ,"locales":["es"] We can also add multiple languages: ,"locales":["es","pt","en"] This will cause the compilation process to include the corresponding locale files located at ext/packages/ext-locale/build. However, this article can't cover each property in the file, so it's recommended that you take a deep look into the Sencha CMD documentation at: http://docs-origin.sencha.com/cmd/5.x/microloader.html to learn more about the app.json file. The Sencha command To create our production build, we need to use the Sencha Command. This tool will help us in our purpose. If you are running Sencha CMD on Windows 7 or Windows 8, it's recommended that you run the tool with "administrator privileges". So let's type this in our console tool: [path of my app]\sencha app build In my case (Windows OS 7; 64-bit), I typed: K:\x_extjsdev\app_test\myapp>sencha app build After the command runs, you will see something like this in your console tool: So, let's check out the build folder inside our application folder. We may have the following list of files: Notice that the build process has created these: resources: This file will contain a copy of our resources folder, plus one or more CSS files starting with myApp-all app.js: This file contains all of the necessary JS (Ext JS core classes, components, and our custom application classes) app.json: This is a small manifest file compressed index.html: This file is similar to our index file in development mode, except for the line: <script id="microloader" type="text/javascript" src="bootstrap.js"></script> This was replaced by some compressed JavaScript code, which will act in a similar way to the micro loader. Notice that the serverside folder, where we use some JSON files (other cases can be PHP, ASP, and so on), does not exist in the production folder. Well, the reason is that that folder is not part of what Sencha CMD and build files consider. Normally, many developers will say, "Hey, let's copy the folder and let's move on." However, the good news is that we can include that folder with an Apache Ant task Customizing the build.xml file We can add custom code (Apache Ant style) to perform new tasks and things we need in order to make our application build even better. Let's open the build.xml file. You will see something like this: <?xml version="1.0" encoding="utf-8"?> <project name="myApp" default=".help">  <import file="${basedir}/.sencha/app/build-impl.xml"/>  </project> So, let's place the following code before </project>: <target name="-after-build" depends="init"> <copy todir="${build.out.base.path}/serverside" overwrite="false"> <fileset dir="${app.dir}/serverside" includes="**/*"/> </copy> </target> </project> This new code inside the build.xml file establishes that after making the whole building process, if there is no error during the Init process then it will copy the (${app.dir}/ serverside) folder to the (${build.out.base.path}/serverside) output path. So now, let's type the command for building the application again: sencha app build –c In this case, we added -c to first clean the build/production folder and create a new set of files. After the process completes, take a look at the folder contents, and you will see this: Notice that now the serverside folder has been copied to the production build folder, thanks to the custom code we placed in build.xml file. Compressing the code After building our application, let's open the app.js file. We may see something like what is shown here: By default, the build process uses the YUI compressor to compact the JS code (http://yui.github.io/yuicompressor/). Inside the .sencha folder, there are many files, and depending on the type of build we are creating, there are some files such as the base file, where the properties are defined in defaults.properties. This file must not be changed whatsoever; for that, we have other files that can override the values defined in this file. As an example for the production build, we have the following files: production.defaults.properties: This file will contain some properties/variables that will be used for the production build. production.properties: This file has only comments. The idea behind this file is that developers place the variables they want in order to customize the production build. By default, in the production.defaults.properties file, you will see something like the following code: # Comments ...... # more comments...... build.options.logger=no build.options.debug=false # enable the full class system optimizer app.output.js.optimize=true build.optimize=${build.optimize.enable} enable.cache.manifest=true enable.resource.compression=true build.embedded.microloader.compressor=-closure Now, as an example of compression, let's make a change and place some variables inside the production.properties file. The code we will place here will override the properties set in defaults.properties and production.defaults.properties. So, let's write the following code after the comments: build.embedded.microloader.compressor=-closure build.compression.yui=0 build.compression.closure=1 build.compression=-closure With this code, we are setting up the build process to use closure as the JavaScript compressor and also for the micro loader. Now save the file and use the Sencha CMD tool once again: sencha app build Wait for the process to end and take a look at app.js. You can notice that the code is quite different. This is because the code compiler (closure) was the one that made the compression. Run the app and you will notice no change in the behavior and use of the application. As we have used the production.properties file in this example, notice that in the .sencha folder, we have some other files for different environments, such as: Environment File (or files) Testing testing.defaults.properties and testing.properties Development development.defaults.properties and development.properties Production production.defaults.properties and production.properties It's not recommended that you change the *.default.properties file. That's the reason of the *.properties file, so that you can set your own variables, and doing this will override the settings on default file. Packaging and deploying Finally, after we have built our application, we have our production build/package ready to be deployed. We will have the following structure in our folder: Now we have all the files required to make our application work on a public server. We don't need to upload anything from the Ext JS folder because we have all that we need in app.js (all of the Ext JS code and our code). Also, the resources file contains the images, CSS (the theme used in the app), and of course our serverside folder. So now, we need to upload all of the content to the server: And we are ready to test the production in a public server. Summary In this article, you learned the reasons that will make us to consider using Ext JS 5 for developing projects. We briefly mentioned some of the significant additional features in version 5 that are instrumental in developing applications. Later, we talked about compiling and preparing an application for a production environment. Using Sencha CMD and also configuring JSON or XML files to build a project can sometimes be an overwhelming situation, but don't panic! Check out the documentation of Sencha and Apache. Do remember that there's no reason to be afraid of testing and playing with the configurations. It's all part of learning and knowing how to use Sencha Ext JS. Resources for Article: Further resources on this subject: The Login Page using Ext JS [Article] So, what is Ext JS? [Article] AngularJS Performance [Article]

0
0
9996

article-image-typical-javascript-project

Packt

11 Aug 2015

29 min read

A Typical JavaScript Project

Packt

11 Aug 2015

29 min read

In this article by Phillip Fehre, author of the book JavaScript Domain-Driven Design, we will explore a practical approach to developing software with advanced business logic. There are many strategies to keep development flowing and the code and thoughts organized, there are frameworks building on conventions, there are different software paradigms such as object orientation and functional programming, or methodologies such as test-driven development. All these pieces solve problems, and are like tools in a toolbox to help manage growing complexity in software, but they also mean that today when starting something new, there are loads of decisions to make even before we get started at all. Do we want to develop a single-page application, do we want to develop following the standards of a framework closely or do we want to set our own? These kinds of decisions are important, but they also largely depend on the context of the application, and in most cases the best answer to the questions is: it depends. (For more resources related to this topic, see here.) So, how do we really start? Do we really even know what our problem is, and, if we understand it, does this understanding match that of others? Developers are very seldom the domain experts on a given topic. Therefore, the development process needs input from outside through experts of the business domain when it comes to specifying the behavior a system should have. Of course, this is not only true for a completely new project developed from the ground up, but also can be applied to any new feature added during development of to an application or product. So, even if your project is well on its way already, there will come a time when a new feature just seems to bog the whole thing down and, at this stage, you may want to think about alternative ways to go about approaching this new piece of functionality. Domain-driven design gives us another useful piece to play with, especially to solve the need to interact with other developers, business experts, and product owners. As in the modern era, JavaScript becomes a more and more persuasive choice to build projects in and, in many cases like browser-based web applications, it actually is the only viable choice. Today, the need to design software with JavaScript is more pressing than ever. In the past, the issues of a more involved software design were focused on either backend or client application development, with the rise of JavaScript as a language to develop complete systems in, this has changed. The development of a JavaScript client in the browser is a complex part of developing the application as a whole, and so is the development of server-side JavaScript applications with the rise of Node.js. In modern development, JavaScript plays a major role and therefore needs to receive the same amount of attention in development practices and processes as other languages and frameworks have in the past. A browser based client-side application often holds the same amount, or even more logic, than the backend. With this change, a lot of new problems and solutions have arisen, the first being the movement toward better encapsulation and modularization of JavaScript projects. New frameworks have arisen and established themselves as the bases for many projects. Last but not least, JavaScript made the jump from being the language in the browser to move more and more to the server side, by means of Node.js or as the query language of choice in some NoSQL databases. Let me take you on a tour of developing a piece of software, taking you through the stages of creating an application from start to finish using the concepts domain-driven design introduced and how they can be interpreted and applied. In this article, you will cover: The core idea of domain-driven design Our business scenario—managing an orc dungeon Tracking the business logic Understanding the core problem and selecting the right solution Learning what domain-driven design is The core idea of domain-driven design There are many software development methodologies around, all with pros and cons but all also have a core idea, which is to be applied and understood to get the methodology right. For a domain-driven design, the core lies in the realization that since we are not the experts in the domain the software is placed in, we need to gather input from other people who are experts. This realization means that we need to optimize our development process to gather and incorporate this input. So, what does this mean for JavaScript? When thinking about a browser application to expose a certain functionality to a consumer, we need to think about many things, for example: How does the user expect the application to behave in the browser? How does the business workflow work? What does the user know about the workflow? These three questions already involve three different types of experts: a person skilled in user experience can help with the first query, a business domain expert can address the second query, and a third person can research the target audience and provide input on the last query. Bringing all of this together is the goal we are trying to achieve. While the different types of people matter, the core idea is that the process of getting them involved is always the same. We provide a common way to talk about the process and establish a quick feedback loop for them to review. In JavaScript, this can be easier than in most other languages due to the nature of it being run in a browser, readily available to be modified and prototyped with; an advantage Java Enterprise Applications can only dream of. We can work closely with the user experience designer adjusting the expected interface and at the same time change the workflow dynamically to suit our business needs, first on the frontend in the browser and later moving the knowledge out of the prototype to the backend, if necessary. Managing an orc dungeon When talking about domain-driven design, it is often stated in the context of having complex business logic to deal with. In fact, most software development practices are not really useful when dealing with a very small, cut-out problem. Like with every tool, you need to be clear when it is the right time to use it. So, what does really fall in to the realm of complex business logic? It means that the software has to describe a real-world scenario, which normally involves human thinking and interaction. Writing software that deals with decisions, which 90 per cent of the time go a certain way and ten per cent of the time it's some other way, is notoriously hard, especially when explaining it to people not familiar with software. These kind of decisions are the core of many business problems, but even though this is an interesting problem to solve, following how the next accounting software is developed does not make an interesting read. With this in mind, I would like to introduce you to the problem we are trying to solve, that is, managing a dungeon. An orc Inside the dungeon Running an orc dungeon seems pretty simple from the outside, but managing it without getting killed is actually rather complicated. For this reason, we are contacted by an orc master who struggles with keeping his dungeon running smoothly. When we arrive at the dungeon, he explains to us how it actually works and what factors come into play. Even greenfield projects often have some status quo that work. This is important to keep in mind since it means that we don't have to come up with the feature set, but match the feature set of the current reality. Many outside factors play a role and the dungeon is not as independent at it would like to be. After all, it is part of the orc kingdom, and the king demands that his dungeons make him money. However, money is just part of the deal. How does it actually make money? The prisoners need to mine gold and to do that there needs to be a certain amount of prisoners in the dungeon that need to be kept. The way an orc kingdom is run also results in the constant arrival of new prisoners, new captures from war, those who couldn't afford their taxes, and so on. There always needs to be room for new prisoners. The good thing is that every dungeon is interconnected, and to achieve its goals it can rely on others by requesting a prisoner transfer to either fill up free cells or get rid of overflowing prisoners in its cells. These options allow the dungeon masters to keep a close watch on prisoners being kept and the amount of cell space available. Sending off prisoners into other dungeons as needed and requesting new ones from other dungeons, in case there is too much free cell space available, keeps the mining workforce at an optimal level for maximizing the profit, while at the same time being ready to accommodate the eventual arrival of a high value inmate sent directly to the dungeon. So far, the explanation is sound, but let's dig a little deeper and see what is going on. Managing incoming prisoners Prisoners can arrive for a couple of reasons, such as if a dungeon is overflowing and decides to transfer some of its inmates to a dungeon with free cells and, unless they flee on the way, they will eventually arrive at our dungeon sooner or later. Another source of prisoners is the ever expanding orc kingdom itself. The orcs will constantly enslave new folk and telling our king, "Sorry we don't have room", is not a valid option, it might actually result in us being one of the new prisoners. Looking at this, our dungeon will fill up eventually, but we need to make sure this doesn't happen. The way to handle this is by transferring inmates early enough to make room. This is obviously going to be the most complicated thing; we need to weigh several factors to decide when and how many prisoners to transfer. The reason we can't simply solve this via thresholds is that looking at the dungeon structure, this is not the only way we can lose inmates. After all, people are not always happy with being gold mining slaves and may decide the risk of dying in a prison is as high as dying while fleeing. Therefore, they decide to do so. The same is true while prisoners are on the move between different dungeons as well, and not unlikely. So even though we have a hard limit of physical cells, we need to deal with the soft number of incoming and outgoing prisoners. This is a classical problem in business software. Matching these numbers against each other and optimizing for a certain outcome is basically what computer data analysis is all about. The current state of the art With all this in mind, it becomes clear that the orc master's current system of keeping track via a badly written note on a napkin is not perfect. In fact, it almost got him killed multiple times already. To give you an example of what can happen, he tells the story of how one time the king captured four clan leaders and wanted to make them miners just to humiliate them. However, when arriving at the dungeon, he realized that there was no room and had to travel to the next dungeon to drop them off, all while having them laugh at him because he obviously didn't know how to run a kingdom. This was due to our orc master having forgotten about the arrival of eight transfers just the day before. Another time, the orc master was not able to deliver any gold when the king's sheriff arrived because he didn't know he only had one-third of his required prisoners to actually mine anything. This time it was due to having multiple people count the inmates, and instead of recoding them cell-by-cell, they actually tried to do it in their head. While being orc, this is a setup for failure. All this comes down to bad organization, and having your system to manage dungeon inmates drawn on the back of a napkin certainly qualifies as such. Digital dungeon management Guided by the recent failures, the orc master has finally realized it is time to move to modern times, and he wants to revolutionize the way to manage his dungeon by making everything digital. He strives to have a system that basically takes the busywork out of managing by automatically calculating the necessary transfers according to the current amount of cells filled. He would like to just sit back, relax and let the computer do all the work for him. A common pattern when talking with a business expert about software is that they are not aware of what can be done. Always remember that we, as developers, are the software experts and therefore are the only ones who are able to manage these expectations. It is time now for us to think about what we need to know about the details and how to deal with the different scenarios. The orc master is not really familiar with the concepts of software development, so we need to make sure we talk in a language he can follow and understand, while making sure we get all the answers we need. We are hired for our expertise in software development, so we need to make sure to manage the expectations as well as the feature set and development flow. The development itself is of course going to be an iterative process, since we can't expect to get a list of everything needed right in one go. It also means that we will need to keep possible changes in mind. This is an essential part of structuring complex business software. Developing software containing more complex business logic is prone to changing rapidly as the business is adapting itself and the users leverage the functionality the software provides. Therefore, it is essential to keep a common language between the people who understand the business and the developers who understand the software. Incorporate the business terms wherever possible, it will ease communication between the business domain experts and you as a developer and therefore prevent misunderstandings early on. Specification To create a good understanding of what a piece of software needs to do, at least to be useful in the best way, is to get an understanding of what the future users were doing before your software existed. Therefore, we sit down with the orc master as he is managing his incoming and outgoing prisoners, and let him walk us through what he is doing on a day-to-day basis. The dungeon is comprised of 100 cells that are either occupied by a prisoner or empty at the moment. When managing these cells, we can identify distinct tasks by watching the orc do his job. Drawing out what we see, we can roughly sketch it like this: There are a couple of organizational important events and states to be tracked, they are: Currently available or empty cells Outgoing transfer states Incoming transfer states Each transfer can be in multiple states that the master has to know about to make further decisions on what to do next. Keeping a view of the world like this is not easy especially accounting for the amount of concurrent updates happening. Tracking the state of everything results in further tasks for our master to do: Update the tracking Start outgoing transfers when too many cells are occupied Respond to incoming transfers by starting to track them Ask for incoming transfers if the occupied cells are to low So, what does each of them involve? Tracking available cells The current state of the dungeon is reflected by the state of its cells, so the first task is to get this knowledge. In its basic form, this is easily achievable by simply counting every occupied and every empty cell, writing down what the values are. Right now, our orc master tours the dungeon in the morning, noting each free cell assuming that the other one must be occupied. To make sure he does not get into trouble, he no longer trusts his subordinates to do that! The problem being that there only is one central sheet to keep track of everything, so his keepers may overwrite each other's information accidently if there is more than one person counting and writing down cells. Also, this is a good start and is sufficient as it is right now, although it misses some information that would be interesting to have, for example, the amount of inmates fleeing the dungeon and an understanding of the expected free cells based on this rate. For us, this means that we need to be able track this information inside the application, since ultimately we want to project the expected amount of free cells so that we can effectively create recommendations or warnings based on the dungeon state. Starting outgoing transfers The second part is to actually handle getting rid of prisoners in case the dungeon fills up. In this concrete case, this means that if the number of free cells drops beneath 10, it is time to move prisoners out, since there may be new prisoners coming at any time. This strategy works pretty reliably since, from experience, it has been established that there are hardly any larger transports, so the recommendation is to stick with it in the beginning. However, we can already see some optimizations which currently are too complex. Drawing from the experience of the business is important, as it is possible to encode such knowledge and reduces mistakes, but be mindful since encoding detailed experience is probably one of the most complex things to do. In the future, we want to optimize this based on the rate of inmates fleeing the dungeon, new prisoners arriving due to being captured, as well as the projection of new arrivals from transfers. All this is impossible right now, since it will just overwhelm the current tracking system, but it actually comes down to capturing as much data as possible and analyzing it, which is something modern computer systems are good at. After all, it could save the orc master's head! Tracking the state of incoming transfers On some days, a raven will arrive bringing news that some prisoners have been sent on their way to be transferred to our dungeon. There really is nothing we can do about it, but the protocol is to send the raven out five days prior to the prisoners actually arriving to give the dungeon a chance to prepare. Should prisoners flee along the way, another raven will be sent informing the dungeon of this embarrassing situation. These messages have to be sifted through every day, to make sure there actually is room available for those arriving. This is a big part of projecting the amount of filled cells, and also the most variable part, we get told. It is important to note that every message should only be processed once, but it can arrive at any time during the day. Right now, they are all dealt with by one orc, who throws them out immediately after noting what the content results in. One problem with the current system is that since other dungeons are managed the same way ours is currently, they react with quick and large transfers when they get in trouble, which makes this quite unpredictable. Initiating incoming transfers Besides keeping the prisoners where they belong, mining gold is the second major goal of the dungeon. To do this, there needs to be a certain amount of prisoners available to man the machines, otherwise production will essentially halt. This means that whenever too many cells become abandoned it is time to fill them, so the orc master sends a raven to request new prisoners in. This again takes five days and, unless they flee along the way, works reliably. In the past, it still has been a major problem for the dungeon due to the long delay. If the filled cells drop below 50, the dungeon will no longer produce any gold and not making money is a reason to replace the current dungeon master. If all the orc master does is react to the situation, it means that there will probably be about five days in which no gold will be mined. This is one of the major pain points in the current system because projecting the amount of filled cells five days out seems rather impossible, so all the orcs can do right now is react. All in all, this gives us a rough idea what the dungeon master is looking for and which tasks need to be accomplished to replace the current system. Of course, this does not have to happen in one go, but can be done gradually so everybody adjusts. Right now, it is time for us to identify where to start. From greenfield to application We are JavaScript developers, so it seems obvious for us to build a web application to implement this. As the problem is described, it is clear that starting out simply and growing the application as we further analyze the situation is clearly the way to go. Right now, we don't really have a clear understanding how some parts should be handled since the business process has not evolved to this level, yet. Also, it is possible that new features will arise or things start being handled differently as our software begins to get used. The steps described leave room for optimization based on collected data, so we first need the data to see how predictions can work. This means that we need to start by tracking as many events as possible in the dungeon. Running down the list, the first step is always to get a view of which state we are in, this means tracking the available cells and providing an interface for this. To start out, this can be done via a counter, but this can't be our final solution. So, we then need to grow toward tracking events and summing those to be able to make predictions for the future. The first route and model Of course there are many other ways to get started, but what it boils down to in most cases is that it is time now to choose the base to build on. By this I mean deciding on a framework or set of libraries to build upon. This happens alongside the decision on what database is used to back our application and many other small decisions, which are influenced by influenced by those decisions around framework and libraries. A clear understanding on how the frontend should be built is important as well, since building a single-page application, which implements a large amount of logic in the frontend and is backed by an API layer that differs a lot from an application, which implements most logic on the server side. Don't worry if you are unfamiliar with express or any other technology used in the following. You don't need to understand every single detail, but you will get the idea of how developing an application with a framework is achieved. Since we don't have a clear understanding, yet, which way the application will ultimately take, we try to push as many decisions as possible out, but decide on the stuff we immediately need. As we are developing in JavaScript, the application is going to be developed in Node.js and express is going to be our framework of choice. To make our life easier, we first decide that we are going to implement the frontend in plain HTML using EJS embedded JavaScript templates, since it will keep the logic in one place. This seems sensible since spreading the logic of a complex application across multiple layers will complicate things even further. Also, getting rid of the eventual errors during transport will ease our way toward a solid application in the beginning. We can push the decision about the database out and work with simple objects stored in RAM for our first prototype; this is, of course, no long-term solution, but we can at least validate some structure before we need to decide on another major piece of software, which brings along a lot of expectations as well. With all this in mind, we setup the application. In the following section and throughout the book, we are using Node.js to build a small backend. At the time of the writing, the currently active version was Node.js 0.10.33. Node.js can be obtained from http://nodejs.org/ and is available for Windows, Mac OS X, and Linux. The foundation for our web application is provided by express, available via the Node Package Manager (NPM) at the time of writing in version 3.0.3: $ npm install –g express$ express --ejs inmatr For the sake of brevity, the glue code in the following is omitted, but like all other code presented in the book, the code is available on the GitHub repository https://github.com/sideshowcoder/ddd-js-sample-code. Creating the model The most basic parts of the application are set up now. We can move on to creating our dungeon model in models/dungeon.js and add the following code to it to keep a model and its loading and saving logic: var Dungeon = function(cells) {this.cells = cellsthis.bookedCells = 0} Keeping in mind that this will eventually be stored in a database, we also need to be able to find a dungeon in some way, so the find method seems reasonable. This method should already adhere to the Node.js callback style to make our lives easier when switching to a real database. Even though we pushed this decision out, the assumption is clear since, even if we decide against a database, the dungeon reference will be stored and requested from outside the process in the future. The following shows an example with the find method: var dungeons = {}Dungeon.find = function(id, callback) {if(!dungeons[id]) { dungeons[id] = new Dungeon(100)}callback(null, dungeons[id])} The first route and loading the dungeon Now that we have this in place, we can move on to actually react to requests. In express defining, the needed routes do this. Since we need to make sure we have our current dungeon available, we also use middleware to load it when a request comes in. Using the methods we just created, we can add a middleware to the express stack to load the dungeon whenever a request comes in. A middleware is a piece of code, which gets executed whenever a request reaches its level of the stack, for example, the router used to dispatch requests to defined functions is implemented as a middleware, as is logging and so on. This is a common pattern for many other kinds of interactions as well, such as user login. Our dungeon loading middleware looks like this, assuming for now we only manage one dungeon we can create it by adding a file in middleware/load_context.js with the following code: function(req, res, next) {req.context = req.context || {}Dungeon.find('main', function(err, dungeon) { req.context.dungeon = dungeon next()})} Displaying the page With this, we are now able to simply display information about the dungeon and track any changes made to it inside the request. Creating a view to render the state, as well as a form to modify it, are the essential parts of our GUI. Since we decided to implement the logic server-side, they are rather barebones. Creating a view under views/index.ejs allows us to render everything to the browser via express later. The following example is the HTML code for the frontend: <h1>Inmatr</h1> <p>You currently have <%= dungeon.free %> of <%= dungeon.cells %> cells available.</p> <form action="/cells/book" method="post"> <select name="cells"> <% for(var i = 1; i < 11; i++) { %> <option value="<%= i %>"><%= i %></option> <% } %> </select> <button type="submit" name="book" value="book"> Book cells</button> <button type="submit" name="free" value="free"> Free cells</button> </form> Gluing the application together via express Now that we are almost done, we have a display for the state, a model to track what is changing, and a middleware to load this model as needed. Now, to glue it all together we will use express to register our routes and call the necessary functions. We mainly need two routes: one to display the page and one to accept and process the form input. Displaying the page is done when a user hits the index page, so we need to bind to the root path. Accepting the form input is already declared in the form itself as /cells/book. We can just create a route for it. In express, we define routes in relation to the main app object and according to the HTTP verbs as follows: app.get('/', routes.index) app.post('/cells/book', routes.cells.book) Adding this to the main app.js file allows express to wire things up, the routes itself are implemented as follows in the routes/index.js file: var routes = { index: function(req, res){ res.render('index', req.context) }, cells: { book: function(req, res){ var dungeon = req.context.dungeon var cells = parseInt(req.body.cells) if (req.body.book) { dungeon.book(cells) } else { dungeon.unbook(cells) } res.redirect('/') } } } With this done, we have a working application to track free and used cells. The following shows the frontend output for the tracking system: Moving the application forward This is only the first step toward the application that will hopefully automate what is currently done by hand. With the first start in place, it is now time to make sure we can move the application along. We have to think about what this application is supposed to do and identify the next steps. After presenting the current state back to the business the next request is most likely to be to integrate some kind of login, since it will not be possible to modify the state of the dungeon unless you are authorized to do it. Since this is a web application, most people are familiar with them having a login. This moves us into a complicated space in which we need to start specifying the roles in the application along with their access patterns; so it is not clear if this is the way to go. Another route to take is starting to move the application towards tracking events instead of pure numbers of the free cells. From a developer's point of view, this is probably the most interesting route but the immediate business value might be hard to justify, since without the login it seems unusable. We need to create an endpoint to record events such as fleeing prisoner, and then modify the state of the dungeon according to those tracked events. This is based on the assumption that the highest value for the application will lie in the prediction of the prisoner movement. When we want to track free cells in such a way, we will need to modify the way our first version of the application works. The logic on what events need to be created will have to move somewhere, most logically the frontend, and the dungeon will no longer be the single source of truth for the dungeon state. Rather, it will be an aggregator for the state, which is modified by the generation of events. Thinking about the application in such a way makes some things clear. We are not completely sure what the value proposition of the application ultimately will be. This leads us down a dangerous path since the design decisions that we make now will impact how we build new features inside the application. This is also a problem in case our assumption about the main value proposition turns out to be wrong. In this case, we may have built quite a complex event tracking system which does not really solve the problem but complicates things. Every state modification needs to be transformed into a series of events where a simple state update on an object may have been enough. Not only does this design not solve the real problem, explaining it to the orc master is also tough. There are certain abstractions missing, and the communication is not following a pattern established as the business language. We need an alternative approach to keep the business more involved. Also, we need to keep development simple using abstraction on the business logic and not on the technologies, which are provided by the frameworks that are used. Summary In this article you were introduced to a typical business application and how it is developed. It showed how domain-driven design can help steer clear of common issues during the development to create a more problem-tailored application. Resources for Article: Further resources on this subject: An Introduction to Mastering JavaScript Promises and Its Implementation in Angular.js [article] Developing a JavaFX Application for iOS [article] Object-Oriented JavaScript with Backbone Classes [article]

0
0
1692

How-To Tutorials

article-image-getting-started-java-driver-mongodb

Packt

11 Aug 2015

8 min read

Getting Started with Java Driver for MongoDB

Packt

11 Aug 2015

8 min read

In this article by Francesco Marchioni, author of the book MongoDB for Java Developers, you will be able to perform all the create/read/update/delete (CRUD) operations that we have so far accomplished using the mongo shell. (For more resources related to this topic, see here.) Querying data We will now see how to use the Java API to query for your documents. Querying for documents with MongoDB resembles JDBC queries; the main difference is that the returned object is a com.mongodb.DBCursor class, which is an iterator over the database result. In the following example, we are iterating over the javastuff collection that should contain the documents inserted so far: package com.packtpub.mongo.chapter2; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBCursor; import com.mongodb.DBObject; import com.mongodb.MongoClient; public class SampleQuery{ private final static String HOST = "localhost"; private final static int PORT = 27017; public static void main( String args[] ){ try{ MongoClient mongoClient = new MongoClient( HOST,PORT ); DB db = mongoClient.getDB( "sampledb" ); DBCollection coll = db.getCollection("javastuff"); DBCursor cursor = coll.find(); try { while(cursor.hasNext()) { DBObject object = cursor.next(); System.out.println(object); } } finally { cursor.close(); } } catch(Exception e) { System.err.println( e.getClass().getName() + ": " + e.getMessage() ); } } } Depending on the documents you have inserted, the output could be something like this: { "_id" : { "$oid" : "5513f8836c0df1301685315b"} , "name" : "john" , "age" : 35 , "kids" : [ { "name" : "mike"} , { "name" : "faye"}] , "info" : { "email" : "john@mail.com" , "phone" : "876-134-667"}} . . . . Restricting the search to the first document The find operator executed without any parameter returns the full cursor of a collection; pretty much like the SELECT * query in relational DB terms. If you are interested in reading just the first document in the collection, you could use the findOne() operation to get the first document in the collection. This method returns a single document (instead of the DBCursor that the find() operation returns). As you can see, the findOne() operator directly returns a DBObject instead of a com.mongodb.DBCursor class: DBObject myDoc = coll.findOne(); System.out.println(myDoc); Querying the number of documents in a collection Another typical construct that you probably know from the SQL is the SELECT count(*) query that is useful to retrieve the number of records in a table. In MongoDB terms, you can get this value simply by invoking the getCount against a DBCollection class: DBCollection coll = db.getCollection("javastuff"); System.out.println(coll.getCount()); As an alternative, you could execute the count() method over the DBCursor object: DBCursor cursor = coll.find(); System.out.println(cursor.count()); Eager fetching of data using DBCursor When find is executed and a DBCursor is executed you have a pointer to a database document. This means that the documents are fetched in the memory as you call next() method on the DBCursor. On the other hand, you can eagerly load all the data into the memory by executing the toArray() method, which returns a java.util.List structure: List list = collection.find( query ).toArray(); The problem with this approach is that you could potentially fill up the memory with lots of documents, which are eagerly loaded. You are therefore advised to include some operators such as skip() and limit() to control the amount of data to be loaded into the memory: List list = collection.find( query ).skip( 100 ). limit( 10 ).toArray(); Just like you learned from the mongo shell, the skip operator can be used as an initial offset of your cursor whilst the limit construct can eventually load the first n occurrences in the cursor. Filtering through the records Typically, you will not need to fetch the whole set of documents in a collection. So, just like SQL uses WHERE conditions to filter records, in MongoDB you can restrict searches by creating a BasicDBObject and passing it to the find function as an argument. See the following example: DBCollection coll = db.getCollection("javastuff"); DBObject query = new BasicDBObject("name", "owen"); DBCursor cursor = coll.find(query); try { while(cursor.hasNext()) { System.out.println(cursor.next()); } } finally { cursor.close(); } In the preceding example, we retrieve the documents in the javastuff collection, whose name key equals to owen. That's the equivalent of an SQL query like this: SELECT * FROM javastuff WHERE name='owen' Building more complex searches As your collections keep growing, you will need to be more selective with your searches. For example, you could include multiple keys in your BasicDBObject that will eventually be passed to find. We can then apply the same functions in our queries. For example, here is how to find documents whose name does not equal ($ne) to Frank and whose age is greater than 10: DBCollection coll = db.getCollection("javastuff"); DBObject query = new BasicDBObject("name", new BasicDBObject("$ne", "frank")).append("age", new BasicDBObject("$gt", 10)); DBCursor cursor = coll.find(query); Updating documents Having learned about create and read, we are half way through our CRUD track. The next operation you will learn is update. The DBCollection class contains an update method that can be used for this purpose. Let's say we have the following document: > db.javastuff.find({"name":"frank"}).pretty() { "_id" : ObjectId("55142c27627b27560bd365b1"), "name" : "frank", "age" : 31, "info" : { "email" : "frank@mail.com", "phone" : "222-111-444" } } Now we want to change the age value for this document by setting it to 23: DBCollection coll = db.getCollection("javastuff"); DBObject newDocument = new BasicDBObject(); newDocument.put("age", 23); DBObject searchQuery = new BasicDBObject().append("name", "owen"); coll.update(searchQuery, newDocument); You might think that would do the trick, but wait! Let's have a look at our document using the mongo shell: > db.javastuff.find({"age":23}).pretty() { "_id" : ObjectId("55142c27627b27560bd365b1"), "age" : 23 } As you can see, the update statement has replaced the original document with another one, including only the keys and values we have passed to the update. In most cases, this is not what we want to achieve. If we want to update a particular value, we have to use the $set update modifier. DBCollection coll = db.getCollection("javastuff"); BasicDBObject newDocument = new BasicDBObject(); newDocument.append("$set", new BasicDBObject().append("age", 23)); BasicDBObject searchQuery = new BasicDBObject().append("name", "frank"); coll.update(searchQuery, newDocument); So, suppose we restored the initial document with all the fields, this is the outcome of the update using the $set update modifier: > db.javastuff.find({"age":23}).pretty() { "_id" : ObjectId("5514326e627b383428c2ccd8"), "name" : "frank", "age" : 23, "in,fo" : { "email" : "frank@mail.com", "phone" : "222-111-444" } } Please note that the DBCollection class overloads the method update with update (DBObject q, DBObject o, boolean upsert, boolean multi). The first parameter (upsert) determines whether the database should create the element if it does not exist. The second one (multi) causes the update to be applied to all matching objects. Deleting documents The operator to be used for deleting documents is obviously delete. As for other operators, it includes several variants. In its simplest form, when executed over a single document returned, it will remove it: MongoClient mongoClient = new MongoClient("localhost", 27017); DB db = mongoClient.getDB("sampledb"); DBCollection coll = db.getCollection("javastuff"); DBObject doc = coll.findOne(); coll.remove(doc); Most of the time you will need to filter the documents to be deleted. Here is how to delete the document with the key frank: DBObject document = new BasicDBObject(); document.put("name", "frank"); coll.remove(document); Deleting a set of documents Bulk deletion of documents can be achieved by including the keys in a List and building an $in modifier expression that uses this list. Let's see, for example, how to delete all records whose age ranges from 0 to 49: BasicDBObject deleteQuery = new BasicDBObject(); List<Integer> list = new ArrayList<Integer>(); for (int i=0;i<50;i++) list.add(i); deleteQuery.put("age", new BasicDBObject("$in", list)); Summary In this article, we covered how to perform the same operations that are available in the mongo shell. Resources for Article: Further resources on this subject: MongoDB data modeling [article] Apache Solr and Big Data – integration with MongoDB [article] About MongoDB [article]

0
0
2052

Packt

11 Aug 2015

17 min read

Storage Scalability

Packt

11 Aug 2015

17 min read

In this article by Victor Wu and Eagle Huang, authors of the book, Mastering VMware vSphere Storage, we will learn that, SAN storage is a key component of a VMware vSphere environment. We can choose different vendors and types of SAN storage to deploy on a VMware Sphere environment. The advanced settings of each storage can affect the performance of the virtual machine, for example, FC or iSCSI SAN storage. It has a different configuration in a VMware vSphere environment. Host connectivity of Fibre Channel storage is accessed by Host Bus Adapter (HBA). Host connectivity of iSCSI storage is accessed by the TCP/IP networking protocol. We first need to know the concept of storage. Then we can optimize the performance of storage in a VMware vSphere environment. In this article, you will learn these topics: What the vSphere storage APIs for Array Integration (VAAI) and Storage Awareness (VASA) are The virtual machine storage profile VMware vSphere Storage DRS and VMware vSphere Storage I/O Control (For more resources related to this topic, see here.) vSphere storage APIs for array integration and storage awareness VMware vMotion is a key feature in vSphere hosts. An ESXi host cannot provide the vMotion feature if it is without shared SAN storage. SAN storage is a key component in a VMware vSphere environment. In large-scale virtualization environments, there are many virtual machines stored in SAN storage. When a VMware administrator executes virtual machine cloning or migrates a virtual machine to another ESXi host by vMotion, this operation allocates the resource on that ESXi host and SAN storage. In vSphere 4.1 and later versions, it can support VAAI. The vSphere storage API is used by a storage vendor who provides hardware acceleration or offloads vSphere I/O between storage devices. These APIs can reduce the resource overhead on ESXi hosts and improve performance for ESXi host operations, for example, vMotion, virtual machine cloning, creating a virtual machine, and so on. VAAI has two APIs: the hardware acceleration API and the array thin provisioning API. The hardware acceleration API is used to integrate with VMware vSphere to offload storage operations to the array and reduce the CPU overload on the ESXi host. The following table lists the features of the hardware acceleration API for block and NAS: Array integration Features Description Block Fully copy This blocks clone or copy offloading. Block zeroing This is also called write same. When you provision an eagerzeroedthick VMDK, the SCSI command is issued to write zeroes to disks. Atomic Test & Set (ATS) This is a lock mechanism that prevents the other ESXi host from updating the same VMFS metadata. NAS Full file clone This is similar to Extended Copy (XCOPY) hardware acceleration. Extended statistics This feature is enabled in space usage in the NAS data store. Reserved space The allocated space of virtual disk in thick format. The array thin provisioning API is used to monitor the ESXi data store space on the storage arrays. It helps prevent the disk from running out of space and reclaims disk space. For example, if the storage is assigned as 1 x 3 TB LUN in the ESXi host, but the storage can only provide 2 TB of data storage space, it is considered to be 3 TB in the ESXi host. Streamline its monitoring LUN configuration space in order to avoid running out of physical space. When vSphere administrators delete or remove files from the data store that is provisioned LUN, the storage can reclaim free space in the block level. In vSphere 4.1 or later, it can support VAAI features. In vSphere 5.5, you can reclaim the space on thin provisioned LUN using esxcli. VMware VASA is a piece of software that allows the storage vendor to provide information about their storage array to VMware vCenter Server. The information includes storage capability, the state of physical storage devices, and so on. vCenter Server collects this information from the storage array using a software component called VASA provider, which is provided by the storage array vendor. A VMware administrator can view the information in VMware vSphere Client / VMware vSphere Web Client. The following diagram shows the architecture of VASA with vCenter Server. For example, the VMware administrator requests to create a 1 x data store in VMware ESXi Server. It has three main components: the storage array, the storage provider and VMware vCenter Server. The following is the procedure to add the storage provider to vCenter Server: Log in to vCenter by vSphere Client. Go to Home | Storage Providers. Click on the Add button. Input information about the storage vendor name, URL, and credentials. Virtual machine storage profile The storage provider can help the vSphere administrator know the state of the physical storage devices and the capabilities on which their virtual machines are located. It also helps choose the correct storage in terms of performance and space by using virtual machine storage policies. A virtual machine storage policy helps you ensure that a virtual machine guarantees a specified level of performance or capacity of storage, for example, the SSD/SAS/NL-SAS data store, spindle I/O, and redundancy. Before you define a storage policy, you need to specify the storage requirement for your application that runs on the virtual machine. It has two types of storage requirement, which is storage-vendor-specific storage capability and user-defined storage capability. Storage-vendor-specific storage capability comes from the storage array. The storage vendor provider informs vCenter Server that it can guarantee the use of storage features by using storage-vendor-specific storage capability. vCenter Server assigns vendor-specific storage capability to each ESXi data store. User-defined storage capability is the one that you can define and assign storage profile to each ESXi datastore. In vSphere 5.1/5.5, the name of the storage policy is VM storage profile. Virtual machine storage policies can include one or more storage capabilities and assign to one or more VM. The virtual machine can be checked for storage compliance if it is placed on compliant storage. When you migrate, create, or clone a virtual machine, you can select the storage policy and apply it to that machine. The following procedure shows how to create a storage policy and apply it to a virtual machine in vSphere 5.1 using user-defined storage capability: The vSphere ESXi host requires the license edition of Enterprise Plus to enable the VM storage profile feature. The following procedure is adding the storage profile into vCenter Server: Log in to vCenter Server using vSphere Client. Click on the Home button in the top bar, and choose the VM Storage Profiles button under Management. Click on the Manage Storage Capabilities button to create user-defined storage capability. Click on the Add button to create the name of the storage capacity, for example, SSD Storage, SAS Storage, or NL-SAS Storage. Then click on the Close button. Click on the Create VM Storage Profile button to create the storage policy. Input the name of the VM storage profile, as shown in the following screenshot, and then click on the Next button to select the user-defined storage capability, which is defined in step 4. Click on the Finish button. Assign the user-defined storage capability to your specified ESXi data store. Right-click on the data store that you plan to assign the user-defined storage capability to. This capability is defined in step 4. After creating the VM storage profile, click on the Enable VM Storage Profiles button. Then click on the Enable button to enable the profiles. The following screenshot shows Enable VM Storage Profiles: After enabling the VM storage profile, you can see VM Storage Profile Status as Enabled and Licensing Status as Licensed, as shown in this screenshot: We have successfully created the VM storage profile. Now we have to associate the VM storage profile with a virtual machine. Right-click on a virtual machine that you plan to apply to the VM storage profile, choose VM Storage Profile, and then choose Manage Profiles. From the drop-down menu of VM Storage Profile select your profile. Then you can click on the Propagate to disks button to associate all virtual disks or decide which virtual disks you want to associate with that profile by setting manually. Click on OK. Finally, you need to check the compliance of VM Storage Profile on this virtual machine. Click on the Home button in the top bar. Then choose the VM Storage Profiles button under Management. Go to Virtual Machines and click on the Check Compliance Now button. The Compliance Status will display Compliant after compliance checking, as follows: Pluggable Storage Architecture (PSA) exists in the SCSI middle layer of the VMkernel storage stack. PSA is used to allow thirty-party storage vendors to use their failover and load balancing techniques for their specific storage array. A VMware ESXi host uses its multipathing plugin to control the ownership of the device path and LUN. The VMware default Multipathing Plugin (MPP) is called VMware Native Multipathing Plugin (NMP), which includes two subplugins as components: Storage Array Type Plugin (SATP) and Path Selection Plugin (PSP). SATP is used to handle path failover for a storage array, and PSP is used to issue an I/O request to a storage array. The following diagram shows the architecture of PSA: This table lists the operation tasks of PSA and NMP in the ESXi host: PSA NMP Operation tasks Discovers the physical paths Manages the physical path Handles I/O requests to the physical HBA adapter and logical devices Creates, registers, and deregisters logical devices Uses predefined claim rules to control storage devices Selects an optimal physical path for the request The following is an example of operation of PSA in a VMkernel storage stack: The virtual machine sends out an I/O request to a logical device that is managed by the VMware NMP. The NMP requests the PSP to assign to this logical device. The PSP selects a suitable physical path to send the I/O request. When the I/O operation is completed successfully, the NMP reports that the I/O operation is complete. If the I/O operation reports an error, the NMP calls the SATP. The SATP fails over to the new active path. The PSP selects a new active path from all available paths and continues the I/O operation. The following diagram shows the operation of PSA: VMware vSphere provides three options for the path selection policy. These are Most Recently Used (MRU), Fixed, and Round Robin (RR). The following table lists the advantages and disadvantages of each path: Path selection Description Advantage Disadvantage MRU The ESXi host selects the first preferred path at system boot time. If this path becomes unavailable, the ESXi host changes to the other active path. You can select your preferred path manually in the ESXi host. The ESXi host does not revert to the original path when that l path becomes available again. Fixed You can select the preferred path manually. The ESXi host can revert to the original path when the preferred path becomes available again. If the ESXi host cannot select the preferred path, it selects an available preferred path randomly. RR The ESXi host uses automatic path selection. The storage I/O across all available paths and enable load balancing across all paths. The storage is required to support ALUA mode. You cannot know which path is preferred because the storage I/O across all available paths. The following is the procedure of changing the path selection policy in an ESXi host: Log in to vCenter Server using vSphere Client. Go to the configuration of your selected ESXi host, choose the data store that you want to configure, and click on the Properties… button. Click on the Manage Paths… button. Select the drop-down menu and click on the Change button. If you plan to deploy a third-party MPP on your ESXi host, you need to follow up the storage vendor's instructions for the installation, for example, EMC PowerPath/VE for VMware that it is a piece of path management software for VMware's vSphere server and Microsoft's Hyper-V server. It also can provide load balancing and path failover features. VMware vSphere Storage DRS VMware vSphere Storage DRS (SDRS) is the placement of virtual machines in an ESX's data store cluster. According to storage capacity and I/O latency, it is used by VMware storage vMotion to migrate the virtual machine to keep the ESX's data store in a balanced status that is used to aggregate storage resources, and enable the placement of the virtual disk (VMDK) of virtual machine and load balancing of existing workloads. What is a data store cluster? It is a collection of ESXi's data stores grouped together. The data store cluster is enabled for vSphere SDRS. SDRS can work in two modes: manual mode and fully automated mode. If you enable SDRS in your environment, when the vSphere administrator creates or migrates a virtual machine, SDRS places all the files (VMDK) of this virtual machine in the same data store or different a data store in the cluster, according to the SDRS affinity rules or anti-affinity rules. The VMware ESXi host cluster has two key features: VMware vSphere High Availability (HA) and VMware vSphere Distributed Resource Scheduler (DRS). SDRS is different from the host cluster DRS. The latter is used to balance the virtual machine across the ESXi host based on the memory and CPU usage. SDRS is used to balance the virtual machine across the SAN storage (ESX's data store) based on the storage capacity and IOPS. The following table lists the difference between SDRS affinity rules and anti-affinity rules: Name of SDRS rules Description VMDK affinity rules This is the default SDRS rule for all virtual machines. It keeps each virtual machine's VMDKs together on the same ESXi data store. VMDK anti-affinity rules Keep each virtual machine's VMDKs on different ESXi data stores. You can apply this rule into all virtual machine's VMDKs or to dedicated virtual machine's VMDKs. VM anti-affinity rules Keep the virtual machine on different ESXi data stores. This rule is similar to the ESX DRS anti-affinity rules. The following is the procedure to create a storage DRS in vSphere 5: Log in to vCenter Server using vSphere Client. Go to home and click on the Datastores and Datastore Clusters button. Right-click on the data center and choose New Datastore Cluster. Input the name of the SDRS and then click on the Next button. Choose Storage DRS mode, Manual Mode and Fully Automated Mode. Manual Mode: According to the placement and migration recommendation, the placement and migration of the virtual machine are executed manually by the user.Fully Automated Mode: Based on the runtime rules, the placement of the virtual machine is executed automatically. Set up SDRS Runtime Rules. Then click on the Next button. Enable I/O metric for SDRS recommendations is used to enable I/O load balancing. Utilized Space is the percentage of consumed space allowed before the storage DRS executes an action. I/O Latency is the percentage of consumed latency allowed before the storage DRS executes an action. This setting can execute only if the Enable I/O metric for SDRS recommendations checkbox is selected. No recommendations until utilization difference between source and destination is is used to configure the space utilization difference threshold. I/O imbalance threshold is used to define the aggressive of IOPs load balancing. This setting can execute only if the Enable I/O metric for SDRS recommendations checkbox is selected. Select the ESXi host that is required to create SDRS. Then click on the Next button. Select the data store that is required to join the data store cluster, and click on the Next button to complete. After creating SDRS, go to the vSphere Storage DRS panel on the Summary tab of the data store cluster. You can see that Storage DRS is Enabled. On the Storage DRS tab on the data store cluster, it displays the recommendation, placement, and reasons. Click on the Apply Recommendations button if you want to apply the recommendations. Click on the Run Storage DRS button if you want to refresh the recommendations. VMware vSphere Storage I/O Control What is VMware vSphere Storage I/O Control? It is used to control in order to share and limit the storage of I/O resources, for example, the IOPS. You can control the number of storage IOPs allocated to the virtual machine. If a certain virtual machine is required to get more storage I/O resources, vSphere Storage I/O Control can ensure that that virtual machine can get more storage I/O than other virtual machines. The following table shows example of the difference between vSphere Storage I/O Control enabled and without vSphere Storage I/O Control: In this diagram, the VMware ESXi Host Cluster does not have vSphere Storage I/O Control. VM 2 and VM 5 need to get more IOPs, but they can allocate only a small amount of I/O resources. On the contrary, VM 1 and VM 3 can allocate a large amount of I/O resources. Actually, both VMs are required to allocate a small amount of IOPs. In this case, it wastes and overprovisions the storage resources. In the diagram to the left, vSphere Storage I/O Control is enabled in the ESXi Host Cluster. VM 2 and VM 5 are required to get more IOPs. They can allocate a large amount of I/O resources after storage I/O control is enabled. VM 1, VM 3, and VM 4 are required to get a small amount of I/O resources, and now these three VMs allocate a small amount of IOPs. After enabling storage I/O control, it helps reduce waste and overprovisioning of the storage resources. When you enable VMware vSphere Storage DRS, vSphere Storage I/O Control is automatically enabled on the data stores in the data store cluster. The following is the procedure to be carried out to enable vSphere Storage I/O control on an ESXi data store, and set up storage I/O shares and limits using vSphere Client 5: Log in to vCenter Server using vSphere Client. Go to the Configuration tab of the ESXi host, select the data store, and then click on the Properties… button. Select Enabled under Storage I/O Control, and click on the Close button. After Storage I/O Control is enabled, you can set up the storage I/O shares and limits on the virtual machine. Right-click on the virtual machine and select Edit Settings. Click on the Resources tab in the virtual machine properties box, and select Disk. You can individually set each virtual disk's Shares and Limit field. By default, all virtual machine shares are set to Normal and with Unlimited IOPs. Summary In this article, you learned what VAAI and VASA are. In a vSphere environment, the vSphere administrator learned how to configure the storage profile in vCenter Server and assign to the ESXi data store. We covered the benefits of vSphere Storage I/O Control and vSphere Storage DRS. When you found that it has a storage performance problem in the vSphere host, we saw how to troubleshoot the performance problem, and found out the root cause. Resources for Article: Further resources on this subject: Essentials of VMware vSphere [Article] Introduction to vSphere Distributed switches [Article] Network Virtualization and vSphere [Article]

0
0
9208

article-image-achieving-high-availability-aws-cloud

Packt

11 Aug 2015

18 min read

Achieving High-Availability on AWS Cloud

Packt

11 Aug 2015

18 min read

In this article, by Aurobindo Sarkar and Amit Shah, author of the book Learning AWS, we will introduce some key design principles and approaches to achieving high availability in your applications deployed on the AWS cloud. As a good practice, you want to ensure that your mission-critical applications are always available to serve your customers. The approaches in this article will address availability across the layers of your application architecture including availability aspects of key infrastructural components, ensuring there are no single points of failure. In order to address availability requirements, we will use the AWS infrastructure (Availability Zones and Regions), AWS Foundation Services (EC2 instances, Storage, Security and Access Control, Networking), and the AWS PaaS services (DynamoDB, RDS, CloudFormation, and so on). (For more resources related to this topic, see here.) Defining availability objectives Achieving high availability can be costly. Therefore, it is important to ensure that you align your application availability requirements with your business objectives. There are several options to achieve the level of availability that is right for your application. Hence, it is essential to start with a clearly defined set of availability objectives and then make the most prudent design choices to achieve those objectives at a reasonable cost. Typically, all systems and services do not need to achieve the highest levels of availability possible; at the same time ensure you do not introduce a single point of failure in your architecture through dependencies between your components. For example, a mobile taxi ordering service needs its ordering-related service to be highly available; however, a specific customer's travel history need not be addressed at the same level of availability. The best way to approach high availability design is to assume that anything can fail, at any time, and then consciously design against it. "Everything fails, all the time." - Werner Vogels, CTO, Amazon.com In other words, think in terms of availability for each and every component in your application and its environment because any given component can turn into a single point of failure for your entire application. Availability is something you should consider early on in your application design process, as it can be hard to retrofit it later. Key among these would be your database and application architecture (for example, RESTful architecture). In addition, it is important to understand that availability objectives can influence and/or impact your design, development, test, and running your system on the cloud. Finally, ensure you proactively test all your design assumptions and reduce uncertainty by injecting or forcing failures instead of waiting for random failures to occur. The nature of failures There are many types of failures that can happen at any time. These could be a result of disk failures, power outages, natural disasters, software errors, and human errors. In addition, there are several points of failure in any given cloud application. These would include DNS or domain services, load balancers, web and application servers, database servers, application services-related failures, and data center-related failures. You will need to ensure you have a mitigation strategy for each of these types and points of failure. It is highly recommended that you automate and implement detailed audit trails for your recovery strategy, and thoroughly test as many of these processes as possible. In the next few sections, we will discuss various strategies to achieve high availability for your application. Specifically, we will discuss the use of AWS features and services such as: VPC Amazon Route 53 Elastic Load Balancing, auto-scaling Redundancy Multi-AZ and multi-region deployments Setting up VPC for high availability Before setting up your VPC, you will need to carefully select your primary site and a disaster recovery (DR) site. Leverage AWS's global presence to select the best regions and availability zones to match your business objectives. The choice of a primary site is usually the closest region to the location of a majority of your customers and the DR site could be in the next closest region or in a different country depending on your specific requirements. Next, we need to set up the network topology, which essentially includes setting up the VPC and the appropriate subnets. The public facing servers are configured in a public subnet; whereas the database servers and other application servers hosting services such as the directory services will usually reside in the private subnets. Ensure you chose different sets of IP addresses across the different regions for the multi-region deployment, for example 10.0.0.0/16 for the primary region and 192.168.0.0/16 for the secondary region to avoid any IP addressing conflicts when these regions are connected via a VPN tunnel. Appropriate routing tables and ACLs will also need to be defined to ensure traffic can traverse between them. Cross-VPC connectivity is required so that data transfer can happen between the VPCs (say, from the private subnets in one region over to the other region). The secure VPN tunnels are basically IPSec tunnels powered by VPN appliances—a primary and a secondary tunnel should be defined (in case the primary IPSec tunnel fails). It is imperative you consult with your network specialists through all of these tasks. An ELB is configured in the primary region to route traffic across multiple availability zones; however, you need not necessarily commission the ELB for your secondary site at this time. This will help you avoid costs for the ELB in your DR or secondary site. However, always weigh these costs against the total cost/time required for recovery. It might be worthwhile to just commission the extra ELB and keep it running. Gateway servers and NAT will need to be configured as they act as gatekeepers for all inbound and outbound Internet access. Gateway servers are defined in the public subnet with appropriate licenses and keys to access your servers in the private subnet for server administration purposes. NAT is required for servers located in the private subnet to access the Internet and is typically used for automatic patch updates. Again, consult your network specialists for these tasks. Elastic load balancing and Amazon Route 53 are critical infrastructure components for scalable and highly available applications; we discuss these services in the next section. Using ELB and Route 53 for high availability In this section, we describe different levels of availability and the role ELBs and Route 53 play from an availability perspective. Instance availability The simplest guideline here is to never run a single instance in a production environment. The simplest approach to improving greatly from a single server scenario is to spin up multiple EC2 instances and stick an ELB in front of them. The incoming request load is shared by all the instances behind the load balancer. ELB uses the least outstanding requests routing algorithm to spread HTTP/HTTPS requests across healthy instances. This algorithm favors instances with the fewest outstanding requests. Even though it is not recommended to have different instance sizes between or within the AZs, the ELB will adjust for the number of requests it sends to smaller or larger instances based on response times. In addition, ELBs use cross-zone load balancing to distribute traffic across all healthy instances regardless of AZs. Hence, ELBs help balance the request load even if there are unequal number of instances in different AZs at any given time (perhaps due to a failed instance in one of the AZs). There is no bandwidth charge for cross-zone traffic (if you are using an ELB). Instances that fail can be seamlessly replaced using auto scaling while other instances continue to operate. Though auto-replacement of instances works really well, storing application state or caching locally on your instances can be hard to detect problems. Instance failure is detected and the traffic is shifted to healthy instances, which then carries the additional load. Health checks are used to determine the health of the instances and the application. TCP and/or HTTP-based heartbeats can be created for this purpose. It is worthwhile implementing health checks iteratively to arrive at the right set that meets your goals. In addition, you can customize the frequency and the failure thresholds as well. Finally, if all your instances are down, then AWS will return a 503. Zonal availability or availability zone redundancy Availability zones are distinct geographical locations engineered to be insulated from failures in other zones. It is critically important to run your application stack in more than one zone to achieve high availability. However, be mindful of component level dependencies across zones and cross-zone service calls leading to substantial latencies in your application or application failures during availability zone failures. For sites with very high request loads, a 3-zone configuration might be the preferred configuration to handle zone-level failures. In this situation, if one zone goes down, then other two AZs can ensure continuing high availability and better customer experience. In the event of a zone failure, there are several challenges in a Multi-AZ configuration, resulting from the rapidly shifting traffic to the other AZs. In such situations, the load balancers need to expire connections quickly and lingering connections to caches must be addressed. In addition, careful configuration is required for smooth failover by ensuring all clusters are appropriately auto scaled, avoiding cross-zone calls in your services, and avoiding mismatched timeouts across your architecture. ELBs can be used to balance across multiple availability zones. Each load balancer will contain one or more DNS records. The DNS record will contain multiple IP addresses and DNS round-robin can be used to balance traffic between the availability zones. You can expect the DNS records to change over time. Using multiple AZs can result in traffic imbalances between AZs due to clients caching DNS records. However, ELBs can help reduce the impact of this caching. Regional availability or regional redundancy ELB and Amazon Route 53 have been integrated to support a single application across multiple regions. Route 53 is AWS's highly available and scalable DNS and health checking service. Route 53 supports high availability architectures by health checking load balancer nodes and rerouting traffic to avoid the failed nodes, and by supporting implementation of multi-region architectures. In addition, Route 53 uses Latency Based Routing (LBR) to route your customers to the endpoint that has the least latency. If multiple primary sites are implemented with appropriate health checks configured, then in cases of failure, traffic shifts away from that site to the next closest region. Region failures can present several challenges as a result of rapidly shifting traffic (similar to the case of zone failures). These can include auto scaling, time required for instance startup, and the cache fill time (as we might need to default to our data sources, initially). Another difficulty usually arises from the lack of information or clarity on what constitutes the minimal or critical stack required to keep the site functioning as normally as possible. For example, any or all services will need to be considered as critical in these circumstances. The health checks are essentially automated requests sent over the Internet to your application to verify that your application is reachable, available, and functional. This can include both your EC2 instances and your application. As answers are returned only for the resources that are healthy and reachable from the outside world, the end users can be routed away from a failed application. Amazon Route 53 health checks are conducted from within each AWS region to check whether your application is reachable from that location. The DNS failover is designed to be entirely automatic. After you have set up your DNS records and health checks, no manual intervention is required for failover. Ensure you create appropriate alerts to be notified when this happens. Typically, it takes about 2 to 3 minutes from the time of the failure to the point where traffic is routed to an alternate location. Compare this to the traditional process where an operator receives an alarm, manually configures the DNS update, and waits for the DNS changes to propagate. The failover happens entirely within the Amazon Route 53 data plane. Depending on your availability objectives, there is an additional strategy (using Route 53) that you might want to consider for your application. For example, you can create a backup static site to maintain a presence for your end customers while your primary dynamic site is down. In the normal course, Route 53 will point to your dynamic site and maintain health checks for it. Furthermore, you will need to configure Route 53 to point to the S3 storage, where your static site resides. If your primary site goes down, then traffic can be diverted to the static site (while you work to restore your primary site). You can also combine this static backup site strategy with a multiple region deployment. Setting up high availability for application and data layers In this section, we will discuss approaches for implementing high availability in the application and data layers of your application architecture. The auto healing feature of AWS OpsWorks provides a good recovery mechanism from instance failures. All OpsWorks instances have an agent installed. If an agent does not communicate with the service for a short duration, then OpsWorks considers the instance to have failed. If auto healing is enabled at the layer and an instance becomes unhealthy, then OpsWorks first terminates the instance and starts a new one as per the layer configuration. In the application layer, we can also do cold starts from preconfigured images or a warm start from scaled down instances for your web servers and application servers in a secondary region. By leveraging auto scaling, we can quickly ramp up these servers to handle full production loads. In this configuration, you would deploy the web servers and application servers across multiple AZs in your primary region while the standby servers need not be launched in your secondary region until you actually need them. However, keep the preconfigured AMIs for these servers ready to launch in your secondary region. The data layer can comprise of SQL databases, NoSQL databases, caches, and so on. These can be AWS managed services such as RDS, DynamoDB, and S3, or your own SQL and NoSQL databases such as Oracle, SQL Server, or MongoDB running on EC2 instances. AWS services come with HA built-in, while using database products running on EC2 instances offers a do-it-yourself option. It can be advantageous to use AWS services if you want to avoid taking on database administration responsibilities. For example, with the increasing sizes of your databases, you might choose to share your databases, which is easy to do. However, resharding your databases while taking in live traffic can be a very complex undertaking and present availability risks. Choosing to use the AWS DynamoDB service in such a situation offloads this work to AWS, thereby resulting in higher availability out of the box. AWS provides many different data replication options and we will discuss a few of those in the following several paragraphs. DynamoDB automatically replicates your data across several AZs to provide higher levels of data durability and availability. In addition, you can use data pipelines to copy your data from one region to another. DynamoDB streams functionality that can be leveraged to replicate to another DynamoDB in a different region. For very high volumes, low latency Kinesis services can also be used for this replication across multiple regions distributed all over the world. You can also enable the Multi-AZ setting for the AWS RDS service to ensure AWS replicates your data to a different AZ within the same region. In the case of Amazon S3, the S3 bucket contents can be copied to a different bucket and the failover can be managed on the client side. Depending on the volume of data, always think in terms of multiple machines, multiple threads and multiple parts to significantly reduce the time it takes to upload data to S3 buckets. While using your own database (running on EC2 instances), use your database-specific high availability features for within and cross-region database deployments. For example, if you are using SQL Server, you can leverage the SQL Server Always-on feature for synchronous and asynchronous replication across the nodes. If the volume of data is high, then you can also use the SQL Server log shipping to first upload your data to Amazon S3 and then restore into your SQL Server instance on AWS. A similar approach in case of Oracle databases uses OSB Cloud Module and RMAN. You can also replicate your non-RDS databases (on-premise or on AWS) to AWS RDS databases. You will typically define two nodes in the primary region with synchronous replication and a third node in the secondary region with asynchronous replication. NoSQL databases such as MongoDB and Cassandra have their own asynchronous replication features that can be leveraged for replication to a different region. In addition, you can create Read Replicas for your databases in other AZs and regions. In this case, if your master database fails followed by a failure of your secondary database, then one of the read replicas can be promoted to being the master. In hybrid architectures, where you need to replicate between on-premise and AWS data sources, you can do so through a VPN connection between your data center and AWS. In case of any connectivity issues, you can also temporarily store pending data updates in SQS, and process them when the connectivity is restored. Usually, data is actively replicated to the secondary region while all other servers like the web servers and application servers are maintained in a cold state to control costs. However, in cases of high availability for web scale or mission critical applications, you can also choose to deploy your servers in active configuration across multiple regions. Implementing high availability in the application In this section, we will discuss a few design principles to use in your application from a high availability perspective. We will briefly discuss using highly available AWS services to implement common features in mobile and Internet of Things (IoT) applications. Finally, we also cover running packaged applications on the AWS cloud. Designing your application services to be stateless and following a micro services-oriented architecture approach can help the overall availability of your application. In such architectures, if a service fails then that failure is contained or isolated to that particular service while the rest of your application services continue to serve your customers. This approach can lead to an acceptable degraded experience rather than outright failures or worse. You should also store user or session information in a central location such as the AWS ElastiCache and then spread information across multiple AZs for high availability. Another design principle is to rigorously implement exception handling in your application code, and in each of your services to ensure graceful exit in case of failures. Most mobile applications share common features including user authentication and authorization, data synchronization across devices; user behavior analytics; retention tracking, storing, sharing, and delivering media globally; sending push notifications; store shared data; stream real-time data; and so on. There are a host of highly available AWS services that can be used for implementing such mobile application functionality. For example, you can use Amazon Cognito to authenticate users, Amazon Mobile Analytics for analyzing user behavior and tracking retention, Amazon SNS for push notifications and Amazon Kinesis for streaming real-time data. In addition, other AWS services such as S3, DynamoDB, IAM, and so on can also be effectively used to complete most mobile application scenarios. For mobile applications, you need to be especially sensitive about latency issues; hence, it is important to leverage AWS regions to get as close to your customers as possible. Similar to mobile applications, for IoT applications you can use the same highly available AWS services to implement common functionality such as device analytics and device messaging/notifications. You can also leverage Amazon Kinesis to ingest data from hundreds of thousands of sensors that are continuously generating massive quantities of data. Aside from your own custom applications, you can also run packaged applications such as SAP on AWS. These would typically include replicated standby systems, Multi-AZ and multi-region deployments, hybrid architectures spanning your own data center, and AWS cloud (connected via VPN or AWS Direct Connect service), and so on. For more details, refer to the specific package guides for achieving high availability on the AWS cloud. Summary In this article, we reviewed some of the strategies you can follow for achieving high availability in your cloud application. We emphasized the importance of both designing your application architecture for availability and using the AWS infrastructural services to get the best results. Resources for Article: Further resources on this subject: Securing vCloud Using the vCloud Networking and Security App Firewall [article] Introduction to Microsoft Azure Cloud Services [article] AWS Global Infrastructure [article]

0
0
18450

article-image-analyzing-network-reconnaissance-attempts

Packt

11 Aug 2015

8 min read

Analyzing network reconnaissance attempts

Packt

11 Aug 2015

8 min read

In this article by Piyush Verma, author of the book Wireshark Network Security, you will be introduced to using Wireshark to detect network reconnaissance activities performed by an insider. A dictionary definition of reconnaissance is "military observation of a region to locate an enemy or ascertain strategic features." A good analogy for reconnaissance will be a thief studying the neighborhood to observe which houses are empty and which are occupied, the number of family members who live at the occupied houses, their entrance points, the time during which these occupied houses are empty, and so on, before he/she can even think about stealing anything from that neighborhood. Network reconnaissance relates to the act of gathering information about the target’s network infrastructure, the devices that reside on the network, the platform used by such devices, and the ports opened on them, to ultimately come up with a brief network diagram of devices and then plan the attack accordingly. The tools required to perform network scanning are readily available and can be downloaded easily over the Internet. One such popular tool is Nmap, short for Network Mapper. It is written by Gordon “Fyodor” Lyon and is a popular tool of choice to perform network-based reconnaissance. Network scanning activities can be as follows: Scanning for live machines Port scans Detecting the presence of a firewall or additional IP protocols (For more resources related to this topic, see here.) Detect the scanning activity for live machines An attacker would want to map out the live machines on the network rather than performing any activity with an assumption that all the machines are live. The following are the two popular techniques, which can be used, and the ways to detect them using Wireshark. Ping sweep This technique makes use of a simple technique to ping an IP address in order to identify whether it is alive or not. Almost all modern networks block the ICMP protocol; hence, this technique is not very successful. However, in case your network supports ICMP-based traffic, you can detect this attack by looking for large number of ping requests going to a range of IP addresses on your network. A helpful filter in this case will be: icmp.type == 8 || icmp.type == 0 ICMP Type 8 = ECHO Request ICMP Type 0 = ECHO Reply ARP sweep ARP responses cannot be disabled on the network; hence, this technique works very well while trying to identify live machines on a local network. Using this technique, an attacker can discover hosts that may be hidden from other discovery methods, such as ping sweeps, by a firewall. To perform this, an attacker sends ARP broadcasts (destination MAC address: FF:FF:FF:FF:FF:FF) for all possible IP addresses on a given subnet, and the machines responding to these requests are noted as alive or active. To detect ARP sweep attempts, we need to look for a massive amount of ARP broadcasts from a client machine on the network. Another thing to note will be the duration in which these broadcasts are sent. These are highlighted in the following screenshot: Shows an ARP sweep in action A point to note is the source of these ARP requests to avoid false positives because such requests may also be made by legitimate services, such as SNMP. Identifying port scanning attempts Now, we will look at different port scanning techniques used by attackers and how to detect them using Wireshark. TCP Connect scan In a TCP Connect scan, a client/attacker sends a SYN packet to the server/victim on a range of port numbers. For the ports that respond to SYN/ACK, the client completes the 3-way handshake by sending an ACK and then terminates the connection by sending an RST to the server/victim, while the ports that are closed reply with RST/ACK packets to the SYN sent by the client/attacker. Hence, in order to identify this type of scan, we will need to look for a significantly large number of RST (Expert Info) or SYN/ACK packets. Generally, when a connection is established, some form of data is transferred; however, in scanning attempts, no data is sent across indicating a suspicious activity (Conversations | TCP). Another indication is the short period of time under which these packets are sent (Statistics | Flow Graph). Wireshark’s Flow Graph While observing the TCP flow in the Flow Graph, we noted a sequence of SYN, SYN/ACK, and ACKs along with SYN and RST/ACKs. Another indication is the fraction of seconds (displayed on left-hand side) under which these packets are sent. Shows a complete 3-way handshake with open-ports and how quickly the packets were sent under the ‘Time’ column Wireshark’s Expert Info Even the Expert Info window indicates a significant number of connection resets. Shows Warning Tab under Expert Info Wireshark’s Conversations We can look at the TCP Conversations, to observe which type of scan is underway and the number of Bytes associated with each conversation. Shows the number of packets and Bytes transferred for each Conversation The number 4 in the Packets column indicates a SYN, SYN/ACK, ACK, and RST packets, and the number 2 indicates the SYN sent by Nmap and RST/ACK received for a closed port. Stealth scan A stealth scan is different from the TCP Connect scan explained earlier and is never detected by the application layer as the complete TCP 3-way handshake is never established during this scan and hence also known as half-open scan. During this scan, a client/attacker sends a SYN packet to the server/victim on a range of port numbers. If Nmap receives a SYN/ACK to the SYN request, it means that the port is open, and Nmap then sends an RST to close the connection without ever completing the 3-way handshake, while the ports that are closed reply with RST/ACK packets to the SYN requests. The way to detect this attack is similar to the previous scan, where you will note a lot of RST (Expert Info) or SYN/ACK packets without data transfers (Conversations | TCP) on the network. Another indication is the short period of time under which these packets are sent (Statistics | Flow Graph). Now, we will look at the Flow Graph, Expert Info, and Conversations in Wireshark for Stealth scan. Wireshark’s Flow Graph While observing the TCP flow in the Flow Graph, we noted a sequence of SYN, SYN/ACK, and RSTs (indicating a half-open connection) along with SYN and RST/ACKs. Another indication is the fraction of seconds (displayed on the left-hand side) under which these packets are sent. Shows the half-open scan underway and how quickly the packets were sent under the "Time" column Wireshark’s Expert Info The huge number of connection resets is another indication of a scan underway. Shows Warning Tab under Expert Info Wireshark’s Conversations TCP Conversations also provide an insight to indicate that a half-open scan is underway and the number of Bytes associated with each attempt. Shows the number of packets and Bytes transferred for each Conversation The number 3 in the Packets column indicates a SYN, SYN/ACK, and RST packets, and the number 2 indicates the SYN sent by Nmap and RST/ACK received for a closed port. NULL scan During a null scan, unusual TCP packets are sent with no flags set. If the resultant of this is no response, it means that the port is either open or filtered; while the RST/ACK response means that the port is closed. A quick way to detect, whether such a scan is underway, is to filter on tcp.flags == 0x00. UDP scan The last three techniques were related to TCP-based scans. Many common protocols work over UDP as well (DNS, SNMP, TFTP, and so on), and scans are conducted to detect whether such ports are open or not. No response to a UDP port scan indicates that the port is either open or firewalled, and a response of ICMP Destination Unreachable/Port Unreachable means that the port is closed. Detect UDP scans by filtering on (icmp.type == 3) && (icmp.code == 3). ICMP Type 3 = Destination Unreachable ICMP Code 3 = Port Unreachable Other scanning attempts The following scanning techniques go beyond the traditional port scanning techniques and help the attacker in further enumeration of the network. ACK scan ACK Flag scan never locates an open port; rather, it only provides the result in form of "filtered" or "unfiltered" and is generally used to detect the presence of a firewall. No response means port is filtered, and RST response indicates that the port is unfiltered. Shows Flow Graph (TCP) of an ACK Flag Scan IP Protocol scan IP Protocol scan is conducted by attackers to determine the presence of additional IP protocols in use by the victim. For example, if a router is scanned using this technique, it might reveal the use of other protocols as EGP, IGP, EIGRP, and so on. No response indicates that a protocol is present or the response is filtered, while an ICMP Destination Unreachable/Protocol Unreachable indicates that the protocol is not supported by the device. To detect this scan using Wireshark, we can filter the traffic based on (icmp.type == 3) && (icmp.code == 2). ICMP Type 3 = Destination Unreachable ICMP Code 2 = Protocol Unreachable Summary In this article, we used Wireshark and the set of robust features it has to offer, to analyze the network scanning attempts performed by attackers. Resources for Article: Further resources on this subject: Wireshark [article] Wireshark: Working with Packet Streams [article] UDP/TCP Analysis [article]

0
0
13593

Packt

11 Aug 2015

17 min read

Managing Recipients

Packt

11 Aug 2015

17 min read

0
0
1194

Packt

11 Aug 2015

17 min read

Divide and Conquer – Classification Using Decision Trees and Rules

Packt

11 Aug 2015

17 min read

In this article by Brett Lantz, author of the book Machine Learning with R, Second Edition, we will get a basic understanding about decision trees and rule learners, including the C5.0 decision tree algorithm. This algorithm will cover mechanisms such as choosing the best split and pruning the decision tree. While deciding between several job offers with various levels of pay and benefits, many people begin by making lists of pros and cons, and eliminate options based on simple rules. For instance, ''if I have to commute for more than an hour, I will be unhappy.'' Or, ''if I make less than $50k, I won't be able to support my family.'' In this way, the complex and difficult decision of predicting one's future happiness can be reduced to a series of simple decisions. This article covers decision trees and rule learners—two machine learning methods that also make complex decisions from sets of simple choices. These methods then present their knowledge in the form of logical structures that can be understood with no statistical knowledge. This aspect makes these models particularly useful for business strategy and process improvement. By the end of this article, you will learn: How trees and rules "greedily" partition data into interesting segments The most common decision tree and classification rule learners, including the C5.0, 1R, and RIPPER algorithms We will begin by examining decision trees, followed by a look at classification rules. (For more resources related to this topic, see here.) Understanding decision trees Decision tree learners are powerful classifiers, which utilize a tree structure to model the relationships among the features and the potential outcomes. As illustrated in the following figure, this structure earned its name due to the fact that it mirrors how a literal tree begins at a wide trunk, which if followed upward, splits into narrower and narrower branches. In much the same way, a decision tree classifier uses a structure of branching decisions, which channel examples into a final predicted class value. To better understand how this works in practice, let's consider the following tree, which predicts whether a job offer should be accepted. A job offer to be considered begins at the root node, where it is then passed through decision nodes that require choices to be made based on the attributes of the job. These choices split the data across branches that indicate potential outcomes of a decision, depicted here as yes or no outcomes, though in some cases there may be more than two possibilities. In the case a final decision can be made, the tree is terminated by leaf nodes (also known as terminal nodes) that denote the action to be taken as the result of the series of decisions. In the case of a predictive model, the leaf nodes provide the expected result given the series of events in the tree. A great benefit of decision tree algorithms is that the flowchart-like tree structure is not necessarily exclusively for the learner's internal use. After the model is created, many decision tree algorithms output the resulting structure in a human-readable format. This provides tremendous insight into how and why the model works or doesn't work well for a particular task. This also makes decision trees particularly appropriate for applications in which the classification mechanism needs to be transparent for legal reasons, or in case the results need to be shared with others in order to inform future business practices. With this in mind, some potential uses include: Credit scoring models in which the criteria that causes an applicant to be rejected need to be clearly documented and free from bias Marketing studies of customer behavior such as satisfaction or churn, which will be shared with management or advertising agencies Diagnosis of medical conditions based on laboratory measurements, symptoms, or the rate of disease progression Although the previous applications illustrate the value of trees in informing decision processes, this is not to suggest that their utility ends here. In fact, decision trees are perhaps the single most widely used machine learning technique, and can be applied to model almost any type of data—often with excellent out-of-the-box applications. This said, in spite of their wide applicability, it is worth noting some scenarios where trees may not be an ideal fit. One such case might be a task where the data has a large number of nominal features with many levels or it has a large number of numeric features. These cases may result in a very large number of decisions and an overly complex tree. They may also contribute to the tendency of decision trees to overfit data, though as we will soon see, even this weakness can be overcome by adjusting some simple parameters. Divide and conquer Decision trees are built using a heuristic called recursive partitioning. This approach is also commonly known as divide and conquer because it splits the data into subsets, which are then split repeatedly into even smaller subsets, and so on and so forth until the process stops when the algorithm determines the data within the subsets are sufficiently homogenous, or another stopping criterion has been met. To see how splitting a dataset can create a decision tree, imagine a bare root node that will grow into a mature tree. At first, the root node represents the entire dataset, since no splitting has transpired. Next, the decision tree algorithm must choose a feature to split upon; ideally, it chooses the feature most predictive of the target class. The examples are then partitioned into groups according to the distinct values of this feature, and the first set of tree branches are formed. Working down each branch, the algorithm continues to divide and conquer the data, choosing the best candidate feature each time to create another decision node, until a stopping criterion is reached. Divide and conquer might stop at a node in a case that: All (or nearly all) of the examples at the node have the same class There are no remaining features to distinguish among the examples The tree has grown to a predefined size limit To illustrate the tree building process, let's consider a simple example. Imagine that you work for a Hollywood studio, where your role is to decide whether the studio should move forward with producing the screenplays pitched by promising new authors. After returning from a vacation, your desk is piled high with proposals. Without the time to read each proposal cover-to-cover, you decide to develop a decision tree algorithm to predict whether a potential movie would fall into one of three categories: Critical Success, Mainstream Hit, or Box Office Bust. To build the decision tree, you turn to the studio archives to examine the factors leading to the success and failure of the company's 30 most recent releases. You quickly notice a relationship between the film's estimated shooting budget, the number of A-list celebrities lined up for starring roles, and the level of success. Excited about this finding, you produce a scatterplot to illustrate the pattern: Using the divide and conquer strategy, we can build a simple decision tree from this data. First, to create the tree's root node, we split the feature indicating the number of celebrities, partitioning the movies into groups with and without a significant number of A-list stars: Next, among the group of movies with a larger number of celebrities, we can make another split between movies with and without a high budget: At this point, we have partitioned the data into three groups. The group at the top-left corner of the diagram is composed entirely of critically acclaimed films. This group is distinguished by a high number of celebrities and a relatively low budget. At the top-right corner, majority of movies are box office hits with high budgets and a large number of celebrities. The final group, which has little star power but budgets ranging from small to large, contains the flops. If we wanted, we could continue to divide and conquer the data by splitting it based on the increasingly specific ranges of budget and celebrity count, until each of the currently misclassified values resides in its own tiny partition, and is correctly classified. However, it is not advisable to overfit a decision tree in this way. Though there is nothing to stop us from splitting the data indefinitely, overly specific decisions do not always generalize more broadly. We'll avoid the problem of overfitting by stopping the algorithm here, since more than 80 percent of the examples in each group are from a single class. This forms the basis of our stopping criterion. You might have noticed that diagonal lines might have split the data even more cleanly. This is one limitation of the decision tree's knowledge representation, which uses axis-parallel splits. The fact that each split considers one feature at a time prevents the decision tree from forming more complex decision boundaries. For example, a diagonal line could be created by a decision that asks, "is the number of celebrities is greater than the estimated budget?" If so, then "it will be a critical success." Our model for predicting the future success of movies can be represented in a simple tree, as shown in the following diagram. To evaluate a script, follow the branches through each decision until the script's success or failure has been predicted. In no time, you will be able to identify the most promising options among the backlog of scripts and get back to more important work, such as writing an Academy Awards acceptance speech. Since real-world data contains more than two features, decision trees quickly become far more complex than this, with many more nodes, branches, and leaves. In the next section, you will learn about a popular algorithm to build decision tree models automatically. The C5.0 decision tree algorithm There are numerous implementations of decision trees, but one of the most well-known implementations is the C5.0 algorithm. This algorithm was developed by computer scientist J. Ross Quinlan as an improved version of his prior algorithm, C4.5, which itself is an improvement over his Iterative Dichotomiser 3 (ID3) algorithm. Although Quinlan markets C5.0 to commercial clients (see http://www.rulequest.com/ for details), the source code for a single-threaded version of the algorithm was made publically available, and it has therefore been incorporated into programs such as R. To further confuse matters, a popular Java-based open source alternative to C4.5, titled J48, is included in R's RWeka package. Because the differences among C5.0, C4.5, and J48 are minor, the principles in this article will apply to any of these three methods, and the algorithms should be considered synonymous. The C5.0 algorithm has become the industry standard to produce decision trees, because it does well for most types of problems directly out of the box. Compared to other advanced machine learning models, the decision trees built by C5.0 generally perform nearly as well, but are much easier to understand and deploy. Additionally, as shown in the following table, the algorithm's weaknesses are relatively minor and can be largely avoided: Strengths Weaknesses An all-purpose classifier that does well on most problems Highly automatic learning process, which can handle numeric or nominal features, as well as missing data Excludes unimportant features Can be used on both small and large datasets Results in a model that can be interpreted without a mathematical background (for relatively small trees) More efficient than other complex models Decision tree models are often biased toward splits on features having a large number of levels It is easy to overfit or underfit the model Can have trouble modeling some relationships due to reliance on axis-parallel splits Small changes in the training data can result in large changes to decision logic Large trees can be difficult to interpret and the decisions they make may seem counterintuitive To keep things simple, our earlier decision tree example ignored the mathematics involved in how a machine would employ a divide and conquer strategy. Let's explore this in more detail to examine how this heuristic works in practice. Choosing the best split The first challenge that a decision tree will face is to identify which feature to split upon. In the previous example, we looked for a way to split the data such that the resulting partitions contained examples primarily of a single class. The degree to which a subset of examples contains only a single class is known as purity, and any subset composed of only a single class is called pure. There are various measurements of purity that can be used to identify the best decision tree splitting candidate. C5.0 uses entropy, a concept borrowed from information theory that quantifies the randomness, or disorder, within a set of class values. Sets with high entropy are very diverse and provide little information about other items that may also belong in the set, as there is no apparent commonality. The decision tree hopes to find splits that reduce entropy, ultimately increasing homogeneity within the groups. Typically, entropy is measured in bits. If there are only two possible classes, entropy values can range from 0 to 1. For n classes, entropy ranges from 0 to log2(n). In each case, the minimum value indicates that the sample is completely homogenous, while the maximum value indicates that the data are as diverse as possible, and no group has even a small plurality. In the mathematical notion, entropy is specified as follows: In this formula, for a given segment of data (S), the term c refers to the number of class levels and pi refers to the proportion of values falling into class level i. For example, suppose we have a partition of data with two classes: red (60 percent) and white (40 percent). We can calculate the entropy as follows: > -0.60 * log2(0.60) - 0.40 * log2(0.40) [1] 0.9709506 We can examine the entropy for all the possible two-class arrangements. If we know that the proportion of examples in one class is x, then the proportion in the other class is (1 – x). Using the curve() function, we can then plot the entropy for all the possible values of x: > curve(-x * log2(x) - (1 - x) * log2(1 - x), col = "red", xlab = "x", ylab = "Entropy", lwd = 4) This results in the following figure: As illustrated by the peak in entropy at x = 0.50, a 50-50 split results in maximum entropy. As one class increasingly dominates the other, the entropy reduces to zero. To use entropy to determine the optimal feature to split upon, the algorithm calculates the change in homogeneity that would result from a split on each possible feature, which is a measure known as information gain. The information gain for a feature F is calculated as the difference between the entropy in the segment before the split (S1) and the partitions resulting from the split (S2): One complication is that after a split, the data is divided into more than one partition. Therefore, the function to calculate Entropy(S2) needs to consider the total entropy across all of the partitions. It does this by weighing each partition's entropy by the proportion of records falling into the partition. This can be stated in a formula as: In simple terms, the total entropy resulting from a split is the sum of the entropy of each of the n partitions weighted by the proportion of examples falling in the partition (wi). The higher the information gain, the better a feature is at creating homogeneous groups after a split on this feature. If the information gain is zero, there is no reduction in entropy for splitting on this feature. On the other hand, the maximum information gain is equal to the entropy prior to the split. This would imply that the entropy after the split is zero, which means that the split results in completely homogeneous groups. The previous formulae assume nominal features, but decision trees use information gain for splitting on numeric features as well. To do so, a common practice is to test various splits that divide the values into groups greater than or less than a numeric threshold. This reduces the numeric feature into a two-level categorical feature that allows information gain to be calculated as usual. The numeric cut point yielding the largest information gain is chosen for the split. Though it is used by C5.0, information gain is not the only splitting criterion that can be used to build decision trees. Other commonly used criteria are Gini index, Chi-Squared statistic, and gain ratio. For a review of these (and many more) criteria, refer to Mingers J. An Empirical Comparison of Selection Measures for Decision-Tree Induction. Machine Learning. 1989; 3:319-342. Pruning the decision tree A decision tree can continue to grow indefinitely, choosing splitting features and dividing the data into smaller and smaller partitions until each example is perfectly classified or the algorithm runs out of features to split on. However, if the tree grows overly large, many of the decisions it makes will be overly specific and the model will be overfitted to the training data. The process of pruning a decision tree involves reducing its size such that it generalizes better to unseen data. One solution to this problem is to stop the tree from growing once it reaches a certain number of decisions or when the decision nodes contain only a small number of examples. This is called early stopping or pre-pruning the decision tree. As the tree avoids doing needless work, this is an appealing strategy. However, one downside to this approach is that there is no way to know whether the tree will miss subtle, but important patterns that it would have learned had it grown to a larger size. An alternative, called post-pruning, involves growing a tree that is intentionally too large and pruning leaf nodes to reduce the size of the tree to a more appropriate level. This is often a more effective approach than pre-pruning, because it is quite difficult to determine the optimal depth of a decision tree without growing it first. Pruning the tree later on allows the algorithm to be certain that all the important data structures were discovered. The implementation details of pruning operations are very technical and beyond the scope of this article. For a comparison of some of the available methods, see Esposito F, Malerba D, Semeraro G. A Comparative Analysis of Methods for Pruning Decision Trees. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1997;19: 476-491. One of the benefits of the C5.0 algorithm is that it is opinionated about pruning—it takes care of many decisions automatically using fairly reasonable defaults. Its overall strategy is to post-prune the tree. It first grows a large tree that overfits the training data. Later, the nodes and branches that have little effect on the classification errors are removed. In some cases, entire branches are moved further up the tree or replaced by simpler decisions. These processes of grafting branches are known as subtree raising and subtree replacement, respectively. Balancing overfitting and underfitting a decision tree is a bit of an art, but if model accuracy is vital, it may be worth investing some time with various pruning options to see if it improves the performance on test data. As you will soon see, one of the strengths of the C5.0 algorithm is that it is very easy to adjust the training options. Summary This article covered two classification methods that use so-called "greedy" algorithms to partition the data according to feature values. Decision trees use a divide and conquer strategy to create flowchart-like structures, while rule learners separate and conquer data to identify logical if-else rules. Both methods produce models that can be interpreted without a statistical background. One popular and highly configurable decision tree algorithm is C5.0. We used the C5.0 algorithm to create a tree to predict whether a loan applicant will default. This article merely scratched the surface of how trees and rules can be used. Resources for Article: Further resources on this subject: Introduction to S4 Classes [article] First steps with R [article] Supervised learning [article]

0
0
99630

Packt

11 Aug 2015

27 min read

Scaling influencers

Packt

11 Aug 2015

27 min read

0
0
2458

How-To Tutorials

How to do Machine Learning with Python

What we can learn from attacks on the WEP Protocol

Benchmarking and Optimizing Go Code, Part 2

Tracking Objects in Videos

Tappy Defender – Building the home screen

Task Execution with Asio

Ext JS 5 – an Introduction

A Typical JavaScript Project

Getting Started with Java Driver for MongoDB

Storage Scalability

Trending Topics

Achieving High-Availability on AWS Cloud

Analyzing network reconnaissance attempts

Managing Recipients

Divide and Conquer – Classification Using Decision Trees and Rules

Scaling influencers

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access