Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7010 Articles
article-image-navigation-mesh-generation
Packt
19 Dec 2014
9 min read
Save for later

Navigation Mesh Generation

Packt
19 Dec 2014
9 min read
In this article by Curtis Bennett and Dan Violet Sagmiller, authors of the book Unity AI Programming Essentials, we will learn about navigation meshes in Unity. Navigation mesh generation controls how AI characters are able to travel around a game level and is one of the most important topics in game AI. In this article, we will provide an overview of navigation meshes and look at the algorithm for generating them. Then, we'll look at different options of customizing our navigation meshes better. To do this, we will be using RAIN 2.1.5, a popular AI plugin for Unity by Rival Theory, available for free at http://rivaltheory.com/rain/download/. In this article, you will learn about: How navigation mesh generation works and the algorithm behind it Advanced options for customizing navigation meshes Creating advanced navigation meshes with RAIN (For more resources related to this topic, see here.) An overview of a navigation mesh To use navigation meshes, also referred to as NavMeshes, effectively the first things we need to know are what exactly navigation meshes are and how they are created. A navigation mesh is a definition of the area an AI character could travel to in a level. It is a mesh, but it is not intended to be rendered or seen by the player, instead it is used by the AI system. A NavMesh usually does not cover all the area in a level (if it did we wouldn't need one) since it's just the area a character can walk. The mesh is also almost always a simplified version of the geometry. For instance, you could have a cave floor in a game with thousands of polygons along the bottom showing different details in the rock, but for the navigation mesh the areas would just be a handful of very large polys giving a simplified view of the level. The purpose of navigation mesh is to provide this simplified representation to the rest of the AI system a way to find a path between two points on a level for a character. This is its purpose; let's discuss how they are created. It used to be a common practice in the games industry to create navigation meshes manually. A designer or artist would take the completed level geometry and create one using standard polygon mesh modelling tools and save it out. As you might imagine, this allowed for nice, custom, efficient meshes, but was also a big time sink, since every time the level changed the navigation mesh would need to be manually edited and updated. In recent years, there has been more research in automatic navigation mesh generation. There are many approaches to automatic navigation mesh generation, but the most popular is Recast, originally developed and designed by Mikko Monomen. Recast takes in level geometry and a set of parameters defining the character, such as the size of the character and how big of steps it can take, and then does a multipass approach to filter and create the final NavMesh. The most important phase of this is voxelizing the level based on an inputted cell size. This means the level geometry is divided into voxels (cubes) creating a version of the level geometry where everything is partitioned into different boxes called cells. Then the geometry in each of these cells is analyzed and simplified based on its intersection with the sides of the boxes and is culled based on things such as the slope of the geometry or how big a step height is between geometry. This simplified geometry is then merged and triangulated to make a final navigation mesh that can be used by the AI system. The source code and more information on the original C++ implementation of Recast is available at https://github.com/memononen/recastnavigation. Advanced NavMesh parameters Now that we understand how navigation mesh generations works, let's look at the different parameters you can set to generate them in more detail. We'll look at how to do these with RAIN: Open Unity and create a new scene and a floor and some blocks for walls. Download RAIN from http://rivaltheory.com/rain/download/ and import it into your scene. Then go to RAIN | Create Navigation Mesh. Also right-click on the RAIN menu and choose Show Advanced Settings. The setup should look something like the following screenshot: Now let's look at some of the important parameters: Size: This is the overall size of the navigation mesh. You'll want the navigation mesh to cover your entire level and use this parameter instead of trying to scale up the navigation mesh through the Scale transform in the Inspector window. For our demo here, set the Size parameter to 20. Walkable Radius: This is an important parameter to define the character size of the mesh. Remember, each mesh will be matched to the size of a particular character, and this is the radius of the character. You can visualize the radius for a character by adding a Unity Sphere Collider script to your object (by going to Component | Physics | Sphere Collider) and adjusting the radius of the collider. Cell Size: This is also a very important parameter. During the voxel step of the Recast algorithm, this sets the size of the cubes to inspect the geometry. The smaller the size, the more detailed and finer mesh, but longer the processing time for Recast. A large cell size makes computation fast but loses detail. For example, here is a NavMesh from our demo with a cell size of 0.01: You can see the finer detail here. Here is the navigation mesh generated with a cell size of 0.1: Note the difference between the two screenshots. In the former, walking through the two walls lower down in our picture is possible, but in the latter with a larger cell size, there is no path even though the character radius is the same. Problems like this become greater with larger cell sizes. The following is a navigation mesh with a cell size of 1: As you can see, the detail becomes jumbled and the mesh itself becomes unusable. With such differing results, the big question is how large should a cell size be for a level? The answer is that it depends on the required result. However, one important consideration is that as the processing time to generate one is done during development and not at runtime even if it takes several minutes to generate a good mesh, it can be worth it to get a good result in the game. Setting a small cell size on a large level can cause mesh processing to take a significant amount of time and consume a lot of memory. It is a good practice to save the scene before attempting to generate a complex navigation mesh. The Size, Walkable Radius, and Cell Size parameters are the most important parameters when generating the navigation mesh, but there are more that are used to customize the mesh further: Max Slope: This is the largest slope that a character can walk on. This is how much a piece of geometry that is tilted can still be walked on. If you take the wall and rotate it, you can see it is walkable: The preceding is a screenshot of a walkable object with slope. Step Height: This is how high a character can step from one object to another. For example, if you have steps between two blocks, as shown in the following screenshot, this would define how far in height the blocks can be apart and whether the area is still considered walkable: This is a screenshot of the navigation mesh with step height set to connect adjacent blocks. Walkable Height: This is the vertical height that is needed for the character to walk. For example, in the previous illustration, the second block is not walkable underneath because of the walkable height. If you raise it to a least one unit off the ground and set the walkable height to 1, the area underneath would become walkable:   You can see a screenshot of the navigation mesh with walkable height set to allow going under the higher block. These are the most important parameters. There are some other parameters related to the visualization and to cull objects. We will look at culling more in the next section. Culling areas Being able to set up areas as walkable or not is an important part of creating a level. To demo this, let's divide the level into two parts and create a bridge between the two. Take our demo and duplicate the floor and pull it down. Then transform one of the walls to a bridge. Then, add two other pieces of geometry to mark areas that are dangerous to walk on, like lava. Here is an example setup: This is a basic scene with a bridge to cross. If you recreate the navigation mesh now, all of the geometry will be covered and the bridge won't be recognized. To fix this, you can create a new tag called Lava and tag the geometry under the bridge with it. Then, in the navigation meshes' RAIN component, add Lava to the unwalkable tags. If you then regenerate the mesh, only the bridge is walkable. This is a screenshot of a navigation mesh areas under bridge culled: Using layers and the walkable tag you can customize navigation meshes. Summary Navigation meshes are an important part of game AI. In this article, we looked at the different parameters to customize navigation meshes. We looked at things such as setting the character size and walkable slopes and discussed the importance of the cell size parameter. We then saw how to customize our mesh by tagging different areas as not walkable. This should be a good start for designing navigation meshes for your games. Resources for Article: Further resources on this subject: Components in Unity [article] Enemy and Friendly AIs [article] Introduction to AI [article]
Read more
  • 0
  • 0
  • 4378

article-image-supervised-learning
Packt
19 Dec 2014
50 min read
Save for later

Supervised learning

Packt
19 Dec 2014
50 min read
In this article by Dan Toomey, author of the book R for Data Science, we will learn about the supervised learning, which involves the use of a target variable and a number of predictor variables that are put into a model to enable the system to predict the target. This is also known as predictive modeling. (For more resources related to this topic, see here.) As mentioned, in supervised learning we have a target variable and a number of possible predictor variables. The objective is to associate the predictor variables in such a way so as to accurately predict the target variable. We are using some portion of observed data to learn how our model behaves and then testing that model on the remaining observations for accuracy. We will go over the following supervised learning techniques: Decision trees Regression Neural networks Instance based learning (k-NN) Ensemble learning Support vector machines Bayesian learning Bayesian inference Random forests Decision tree For decision tree machine learning, we develop a logic tree that can be used to predict our target value based on a number of predictor variables. The tree has logical points, such as if the month is December, follow the tree logic to the left; otherwise, follow the tree logic to the right. The last leaf of the tree has a predicted value. For this example, we will use the weather data in the rattle package. We will develop a decision tree to be used to determine whether it will rain tomorrow or not based on several variables. Let's load the rattle package as follows: > library(rattle) We can see a summary of the weather data. This shows that we have some real data over a year from Australia: > summary(weather)      Date                     Location     MinTemp     Min.   :2007-11-01   Canberra     :366   Min.   :-5.300 1st Qu.:2008-01-31   Adelaide     : 0   1st Qu.: 2.300 Median :2008-05-01   Albany       : 0   Median : 7.450 Mean   :2008-05-01   Albury       : 0   Mean   : 7.266 3rd Qu.:2008-07-31   AliceSprings : 0   3rd Qu.:12.500 Max.   :2008-10-31   BadgerysCreek: 0   Max.   :20.900                      (Other)     : 0                      MaxTemp         Rainfall       Evaporation       Sunshine     Min.   : 7.60   Min.   : 0.000   Min.  : 0.200   Min.   : 0.000 1st Qu.:15.03   1st Qu.: 0.000   1st Qu.: 2.200   1st Qu.: 5.950 Median :19.65   Median : 0.000   Median : 4.200   Median : 8.600 Mean   :20.55   Mean   : 1.428   Mean   : 4.522   Mean   : 7.909 3rd Qu.:25.50   3rd Qu.: 0.200   3rd Qu.: 6.400   3rd Qu.:10.500 Max.   :35.80   Max.   :39.800   Max.   :13.800   Max.   :13.600                                                    NA's   :3       WindGustDir   WindGustSpeed   WindDir9am   WindDir3pm NW     : 73   Min.   :13.00   SE     : 47   WNW   : 61 NNW   : 44   1st Qu.:31.00   SSE   : 40   NW     : 61 E     : 37   Median :39.00   NNW   : 36   NNW   : 47 WNW   : 35   Mean   :39.84   N     : 31   N     : 30 ENE   : 30   3rd Qu.:46.00   NW     : 30   ESE   : 27 (Other):144   Max.   :98.00   (Other):151   (Other):139 NA's   : 3   NA's   :2       NA's   : 31   NA's   : 1 WindSpeed9am     WindSpeed3pm   Humidity9am     Humidity3pm   Min.   : 0.000   Min.   : 0.00   Min.   :36.00   Min.   :13.00 1st Qu.: 6.000   1st Qu.:11.00   1st Qu.:64.00   1st Qu.:32.25 Median : 7.000   Median :17.00   Median :72.00   Median :43.00 Mean   : 9.652   Mean   :17.99   Mean   :72.04   Mean   :44.52 3rd Qu.:13.000   3rd Qu.:24.00   3rd Qu.:81.00   3rd Qu.:55.00 Max.   :41.000   Max.   :52.00   Max.   :99.00   Max.   :96.00 NA's   :7                                                       Pressure9am     Pressure3pm       Cloud9am       Cloud3pm   Min.   : 996.5   Min.   : 996.8   Min.   :0.000   Min.   :0.000 1st Qu.:1015.4   1st Qu.:1012.8   1st Qu.:1.000   1st Qu.:1.000 Median :1020.1   Median :1017.4   Median :3.500   Median :4.000 Mean   :1019.7   Mean   :1016.8   Mean   :3.891   Mean   :4.025 3rd Qu.:1024.5   3rd Qu.:1021.5   3rd Qu.:7.000   3rd Qu.:7.000 Max.   :1035.7   Max.   :1033.2   Max.   :8.000   Max.   :8.000 Temp9am         Temp3pm         RainToday RISK_MM Min.   : 0.100   Min.   : 5.10   No :300   Min.   : 0.000 1st Qu.: 7.625   1st Qu.:14.15   Yes: 66   1st Qu.: 0.000 Median :12.550   Median :18.55             Median : 0.000 Mean   :12.358   Mean   :19.23             Mean   : 1.428 3rd Qu.:17.000   3rd Qu.:24.00             3rd Qu.: 0.200 Max.   :24.700   Max.   :34.50           Max.   :39.800                                                            RainTomorrow No :300     Yes: 66       We will be using the rpart function to develop a decision tree. The rpart function looks like this: rpart(formula, data, weights, subset, na.action = na.rpart, method, model = FALSE, x = FALSE, y = TRUE, parms, control, cost, ...) The various parameters of the rpart function are described in the following table: Parameter Description formula This is the formula used for the prediction. data This is the data matrix. weights These are the optional weights to be applied. subset This is the optional subset of rows of data to be used. na.action This specifies the action to be taken when y, the target value, is missing. method This is the method to be used to interpret the data. It should be one of these: anova, poisson, class, or exp. If not specified, the algorithm decides based on the layout of the data. … These are the additional parameters to be used to control the behavior of the algorithm.  Let's create a subset as follows: > weather2 <- subset(weather,select=-c(RISK_MM)) > install.packages("rpart") >library(rpart) > model <- rpart(formula=RainTomorrow ~ .,data=weather2, method="class") > summary(model) Call: rpart(formula = RainTomorrow ~ ., data = weather2, method = "class") n= 366   CPn split       rel error     xerror   xstd 1 0.19696970     0 1.0000000 1.0000000 0.1114418 2 0.09090909      1 0.8030303 0.9696970 0.1101055 3 0.01515152     2 0.7121212 1.0151515 0.1120956 4 0.01000000     7 0.6363636 0.9090909 0.1073129   Variable importance Humidity3pm WindGustSpeed     Sunshine WindSpeed3pm       Temp3pm            24           14          12             8             6 Pressure3pm       MaxTemp       MinTemp   Pressure9am       Temp9am            6             5             4             4             4 Evaporation         Date   Humidity9am     Cloud3pm     Cloud9am             3             3             2             2             1      Rainfall            1 Node number 1: 366 observations,   complexity param=0.1969697 predicted class=No   expected loss=0.1803279 P(node) =1    class counts:   300   66    probabilities: 0.820 0.180 left son=2 (339 obs) right son=3 (27 obs) Primary splits:    Humidity3pm < 71.5   to the left, improve=18.31013, (0 missing)    Pressure3pm < 1011.9 to the right, improve=17.35280, (0 missing)    Cloud3pm   < 6.5     to the left, improve=16.14203, (0 missing)    Sunshine   < 6.45   to the right, improve=15.36364, (3 missing)    Pressure9am < 1016.35 to the right, improve=12.69048, (0 missing) Surrogate splits:    Sunshine < 0.45   to the right, agree=0.945, adj=0.259, (0 split) (many more)… As you can tell, the model is complicated. The summary shows the progression of the model development using more and more of the data to fine-tune the tree. We will be using the rpart.plot package to display the decision tree in a readable manner as follows: > library(rpart.plot) > fancyRpartPlot(model,main="Rain Tomorrow",sub="Chapter 12") This is the output of the fancyRpartPlot function Now, we can follow the logic of the decision tree easily. For example, if the humidity is over 72, we are predicting it will rain. Regression We can use a regression to predict our target value by producing a regression model from our predictor variables. We will be using the forest fire data from http://archive.ics.uci.edu. We will load the data and get the following summary: > forestfires <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/forestfires.csv") > summary(forestfires)        X               Y           month     day         FFMC     Min.   :1.000   Min.   :2.0   aug   :184   fri:85 Min.   :18.70 1st Qu.:3.000   1st Qu.:4.0   sep   :172   mon:74   1st Qu.:90.20 Median :4.000   Median :4.0   mar   : 54   sat:84   Median :91.60 Mean   :4.669   Mean   :4.3   jul   : 32   sun:95   Mean   :90.64 3rd Qu.:7.000   3rd Qu.:5.0  feb   : 20   thu:61   3rd Qu.:92.90 Max.   :9.000   Max.   :9.0   jun   : 17   tue:64   Max.   :96.20                                (Other): 38   wed:54                      DMC             DC             ISI             temp     Min.   : 1.1   Min.   : 7.9   Min.   : 0.000   Min.   : 2.20 1st Qu.: 68.6   1st Qu.:437.7   1st Qu.: 6.500   1st Qu.:15.50 Median :108.3   Median :664.2   Median : 8.400   Median :19.30 Mean   :110.9   Mean   :547.9   Mean   : 9.022   Mean   :18.89 3rd Qu.:142.4   3rd Qu.:713.9   3rd Qu.:10.800   3rd Qu.:22.80 Max.   :291.3   Max.   :860.6   Max.   :56.100   Max.   :33.30                                                                         RH             wind           rain             area       Min.   : 15.00   Min.   :0.400   Min.   :0.00000   Min.   :   0.00 1st Qu.: 33.00   1st Qu.:2.700   1st Qu.:0.00000   1st Qu.:   0.00 Median : 42.00   Median :4.000   Median :0.00000   Median :   0.52 Mean   : 44.29   Mean   :4.018   Mean   :0.02166   Mean   : 12.85 3rd Qu.: 53.00   3rd Qu.:4.900   3rd Qu.:0.00000   3rd Qu.:   6.57 Max.   :100.00   Max.   :9.400   Max.   :6.40000   Max.   :1090.84 I will just use the month, temperature, wind, and rain data to come up with a model of the area (size) of the fires using the lm function. The lm function looks like this: lm(formula, data, subset, weights, na.action,    method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,    singular.ok = TRUE, contrasts = NULL, offset, ...) The various parameters of the lm function are described in the following table: Parameter Description formula This is the formula to be used for the model data This is the dataset subset This is the subset of dataset to be used weights These are the weights to apply to factors … These are the additional parameters to be added to the function Let's load the data as follows: > model <- lm(formula = area ~ month + temp + wind + rain, data=forestfires) Looking at the generated model, we see the following output: > summary(model) Call: lm(formula = area ~ month + temp + wind + rain, data = forestfires) Residuals:    Min     1Q Median     3Q     Max -33.20 -14.93   -9.10   -1.66 1063.59 Coefficients:            Estimate Std. Error t value Pr(>|t|) (Intercept) -17.390     24.532 -0.709   0.4787 monthaug     -10.342     22.761 -0.454   0.6498 monthdec     11.534     30.896   0.373   0.7091 monthfeb       2.607     25.796   0.101   0.9196 monthjan       5.988     50.493   0.119   0.9056 monthjul     -8.822    25.068 -0.352   0.7251 monthjun     -15.469     26.974 -0.573   0.5666 monthmar     -6.630     23.057 -0.288   0.7738 monthmay       6.603     50.053   0.132   0.8951 monthnov     -8.244     67.451 -0.122   0.9028 monthoct     -8.268    27.237 -0.304   0.7616 monthsep     -1.070     22.488 -0.048   0.9621 temp           1.569     0.673   2.332   0.0201 * wind           1.581     1.711   0.924   0.3557 rain         -3.179     9.595 -0.331   0.7406 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1   Residual standard error: 63.99 on 502 degrees of freedom Multiple R-squared: 0.01692, Adjusted R-squared: -0.0105 F-statistic: 0.617 on 14 and 502 DF, p-value: 0.8518 Surprisingly, the month has a significant effect on the size of the fires. I would have guessed that whether or not the fires occurred in August or similar months would have effected any discernable difference. Also, the temperature has such a minimal effect. Further, the model is using the month data as categorical. If we redevelop the model (without temperature), we have a better fit (notice the multiple R-squared value drops to 0.006 from 0.01), as shown here: > model <- lm(formula = area ~ month + wind + rain, data=forestfires) > summary(model)   Call: lm(formula = area ~ month + wind + rain, data = forestfires)   Residuals:    Min     1Q Median     3Q     Max -22.17 -14.39 -10.46   -3.87 1072.43   Coefficients:           Estimate Std. Error t value Pr(>|t|) (Intercept)   4.0126   22.8496   0.176   0.861 monthaug     4.3132   21.9724   0.196   0.844 monthdec     1.3259   30.7188   0.043   0.966 monthfeb     -1.6631   25.8441 -0.064   0.949 monthjan     -6.1034   50.4475 -0.121   0.904 monthjul     6.4648   24.3021   0.266   0.790 monthjun     -2.4944   26.5099 -0.094   0.925 monthmar     -4.8431   23.1458 -0.209   0.834 monthmay     10.5754   50.2441   0.210   0.833 monthnov     -8.7169   67.7479 -0.129   0.898 monthoct     -0.9917   27.1767 -0.036   0.971 monthsep     10.2110   22.0579   0.463   0.644 wind         1.0454     1.7026   0.614   0.540 rain         -1.8504     9.6207 -0.192   0.848   Residual standard error: 64.27 on 503 degrees of freedom Multiple R-squared: 0.006269, Adjusted R-squared: -0.01941 F-statistic: 0.2441 on 13 and 503 DF, p-value: 0.9971 From the results, we can see R-squared of close to 0 and p-value almost 1; this is a very good fit. If you plot the model, you will get a series of graphs. The plot of the residuals versus fitted values is the most revealing, as shown in the following graph: > plot(model) You can see from the graph that the regression model is very accurate: Neural network In a neural network, it is assumed that there is a complex relationship between the predictor variables and the target variable. The network allows the expression of each of these relationships. For this model, we will use the liver disorder data from http://archive.ics.uci.edu. The data has a few hundred observations from patients with liver disorders. The variables are various measures of blood for each patient as shown here: > bupa <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/liver-disorders/bupa.data") > colnames(bupa) <- c("mcv","alkphos","alamine","aspartate","glutamyl","drinks","selector") > summary(bupa)      mcv           alkphos         alamine     Min.   : 65.00   Min.   : 23.00   Min.   : 4.00 1st Qu.: 87.00   1st Qu.: 57.00   1st Qu.: 19.00 Median : 90.00   Median : 67.00   Median : 26.00 Mean   : 90.17   Mean   : 69.81   Mean   : 30.36 3rd Qu.: 93.00   3rd Qu.: 80.00   3rd Qu.: 34.00 Max.   :103.00   Max.   :138.00   Max.   :155.00    aspartate       glutamyl         drinks     Min.   : 5.00   Min.   : 5.00   Min.  : 0.000 1st Qu.:19.00   1st Qu.: 15.00   1st Qu.: 0.500 Median :23.00   Median : 24.50   Median : 3.000 Mean   :24.64   Mean   : 38.31   Mean   : 3.465 3rd Qu.:27.00   3rd Qu.: 46.25   3rd Qu.: 6.000 Max.   :82.00   Max.   :297.00   Max. :20.000    selector   Min.   :1.000 1st Qu.:1.000 Median :2.000 Mean   :1.581 3rd Qu.:2.000 Max.   :2.000 We generate a neural network using the neuralnet function. The neuralnet function looks like this: neuralnet(formula, data, hidden = 1, threshold = 0.01,                stepmax = 1e+05, rep = 1, startweights = NULL,          learningrate.limit = NULL,          learningrate.factor = list(minus = 0.5, plus = 1.2),          learningrate=NULL, lifesign = "none",          lifesign.step = 1000, algorithm = "rprop+",          err.fct = "sse", act.fct = "logistic",          linear.output = TRUE, exclude = NULL,          constant.weights = NULL, likelihood = FALSE) The various parameters of the neuralnet function are described in the following table: Parameter Description formula This is the formula to converge. data This is the data matrix of predictor values. hidden This is the number of hidden neurons in each layer. stepmax This is the maximum number of steps in each repetition. Default is 1+e5. rep This is the number of repetitions. Let's generate the neural network as follows: > nn <- neuralnet(selector~mcv+alkphos+alamine+aspartate+glutamyl+drinks, data=bupa, linear.output=FALSE, hidden=2) We can see how the model was developed via the result.matrix variable in the following output: > nn$result.matrix                                      1 error                 100.005904355153 reached.threshold       0.005904330743 steps                 43.000000000000 Intercept.to.1layhid1   0.880621509705 mcv.to.1layhid1       -0.496298308044 alkphos.to.1layhid1     2.294158313786 alamine.to.1layhid1     1.593035613921 aspartate.to.1layhid1 -0.407602506759 glutamyl.to.1layhid1   -0.257862634340 drinks.to.1layhid1     -0.421390527261 Intercept.to.1layhid2   0.806928998059 mcv.to.1layhid2       -0.531926150470 alkphos.to.1layhid2     0.554627946150 alamine.to.1layhid2     1.589755874579 aspartate.to.1layhid2 -0.182482440722 glutamyl.to.1layhid2   1.806513419058 drinks.to.1layhid2     0.215346602241 Intercept.to.selector   4.485455617018 1layhid.1.to.selector   3.328527160621 1layhid.2.to.selector   2.616395644587 The process took 43 steps to come up with the neural network once the threshold was under 0.01 (0.005 in this case). You can see the relationships between the predictor values. Looking at the network developed, we can see the hidden layers of relationship among the predictor variables. For example, sometimes mcv combines at one ratio and on other times at another ratio, depending on its value. Let's load the neural network as follows: > plot(nn) Instance-based learning R programming has a nearest neighbor algorithm (k-NN). The k-NN algorithm takes the predictor values and organizes them so that a new observation is applied to the organization developed and the algorithm selects the result (prediction) that is most applicable based on nearness of the predictor values in the new observation. The nearest neighbor function is knn. The knn function call looks like this: knn(train, test, cl, k = 1, l = 0, prob = FALSE, use.all = TRUE) The various parameters of the knn function are described in the following table: Parameter Description train This is the training data. test This is the test data. cl This is the factor of true classifications. k This is the Number of neighbors to consider. l This is the minimum vote for a decision. prob This is a Boolean flag to return proportion of winning votes. use.all This is a Boolean variable for tie handling. TRUE means use all votes of max distance I am using the auto MPG dataset in the example of using knn. First, we load the dataset : > data <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", na.string="?") > colnames(data) <- c("mpg","cylinders","displacement","horsepower","weight","acceleration","model.year","origin","car.name") > summary(data)      mpg         cylinders     displacement     horsepower Min.   : 9.00  Min.   :3.000   Min.   : 68.0   150   : 22 1st Qu.:17.50   1st Qu.:4.000   1st Qu.:104.2   90     : 20 Median :23.00   Median :4.000   Median :148.5   88     : 19 Mean   :23.51   Mean   :5.455   Mean   :193.4   110   : 18 3rd Qu.:29.00   3rd Qu.:8.000   3rd Qu.:262.0   100   : 17 Max.   :46.60   Max.   :8.000   Max.   :455.0   75     : 14                                                  (Other):288      weight     acceleration     model.year       origin     Min.   :1613   Min. : 8.00   Min.   :70.00   Min.   :1.000 1st Qu.:2224   1st Qu.:13.82   1st Qu.:73.00   1st Qu.:1.000 Median :2804   Median :15.50   Median :76.00   Median :1.000 Mean   :2970   Mean   :15.57   Mean   :76.01   Mean   :1.573 3rd Qu.:3608   3rd Qu.:17.18   3rd Qu.:79.00   3rd Qu.:2.000 Max.   :5140   Max.   :24.80   Max.   :82.00   Max.   :3.000                                                                           car.name ford pinto   : 6 amc matador   : 5 ford maverick : 5 toyota corolla: 5 amc gremlin   : 4 amc hornet   : 4 (Other)       :369   There are close to 400 observations in the dataset. We need to split the data into a training set and a test set. We will use 75 percent for training. We use the createDataPartition function in the caret package to select the training rows. Then, we create a test dataset and a training dataset using the partitions as follows: > library(caret) > training <- createDataPartition(data$mpg, p=0.75, list=FALSE) > trainingData <- data[training,] > testData <- data[-training,] > model <- knn(train=trainingData, test=testData, cl=trainingData$mpg) NAs introduced by coercion The error message means that some numbers in the dataset have a bad format. The bad numbers were automatically converted to NA values. Then the inclusion of the NA values caused the function to fail, as NA values are not expected in this function call. First, there are some missing items in the dataset loaded. We need to eliminate those NA values as follows: > completedata <- data[complete.cases(data),] After looking over the data several times, I guessed that the car name fields were being parsed as numerical data when there was a number in the name, such as Buick Skylark 320. I removed the car name column from the test and we end up with the following valid results; > drops <- c("car.name") > completeData2 <- completedata[,!(names(completedata) %in% drops)] > training <- createDataPartition(completeData2$mpg, p=0.75, list=FALSE) > trainingData <- completeData2[training,] > testData <- completeData2[-training,] > model <- knn(train=trainingData, test=testData, cl=trainingData$mpg) We can see the results of the model by plotting using the following command. However, the graph doesn't give us much information to work on. > plot(model) We can use a different kknn function to compare our model with the test data. I like this version a little better as you can plainly specify the formula for the model. Let's use the kknn function as follows: > library(kknn) > model <- kknn(formula = formula(mpg~.), train = trainingData, test = testData, k = 3, distance = 1) > fit <- fitted(model) > plot(testData$mpg, fit) > abline(a=0, b=1, col=3) I added a simple slope to highlight how well the model fits the training data. It looks like as we progress to higher MPG values, our model has a higher degree of variance. I think that means we are missing predictor variables, especially for the later model, high MPG series of cars. That would make sense as government mandate and consumer demand for high efficiency vehicles changed the mpg for vehicles. Here is the graph generated by the previous code: Ensemble learning Ensemble learning is the process of using multiple learning methods to obtain better predictions. For example, we could use a regression and k-NN, combine the results, and end up with a better prediction. We could average the results of both or provide heavier weight towards one or another of the algorithms, whichever appears to be a better predictor. Support vector machines We covered support vector machines (SVM), but I will run through an example here. As a reminder, SVM is concerned with binary data. We will use the spam dataset from Hewlett Packard (part of the kernlab package). First, let's load the data as follows: > library(kernlab) > data("spam") > summary(spam)      make           address           all             num3d         Min.   :0.0000   Min.   : 0.000   Min.   :0.0000   Min.   : 0.00000 1st Qu.:0.0000   1st Qu.: 0.000   1st Qu.:0.0000   1st Qu.: 0.00000 Median :0.0000   Median : 0.000   Median :0.0000   Median : 0.00000 Mean   :0.1046   Mean   : 0.213   Mean   :0.2807   Mean   : 0.06542 3rd Qu.:0.0000   3rd Qu.: 0.000   3rd Qu.:0.4200   3rd Qu.: 0.00000 Max.   :4.5400   Max.   :14.280   Max.   :5.1000   Max.   :42.81000 … There are 58 variables with close to 5000 observations, as shown here: > table(spam$type) nonspam   spam    2788   1813 Now, we break up the data into a training set and a test set as follows: > index <- 1:nrow(spam) > testindex <- sample(index, trunc(length(index)/3)) > testset <- spam[testindex,] > trainingset <- spam[-testindex,] Now, we can produce our SVM model using the svm function. The svm function looks like this: svm(formula, data = NULL, ..., subset, na.action =na.omit, scale = TRUE) The various parameters of the svm function are described in the following table: Parameter Description formula This is the formula model data This is the dataset subset This is the subset of the dataset to be used na.action This contains what action to take with NA values scale This determines whether to scale the data Let's use the svm function to produce a SVM model as follows: > library(e1071) > model <- svm(type ~ ., data = trainingset, method = "C-classification", kernel = "radial", cost = 10, gamma = 0.1) > summary(model) Call: svm(formula = type ~ ., data = trainingset, method = "C-classification",    kernel = "radial", cost = 10, gamma = 0.1) Parameters:    SVM-Type: C-classification SVM-Kernel: radial        cost: 10      gamma: 0.1 Number of Support Vectors: 1555 ( 645 910 ) Number of Classes: 2 Levels: nonspam spam We can test the model against our test dataset and look at the results as follows: > pred <- predict(model, testset) > table(pred, testset$type) pred     nonspam spam nonspam     891 104 spam         38 500 Note, the e1071 package is not compatible with the current version of R. Given its usefulness I would expect the package to be updated to support the user base. So, using SVM, we have a 90 percent ((891+500) / (891+104+38+500)) accuracy rate of prediction. Bayesian learning With Bayesian learning, we have an initial premise in a model that is adjusted with new information. We can use the MCMCregress method in the MCMCpack package to use Bayesian regression on learning data and apply the model against test data. Let's load the MCMCpack package as follows: > install.packages("MCMCpack") > library(MCMCpack) We are going to be using the transplant data on transplants available at http://lib.stat.cmu.edu/datasets/stanford. (The dataset on the site is part of the web page, so I copied into a local CSV file.) The data shows expected transplant success factor, the actual transplant success factor, and the number of transplants over a time period. So, there is a good progression over time as to the success of the program. We can read the dataset as follows: > transplants <- read.csv("transplant.csv") > summary(transplants)    expected         actual       transplants   Min.   : 0.057   Min.   : 0.000   Min.   : 1.00 1st Qu.: 0.722   1st Qu.: 0.500   1st Qu.: 9.00 Median : 1.654   Median : 2.000   Median : 18.00 Mean   : 2.379   Mean   : 2.382   Mean   : 27.83 3rd Qu.: 3.402   3rd Qu.: 3.000   3rd Qu.: 40.00 Max.   :12.131   Max.   :18.000   Max.   :152.00 We use Bayesian regression against the data— note that we are modifying the model as we progress with new information using the MCMCregress function. The MCMCregress function looks like this: MCMCregress(formula, data = NULL, burnin = 1000, mcmc = 10000,    thin = 1, verbose = 0, seed = NA, beta.start = NA,    b0 = 0, B0 = 0, c0 = 0.001, d0 = 0.001, sigma.mu = NA, sigma.var = NA,    marginal.likelihood = c("none", "Laplace", "Chib95"), ...) The various parameters of the MCMCregress function are described in the following table: Parameter Description formula This is the formula of model data This is the dataset to be used for model … These are the additional parameters for the function Let's use the Bayesian regression against the data as follows: > model <- MCMCregress(expected ~ actual + transplants, data=transplants) > summary(model) Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable,    plus standard error of the mean:                Mean     SD Naive SE Time-series SE (Intercept) 0.00484 0.08394 0.0008394     0.0008388 actual     0.03413 0.03214 0.0003214     0.0003214 transplants 0.08238 0.00336 0.0000336     0.0000336 sigma2     0.44583 0.05698 0.0005698     0.0005857 2. Quantiles for each variable:                2.5%     25%     50%     75%   97.5% (Intercept) -0.15666 -0.05216 0.004786 0.06092 0.16939 actual     -0.02841 0.01257 0.034432 0.05541 0.09706 transplants 0.07574 0.08012 0.082393 0.08464 0.08890 sigma2       0.34777 0.40543 0.441132 0.48005 0.57228 The plot of the data shows the range of results, as shown in the following graph. Look at this in contrast to a simple regression with one result. > plot(model) Random forests Random forests is an algorithm that constructs a multitude of decision trees for the model of the data and selects the best of the lot as the final result. We can use the randomForest function in the kernlab package for this function. The randomForest function looks like this: randomForest(formula, data=NULL, ..., subset, na.action=na.fail) The various parameters of the randomForest function are described in the following table: Parameter Description formula This is the formula of model data This is the dataset to be used subset This is the subset of the dataset to be used na.action This is the action to take with NA values For an example of random forest, we will use the spam data, as in the section Support vector machines. First, let's load the package and library as follows: > install.packages("randomForest") > library(randomForest) Now, we will generate the model with the following command (this may take a while): > fit <- randomForest(type ~ ., data=spam) Let's look at the results to see how it went: > fit Call: randomForest(formula = type ~ ., data = spam)                Type of random forest: classification                      Number of trees: 500 No. of variables tried at each split: 7        OOB estimate of error rate: 4.48% Confusion matrix:         nonspam spam class.error nonspam   2713   75 0.02690100 spam       131 1682 0.07225593 We can look at the relative importance of the data variables in the final model, as shown here: > head(importance(fit))        MeanDecreaseGini make           7.967392 address       12.654775 all           25.116662 num3d           1.729008 our           67.365754 over           17.579765 Ordering the data shows a couple of the factors to be critical to the determination. For example, the presence of the exclamation character in the e-mail is shown as a dominant indicator of spam mail: charExclamation   256.584207 charDollar       200.3655348 remove           168.7962949 free              142.8084662 capitalAve       137.1152451 capitalLong       120.1520829 your             116.6134519 Unsupervised learning With unsupervised learning, we do not have a target variable. We have a number of predictor variables that we look into to determine if there is a pattern. We will go over the following unsupervised learning techniques: Cluster analysis Density estimation Expectation-maximization algorithm Hidden Markov models Blind signal separation Cluster analysis Cluster analysis is the process of organizing data into groups (clusters) that are similar to each other. For our example, we will use the wheat seed data available at http://www.uci.edu, as shown here: > wheat <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt", sep="t") Let's look at the raw data: > head(wheat) X15.26 X14.84 X0.871 X5.763 X3.312 X2.221 X5.22 X1 1 14.88 14.57 0.8811 5.554 3.333 1.018 4.956 1 2 14.29 14.09 0.9050 5.291 3.337 2.699 4.825 1 3 13.84 13.94 0.8955 5.324 3.379 2.259 4.805 1 4 16.14 14.99 0.9034 5.658 3.562 1.355 5.175 1 5 14.38 14.21 0.8951 5.386 3.312 2.462 4.956 1 6 14.69 14.49 0.8799 5.563 3.259 3.586 5.219 1 We need to apply column names so we can see the data better: > colnames(wheat) <- c("area", "perimeter", "compactness", "length", "width", "asymmetry", "groove", "undefined") > head(wheat)    area perimeter compactness length width asymmetry groove undefined 1 14.88     14.57     0.8811 5.554 3.333     1.018 4.956         1 2 14.29     14.09     0.9050 5.291 3.337     2.699 4.825         1 3 13.84     13.94     0.8955 5.324 3.379     2.259 4.805         1 4 16.14     14.99     0.9034 5.658 3.562     1.355 5.175         1 5 14.38     14.21     0.8951 5.386 3.312     2.462 4.956         1 6 14.69     14.49     0.8799 5.563 3.259     3.586 5.219         1 The last column is not defined in the data description, so I am removing it: > wheat <- subset(wheat, select = -c(undefined) ) > head(wheat)    area perimeter compactness length width asymmetry groove 1 14.88     14.57     0.8811 5.554 3.333     1.018 4.956 2 14.29     14.09     0.9050 5.291 3.337     2.699 4.825 3 13.84     13.94     0.8955 5.324 3.379     2.259 4.805 4 16.14     14.99     0.9034 5.658 3.562     1.355 5.175 5 14.38     14.21     0.8951 5.386 3.312     2.462 4.956 6 14.69    14.49     0.8799 5.563 3.259     3.586 5.219 Now, we can finally produce the cluster using the kmeans function. The kmeans function looks like this: kmeans(x, centers, iter.max = 10, nstart = 1,        algorithm = c("Hartigan-Wong", "Lloyd", "Forgy",                      "MacQueen"), trace=FALSE) The various parameters of the kmeans function are described in the following table: Parameter Description x This is the dataset centers This is the number of centers to coerce data towards … These are the additional parameters of the function Let's produce the cluster using the kmeans function: > fit <- kmeans(wheat, 5) Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1) Unfortunately, there are some rows with missing data, so let's fix this using the following command: > wheat <- wheat[complete.cases(wheat),] Let's look at the data to get some idea of the factors using the following command: > plot(wheat) If we try looking at five clusters, we end up with a fairly good set of clusters with an 85 percent fit, as shown here: > fit <- kmeans(wheat, 5) > fit K-means clustering with 5 clusters of sizes 29, 33, 56, 69, 15 Cluster means:      area perimeter compactness   length   width asymmetry   groove 1 16.45345 15.35310   0.8768000 5.882655 3.462517 3.913207 5.707655 2 18.95455 16.38879   0.8868000 6.247485 3.744697 2.723545 6.119455 3 14.10536 14.20143   0.8777750 5.480214 3.210554 2.368075 5.070000 4 11.94870 13.27000   0.8516652 5.229304 2.870101 4.910145 5.093333 5 19.58333 16.64600   0.8877267 6.315867 3.835067 5.081533 6.144400 Clustering vector: ... Within cluster sum of squares by cluster: [1] 48.36785 30.16164 121.63840 160.96148 25.81297 (between_SS / total_SS = 85.4 %) If we push to 10 clusters, the performance increases to 92 percent. Density estimation Density estimation is used to provide an estimate of the probability density function of a random variable. For this example, we will use sunspot data from Vincent arlbuck site. Not clear if sunspots are truly random. Let's load our data as follows: > sunspots <- read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/datasets/sunspot.month.csv") > summary(sunspots)        X             time     sunspot.month   Min.   :   1   Min.   :1749   Min.   : 0.00 1st Qu.: 795   1st Qu.:1815   1st Qu.: 15.70 Median :1589   Median :1881   Median : 42.00 Mean   :1589   Mean   :1881   Mean   : 51.96 3rd Qu.:2383   3rd Qu.:1948   3rd Qu.: 76.40 Max.   :3177   Max.   :2014   Max.   :253.80 > head(sunspots) X     time sunspot.month 1 1 1749.000         58.0 2 2 1749.083         62.6 3 3 1749.167         70.0 4 4 1749.250         55.7 5 5 1749.333         85.0 6 6 1749.417        83.5 We will now estimate the density using the following command: > d <- density(sunspots$sunspot.month) > d Call: density.default(x = sunspots$sunspot.month) Data: sunspots$sunspot.month (3177 obs.); Bandwidth 'bw' = 7.916        x               y           Min.   :-23.75   Min.   :1.810e-07 1st Qu.: 51.58   1st Qu.:1.586e-04 Median :126.90   Median :1.635e-03 Mean   :126.90   Mean   :3.316e-03 3rd Qu.:202.22   3rd Qu.:5.714e-03 Max.   :277.55   Max.   :1.248e-02 A plot is very useful for this function, so let's generate one using the following command: > plot(d) It is interesting to see such a wide variation; maybe the data is pretty random after all. We can use the density to estimate additional periods as follows: > N<-1000 > sunspots.new <- rnorm(N, sample(sunspots$sunspot.month, size=N, replace=TRUE)) > lines(density(sunspots.new), col="blue") It looks like our density estimate is very accurate. Expectation-maximization Expectation-maximization (EM) is an unsupervised clustering approach that adjusts the data for optimal values. When using EM, we have to have some preconception of the shape of the data/model that will be targeted. This example reiterates the example on the Wikipedia page, with comments. The example tries to model the iris species from the other data points. Let's load the data as shown here: > iris <- read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data") > colnames(iris) <- c("SepalLength","SepalWidth","PetalLength","PetalWidth","Species") > modelName = "EEE" Each observation has sepal length, width, petal length, width, and species, as shown here: > head(iris) SepalLength SepalWidth PetalLength PetalWidth     Species 1         5.1       3.5         1.4       0.2 Iris-setosa 2         4.9       3.0         1.4       0.2 Iris-setosa 3         4.7       3.2         1.3       0.2 Iris-setosa 4         4.6       3.1         1.5       0.2 Iris-setosa 5         5.0       3.6         1.4       0.2 Iris-setosa 6         5.4       3.9         1.7       0.4 Iris-setosa We are estimating the species from the other points, so let's separate the data as follows: > data = iris[,-5] > z = unmap(iris[,5]) Let's set up our mstep for EM, given the data, categorical data (z) relating to each data point, and our model type name: > msEst <- mstep(modelName, data, z) We use the parameters defined in the mstep to produce our model, as shown here: > em(modelName, data, msEst$parameters) $z                [,1]         [,2]         [,3] [1,] 1.000000e+00 4.304299e-22 1.699870e-42 … [150,] 8.611281e-34 9.361398e-03 9.906386e-01 $parameters$pro [1] 0.3333333 0.3294048 0.3372619 $parameters$mean              [,1]     [,2]     [,3] SepalLength 5.006 5.941844 6.574697 SepalWidth 3.418 2.761270 2.980150 PetalLength 1.464 4.257977 5.538926 PetalWidth 0.244 1.319109 2.024576 $parameters$variance$d [1] 4 $parameters$variance$G [1] 3 $parameters$variance$sigma , , 1            SepalLength SepalWidth PetalLength PetalWidth SepalLength 0.26381739 0.09030470 0.16940062 0.03937152 SepalWidth   0.09030470 0.11251902 0.05133876 0.03082280 PetalLength 0.16940062 0.05133876 0.18624355 0.04183377 PetalWidth   0.03937152 0.03082280 0.04183377 0.03990165 , , 2 , , 3 … (there was little difference in the 3 sigma values) Covariance $parameters$variance$Sigma            SepalLength SepalWidth PetalLength PetalWidth SepalLength 0.26381739 0.09030470 0.16940062 0.03937152 SepalWidth   0.09030470 0.11251902 0.05133876 0.03082280 PetalLength 0.16940062 0.05133876 0.18624355 0.04183377 PetalWidth   0.03937152 0.03082280 0.04183377 0.03990165 $parameters$variance$cholSigma             SepalLength SepalWidth PetalLength PetalWidth SepalLength -0.5136316 -0.1758161 -0.32980960 -0.07665323 SepalWidth   0.0000000 0.2856706 -0.02326832 0.06072001 PetalLength   0.0000000 0.0000000 -0.27735855 -0.06477412 PetalWidth   0.0000000 0.0000000 0.00000000 0.16168899 attr(,"info") iterations       error 4.000000e+00 1.525131e-06 There is quite a lot of output from the em function. The highlights for me were the three sigma ranges were the same and the error from the function was very small. So, I think we have a very good estimation of species using just the four data points. Hidden Markov models The hidden Markov models (HMM) is the idea of observing data assuming it has been produced by a Markov model. The problem is to discover what that model is. I am using the Python example on Wikipedia for HMM. For an HMM, we need states (assumed to be hidden from observer), symbols, transition matrix between states, emission (output) states, and probabilities for all. The Python information presented is as follows: states = ('Rainy', 'Sunny') observations = ('walk', 'shop', 'clean') start_probability = {'Rainy': 0.6, 'Sunny': 0.4} transition_probability = {    'Rainy' : {'Rainy': 0.7, 'Sunny': 0.3},    'Sunny' : {'Rainy': 0.4, 'Sunny': 0.6},    } emission_probability = {    'Rainy' : {'walk': 0.1, 'shop': 0.4, 'clean': 0.5},    'Sunny' : {'walk': 0.6, 'shop': 0.3, 'clean': 0.1},    } trans <- matrix(c('Rainy', : {'Rainy': 0.7, 'Sunny': 0.3},    'Sunny' : {'Rainy': 0.4, 'Sunny': 0.6},    } We convert these to use in R for the initHmm function by using the following command: > hmm <- initHMM(c("Rainy","Sunny"), c('walk', 'shop', 'clean'), c(.6,.4), matrix(c(.7,.3,.4,.6),2), matrix(c(.1,.4,.5,.6,.3,.1),3)) > hmm $States [1] "Rainy" "Sunny" $Symbols [1] "walk" "shop" "clean" $startProbs Rainy Sunny 0.6   0.4 $transProbs        to from   Rainy Sunny Rainy   0.7   0.4 Sunny   0.3   0.6 $emissionProbs        symbols states walk shop clean Rainy 0.1 0.5   0.3 Sunny 0.4 0.6   0.1 The model is really a placeholder for all of the setup information needed for HMM. We can then use the model to predict based on observations, as follows: > future <- forward(hmm, c("walk","shop","clean")) > future        index states         1         2         3 Rainy -2.813411 -3.101093 -4.139551 Sunny -1.832581 -2.631089 -5.096193 The result is a matrix of probabilities. For example, it is more likely to be Sunny when we observe walk. Blind signal separation Blind signal separation is the process of identifying sources of signals from a mixed signal. Primary component analysis is one method of doing this. An example is a cocktail party where you are trying to listen to one speaker. For this example, I am using the decathlon dataset in the FactoMineR package, as shown here: > library(FactoMineR) > data(decathlon) Let's look at the data to get some idea of what is available: > summary(decathlon) 100m           Long.jump     Shot.put       High.jump Min.   :10.44   Min.   :6.61   Min.   :12.68   Min.   :1.850 1st Qu.:10.85   1st Qu.:7.03   1st Qu.:13.88   1st Qu.:1.920 Median :10.98   Median :7.30   Median :14.57   Median :1.950 Mean   :11.00   Mean   :7.26   Mean   :14.48   Mean   :1.977 3rd Qu.:11.14   3rd Qu.:7.48   3rd Qu.:14.97   3rd Qu.:2.040 Max.   :11.64   Max.   :7.96   Max.   :16.36   Max.   :2.150 400m           110m.hurdle       Discus       Pole.vault   Min.   :46.81   Min.   :13.97   Min.   :37.92   Min.   :4.200 1st Qu.:48.93   1st Qu.:14.21   1st Qu.:41.90   1st Qu.:4.500 Median :49.40   Median :14.48   Median :44.41   Median :4.800 Mean   :49.62   Mean   :14.61 Mean   :44.33   Mean   :4.762 3rd Qu.:50.30   3rd Qu.:14.98   3rd Qu.:46.07   3rd Qu.:4.920 Max.   :53.20   Max.   :15.67   Max.   :51.65   Max.   :5.400 Javeline       1500m           Rank           Points   Min.   :50.31   Min.   :262.1   Min.   : 1.00   Min.   :7313 1st Qu.:55.27   1st Qu.:271.0   1st Qu.: 6.00   1st Qu.:7802 Median :58.36   Median :278.1   Median :11.00   Median :8021 Mean   :58.32   Mean   :279.0   Mean   :12.12   Mean   :8005 3rd Qu.:60.89   3rd Qu.:285.1   3rd Qu.:18.00   3rd Qu.:8122 Max.   :70.52   Max.   :317.0   Max.   :28.00   Max.   :8893    Competition Decastar:13 OlympicG:28 The output looks like performance data from a series of events at a track meet: > head(decathlon)        100m   Long.jump Shot.put High.jump 400m 110m.hurdle Discus SEBRLE 11.04     7.58   14.83     2.07 49.81       14.69 43.75 CLAY   10.76     7.40   14.26     1.86 49.37       14.05 50.72 KARPOV 11.02     7.30   14.77     2.04 48.37       14.09 48.95 BERNARD 11.02     7.23   14.25     1.92 48.93       14.99 40.87 YURKOV 11.34     7.09   15.19     2.10 50.42       15.31 46.26 WARNERS 11.11     7.60   14.31     1.98 48.68       14.23 41.10        Pole.vault Javeline 1500m Rank Points Competition SEBRLE       5.02   63.19 291.7   1   8217   Decastar CLAY         4.92   60.15 301.5   2   8122   Decastar KARPOV       4.92   50.31 300.2   3   8099   Decastar BERNARD       5.32   62.77 280.1   4   8067   Decastar YURKOV       4.72   63.44 276.4   5   8036   Decastar WARNERS       4.92   51.77 278.1   6   8030   Decastar Further, this is performance of specific individuals in track meets. We run the PCA function by passing the dataset to use, whether to scale the data or not, and the type of graphs: > res.pca = PCA(decathlon[,1:10], scale.unit=TRUE, ncp=5, graph=T) This produces two graphs: Individual factors map Variables factor map The individual factors map lays out the performance of the individuals. For example, we see Karpov who is high in both dimensions versus Bourginon who is performing badly (on the left in the following chart): The variables factor map shows the correlation of performance between events. For example, doing well in the 400 meters run is negatively correlated with the performance in the long jump; if you did well in one, you likely did well in the other as well. Here is the variables factor map of our data: Questions Factual Which supervised learning technique(s) do you lean towards as your "go to" solution? Why are the density plots for Bayesian results off-center? When, how, and why? How would you decide on the number of clusters to use? Find a good rule of thumb to decide the number of hidden layers in a neural net. Challenges Investigate other blind signal separation techniques, such as ICA. Use other methods, such as poisson, in the rpart function (especially if you have a natural occurring dataset). Summary In this article, we looked into various methods of machine learning, including both supervised and unsupervised learning. With supervised learning, we have a target variable we are trying to estimate. With unsupervised, we only have a possible set of predictor variables and are looking for patterns. In supervised learning, we looked into using a number of methods, including decision trees, regression, neural networks, support vector machines, and Bayesian learning. In unsupervised learning, we used cluster analysis, density estimation, hidden Markov models, and blind signal separation. Resources for Article: Further resources on this subject: Machine Learning in Bioinformatics [article] Data visualization [article] Introduction to S4 Classes [article]
Read more
  • 0
  • 0
  • 2731

article-image-securing-your-twilio-app
Packt
18 Dec 2014
11 min read
Save for later

Securing your Twilio App

Packt
18 Dec 2014
11 min read
In this article, by Tim Rogers author of Twilio Best Practices, we'll see how to keep our Twilio account and applications (and ultimately credit) secure by: Enabling two-factor authentication on our Twilio account Verifying that requests to our application are really coming from Twilio Setting up a circuit breaker for our account and any subaccounts (For more resources related to this topic, see here.) Enabling two-factor authentication Twilio offers two-factor authentication functionality that we can enable on our account. This will give you much greater security if someone tries to break into your account following the something you know and something you have model. Apart from your password, Twilio will send you an SMS or call you when a login attempt is made, requiring you to enter a one-time password. Not only will this largely prevent malicious access to your account, but you'll also know that someone is attempting to access your account, and what's more, that they have your password. It's worth noting that, unsurprisingly, you can quite easily roll two-factor authentication functionality for your own application using Twilio's call and SMS functionality. Check out https://www.twilio.com/docs/howto/two-factor-authentication for help with getting started. There are two steps to enable two-factor authentication: First, you'll need to add a phone number. You can do this from your Twilio dashboard by clicking on the dropdown in the top-right corner, and then clicking on the first entry with your name and e-mail address. Next, click on Add phone number, and then enter your phone number. Twilio will send you an SMS (or alternatively, call you if you'd like), thereby ensuring that you own the phone number that you've provided. Once you've added and verified your phone number on your user profile, you'll need to set up two-factor authentication on your account(s). To get to the right place, click on Account Settings in the dropdown. If your login has been given access to another user's Twilio account, (that is, the account you access from the Switch Accounts menu option) an administrator on that account will need to repeat this process. From this page, you'll be able to choose between two different two-factor options (you can also disable the feature here): Once Per Computer: This will effectively make the device you're using trusted, which means that subsequent logins for the next 30 days won't require you to use your phone. Every log-in: Every time you try to log in to Twilio, you'll have to provide a one-time password from your phone. Once you're done, click on the Save Settings button at the bottom of your page to set up the two-factor authentication. There are a couple of other features you might want to check out on the Account Settings page. You can reset your API credentials if you accidentally reveal them and you can disable parts of Twilio's Request Inspector which might potentially store sensitive information from your application and require passwords in order to access recordings and media you send in MMS messages. Verifying that requests are from Twilio If parties other than Twilio are able to make requests to your application, they can potentially change and corrupt data or access sensitive information. Without authentication measures, if an attacker was able to guess the URLs of the endpoints on your application that Twilio hits with its webhooks, they could wreak havoc. For instance, they could spoof fake SMS messages so that they appear to come from users or they could access the private phones numbers of users they should only be able to call through a public line you provide. There are two routes you can take to prevent this, ensuring with a reasonable degree of certainty that a request genuinely comes from Twilio: Set up HTTP Basic Authentication Verify the signature of requests to ensure they're signed by Twilio HTTP Basic Authentication HTTP Basic Authentication simply allows you to require a username and password to access your web server's resources. If you're working with PHP, you'll want to set this up on the web server level. This is possible in most servers, including: Apache () Nginx () IIS () If you're not using one of these, you can be virtually certain anyway that this option will be available to you; simply have a look at its documentation or search the web. Alternatively, you can implement Basic Authentication in your PHP code using code along these lines. We'll store the username and password in environment variables for security: <?php if (!isset($_SERVER['PHP_AUTH_USER'])) { // The user didn't even try to authenticate, so sent 401 Unauthorized header('WWW-Authenticate: Basic realm="Twilio only!"'); header('HTTP/1.0 401 Unauthorized'); exit; } elseif ($_SERVER['PHP_AUTH_USER'] == $_ENV["TWILIO_USERNAME"] && $_SERVER['PHP_AUTH_PW'] == $_ENV["TWILIO_PASSWORD"]) { // The user authenticated successfully, so perform actions and output TwiML } else { // The user tried to authenticate, but didn't have the right credentials header('WWW-Authenticate: Basic realm="Twilio only!"'); header('HTTP/1.0 401 Unauthorized'); exit; } ?> Let's go through this bit by bit: If $_SERVER['PHP_AUTH_USER'] isn't set, then no username and password has been provided, so we respond with a 401 Unauthorized error (that is, a header request, and the user provides a username and password, as well as the WWW-Authenticate header), which will make browsers display Twilio only! in the login dialog. If the provided username and password do match what is stored in the TWILIO_USERNAME and TWILIO_PASSWORD environment variables respectively, then we perform actions that the request requires and respond with TwiML. If a username and password was provided, but didn't match those we expected, then we send our 401 error and associated headers again. When we're providing a URL to Twilio (for instance, when initiating a call via the REST API, or setting it for incoming calls or SMS messages from our Dashboard), we can set the username and password in this format: Verifying the signature Alternatively, instead of using a username and password, we can verify the cryptographic signature Twilio generates with its requests based upon our auth token which is sent in the X-Twilio-Signature header. The scheme for doing this is somewhat complicated (you can find it in full at https://www.twilio.com/docs/security#validating-requests) but fortunately, Twilio provides validation functionality in their API libraries alongside code samples. For this method of verification to be available, you'll need to serve your application over HTTPS with Transport Layer Security (TLS) enabled. In fact, you should always do this with your Twilio application, as a good security practice. Following the SSLv3 vulnerability discovered in October, 2014 known as POODLE, you'll want to double-check the security of any SSL configuration. See https://www.digitalocean.com/community/tutorials/how-to-protect-your-server-against-the-poodle-sslv3-vulnerability for details. In PHP, we'd execute the following: <?php // Load auth token from the TWILIO_AUTH_TOKEN environment variable $authToken = $_ENV['TWILIO_AUTH_TOKEN']; // You'll need to make sure the Twilio library is included, either by requiring // it manually or loading Composer's autoload.php $validator = new Services_Twilio_RequestValidator($authToken); $url = $_SERVER["SCRIPT_URI"]; $vars = $_GET; $signature = $_SERVER["HTTP_X_TWILIO_SIGNATURE"]; if ($validator->validate($signature, $url, $vars)) { // This request definitely came from Twilio, so continue onwards... } else { // Watch out - this is not a real request from Twilio. header('HTTP/1.0 401 Unauthorized'); } ?> Here, we instantiate a Services_Twilio_RequestValidator object from the API library with our auth token before passing in the requested URL, the request body ($_GET in this case, but for a POST request, this would be $_POST), and the signature. We then call the validator's validate method with these pieces of data, allowing it to generate the signature itself, and comparing it against what we received in the X-Twilio-Signature header. If it matches, the request is genuine, but if not, the request is spoofed and is not from Twilio. Building a circuit breaker Using Twilio's Usage Triggers allows us to build a circuit breaker. In short, this will let us know when one of our subaccounts passes certain amounts of usage, which will help us detect possible abuse of our account, as well as mistakes in our code. It can even help detect abuse if we were running a multitenant app (that is, offering Twilio-based services to our users). When our specified usage threshold is surpassed, Twilio will send a webhook to a URL of our choice. From this URL, we can perform a range of actions, whether that is sending ourselves an e-mail or even suspending the account in question. Here, we'll just run through a quick example of suspending an account if it spends more than $50 in one day. We'll set up our Usage Trigger using the Twilio dashboard. To do this, first log in, and then switch to the appropriate subaccount you'd like to set up the trigger for by clicking on your name in the top-right corner. Next, click on Subaccounts, and then click on the desired account. Next, click on Usage in the navigation bar, then click on Triggers underneath, and then click on the Create Usage Trigger button. Fill out the fields as shown in the following image. First, you'll need to click on the Trigger a webhook link on the right-hand side of the page where Send an email appears in the screenshot to set up a webhook, replacing the URL with one that would be accessible from a domain of your own, of course. We might also want to automate this process of setting up a usage trigger using the REST API. For example, we might want to automatically suspend a subaccount we've just created for a customer if their usage goes beyond reasonable limits. In order to do so, we'd do the same thing we did previously in PHP like this: <?php $accountSid = $_ENV['TWILIO_ACCOUNT_SID']; $authToken = $_ENV['TWILIO_AUTH_TOKEN']; $subaccountSid = '<the SID of the subaccount>'; // You'll need to make sure the Twilio library is included, either by requiring // it manually or loading Composer's autoload.php $client = new Services_Twilio($accountSid, $authToken); $account = $client->accounts->get($subaccountSid); $account->usage_triggers->create( 'totalprice', '+50', 'https://twiliobestpractices.com/path/to/trigger_action.php', array(    'Recurring' => 'daily',    'TriggerBy' => 'price',    'FriendlyName' => 'Suspend if uses more than $50 per day' ) ); ?> Both options do exactly the same thing. They create a trigger that will send a webhook to when more than $50 is spent by our subaccount on any one day. It's completely up to us what we do from the endpoint that receives the webhook. Anything we can program is possible, from suspending the account to calling an engineer to look into it. Here's some example code to go with the usage trigger we just set up that will automatically suspend the account once it goes over $50 spend in a day. In this code sample, we also verify the authenticity of the request, as we saw previously, making sure that this is a genuine Usage Trigger webhook from Twilio: <?php // Before starting, you'll need to require the Twilio PHP library // We'll load the SID of the subaccount that the trigger relates to and its // auth token from environment variables, but in reality, you're likely to be // loading them from a database of your users' details based on a passed-in ID, // or something along those lines. $subaccountSid = $_ENV['TWILIO_SUBACCOUNT_SID']; $subaccountAuthToken = $_ENV['TWILIO_SUBACCOUNT_AUTH_TOKEN']; $url = $_SERVER['SCRIPT_URI']; $signature = $_SERVER['HTTP_X_TWILIO_SIGNATURE']; $validator = new Services_Twilio_RequestValidator($subaccountAuthToken); if ($validator->validate($signature, $url, $_POST)) {    $client = new Services_Twilio($subaccountSid, $subaccountAuthToken);    $client->account->update(array('Status' => 'suspended')); } else {    header('HTTP/1.0 401 Unauthorized'); } ?> We've shown you examples of doing all of this using the PHP library, but you can do it with any of Twilio's libraries . You can also do this directly with the REST API using a tool such as Postman. Summary In this article, we saw three helpful tips to keep your Twilio account and application secure: First, we enabled Two-Factor Authentication to keep our account secure, even if someone finds out our password. Next, we learned how to make sure that the requests your app receives genuinely come from Twilio, by either using the HTTP Basic Authentication or by verifying the cryptographic request signature. Finally, we set up alerts to inform us and take appropriate action when certain usage thresholds are reached using Twilio's Usage Triggers, helping protect your app from abuse and coding errors. Resources for Article: Further resources on this subject: Make phone calls, send SMS from your website using Twilio [article] Execution of Test Plans [article] Configuring Your Operating System [article]
Read more
  • 0
  • 0
  • 11235

article-image-enemy-and-friendly-ais
Packt
17 Dec 2014
29 min read
Save for later

Enemy and Friendly AIs

Packt
17 Dec 2014
29 min read
In this article by Kyle D'Aoust, author of the book Unity Game Development Scripting, we will see how to create enemy and friendly AIs. (For more resources related to this topic, see here.) Artificial Intelligence, also known as AI, is something that you'll see in every video game that you play. First-person shooter, real-time strategy, simulation, role playing games, sports, puzzles, and so on, all have various forms of AI in both large and small systems. In this article, we'll be going over several topics that involve creating AI, including techniques, actions, pathfinding, animations, and the AI manager. Then, finally, we'll put it all together to create an AI package of our own. In this article, you will learn: What a finite state machine is What a behavior tree is How to combine two AI techniques for complex AI How to deal with internal and external actions How to handle outside actions that affect the AI How to play character animations What is pathfinding? How to use a waypoint system How to use Unity's NavMesh pathfinding system How to combine waypoints and NavMesh for complete pathfinding AI techniques There are two very common techniques used to create AI: the finite state machine and the behavior tree. Depending on the game that you are making and the complexity of the AI that you want, the technique you use will vary. In this article, we'll utilize both the techniques in our AI script to maximize the potential of our AI. Finite state machines Finite state machines are one of the most common AI systems used throughout computer programming. To define the term itself, a finite state machine breaks down to a system, which controls an object that has a limited number of states to exist in. Some real-world examples of a finite state machine are traffic lights, television, and a computer. Let's look at an example of a computer finite state machine to get a better understanding. A computer can be in various states. To keep it simple, we will list three main states. These states are On, Off, and Active. The Off state is when the computer does not have power running it, the On state is when the computer does have power running it, and the Active state is when someone is using the computer. Let's take a further look into our computer finite state machine and explore the functions of each of its states: State Functions On Can be used by anyone Can turn off the computer Off Can turn on the computer Computer parts can be operated on Active Can access the Internet and various programs Can communicate with other devices Can turn off the computer Each state has its own functions. Some of the functions of each state affect each other, while some do not. The functions that do affect each other are the functions that control what state the finite state machine is in. If you press the power button on your computer, it will turn on and change the state of your computer to On. While the state of your computer is On, you can use the Internet and possibly some other programs, or communicate to other devices such as a router or printer. Doing so will change the state of your computer to Active. When you are using the computer, you can also turn off the computer by its software or by pressing the power button, therefore changing the state to Off. In video games, you can use a finite state machine to create AI with a simple logic. You can also combine finite state machines with other types of AI systems to create a unique and perhaps more complex AI system. In this article, we will be using finite state machines as well as what is known as a behavior tree. The behavior tree form of the AI system A behavior tree is another kind of AI system that works in a very similar way to finite state machines. Actually, behavior trees are made up of finite state machines that work in a hierarchical system. This system of hierarchy gives us great control over an individual, and perhaps many finite state systems within the behavior tree, allowing us to have a complex AI system. Taking a look back at the table explaining a finite state machine, a behavior tree works the same way. Instead of states, you have behaviors, and in place of the state functions, you have various finite state machines that determine what is done while the AI is in a specific behavior. Let's take a look at the behavior tree that we will be using in this article to create our AI: On the left-hand side, we have four behaviors: Idle, Guard, Combat, and Flee. To the right are the finite state machines that make up each of the behaviors. Idle and Flee only have one finite state machine, while Guard and Combat have multiple. Within the Combat behavior, two of its finite state machines even have a couple of their own finite state machines. As you can see, this hierarchy-based system of finite state machines allows us to use a basic form of logic to create an even more complex AI system. At the same time, we are also getting a lot of control by separating our AI into various behaviors. Each behavior will run its own silo of code, oblivious to the other behaviors. The only time we want a behavior to notice another behavior is either when an internal or external action occurs that forces the behavior of our AI to change. Combining the techniques In this article, we will take both of the AI techniques and combine them to create a great AI package. Our behavior tree will utilize finite state machines to run the individual behaviors, creating a unique and complex AI system. This AI package can be used for an enemy AI as well as a friendly AI. Let's start scripting! Now, let's begin scripting our AI! To start off, create a new C# file and name it AI_Agent. Upon opening it, delete any functions within the main class, leaving it empty. Just after the using statements, add this enum to the script: public enum Behaviors {Idle, Guard, Combat, Flee}; This enum will be used throughout our script to determine what behavior our AI is in. Now let's add it to our class. It is time to declare our first variable: public Behaviors aiBehaviors = Behaviors.Idle; This variable, aiBehaviors, will be the deciding factor of what our AI does. Its main purpose is to have its value checked and changed when needed. Let's create our first function, which will utilize one of this variable's purposes: void RunBehaviors(){switch(aiBehaviors){case Behaviors.Idle:   RunIdleNode();   break;case Behaviors.Guard:   RunGuardNode();   break;case Behaviors.Combat:   RunCombatNode();   break;case Behaviors.Flee:   RunFleeNode();   break;}} What this function will do is check the value of our aiBehaviors variable in a switch statement. Depending on what the value is, it will then call a function to be used within that behavior. This function is actually going to be a finite state machine, which will decide what that behavior does at that point. Now, let's add another function to our script, which will allow us to change the behavior of our AI: void ChangeBehavior(Behaviors newBehavior){aiBehaviors = newBehavior; RunBehaviors();} As you can see, this function works very similarly to the RunBehaviors function. When this function is called, it will take a new behaviors variable and assign its value to aiBehaviors. By doing this, we changed the behavior of our AI. Now let's add the final step to running our behaviors; for now, they will be empty functions that act as placeholders for our internal and external actions. Add these functions to the script: void RunIdleNode(){ } void RunGuardNode(){ }void RunCombatNode(){ }void RunFleeNode(){ } Each of these functions will run the finite state machines that make up the behaviors. These functions are essentially a middleman between the behavior and the behavior's action. Using these functions is the beginning of having more control over our behaviors, something that can't be done with a simple finite state machine. Internal and external actions The actions of a finite state machine can be broken up into internal and external actions. Separating the actions into the two categories makes it easier to define what our AI does in any given situation. The separation is helpful in the planning phase of creating AI, but it can also help in the scripting part as well, since you will know what will and will not be called by other classes and GameObjects. Another way this separation is beneficial is that it eases the work of multiple programmers working on the same AI; each programmer could work on separate parts of the AI without as many conflicts. External actions External actions are functions and activities that are activated when objects outside of the AI object act upon the AI object. Some examples of external actions include being hit by a player, having a spell being cast upon the player, falling from heights, losing the game by an external condition, communicating with external objects, and so on. The external actions that we will be using for our AI are: Changing its health Raising a stat Lowering a stat Killing the AI Internal actions Internal actions are the functions and activities that the AI runs within itself. Examples of these are patrolling a set path, attacking a player, running away from the player, using items, and so on. These are all actions that the AI will choose to do depending on a number of conditions. The internal actions that we will be using for our AI are: Patrolling a path Attacking a player Fleeing from a player Searching for a player Scripting the actions It's time to add some internal and external actions to the script. First, be sure to add the using statement to the top of your script with the other using statements: using System.Collections.Generic; Now, let's add some variables that will allow us to use the actions: public bool isSuspicious = false;public bool isInRange = false;public bool FightsRanged = false;public List<KeyValuePair<string, int>> Stats = new List<KeyValuePair<string, int>>();public GameObject Projectile; The first three of our new variables are conditions to be used in finite state machines to determine what function should be called. Next, we have a list of the KeyValuePair variables, which will hold the stats of our AI GameObject. The last variable is a GameObject, which is what we will use as a projectile for ranged attacks. Remember the empty middleman functions that we previously created? Now with these new variables, we will be adding some code to each of them. Add this code so that the empty functions are now filled: void RunIdleNode(){Idle();} void RunGuardNode(){Guard();}void RunCombatNode(){if(FightsRanged)   RangedAttack();else   MeleeAttack();}void RunFleeNode(){Flee();} Two of the three boolean variables we just created are being used as conditionals to call different functions, effectively creating finite state machines. Next, we will be adding the rest of our actions; these are what is being called by the middleman functions. Some of these functions will be empty placeholders, but will be filled later on in the article: void Idle(){} void Guard(){if(isSuspicious){   SearchForTarget();}else{   Patrol();}}void Combat(){if(isInRange){   if(FightsRanged)   {     RangedAttack();   }   else   {     MeleeAttack();   }}else{   SearchForTarget();}}void Flee(){} void SearchForTarget(){} void Patrol(){} void RangedAttack(){GameObject newProjectile;newProjectile = Instantiate(Projectile, transform.position, Quaternion.identity) as GameObject;} void MeleeAttack(){} In the Guard function, we check to see whether the AI notices the player or not. If it does, then it will proceed to search for the player; if not, then it will continue to patrol along its path. In the Combat function, we first check to see whether the player is within the attacking range; if not, then the AI searches again. If the player is within the attacking range, we check to see whether the AI prefers attacking up close or far away. For ranged attacks, we first create a new, temporary GameObject variable. Then, we set it to an instantiated clone of our Projectile GameObject. From here, the projectile will run its own scripts to determine what it does. This is how we allow our AI to attack the player from a distance. To finish off our actions, we have two more functions to add. The first one will be to change the health of the AI, which is as follows: void ChangeHealth(int Amount){if(Amount < 0){   if(!isSuspicious)   {     isSuspicious = true;     ChangeBehavior(Behaviors.Guard);   }}for(int i = 0; i < Stats.Capacity; i++){   if(Stats[i].Key == "Health")   {     int tempValue = Stats[i].Value;     Stats[i] = new KeyValuePair<string, int>(Stats[i].Key, tempValue += Amount);     if(Stats[i].Value <= 0)     {       Destroy(gameObject);     }     else if(Stats[i].Value < 25)     {       isSuspicious = false;       ChangeBehavior(Behaviors.Flee);     }     break;   }}} This function takes an int variable, which is the amount by which we want to change the health of the player. The first thing we do is check to see if the amount is negative; if it is, then we make our AI suspicious and change the behavior accordingly. Next, we search for the health stat in our list and set its value to a new value that is affected by the Amount variable. We then check if the AI's health is below zero to kill it; if not, then we also check if its health is below 25. If the health is that low, we make our AI flee from the player. To finish off our actions, we have one last function to add. It will allow us to affect a specific stat of the AI. These modifications will either add to or subtract from a stat. The modifications can be permanent or restored anytime. For the following instance, the modifications will be permanent: void ModifyStat(string Stat, int Amount){for(int i = 0; i < Stats.Capacity; i++){   if(Stats[i].Key == Stat)   {     int tempValue = Stats[i].Value;     Stats[i] = new KeyValuePair<string, int>(Stats[i].Key, tempValue += Amount);     break;   }}if(Amount < 0){   if(!isSuspicious)   {     isSuspicious = true;     ChangeBehavior(Behaviors.Guard);   }}} This function takes a string and an integer. The string is used to search for the specific stat that we want to affect and the integer is how much we want to affect that stat by. It works in a very similar way to how the ChangeHealth function works, except that we first search for a specific stat. We also check to see if the amount is negative. This time, if it is negative, we change our AI behavior to Guard. This seems to be an appropriate response for the AI after being hit by something that negated one of its stats! Pathfinding Pathfinding is how the AI will maneuver around the level. For our AI package, we will be using two different kinds of pathfinding, NavMesh and waypoints. The waypoint system is a common approach to create paths for AI to move around the game level. To allow our AI to move through our level in an intelligent manner, we will use Unity's NavMesh component. Creating paths using the waypoint system Using waypoints to create paths is a common practice in game design, and it's simple too. To sum it up, you place objects or set locations around the game world; these are your waypoints. In the code, you will place all of your waypoints that you created in a container of some kind, such as a list or an array. Then, starting at the first waypoint, you tell the AI to move to the next waypoint. Once that waypoint has been reached, you send the AI off to another one, ultimately creating a system that iterates through all of the waypoints, allowing the AI to move around the game world through the set paths. Although using the waypoint system will grant our AI movement in the world, at this point, it doesn't know how to avoid obstacles that it may come across. That is when you need to implement some sort of mesh navigation system so that the AI won't get stuck anywhere. Unity's NavMesh system The next step in creating AI pathfinding is to create a way for our AI to navigate through the game world intelligently, meaning that it does not get stuck anywhere. In just about every game out there that has a 3D-based AI, the world it inhabits has all sorts of obstacles. These obstacles could be plants, stairs, ramps, boxes, holes, and so on. To get our AI to avoid these obstacles, we will use Unity's NavMesh system, which is built into Unity itself. Setting up the environment Before we can start creating our pathfinding system, we need to create a level for our AI to move around in. To do this, I am just using Unity primitive models such as cubes and capsules. For the floor, create a cube, stretch it out, and squish it to make a rectangle. From there, clone it several times so that you have a large floor made up of cubes. Next, delete a bunch of the cubes and move some others around. This will create holes in our floor, which will be used and tested when we implement the NavMesh system. To make the floor easy to see, I've created a material in green and assigned it to the floor cubes. After this, create a few more cubes, make one really long and one shorter than the previous one but thicker, and the last one will be used as a ramp. I've created an intersection of the really long cube and the thick cube. Then, place the ramp towards the end of the thick cube, giving access to the top of the cubes. Our final step in creating our test environment is to add a few waypoints for our AI. For testing purposes, create five waypoints in this manner. Place one in each corner of the level and one in the middle. For the actual waypoints, use the capsule primitive. For each waypoint, add a rigid body component. Name the waypoints as Waypoint1, Waypoint2, Waypoint3, and so on. The name is not all that important for our code; it just makes it easier to distinguish between waypoints in the inspector. Here's what I made for my level:  Creating the NavMesh Now, we will create the navigation mesh for our scene. The first thing we will do is select all of the floor cubes. In the menu tab in Unity, click on the Window option, and then click on the Navigation option at the bottom of the dropdown; this will open up the Navigation window. This is what you should be seeing right now:  By default, the OffMeshLink Generation option is not checked; be sure to check it. What this does is create links at the edges of the mesh allowing it to communicate with any other OffMeshLink nearby, creating a singular mesh. This is a handy tool since game levels typically use more than one mesh as a floor. The Scene filter will just show specific objects within the hierarchy view list. Selecting all the objects will show all of your GameObjects. Selecting mesh renderers will only show GameObjects that have the mesh renderer component. Then, finally, if you select terrains, only terrains will be shown in the Hierarchy view list. The Navigation Layer dropdown will allow you to set the area as either walkable, not walkable, or jump accessible. Walkable areas typically refer to floors, ramps, and so on. Non-walkable areas refer to walls, rocks, and other various obstacles. Next, click on the Bake tab next to the Object tab. You should see information that looks like this: For this article, I am leaving all the values at their defaults. The Radius property is used to determine how close to the walls the navigation mesh will exist. Height determines how much vertical space is needed for the AI agent to be able to walk on the navigation mesh. Max Slope is the maximum angle that the AI is allowed to travel on for ramps, hills, and so on. The Step Height property is used to determine how high the AI can step up onto surfaces higher than the ground level. For Generated Off Mesh Links, the properties are very similar to each other. The Drop Height value is the maximum amount of space the AI can intelligently drop down to another part of the navigation mesh. Jump Distance is the opposite of Height; it determines how high the AI can jump up to another part of the navigation mesh. The Advanced options are to be used when you have a better understanding of the NavMesh component and want a little more out of it. Here, you can further tweak the accuracy of the NavMesh as well as create Height Mesh to coincide with the navigation mesh. Now that you know all the basics of the Unity NavMesh, let's go ahead and create our navigation mesh. At the bottom-right corner of the Navigation tab in the Inspector window, you should see two buttons: one that says Clear and the other that says Bake. Click on the Bake button now to create your new navigation mesh. Select the ramp and the thick cube that we created earlier. In the Navigation window, make sure that the OffMeshLink Generation option is not checked, and that Navigation Layer is set to Default. If the ramp and the thick cube are not selected, reselect the floor cubes so that you have the floors, ramp, and thick wall selected. Bake the navigation mesh again to create a new one. This is what my scene looks like now with the navigation mesh: You should be able to see the newly generated navigation mesh overlaying the underlying mesh. This is what was created using the default Bake properties. Changing the Bake properties will give you different results, which will come down to what kind of navigation mesh you want the AI to use. Now that we have a navigation mesh, let's create the code for our AI to utilize. First, we will code the waypoint system, and then we will code what is needed for the NavMesh system. Adding our variables To start our navigation system, we will need to add a few variables first. Place these with the rest of our variables: public Transform[] Waypoints;public int curWaypoint = 0;bool ReversePath = false;NavMeshAgent navAgent;Vector3 Destination;float Distance; The first variable is an array of Transforms; this is what we will use to hold our waypoints. Next, we have an integer that is used to iterate through our Transform array. We have a bool variable, which will decide how we should navigate through the waypoints. The next three variables are more oriented towards our navigation mesh that we created earlier. The NavMeshAgent object is what we will reference when we want to interact with the navigation mesh. The destination will be the location that we want the AI to move towards. The distance is what we will use to check how far away we are from that location. Scripting the navigation functions Previously, we created many empty functions; some of these are dependent on pathfinding. Let's start with the Flee function. Add this code to replace the empty function: void Flee(){for(int fleePoint = 0; fleePoint < Waypoints.Length; fleePoint++){   Distance = Vector3.Distance(gameObject.transform.position, Waypoints[fleePoint].position);   if(Distance > 10.00f)   {     Destination = Waypoints[curWaypoint].position;     navAgent.SetDestination(Destination);     break;   }   else if(Distance < 2.00f)   {     ChangeBehavior(Behaviors.Idle);   }}} What this for loop does is pick a waypoint that has Distance of more than 10. If it does, then we set the Destination value to the current waypoint and move the AI accordingly. If the distance from the current waypoint is less than 2, we change the behavior to Idle. The next function that we will adjust is the SearchForTarget function. Add the following code to it, replacing its previous emptiness: void SearchForTarget(){Destination = GameObject.FindGameObjectWithTag("Player").transform.position;navAgent.SetDestination(Destination);Distance = Vector3.Distance(gameObject.transform.position, Destination);if(Distance < 10)   ChangeBehavior(Behaviors.Combat);} This function will now be able to search for a target, the Player target to be more specific. We set Destination to the player's current position, and then move the AI towards the player. When Distance is less than 10, we set the AI behavior to Combat. Now that our AI can run from the player as well as chase them down, let's utilize the waypoints and create paths for the AI. Add this code to the empty Patrol function: void Patrol(){Distance = Vector3.Distance(gameObject.transform.position, Waypoints[curWaypoint].position);if(Distance > 2.00f){   Destination = Waypoints[curWaypoint].position;   navAgent.SetDestination(Destination);}else{   if(ReversePath)   {     if(curWaypoint <= 0)     {       ReversePath = false;     }     else     {       curWaypoint--;       Destination = Waypoints[curWaypoint].position;     }   }   else   {     if(curWaypoint >= Waypoints.Length - 1)     {       ReversePath = true;     }     else     {       curWaypoint++;       Destination = Waypoints[curWaypoint].position;     }   }}} What Patrol will now do is check the Distance variable. If it is far from the current waypoint, we set that waypoint as the new destination of our AI. If the current waypoint is close to the AI, we check the ReversePath Boolean variable. When ReversePath is true, we tell the AI to go to the previous waypoint, going through the path in the reverse order. When ReversePath is false, the AI will go on to the next waypoint in the list of waypoints. With all of this completed, you now have an AI with pathfinding abilities. The AI can also patrol a path set by waypoints and reverse the path when the end has been reached. We have also added abilities for the AI to search for the player as well as flee from the player. Character animations Animations are what bring the characters to life visually in the game. From basic animations to super realistic movements, all the animations are important and really represent what scripters do to the player. Before we add animations to our AI, we first need to get a model mesh for it! Importing the model mesh For this article, I am using a model mesh that I got from the Unity Asset Store. To use the same model mesh that I am using, go to the Unity Asset Store and search for Skeletons Pack. It is a package of four skeleton model meshes that are fully textured, propped, and animated. The asset itself is free and great to use. When you import the package into Unity, it will come with all four models as well as their textures, and an example scene named ShowCase. Open that scene and you should see the four skeletons. If you run the scene, you will see all the skeletons playing their idle animations. Choose the skeleton you want to use for your AI; I chose skeletonDark for mine. Click on the drop-down list of your skeleton in the Hierarchy window, and then on the Bip01 drop-down list. Then, select the magicParticle object. For our AI, we will not need it, so delete it from the Hierarchy window. Create a new prefab in the Project window and name it Skeleton. Now select the skeleton that you want to use from the Hierarchy window and drag it onto the newly created prefab. This will now be the model that you will use for this article. In your AI test scene, drag and drop Skeleton Prefab onto the scene. I have placed mine towards the center of the level, near the waypoint in the middle. In the Inspector window, you will be able to see the Animation component full of animations for the model. Now, we will need to add a few components to our skeleton. Go to the Components menu on the top of the Unity window, select Navigation, and then select NavMesh Agent. Doing this will allow the skeleton to utilize the NavMesh we created earlier. Next, go into the Components menu again and click on Capsule Collider as well as Rigidbody. Your Inspector window should now look like this after adding the components: Your model now has all the necessary components needed to work with our AI script. Scripting the animations To script our animations, we will take a simple approach to it. There won't be a lot of code to deal with, but we will spread it out in various areas of our script where we need to play the animations. In the Idle function, add this line of code: animation.Play("idle"); This simple line of code will play the idle animation. We use animation to access the model's animation component, and then use the Play function of that component to play the animation. The Play function can take the name of the animation to call the correct animation to be played; for this one, we call the idle animation. In the SearchForTarget function, add this line of code to the script: animation.Play("run"); We access the same function of the animation component and call the run animation to play here. Add the same line of code to the Patrol function as well, since we will want to use that animation for that function too. In the RangedAttack and MeleeAttack functions, add this code: animation.Play("attack"); Here, we call the attack animation. If we had a separate animation for ranged attacks, we would use that instead, but since we don't, we will utilize the same animation for both attack types. With this, we finished coding the animations into our AI. It will now play those animations when they are called during gameplay. Putting it all together To wrap up our AI package, we will now finish up the script and add it to the skeleton. Final coding touches At the beginning of our AI script, we created some variables that we have yet to properly assign. We will do that in the Start function. We will also add the Update function to run our AI code. Add these functions to the bottom of the class: void Start(){navAgent = GetComponent<NavMeshAgent>(); Stats.Add(new KeyValuePair<string, int>("Health", 100));Stats.Add(new KeyValuePair<string, int>("Speed", 10));Stats.Add(new KeyValuePair<string, int>("Damage", 25));Stats.Add(new KeyValuePair<string, int>("Agility", 25));Stats.Add(new KeyValuePair<string, int>("Accuracy", 60));} void Update (){RunBehaviors();} In the Start function, we first assign the navAgent variable by getting the NavMeshAgent component from the GameObject. Next, we add new KeyValuePair variables to the Stats array. The Stats array is now filled with a few stats that we created. The Update function calls the RunBehaviors function. This is what will keep the AI running; it will run the correct behavior as long as the AI is active. Filling out the inspector To complete the AI package, we will need to add the script to the skeleton, so drag the script onto the skeleton in the Hierarchy window. In the Size property of the waypoints array, type the number 5 and open up the drop-down list. Starting with Element 0, drag each of the waypoints into the empty slots. For the projectile, create a sphere GameObject and make it a prefab. Now, drag it onto the empty slot next to Projectile. Finally, set the AI Behaviors to Guard. This will make it so that when you start the scene, your AI will be patrolling. The Inspector window of the skeleton should look something like this: Your AI is now ready for gameplay! To make sure everything works, we will need to do some playtesting. Playtesting A great way to playtest the AI is to play the scene in every behavior. Start off with Guard, then run it in Idle, Combat, and Flee. For different outputs, try adjusting some of the variables in the NavMesh Agent component, such as Speed, Angular Speed, and Stopping Distance. Try mixing your waypoints around so the path is different. Summary In this article, you learned how to create an AI package. We explored a couple of techniques to handle AI, such as finite state machines and behavior trees. Then, we dived into AI actions, both internal and external. From there, we figured out how to implement pathfinding with both a waypoint system and Unity's NavMesh system. Finally, we topped the AI package off with animations and put everything together, creating our finalized AI. Resources for Article: Further resources on this subject: Getting Started – An Introduction to GML [article] Animations in Cocos2d-x [article] Components in Unity [article]
Read more
  • 0
  • 0
  • 10220

article-image-arduino-mobile-robot
Packt
17 Dec 2014
8 min read
Save for later

The Arduino Mobile Robot

Packt
17 Dec 2014
8 min read
In this article, Marco Schwartz and Stefan Buttigieg, the authors of the Arduino Android Blueprints, we are going to use most of the concepts we have learned to control a mobile robot via an Android app. The robot will have two motors that we can control, and also an ultrasonic sensor in the front so that it can detect obstacles. The robot will also have a BLE chip so that it can receive commands from the Android app. The following will be the major takeaways: Building a mobile robot based on the Arduino platform Connecting a BLE module to the Arduino robot (For more resources related to this topic, see here.) Configuring the hardware We are first going to assemble the robot itself, and then see how to connect the Bluetooth module and the ultrasonic sensor. To give you an idea of what you should end up with, the following is a front-view image of the robot when fully assembled: The following image shows the back of the robot when fully assembled: The first step is to assemble the robot chassis. To do so, you can watch the DFRobot assembly guide at https://www.youtube.com/watch?v=tKakeyL_8Fg. Then, you need to attach the different Arduino boards and shields to the robot. Use the spacers found in the robot chassis kit to mount the Arduino Uno board first. Then put the Arduino motor shield on top of that. At this point, use the screw header terminals to connect the two DC motors to the motor shield. This is how it should look at this point: Finally, mount the prototyping shield on top of the motor shield. We are now going to connect the BLE module and the ultrasonic sensor to the Arduino prototyping shield. The following is a schematic diagram showing the connections between the Arduino Uno board (done via the prototyping shield in our case) and the components: Now perform the following steps: First, we are now going to connect the BLE module. Place the module on the prototyping shield. Connect the power supply of the module as follows: GND goes to the prototyping shield's GND pin, and VIN goes to the prototyping shield's +5V. After that, you need to connect the different wires responsible for the SPI interface: SCK to Arduino pin 13, MISO to Arduino pin 12, and MOSI to Arduino pin 11. Then connect the REQ pin to Arduino pin 10. Finally, connect the RDY pin to Arduino pin 2 and the RST pin to Arduino pin 9. For the URM37 module, connect the VCC pin of the module to Arduino +5V, GND to GND, and the PWM pin to the Arduino A3 pin. To review the pin order on the URM37 module, you can check the official DFRobot documentation at http://www.dfrobot.com/wiki/index.php?title=URM37_V3.2_Ultrasonic_Sensor_(SKU:SEN0001). The following is a close-up image of the prototyping shield with the BLE module connected: Finally, connect the 7.4 V battery to the Arduino Uno board power jack. The battery is simply placed below the Arduino Uno board. Testing the robot We are now going to write a sketch to test the different functionalities of the robot, first without using Bluetooth. As the sketch is quite long, we will look at the code piece by piece. Before you proceed, make sure that the battery is always plugged into the robot. Now perform the following steps: The sketch starts by including the aREST library that we will use to control the robot via serial commands: #includ e <aREST.h> Now we declare which pins the motors are connected to: int speed_motor1 = 6; int speed_motor2 = 5; int direction_motor1 = 7; int direction_motor2 = 4; We also declare which pin the ultrasonic sensor is connected to: int distance_sensor = A3; Then, we create an instance of the aREST library: aREST rest = aREST(); To store the distance data measured by the ultrasonic sensor, we declare a distance variable: int distance; In the setup() function of the sketch, we first initialize serial communications that we will use to communicate with the robot for this test: Serial.begin(115200); We also expose the distance variable to the REST API, so we can access it easily: rest.variable("distance",&distance); To control the robot, we are going to declare a whole set of functions that will perform the basic operations: going forward, going backward, turning on itself (left or right), and stopping. We will see the details of these functions in a moment; for now, we just need to expose them to the API: rest.function("forward",forward); rest.function("backward",backward); rest.function("left",left); rest.function("right",right); rest.function("stop",stop); We also give the robot an ID and a name: rest.set_id("001"); rest.set_name("mobile_robot"); In the loop() function of the sketch, we first measure the distance from the sensor: distance = measure_distance(distance_sensor); We then handle the requests using the aREST library: rest.handle(Serial); Now, we will look at the functions for controlling the motors. They are all based on a function to control a single motor, where we need to set the motor pins, the speed, and the direction of the motor: void send_motor_command(int speed_pin, int direction_pin, int pwm, boolean dir) { analogWrite(speed_pin, pwm); // Set PWM control, 0 for stop, and 255 for maximum speed digitalWrite(direction_pin, dir); // Dir set the rotation direction of the motor (true or false means forward or reverse) } Based on this function, we can now define the different functions to move the robot, such as forward: int forward(String command) { send_motor_command(speed_motor1,direction_motor1,100,1); send_motor_command(speed_motor2,direction_motor2,100,1); return 1; } We also define a backward function, simply inverting the direction of both motors: int backward(String command) { send_motor_command(speed_motor1,direction_motor1,100,0); send_motor_command(speed_motor2,direction_motor2,100,0); return 1; } To make the robot turn left, we simply make the motors rotate in opposite directions: int left(String command) { send_motor_command(speed_motor1,direction_motor1,75,0); send_motor_command(speed_motor2,direction_motor2,75,1); return 1; } We also have a function to stop the robot: int stop(String command) { send_motor_command(speed_motor1,direction_motor1,0,1); send_motor_command(speed_motor2,direction_motor2,0,1); return 1; } There is also a function to make the robot turn right, which is not detailed here. We are now going to test the robot. Before you do anything, ensure that the battery is always plugged into the robot. This will ensure that the motors are not trying to get power from your computer USB port, which could damage it. Also place some small support at the bottom of the robot so that the wheels don't touch the ground. This will ensure that you can test all the commands of the robot without the robot moving too far from your computer, as it is still attached via the USB cable. Now you can upload the sketch to your Arduino Uno board. Open the serial monitor and type the following: /forward This should make both the wheels of the robot turn in the same direction. You can also try the other commands to move the robot to make sure they all work properly. Then, test the ultrasonic distance sensor by typing the following: /distance You should get back the distance (in centimeters) in front of the sensor: {"distance": 24, "id": "001", "name": "mobile_robot", "connected": true} Try changing the distance by putting your hand in front of the sensor and typing the command again. Writing the Arduino sketch Now that we have made sure that the robot is working properly, we can write the final sketch that will receive the commands via Bluetooth. As the sketch shares many similarities with the test sketch, we are only going to see what is added compared to the test sketch. We first need to include more libraries: #include <SPI.h> #include "Adafruit_BLE_UART.h" #include <aREST.h> We also define which pins the BLE module is connected to: #define ADAFRUITBLE_REQ 10 #define ADAFRUITBLE_RDY 2     // This should be an interrupt pin, on Uno thats #2 or #3 #define ADAFRUITBLE_RST 9 We have to create an instance of the BLE module: Adafruit_BLE_UART BTLEserial = Adafruit_BLE_UART(ADAFRUITBLE_REQ, ADAFRUITBLE_RDY, ADAFRUITBLE_RST); In the setup() function of the sketch, we initialize the BLE chip: BTLEserial.begin(); In the loop() function, we check the status of the BLE chip and store it in a variable: BTLEserial.pollACI(); aci_evt_opcode_t status = BTLEserial.getState(); If we detect that a device is connected to the chip, we handle the incoming request with the aREST library, which will allow us to use the same commands as before to control the robot: if (status == ACI_EVT_CONNECTED) { rest.handle(BTLEserial); } You can now upload the code to your Arduino board, again by making sure that the battery is connected to the Arduino Uno board via the power jack. You can now move on to the development of the Android application to control the robot. Summary In this article, we managed to create our very own mobile robot together with a companion Android application that we can use to control our robot. We achieved this step by step by setting up an Arduino-enabled robot and coding the companion Android application. It uses the BLE software and hardware of an Android physical device running on Android 4.3 or higher. Resources for Article: Further resources on this subject: Our First Project – A Basic Thermometer [article] Hardware configuration [article] Avoiding Obstacles Using Sensors [article]
Read more
  • 0
  • 0
  • 9864

article-image-lookups
Packt
17 Dec 2014
24 min read
Save for later

Mastering Splunk: Lookups

Packt
17 Dec 2014
24 min read
In this article, by James Miller, author of the book Mastering Splunk, we will discuss Splunk lookups and workflows. The topics that will be covered in this article are as follows: The value of a lookup Design lookups File lookups Script lookups (For more resources related to this topic, see here.) Lookups Machines constantly generate data, usually in a raw form that is most efficient for processing by machines, but not easily understood by "human" data consumers. Splunk has the ability to identify unique identifiers and/or result or status codes within the data. This gives you the ability to enhance the readability of the data by adding descriptions or names as new search result fields. These fields contain information from an external source such as a static table (a CSV file) or the dynamic result of a Python command or a Python-based script. Splunk's lookups can use information within returned events or time information to determine how to add other fields from your previously defined external data sources. To illustrate, here is an example of a Splunk static lookup that: Uses the Business Unit value in an event Matches this value with the organization's business unit name in a CSV file Adds the definition to the event (as the Business Unit Name field) So, if you have an event where the Business Unit value is equal to 999999, the lookup will add the Business Unit Name value as Corporate Office to that event. More sophisticated lookups can: Populate a static lookup table from the results of a report. Use a Python script (rather than a lookup table) to define a field. For example, a lookup can use a script to return a server name when given an IP address. Perform a time-based lookup if your lookup table includes a field value that represents time. Let's take a look at an example of a search pipeline that creates a table based on IBM Cognos TM1 file extractions: sourcetype=csv 2014 "Current Forecast" "Direct" "513500" |rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2" as "Account" "451200" as "Activity" | eval RFCST= round(FCST) |Table Month, "Business Unit", RFCST The following table shows the results generated:   Now, add the lookup command to our search pipeline to have Splunk convert Business Unit into Business Unit Name: sourcetype=csv 2014 "Current Forecast" "Direct" "513500" |rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2" as "Account" "451200"as "Activity" | eval RFCST= round(FCST) |lookup BUtoBUName BU as "Business Unit" OUTPUT BUName as "Business Unit Name" | Table Month, "Business Unit", "Business Unit Name", RFCST The lookup command in our Splunk search pipeline will now add Business Unit Name in the results table:   Configuring a simple field lookup In this section, we will configure a simple Splunk lookup. Defining lookups in Splunk Web You can set up a lookup using the Lookups page (in Splunk Web) or by configuring stanzas in the props.conf and transforms.conf files. Let's take the easier approach first and use the Splunk Web interface. Before we begin, we need to establish our lookup table that will be in the form of an industry standard comma separated file (CSV). Our example is one that converts business unit codes to a more user-friendly business unit name. For example, we have the following information: Business unit code Business unit name 999999 Corporate office VA0133SPS001 South-western VA0133NLR001 North-east 685470NLR001 Mid-west In the events data, only business unit codes are included. In an effort to make our Splunk search results more readable, we want to add the business unit name to our results table. To do this, we've converted our information (shown in the preceding table) to a CSV file (named BUtoBUName.csv):   For this example, we've kept our lookup table simple, but lookup tables (files) can be as complex as you need them to be. They can have numerous fields (columns) in them. A Splunk lookup table has a few requirements, as follows: A table must contain a minimum of two columns Each of the columns in the table can have duplicate values You should use (plain) ASCII text and not non-UTF-8 characters Now, from Splunk Web, we can click on Settings and then select Lookups:   From the Lookups page, we can select Lookup table files:   From the Lookup table files page, we can add our new lookup file (BUtoBUName.csv):   By clicking on the New button, we see the Add new page where we can set up our file by doing the following: Select a Destination app (this is a drop-down list and you should select Search). Enter (or browse to) our file under Upload a lookup file. Provide a Destination filename. Then, we click on Save:   Once you click on Save, you should receive the Successfully saved "BUtoBUName" in search" message:   In the previous screenshot, the lookup file is saved by default as private. You will need to adjust permissions to allow other Splunk users to use it. Going back to the Lookups page, we can select Lookup definitions to see the Lookup definitions page:   In the Lookup definitions page, we can click on New to visit the Add new page (shown in the following screenshot) and set up our definition as follows: Destination app: The lookup will be part of the Splunk search app Name: Our file is BUtoBUName Type: Here, we will select File-based Lookup file: The filename is ButoBUName.csv, which we uploaded without the .csv suffix Again, we should see the Successfully saved "BUtoBUName" in search message:   Now, our lookup is ready to be used: Automatic lookups Rather than having to code for a lookup in each of your Splunk searches, you have the ability to configure automatic lookups for a particular source type. To do this from Splunk Web, we can click on Settings and then select Lookups:   From the Lookups page, click on Automatic lookups:   In the Automatic lookups page, click on New:   In the Add New page, we will fill in the required information to set up our lookup: Destination app: For this field, some options are framework, launcher, learned, search, and splunk_datapreview (for our example, select search). Name: This provide a user-friendly name that describes this automatic lookup. Lookup table: This is the name of the lookup table you defined with a CSV file (discussed earlier in this article). Apply to: This is the type that you want this automatic lookup to apply to. The options are sourcetype, source, or host (I've picked sourcetype). Named: This is the name of the type you picked under Apply to. I want my automatic search to apply for all searches with the sourcetype of csv. Lookup input fields: This is simple in my example. In my lookup table, the field to be searched on will be BU and the = field value will be the field in the event results that I am converting; in my case, it was the field 650693NLR001. Lookup output fields: This will be the field in the lookup table that I am using to convert to, which in my example is BUName and I want to call it Business Unit Name, so this becomes the = field value. Overwrite field values: This is a checkbox where you can tell Splunk to overwrite existing values in your output fields—I checked it. The Add new page The Splunk Add new page (shown in the following screenshot) is where you enter the lookup information (detailed in the previous section):   Once you have entered your automatic lookup information, you can click on Save and you will receive the Successfully saved "Business Unit to Business Unit Name" in search message:   Now, we can use the lookup in a search. For example, you can run a search with sourcetype=csv, as follows: sourcetype=csv 2014 "Current Forecast" "Direct" "513500" |rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2"as "Account" "451200" as "Activity" | eval RFCST= round(FCST) |Table "Business Unit", "Business Unit Name", Month, RFCST Notice in the following screenshot that Business Unit Name is converted to the user-friendly values from our lookup table, and we didn't have to add the lookup command to our search pipeline:   Configuration files In addition to using the Splunk web interface, you can define and configure lookups using the following files: props.conf transforms.conf To set up a lookup with these files (rather than using Splunk web), we can perform the following steps: Edit transforms.conf to define the lookup table. The first step is to edit the transforms.conf configuration file to add the new lookup reference. Although the file exists in the Splunk default folder ($SPLUNK_HOME/etc/system/default), you should edit the file in $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/ (if the file doesn't exist here, create it). Whenever you edit a Splunk .conf file, always edit a local version, keeping the original (system directory version) intact. In the current version of Splunk, there are two types of lookup tables: static and external. Static lookups use CSV files, and external (which are dynamic) lookups use Python scripting. You have to decide if your lookup will be static (in a file) or dynamic (use script commands). If you are using a file, you'll use filename; if you are going to use a script, you use external_cmd (both will be set in the transforms.conf file). You can also limit the number of matching entries to apply to an event by setting the max_matches option (this tells Splunk to use the first <integer> (in file order) number of entries). I've decided to leave the default for max_matches, so my transforms.conf file looks like the following: [butobugroup]filename = butobugroup.csv This step is optional. Edit props.conf to apply your lookup table automatically. For both static and external lookups, you stipulate the fields you want to match in the configuration file and the output from the lookup table that you defined in your transforms.conf file. It is okay to have multiple field lookups defined in one source lookup definition, but each lookup should have its own unique lookup name; for example, if you have multiple tables, you can name them LOOKUP-table01, LOOKUP-table02, and so on, or something perhaps more easily understood. If you add a lookup to your props.conf file, this lookup is automatically applied to all events from searches that have matching source types (again, as mentioned earlier; if your automatic lookup is very slow, it will also impact the speed of your searches). Restart Splunk to see your changes. Implementing a lookup using configuration files – an example To illustrate the use of configuration files in order to implement an automatic lookup, let's use a simple example. Once again, we want to convert a field from a unique identification code for an organization's business unit to a more user friendly descriptive name called BU Group. What we will do is match the field bu in a lookup table butobugroup.csv with a field in our events. Then, add the bugroup (description) to the returned events. The following shows the contents of the butobugroup.csv file: bu, bugroup 999999, leadership-groupVA0133SPS001, executive-group650914FAC002, technology-group You can put this file into $SPLUNK_HOME/etc/apps/<app_name>/lookups/ and carry out the following steps: Put the butobugroup.csv file into $SPLUNK_HOME/etc/apps/search/lookups/, since we are using the search app. As we mentioned earlier, we edit the transforms.conf file located at either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/. We add the following two lines: [butobugroup]filename = butobugroup.csv Next, as mentioned earlier in this article, we edit the props.conf file located at either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/. Here, we add the following two lines: [csv]LOOKUP-check = butobugroup bu AS 650693NLR001 OUTPUT bugroup Restart the Splunk server. You can (assuming you are logged in as an admin or have admin privileges) restart the Splunk server through the web interface by going to Settings, then select System and finally Server controls. Now, you can run a search for sourcetype=csv (as shown here): sourcetype=csv 2014 "Current Forecast" "Direct" "513500" |rename May as "Month" ,650693NLR001 as "Business Unit" 100000 as "FCST"| eval RFCST= round(FCST) |Table "Business Unit", "Business Unit Name", bugroup, Month, RFCST You will see that the field bugroup can be returned as part of your event results:   Populating lookup tables Of course, you can create CSV files from external systems (or, perhaps even manually?), but from time to time, you might have the opportunity to create lookup CSV files (tables) from event data using Splunk. A handy command to accomplish this is outputcsv (which is covered in detail later in this article). The following is a simple example of creating a CSV file from Splunk event data that can be used for a lookup table: sourcetype=csv "Current Forecast" "Direct" | rename 650693NLR001 as "Business Unit" | Table "Business Unit", "Business Unit Name", bugroup | outputcsv splunk_master The results are shown in the following screeshot:   Of course, the output table isn't quite usable, since the results have duplicates. Therefore, we can rewrite the Splunk search pipeline introducing the dedup command (as shown here): sourcetype=csv   "Current Forecast" "Direct"   | rename 650693NLR001 as "Business Unit" | dedup "Business Unit" | Table "Business Unit", "Business Unit Name", bugroup | outputcsv splunk_master Then, we can examine the results (now with more desirable results):   Handling duplicates with dedup This command allows us to set the number of duplicate events to be kept based on the values of a field (in other words, we can use this command to drop duplicates from our event results for a selected field). The event returned for the dedup field will be the first event found (if you provide a number directly after the dedup command, it will be interpreted as the number of duplicate events to keep; if you don't specify a number, dedup keeps only the first occurring event and removes all consecutive duplicates). The dedup command also lets you sort by field or list of fields. This will remove all the duplicates and then sort the results based on the specified sort-by field. Adding a sort in conjunction with the dedup command can affect the performance as Splunk performs the dedup operation and then sorts the results as a final step. Here is a search command using dedup: sourcetype=csv   "Current Forecast" "Direct"   | rename 650693NLR001 as "Business Unit" | dedup "Business Unit" sortby bugroup | Table "Business Unit", "Business Unit Name", bugroup | outputcsv splunk_master The result of the preceding command is shown in the following screenshot:   Now, we have our CSV lookup file (outputcsv splunk_master) generated and ready to be used:   Look for your generated output file in $SPLUNK_HOME/var/run/splunk. Dynamic lookups With a Splunk static lookup, your search reads through a file (a table) that was created or updated prior to executing the search. With dynamic lookups, the file is created at the time the search executes. This is possible because Splunk has the ability to execute an external command or script as part of your Splunk search. At the time of writing this book, Splunk only directly supports Python scripts for external lookups. If you are not familiar with Python, its implementation began in 1989 and is a widely used general-purpose, high-level programming language, which is often used as a scripting language (but is also used in a wide range of non-scripting contexts). Keep in mind that any external resources (such as a file) or scripts that you want to use with your lookup will need to be copied to a location where Splunk can find it. These locations are: $SPLUNK_HOME/etc/apps/<app_name>/bin $SPLUNK_HOME/etc/searchscripts The following sections describe the process of using the dynamic lookup example script that ships with Splunk (external_lookup.py). Using Splunk Web Just like with static lookups, Splunk makes it easy to define a dynamic or external lookup using the Splunk web interface. First, click on Settings and then select Lookups:   On the Lookups page, we can select Lookup table files to define a CSV file that contains the input file for our Python script. In the Add new page, we enter the following information: Destination app: For this field, select Search Upload a lookup file: Here, you can browse to the filename (my filename is dnsLookup.csv) Destination filename: Here, enter dnslookup The Add new page is shown in the following screenshot:   Now, click on Save. The lookup file (shown in the following screenshot) is a text CSV file that needs to (at a minimum) contain the two field names that the Python (py) script accepts as arguments, in this case, host and ip. As mentioned earlier, this file needs to be copied to $SPLUNK_HOME/etc/apps/<app_name>/bin.   Next, from the Lookups page, select Lookup definitions and then click on New. This is where you define your external lookup. Enter the following information: Type: For this, select External (as this lookup will run an external script) Command: For this, enter external_lookup.py host ip (this is the name of the py script and its two arguments) Supported fields: For this, enter host, ip (this indicates the two script input field names) The following screenshot describes a new lookup definition:   Now, click on Save. Using configuration files instead of Splunk Web Again, just like with static lookups in Splunk, dynamic lookups can also be configured in the Splunk transforms.conf file: [myLookup]external_cmd = external_lookup.py host ipexternal_type = pythonfields_list = host, ipmax_matches = 200 Let's learn more about the terms here: [myLookup]: This is the report stanza. external_cmd: This is the actual runtime command definition. Here, it executes the Python (py) script external_lookup, which requires two arguments (or parameters), host and ip. external_type (optional): This indicates that this is a Python script. Although this is an optional entry in the transform.conf file, it's a good habit to include this for readability and support. fields_list: This lists all the fields supported by the external command or script, delimited by a comma and space. The next step is to modify the props.conf file, as follows: [mylookup]LOOKUP-rdns = dnslookup host ip OUTPUT ip After updating the Splunk configuration files, you will need to restart Splunk. External lookups The external lookup example given uses a Python (py) script named external_lookup.py, which is a DNS lookup script that can return an IP address for a given host name or a host name for a provided IP address. Explanation The lookup table field in this example is named ip, so Splunk will mine all of the IP addresses found in the indexed logs' events and add the values of ip from the lookup table into the ip field in the search events. We can notice the following: If you look at the py script, you will notice that the example uses an MS Windows supported socket.gethostbyname_ex(host) function The host field has the same name in the lookup table and the events, so you don't need to do anything else Consider the following search command: sourcetype=tm1* | lookup dnslookup host | table host, ip When you run this command, Splunk uses the lookup table to pass the values for the host field as a CSV file (the text CSV file we looked at earlier) into the external command script. The py script then outputs the results (with both the host and ip fields populated) and returns it to Splunk, which populates the ip field in a result table:   Output of the py script with both the host and ip fields populated Time-based lookups If your lookup table has a field value that represents time, you can use the time field to set up a Splunk fields lookup. As mentioned earlier, the Splunk transforms.conf file can be modified to add a lookup stanza. For example, the following screenshot shows a file named MasteringDCHP.csv:   You can add the following code to the transforms.conf file: [MasteringDCHP]filename = MasteringDCHP.csvtime_field = TimeStamptime_format = %d/%m/%y %H:%M:%S $pmax_offset_secs = <integer>min_offset_secs = <integer> The file parameters are defined as follows: [MasteringDCHP]: This is the report stanza filename: This is the name of the CSV file to be used as the lookup table time_field: This is the field in the file that contains the time information and is to be used as the timestamp time_format: This indicates what format the time field is in max_offset_secs and min_offset_secs: This indicates min/max amount of offset time for an event to occur after a lookup entry Be careful with the preceding values; the offset relates to the timestamp in your lookup (CSV) file. Setting a tight (small) offset range might reduce the effectiveness of your lookup results! The last step will be to restart Splunk. An easier way to create a time-based lookup Again, it's a lot easier to use the Splunk Web interface to set up our lookup. Here is the step-by-step process: From Settings, select Lookups, and then Lookup table files: In the Lookup table files page, click on New, configure our lookup file, and then click on Save: You should receive the Successfully saved "MasterDHCP" in search message: Next, select Lookup definitions and from this page, click on New: In the Add new page, we define our lookup table with the following information: Destination app: For this, select search from the drop-down list Name: For this, enter MasterDHCP (this is the name you'll use in your lookup) Type: For this, select File-based (as this lookup table definition is a CSV file) Lookup file: For this, select the name of the file to be used from the drop-down list (ours is MasteringDCHP) Configure time-based lookup: Check this checkbox Name of time field: For this, enter TimeStamp (this is the field name in our file that contains the time information) Time format: For this, enter the string to describe to Splunk the format of our time field (our field uses this format: %d%m%y %H%M%S) You can leave the rest blank and click on Save. You should receive the Successfully saved "MasterDHCP" in search message: Now, we are ready to try our search: sourcetype=dh* | Lookup MasterDHCP IP as "IP" | table DHCPTimeStamp, IP, UserId | sort UserId The following screenshot shows the output:   Seeing double? Lookup table definitions are indicated with the attribute LOOKUP-<class> in the Splunk configuration file, props.conf, or in the web interface under Settings | Lookups | Lookup definitions. If you use the Splunk Web interface (which we've demonstrated throughout this article) to set up or define your lookup table definitions, Splunk will prevent you from creating duplicate table names, as shown in the following screenshot:   However, if you define your lookups using the configuration settings, it is important to try and keep your table definition names unique. If you do give the same name to multiple lookups, the following rules apply: If you have defined lookups with the same stanza (that is, using the same host, source, or source type), the first defined lookup in the configuration file wins and overrides all others. If lookups have different stanzas but overlapping events, the following logic is used by Splunk: Events that match the host get the host lookup Events that match the sourcetype get the sourcetype lookup Events that match both only get the host lookup It is a proven practice recommendation to make sure that all of your lookup stanzas have unique names. Command roundup This section lists several important Splunk commands you will use when working with lookups. The lookup command The Splunk lookup command is used to manually invoke field lookups using a Splunk lookup table that is previously defined. You can use Splunk Web (or the transforms.conf file) to define your lookups. If you do not specify OUTPUT or OUTPUTNEW, all fields in the lookup table (excluding the lookup match field) will be used by Splunk as output fields. Conversely, if OUTPUT is specified, the output lookup fields will overwrite existing fields and if OUTPUTNEW is specified, the lookup will not be performed for events in which the output fields already exist. For example, if you have a lookup table specified as iptousername with (at least) two fields, IP and UserId, for each event, Splunk will look up the value of the field IP in the table and for any entries that match, the value of the UserId field in the lookup table will be written to the field user_name in the event. The query is as follows: ... Lookup iptousernameIP as "IP" output UserId as user_name Always strive to perform lookups after any reporting commands in your search pipeline, so that the lookup only needs to match the results of the reporting command and not every individual event. The inputlookup and outputlookup commands The inputlookup command allows you to load search results from a specified static lookup table. It reads in a specified CSV filename (or a table name as specified by the stanza name in transforms.conf). If the append=t (that is, true) command is added, the data from the lookup file is appended to the current set of results (instead of replacing it). The outputlookup command then lets us write the results' events to a specified static lookup table (as long as this output lookup table is defined). So, here is an example of reading in the MasterDHCP lookup table (as specified in transforms.conf) and writing these event results to the lookup table definition NewMasterDHCP: | inputlookup MasterDHCP | outputlookup NewMasterDHCP After running the preceding command, we can see the following output:   Note that we can add the append=t command to the search in the following fashion: | inputlookup MasterDHCP.csv | inputlookup NewMasterDHCP.csv append=t | The inputcsv and outputcsv commands The inputcsv command is similar to the inputlookup command; in this, it loads search results, but this command loads from a specified CSV file. The filename must refer to a relative path in $SPLUNK_HOME/var/run/splunk and if the specified file does not exist and the filename did not have an extension, then a filename with a .csv extension is assumed. The outputcsv command lets us write our result events to a CSV file. Here is an example where we read in a CSV file named splunk_master.csv, search for the text phrase FPM, and then write any matching events to a CSV file named FPMBU.csv: | inputcsv splunk_master.csv | search "Business Unit Name"="FPM" | outputcsv FPMBU.csv The following screenshot shows the results from the preceding search command:   The following screenshot shows the resulting file generated as a result of the preceding command:   Here is another example where we read in the same CSV file (splunk_master.csv) and write out only events from 51 to 500: | inputcsv splunk_master start=50 max=500 Events are numbered starting with zero as the first entry (rather than 1). Summary In this article, we defined Splunk lookups and discussed their value. We also went through the two types of lookups, static and dynamic, and saw detailed, working examples of each. Various Splunk commands typically used with the lookup functionality were also presented. Resources for Article: Further resources on this subject: Working with Apps in Splunk [article] Processing Tweets with Apache Hive [article] Indexes [article]
Read more
  • 0
  • 0
  • 10411
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-ridge-regression
Packt
16 Dec 2014
9 min read
Save for later

Ridge Regression

Packt
16 Dec 2014
9 min read
In this article by Patrick R. Nicolas, the author of the book Scala for Machine Learning, we will cover the basics of ridge regression. The purpose of regression is to minimize a loss function, the residual sum of squares (RSS) being the one commonly used. The problem of overfitting can be addressed by adding a penalty term to the loss function. The penalty term is an element of the larger concept of regularization. (For more resources related to this topic, see here.) Ln roughness penalty Regularization consists of adding a penalty function J(w) to the loss function (or RSS in the case of a regressive classifier) in order to prevent the model parameters (or weights) from reaching high values. A model that fits a training set very well tends to have many features variable with relatively large weights. This process is known as shrinkage. Practically, shrinkage consists of adding a function with model parameters as an argument to the loss function: The penalty function is completely independent from the training set {x,y}. The penalty term is usually expressed as a power to function of the norm of the model parameters (or weights) wd. For a model of D dimension the generic Lp-norm is defined as follows: Notation Regularization applies to parameters or weights associated to an observation. In order to be consistent with our notation w0 being the intercept value, the regularization applies to the parameters w1 …wd. The two most commonly used penalty functions for regularization are L1 and L2. Regularization in machine learning The regularization technique is not specific to the linear or logistic regression. Any algorithm that minimizes the residual sum of squares, such as support vector machine or feed-forward neural network, can be regularized by adding a roughness penalty function to the RSS. The L1 regularization applied to the linear regression is known as the Lasso regularization. The Ridge regression is a linear regression that uses the L2 regularization penalty. You may wonder which regularization makes sense for a given training set. In a nutshell, L2 and L1 regularizations differ in terms of computation efficiency, estimation, and features selection (refer to the 13.3 L1 regularization: basics section in the book Machine Learning: A Probabilistic Perspective, and the Feature selection, L1 vs. L2 regularization, and rotational invariance paper available at http://www.machinelearning.org/proceedings/icml2004/papers/354.pdf). The various differences between the two regularizations are as follows: Model estimation: L1 generates a sparser estimation of the regression parameters than L2. For large non-sparse dataset, L2 has a smaller estimation error than L1. Feature selection: L1 is more effective in reducing the regression weights for features with high value than L2. Therefore, L1 is a reliable features selection tool. Overfitting: Both L1 and L2 reduce the impact of overfitting. However, L1 has a significant advantage in overcoming overfitting (or excessive complexity of a model) for the same reason it is more appropriate for selecting features. Computation: L2 is conducive to a more efficient computation model. The summation of the loss function and L2 penalty w2 is a continuous and differentiable function for which the first and second derivative can be computed (convex minimization). The L1 term is the summation of |wi|, and therefore, not differentiable. Terminology The ridge regression is sometimes called the penalized least squares regression. The L2 regularization is also known as the weight decay. Let's implement the ridge regression, and then evaluate the impact of the L2-norm penalty factor. Ridge regression The ridge regression is a multivariate linear regression with a L2 norm penalty term, and can be calculated as follows: The computation of the ridge regression parameters requires the resolution of the system of linear equations similar to the linear regression. Matrix representation of ridge regression closed form is as follows: I is the identity matrix and it is using the QR decomposition, as shown here: Implementation The implementation of the ridge regression adds L2 regularization term to the multiple linear regression computation of the Apache Commons Math library. The methods of RidgeRegression have the same signature as its ordinary least squares counterpart. However, the class has to inherit the abstract base class AbstractMultipleLinearRegression in the Apache Commons Math and override the generation of the QR decomposition to include the penalty term, as shown in the following code: class RidgeRegression[T <% Double](val xt: XTSeries[Array[T]],                                    val y: DblVector,                                   val lambda: Double) {                   extends AbstractMultipleLinearRegression                    with PipeOperator[Array[T], Double] {    private var qr: QRDecomposition = null    private[this] val model: Option[RegressionModel] = …    … } Besides the input time series xt and the labels y, the ridge regression requires the lambda factor of the L2 penalty term. The instantiation of the class train the model. The steps to create the ridge regression models are as follows: Extract the Q and R matrices for the input values, newXSampleData (line 1) Compute the weights using the calculateBeta defined in the base class (line 2) Return the tuple regression weights calculateBeta and the residuals calculateResiduals private val model: Option[(DblVector, Double)] = { this.newXSampleData(xt.toDblMatrix) //1 newYSampleData(y) val _rss = calculateResiduals.toArray.map(x => x*x).sum val wRss = (calculateBeta.toArray, _rss) //2 Some(RegressionModel(wRss._1, wRss._2)) } The QR decomposition in the AbstractMultipleLinearRegression base class does not include the penalty term (line 3); the identity matrix with lambda factor in the diagonal has to be added to the matrix to be decomposed (line 4). override protected def newXSampleData(x: DblMatrix): Unit = { super.newXSampleData(x)   //3 val xtx: RealMatrix = getX val nFeatures = xt(0).size Range(0, nFeatures).foreach(i => xtx.setEntry(i,i,xtx.getEntry(i,i) + lambda)) //4 qr = new QRDecomposition(xtx) } The regression weights are computed by resolving the system of linear equations using substitution on the QR matrices. It overrides the calculateBeta function from the base class: override protected def calculateBeta: RealVector = qr.getSolver().solve(getY()) Test case The objective of the test case is to identify the impact of the L2 penalization on the RSS value, and then compare the predicted values with original values. Let's consider the first test case related to the regression on the daily price variation of the Copper ETF (symbol: CU) using the stock daily volatility and volume as feature. The implementation of the extraction of observations is identical as with the least squares regression: val src = DataSource(path, true, true, 1) val price = src |> YahooFinancials.adjClose val volatility = src |> YahooFinancials.volatility val volume = src |> YahooFinancials.volume //1   val _price = price.get.toArray val deltaPrice = XTSeries[Double](_price                                .drop(1)                                .zip(_price.take(_price.size -1))                                .map( z => z._1 - z._2)) //2 val data = volatility.get                      .zip(volume.get)                      .map(z => Array[Double](z._1, z._2)) //3 val features = XTSeries[DblVector](data.take(data.size-1)) val regression = new RidgeRegression[Double](features, deltaPrice, lambda) //4   regression.rss match { case Some(rss) => Display.show(rss, logger) …. The observed data, ETF daily price, and the features (volatility and volume) are extracted from the source src (line 1). The daily price change, deltaPrice, is computed using a combination of Scala take and drop methods (line 2). The features vector is created by zipping volatility and volume (line 3). The model is created by instantiating the RidgeRegression class (line 4). The RSS value, rss, is finally displayed (line 5). The RSS value, rss, is plotted for different values of lambda <= 1.0 in the following graph: Graph of RSS versus Lambda for Copper ETF The residual sum of squares decreased as λ increases. The curve seems to be reaching for a minimum around λ=1. The case of λ = 0 corresponds to the least squares regression. Next, let's plot the RSS value for λ varying between 1 and 100: Graph RSS versus large value Lambda for Copper ETF This time around RSS increases with λ before reaching a maximum for λ > 60. This behavior is consistent with other findings (refer to Lecture 5: Model selection and assessment, a lecture by H. Bravo and R. Irizarry from department of Computer Science, University of Maryland, in 2010, available at http://www.cbcb.umd.edu/~hcorrada/PracticalML/pdf/lectures/selection.pdf). As λ increases, the overfitting gets more expensive, and therefore, the RSS value increases. The regression weights can by simply outputted as follows: regression.weights.get Let's plot the predicted price variation of the Copper ETF using the ridge regression with different value of lambda (λ): Graph of ridge regression on Copper ETF price variation with variable Lambda The original price variation of the Copper ETF Δ = price(t+1)-price(t) is plotted as λ =0. The predicted values for λ = 0.8 is very similar to the original data. The predicted values for λ = 0.8 follows the pattern of the original data with reduction of large variations (peaks and troves). The predicted values for λ = 5 corresponds to a smoothed dataset. The pattern of the original data is preserved but the magnitude of the price variation is significantly reduced. The reader is invited to apply the more elaborate K-fold validation routine and compute precision, recall, and F1 measure to confirm the findings. Summary The ridge regression is a powerful alternative to the more common least squares regression because it reduces the risk of overfitting. Contrary to the Naïve Bayes classifiers, it does not require conditional independence of the model features. Resources for Article: Further resources on this subject: Differences in style between Java and Scala code [Article] Dependency Management in SBT [Article] Introduction to MapReduce [Article]
Read more
  • 0
  • 0
  • 6786

article-image-adding-graded-activities
Packt
16 Dec 2014
9 min read
Save for later

Adding Graded Activities

Packt
16 Dec 2014
9 min read
This article by Rebecca Barrington, author of Moodle Gradebook Second Edition, teaches you how to add assignments and set up how they will be graded, including how to use our custom scales and add outcomes for grading. (For more resources related to this topic, see here.) As with all content within Moodle, we need to select Turn editing on within the course in order to be able to add resources and activities. All graded activities are added through the Add an activity or resource text available within each section of within a Moodle course. This text can be found in the bottom right of each section after editing has been turned on. There are a number of items that can be graded and will appear within the Gradebook. Assignments are the most feature-rich of all the graded activities and have many options available in order to customize how assessments can be graded. They can be used to provide assessment information for students, store grades, and provide feedback. When setting up the assignment, we can choose for students to submit their work electronically—either through file submission or online text, or we can review the assessment offline and use only the grade and feedback features of the assignment. Adding assignments There are many options *within the assignments, and throughout this article we will set up a number of different assignments and you'll learn about some of their most useful features and options. Let's have a go at creating a range of assignments that are ready for grading. Creating an assignment with a scale The first assignment that we will add will *make use of the PMD scale Click on the Turn editing on button. Click on Add an activity or resource. Click on Assignment and then click on Add. In the Assignment name box, type in the name of the assignment (such as Task 1). In the Description box, provide some assignment details. In the Availability section, we need to disable the date options. We will not make use of these options, but they can be very useful. To disable the options, click on the tick next to the Enable text. However, details of these options have *been provided for future* reference. The Allow submissions from section* is mostly relevant when the assignment will be submitted electronically, as students won't be able to submit their work until the date and time indicated here. The Due date section* can be used to indicate when the assignment needs to be submitted by. If students electronically submit their assignment after the date and time indicated here, the submission date and time will be shown in red in order to notify the teacher that it was submitted past the due date. The Cut off date section* enables teachers to set an extension period after the due date where late submissions will continue to be accepted. In the* Submission types section, ensure *that the File submissions checkbox is enabled by adding a tick there. This will enable students to submit their assignment electronically. There are additional options that we can choose as well. With Maximum number of uploaded files, we can indicate how many files a student can upload. Keep this as 1. We can also determine the Maximum submission size option for each file using the drop-down list shown in the following screenshot: Within the Feedback types section, ensure that all options under the Feedback types *section are *selected. Feedback comments enables *us to provide written feedback along with the grade. Feedback files enables us *to upload a file in order to provide feedback to a student. Offline grading worksheet will *provide us with the option to download a .csv file that contains core information about the assignment, and this can be used to add grades and feedback while working offline. This completed .csv file can be uploaded and the grades will be added to the assignments within the Gradebook. In the Submission settings section, we have options related to how students will submit their assignment and how they will reattempt submission if required. If Require students click submit button is left as No, students will upload* their assignment* and it will be available *to the teacher for grading. If this option is changed to Yes, students can upload their assignment, but the teacher will see that it is in the draft form. Students will click on Submit to indicate that it is ready to be graded. Require that students accept the submission statement will provide students *with a statement that they need to agree to when they submit their assignment. The default statement is This assignment is my own work, except where I have acknowledged the use of works of other people. The submission statement can be changed by a site administrator by navigating to Site administration | plugins | Activity modules | Assignment settings. The Attempts reopened drop-down list* provides options for the status of the assignment after it has been graded. Students will only be able to resubmit their work when it is open. Therefore this setting will control when and if students are able to submit another version of their assignment. The options available to us are:Never: This option should be selected if students will not be able to submit another piece of work.Manually: This will enable anyone who has the role of a teacher to choose to reopen a submission that enables a student to submit their work again.Automatically until pass: This option works when a pass grade is set within the Gradebook. After grading, if the student is awarded the minimum pass *grade or higher, the submission *will remain closed in order to prevent any changes to the submission. However, if the assignment is graded lower than the assigned pass grade, the submission will automatically reopen in order to enable the student to submit the assignment again.Maximum attempts: The maximum *attempts allowed for this assignment will limit the number of times an assignment is reopened. For example, if this option is set to 3, then a student will only be able to submit their assignment three times. After they have submitted their assignment for a third time, they will not be allowed to submit it again. The default is unlimited, but it can be changed by clicking on the drop-down list. In the Submission settings section, ensure that the options for Require students click on submit button and Require that students accept the submission statement are set to Yes. Also, change the Attempts reopened to Automatically until passed. Within the Grade section, navigate to Grade | Type | Scale and choose the PMD scale. Select Use marking workflow by changing the drop-down list to Yes.Use marking workflow is a new feature of Moodle 2.6* that enables *the grading process to go through a range of stages in order to indicate that the marking is in progress or is complete, is being reviewed, or is ready for release to students. Click on* Save and return to course. Creating an online assignment with a number grade The next *assignment that we will create will have an online* text option that will have a maximum grade of 20. The following steps show you how to create an online assignment with a number grade: Enable editing by clicking on Turn editing on. Click on Add an activity or resource. Click on Assignment and then click on Add. In the Assignment name box, type in the name of the assignment (such as Task 2). In the Description box, provide the assignment details. In the Submission types section, ensure that Online text has a tick next to it. This will enable students to type directly into Moodle. When choosing this option, we can also set a maximum word limit by clicking on the tick box next to the Enable text. After enabling this option, we can add a number to the textbox. For this assignment, enable a word limit of 200 words. When using* online text* submission, we have an additional feedback option within the Feedback types section. Under the Comment inline text, click on No and switch to Yes to enable yourself to add written feedback for students within the written text submitted by students. In the Submission settings section, ensure that the options for Require students click submit button and Require that students accept the submission statement are set to Yes. Also, change Attempts reopened to Automatically until passed. Within the Grades section, navigate to Grade | Type | Point and ensure that Maximum points is set to 20. Click *on* Save and return to course. Creating an assignment including outcomes The next assignment that we will *create will add some of the Outcomes: Enable editing by clicking on Turn editing on. Click on Add an activity or resource. Click on Assignment and then click on Add. In the Assignment name box, type in the name of the assignment (such as Task 3). In the Description box, provide the assignment details. In the Submission types box, ensure that Online text and File submissions are selected. Set Maximum number of uploaded files to 2. In the Submission settings section, ensure that the options for Require students to click submit button and Require that students accept the submission statement are amended to Yes. Change Attempts reopened to Manually. Within the Grades section, navigate to Grade | Type | Point and Maximum points is set to 100. In the Outcomes *section, choose the outcomes as Evidence provided and Criteria 1 met. Scroll to the bottom of the screen and click on Save and return to course. Summary In this article, we added a range of assignments that made use of number and scale grades as well as added outcomes to an assignment. Resources for Article: Further resources on this subject: Moodle for Online Communities [article] What's New in Moodle 2.0 [article] Moodle 2.0: What's New in Add a Resource [article]
Read more
  • 0
  • 0
  • 2660

article-image-building-information-radiator-part-1
Andrew Fisher
16 Dec 2014
9 min read
Save for later

Building an Information Radiator - Part 1

Andrew Fisher
16 Dec 2014
9 min read
Download the code files for this project here. I love lights; specifically, I love LEDs - which have been described to me as "catnip for geeks". LEDs are low powered but bright, which means they can be embedded into all sorts of interesting places and, when coupled with a network, can be used for all sorts of ambient display purposes. In this post, I'll show you how to build an "information radiator" with a bit of Python and some LEDs, which you can then use to make your own for your own personal needs. // An information radiator light showing the forecast temperature in Melbourne. An information radiator is so called as it radiates information outwards from (often) a fixed point so that it can be interpreted by an observer. More complex information can be encoded through the use of color, brightness, or frequency of lighting to encode more information. I'm going to show you how to build an ambient display that scrapes some data from a weather service and then display it using colored light to indicate the forecasted temperature. This is quite a simple example, but by the end of this two-part post series, you will be able to change your information radiator to consider rain or multiple elements, or even point to something that is important to you. Bill of materials Item Description Cost Ethernet Arduino The Freetronics EtherTen is excellent, but an Arduino Uno with an Ethernet shield works too. $60 RGB LED The light discs from DFRobot are great as they produce a lot of light. $10 Computer This is needed to run the Python script to check the weather.   Wire Red, green, blue, and white is ideal, but anything you have available is fine. $2 Light fitting Anything that diffuses light will be interesting. $1+ Tools required These common tools will come in handy: Soldering iron Wire strippers Design You don't want the light attached to the computer all the time - what's the point of a light if you can just look up the weather on Google? The device will connect to the network and exist somewhere visible, and then the processing can run on a mini server somewhere (such as a Raspberry Pi) and just send the device messages when needed. So, the system design looks like this: The microcontroller looks after the LED and exposes a network interface. A Python script runs periodically on the server to check the weather forecast, get the data, and then send a message to the Arduino. Building the light The build of the light is quite straightforward. Cut four pieces of wire about 6 inches long (personal preference) and solder them to the four connections on the light disk.   // Light disc with wires soldered on. Strip 5mm of wire from the other end and wire the light disk to the Arduino in the following way: R to pin 5 G to pin 6 B to pin 9 Depending on the version of the light disc you have, wire GND to GND or 5V to 5V. The specifics are labelled on the disc itself, and the newer discs are GND.   // Light disc wired into Arduino. That's it! You're all done electronics-wise. Plug in an Ethernet cable and ensure you have 7-20V power supplied from a power pack to the Arduino. Programming the Arduino If you have never programmed an Arduino before, I suggest this tutorial as an excellent starting point. I'm going to assume you have got the Arduino IDE installed on your computer and you can upload sketches. First, you need to test your wiring. The following Arduino code will cycle through combinations of colors for about 1 second each. It will print the color to the serial console as well, so you can observe it with the serial monitor: #define RED 5 #define GREEN 6 #define BLUE 9 #define MAX_COLOURS 8 #define GND true // change this to false if 5V type char* colours[] = {"Off", "red", "green", "yellow", "blue", "magenta", "cyan", "white"}; uint8_t current_colour = 0; void setup() { Serial.begin(9600); Serial.println("Testing lights"); pinMode(RED, OUTPUT); pinMode(GREEN, OUTPUT); pinMode(BLUE, OUTPUT); if (GND) { digitalWrite(RED, LOW); digitalWrite(GREEN, LOW); digitalWrite(BLUE, LOW); } else { digitalWrite(RED, HIGH); digitalWrite(GREEN, HIGH); digitalWrite(BLUE, HIGH); } } void loop () { Serial.print("Current colour: "); Serial.println(colours[current_colour]); if (GND) { digitalWrite(RED, current_colour & 1); digitalWrite(GREEN, current_colour & 2); digitalWrite(BLUE, current_colour & 4); } else { digitalWrite(RED, !(bool)(current_colour & 1)); digitalWrite(GREEN, !(bool)(current_colour & 2)); digitalWrite(BLUE, !(bool)(current_colour & 4)); } if ((++current_colour) >= MAX_COLOURS) current_colour=0; delay(1000); } Notably, there is a flag to flip (#define GND true | false) depending on whether your light disc uses GND or 5V. All this does is reverse the bit-shifting logic (on the GND disc, the light goes on when the pin goes HIGH, but on the 5V disc, the light goes on when the pin goes LOW). If the colors are muddled, you have probably just connected a wire to the wrong pin; just flip them over and it should be fine. If you aren't seeing any light, check your connections and ensure you are getting power to the light disk. The next thing to do is write the sketch that will take messages from the network and update the light. To do this, we need to establish a protocol. There are many ways to define this, but for simplicity, a text protocol like JSON works sufficiently well. Each message will look like this: {r:val, g:val, b:val} In each case, val  is an unsigned byte, so will be in the range 0-255: // Adapted from generic web server example as part of IDE created by David Mellis and Tom Igoe. #include "Arduino.h" #include <Ethernet.h> #include <SPI.h> #include <string.h> #include <stdlib.h> #define DEBUG false // <1> // Enter a MAC address and IP address for your controller below. // The IP address will be dependent on your local network: byte mac[] = { 0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xAE }; byte ip[] = { <PUT YOUR IP HERE AS COMMA BYTES> }; //eg 192,168,0,100 byte gateway[] = { <PUT YOUR GW HERE AS COMMA BYTES}; // eg 192,168,0,1 byte subnet[] = { <PUT YOUR SUBNET HERE>}; //eg 255,255,255,0 // Initialize the Ethernet server library // with the IP address and port you want to use (in this case telnet) EthernetServer server(23); #define BUFFERLENGTH 255 // these are the pins you wire each LED to. #define RED 5 #define GREEN 6 #define BLUE 9 #define GND true // change this to false if 5V type, true if GND type light disc void setup() { Ethernet.begin(mac, ip, gateway, subnet); server.begin(); #ifdef DEBUG Serial.begin(9600); Serial.println("Awaiting connection"); #endif } void loop() { char buffer[BUFFERLENGTH]; int index = 0; // Listen EthernetClient client = server.available(); if (client) { #ifdef DEBUG Serial.println("Got a client"); #endif // reset the input buffer index = 0; while (client.connected()) { if (client.available()){ char c = client.read(); // if it's not a new line, then add it to the buffer <2> if (c != 'n' && c != 'r') { buffer[index] = c; index++; if (index > BUFFERLENGTH) index = BUFFERLENGTH -1; continue; } else { buffer[index] = ' '; } // get the message string for processing String msgstr = String(buffer); // get just the bits we want between the {} msgstr = msgstr.substring(msgstr.lastIndexOf('{')+1, msgstr.indexOf('}', msgstr.lastIndexOf('{'))); msgstr.replace(" ", ""); msgstr.replace("'", ""); #ifdef DEBUG Serial.println("Message:"); Serial.println(msgstr); #endif // rebuild the buffer with just the URL msgstr.toCharArray(buffer, BUFFERLENGTH); // iterate over the tokens of the message - assumed flat. <3> char *p = buffer; char *str; while ((str = strtok_r(p, ",", &p)) != NULL) { #ifdef DEBUG Serial.println(str); #endif char *tp = str; char *key; char *val; // get the key key = strtok_r(tp, ":", &tp); val = strtok_r(NULL, ":", &tp); #ifdef DEBUG Serial.print("Key: "); Serial.println(key); Serial.print("val: "); Serial.println(val); #endif // <4> if (GND) { if (*key == 'r') analogWrite(RED, atoi(val)); if (*key == 'g') analogWrite(GREEN, atoi(val)); if (*key == 'b') analogWrite(BLUE, atoi(val)); } else { if (*key == 'r') analogWrite(RED, 255-atoi(val)); if (*key == 'g') analogWrite(GREEN, 255-atoi(val)); if (*key == 'b') analogWrite(BLUE, 255-atoi(val)); } } break; } } delay(10); // give client time to send any data back client.stop(); } } The most notable parts of the code are as follows: You add your own network settings in here This text parser just adds text to a buffer until a n arrives As this protocol is simple, I use a string tokenizer to break up the message into its constituent pieces as key-value pairs Use the RGB values to set the appropriate level on the PWM pins (noting polarity reversal for GND vs 5V light discs) To test the code, upload the sketch, ensure your Ethernet cable is plugged in, and attempt to connect to the device: telnet <ip> 23 This should return something like the following: Trying 10.0.1.91... Connected to 10.0.1.91. Escape character is '^]'. Now, enter: {r:200,g:0, b:0} <enter> If the light changes to red, then everything is working - time to get some data. If not, check your code and make sure the messages are being interpreted properly (plug in your computer to use the serial debugger to watch the messages). Play around with changing the colors of your light by sending different values to the device. In the Part 2 post, I’ll explain how to scrape the weather data we want and use that to update the light periodically. About the Author Andrew Fisher is a creator (and destroyer) of things that combine mobile web, ubicomp, and lots of data. He is a programmer, interaction researcher, and CTO at JBA, a data consultancy in Melbourne, Australia. He can be found on Twitter at @ajfisher. 
Read more
  • 0
  • 0
  • 2523

article-image-role-angularjs
Packt
16 Dec 2014
7 min read
Save for later

Role of AngularJS

Packt
16 Dec 2014
7 min read
This article by Sandeep Kumar Patel, author of Responsive Web Design with AngularJS we will explore the role of AngularJS for responsive web development. Before going into AngularJS, you will learn about responsive "web development in general. Responsive" web development can be performed "in two ways: Using the browser sniffing approach Using the CSS3 media queries approach (For more resources related to this topic, see here.) Using the browser sniffing approach When we view" web pages through our browser, the browser sends a user agent string to the server. This string" provides information like browser and device details. Reading these details, the browser can be redirected to the appropriate view. This method of reading client details is known as browser sniffing. The browser string has a lot of different information about the source from where the request is generated. The following diagram shows the information shared by the user string:   Details of the parameters" present in the user agent string are as follows: Browser name: This" represents the actual name of the browser from where the request has originated, for example, Mozilla or Opera Browser version: This represents" the browser release version from the vendor, for example, Firefox has the latest version 31 Browser platform: This represents" the underlying engine on which the browser is running, for example, Trident or WebKit Device OS: This represents" the operating system running on the device from where the request has originated, for example, Linux or Windows Device processor: This represents" the processor type on which the operating system is running, for example, 32 or 64 bit A different browser string is generated based on the combination of the device and type of browser used while accessing a web page. The following table shows examples of browser "strings: Browser Device User agent string Firefox Windows desktop Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0 Chrome OS X 10 desktop Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36 Opera Windows desktop Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14 Safari OS X 10 desktop Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13+ (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2 Internet Explorer Windows desktop Mozilla/5.0 (compatible; MSIE 10.6; Windows NT 6.1; Trident/5.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) 3gpp-gba UNTRUSTED/1.0   AngularJS has features like providers or services which can be most useful for this browser user-agent sniffing and a redirection approach. An AngularJS provider can be created that can be used in the configuration in the routing module. This provider can have reusable properties and reusable methods that can be used to identify the device and route the specific request to the appropriate template view. To discover more about user agent strings on various browser and device combinations, visit http://www.useragentstring.com/pages/Browserlist/. CSS3 media queries approach CSS3 brings a "new horizon to web application development. One of the key features" is media queries to develop a responsive web application. Media queries uses media types and features as "deciding parameters to apply the style to the current web page. Media type CSS3 media queries" provide rules for media types to have different styles applied to a web page. In the media queries specification, media types that should be supported by the implemented browser are listed. These media types are as follows: all: This is used" for all media type devices aural: This is "used for speech and sound synthesizers braille: This is used "for braille tactile feedback devices embossed: This" is used for paged braille printers handheld: This is "used for small or handheld devices, for example, mobile print: This" is used for printers, for example, an A4 size paper document projection: This is" used for projection-based devices, such as a projector screen with a slide screen: This is "used for computer screens, for example, desktop and "laptop screens tty: This is" used for media using a fixed-pitch character grid, such as teletypes and terminals tv: This is used "for television-type devices, for example, webOS "or Android-based television A media rule can be declared using the @media keyword with the specific type for the targeted media. The following code shows an example of the media rule usage, where the background body color" is black and text is white for the screen type media, and background body color is white and text is black for the printer media type: @media screen { body {    background:black;    color:white; } }   @media print{ body {    background:white;    color:black; } } An external style "sheet can be downloaded and applied to the current page based on the media type with the HTML link tag. The following code uses the link type in conjunction with media type: <link rel='stylesheet' media='screen' href='<fileName.css>' /> To learn more about" different media types,visit https://developer.mozilla.org/en-US/docs/Web/CSS/@media#Media_types. Media feature Conditional styles can be "applied to a page based on different features of a device. The features that are "supported by CSS3 media queries to apply styles are as follows: color: Based on the" number of bits used for a color component by the device-specific style sheet, this can be applied to a page. color-index: Based "on the color look up, table styles can be applied "to a page. aspect-ratio: Based "on the aspect ratio, display area style sheets can be applied to a page. device-aspect-ratio: Based "on the device aspect ratio, styles can be applied to a page. device-height: Based "on device height, styles can be applied to a page. "This includes the entire screen. device-width: Based "on device width, styles can be applied to a page. "This includes the entire screen. grid: Based "on the device type, bitmap or grid, styles can be applied "to a page. height: Based on" the device rendering area height, styles can be used "to a page. monochrome: Based on" the monochrome type, styles can be applied. "This represents the number of bits used by the device in the grey scale. orientation: Based" on the viewport mode, landscape or portrait, styles can be applied to a page. resolution: Based" on the pixel density, styles can be applied to a page. scan: Based on the "scanning type used by the device for rendering, styles can be applied to a page. width: Based "on the device screen width, specific styles can be applied. The following" code shows some examples" of CSS3 media queries using different device features for conditional styles used: //for screen devices with a minimum aspect ratio 0.5 @media screen and (min-aspect-ratio: 1/2) { img {    height: 70px;    width: 70px; } } //for all device in portrait viewport @media all and (orientation: portrait) { img {    height: 100px;    width: 200px; } } //For printer devices with a minimum resolution of 300dpi pixel density @media print and (min-resolution: 300dpi) { img {    height: 600px;    width: 400px; } } To learn more" about different media features, visit https://developer.mozilla.org/en-US/docs/Web/CSS/@media#Media_features. Summary In this chapter, you learned about responsive design and the SPA architecture. "You now understand the role of the AngularJS library when developing a responsive application. We quickly went through all the important features of AngularJS with the coded syntax. In the next chapter, you will set up your AngularJS application and learn to create dynamic routing-based on the devices. Resources for Article:  Further resources on this subject: Best Practices for Modern Web Applications [article] Important Aspect of AngularJS UI Development [article] A look into responsive design frameworks [article]
Read more
  • 0
  • 23
  • 41358
article-image-overview-chips
Packt
16 Dec 2014
7 min read
Save for later

Overview of Chips

Packt
16 Dec 2014
7 min read
In this article by Olliver M. Schinagl, author of Getting Started with Cubieboard, we will overview various development boards and compare a few popular ones to help you choose a board tailored to your requirements. In the last few years, ARM-based Systems on Chips (SoCs) have become immensely popular. Compared to the regular x86 Intel-based or AMD-based CPUs, they are much more energy efficient and still performs adequately. They also incorporate a lot of peripherals, such as a Graphics Processor Unit (GPU), a Video Accelerator (VPU), an audio controller, various storage controllers, and various buses (I2C and SPI), to name a few. This immensely reduces the required components on a board. With the reduction in the required components, there are a few obvious advantages, such as reduction in the cost and, consequentially, a much easier design of boards. Thus, many companies with electronic engineers are able to design and manufacture these boards cheaply. (For more resources related to this topic, see here.) So, there are many boards; does that mean there are also many SoCs? Quite a few actually, but to keep the following list short, only the most popular ones are listed: Allwinner's A-series Broadcom's BCM-series Freescale's i.MX-series MediaTek's MT-series Rockchip's RK-series Samsung's Exynos-series NVIDIA's Tegra-series Texas Instruments' AM-series and OMAP-series Qualcomm's APQ-series and MSM-series While many of the potential chips are interesting, Allwinner's A-series of SoCs will be the focus of this book. Due to their low price and decent availability, quite a few companies design development boards around these chips and sell them at a low cost. Additionally, the A-series is, presently, the most open source friendly series of chips available. There is a fully open source bootloader, and nearly all the hardware is supported by open source drivers. Among the A-series of chips, there are a few choices. The following is the list of the most common and most interesting devices: A10: This is the first chip of the A-series and the best supported one, as it has been around for long. It is able to communicate with the outside world over I2C, SPI, MMC, NAND, digital and analog video out, analog audio out, SPDIF, I2S, Ethernet MAC, USB, SATA, and HDMI. This chip initially targeted everything, such as phones, tablets, set-top boxes, and mini PC sticks. For its GPU, it features the MALI-400. A10S: This chip followed the A10; it focused mainly on the PC stick market and left out several parts, such as SATA and analog video in/out, and it has no LCD interface. These parts were left out to reduce the cost of the chip, making it interesting for cheap TV sticks. A13: This chip was introduced more or less simultaneously with the A10S for primary use in tablets. It lacked SATA, Ethernet MAC, and also HDMI, which reduced the chip's cost even more. A20: This chip was introduced way after the others and hence it was pin-compatible to the A10 intended to replace it. As the name hints, the A20 is a dual-core variant of the A10. The ARM cores are slightly different; Cortex-A7 has been used in the A10 instead of Cortex-A8. A23: This chip was introduced after the A31 and A31S and is reasonably similar to the A31 in its design. It features a dual-core Cortex-A7 design and is intended to replace the A13. It is mainly intended to be used in tablets. A31: This chip features four Cortex-A7 cores and generally has all the connections that the A10 has. It is, however, not popular within the community because it features a PowerVR GPU that, until now, has seen no community support at all. Additionally, there are no development boards commonly available for this chip. A31S: This chip was released slightly after the A31 to solve some issues with the A31. There are no common development boards available. Choosing the right development board Allwinner's A-series of SoCs was produced and sold so cheaply that many companies used these chips in their products, such as tablets, set-top boxes, and eventually, development boards. Before the availability of development boards, people worked on and with tablets and set-top boxes. The most common and popular boards are from Cubietech and Olimex, in part because both companies handed out development boards to community developers for free. Olimex Olimex has released a fair amount of different development boards and peripherals. A lot of its boards are open source hardware with schematics and layout files available, and Olimex is also very open source friendly. You can see the Olimex board in the following image: Olimex offers the A10-OLinuXino-LIME, an A10-based micro board that is marketed to compete with the famous Raspberry Pi price-wise. Due to its small size, it uses less standard, 1.27 mm pitch headers for the pins, but it has nearly all of these pins exposed for use. You can see the A10-OLinuXino-LIME board in the following image: The Olimex OLinuXino series of boards is available in the A10, A13, and A20 flavors and has more standard, 2.54 mm pitch headers that are compatible with the old IDE and serial connectors. Olimex has various sensors, displays, and other peripherals that are also compatible with these headers. Cubietech Cubietech was formed by previous Allwinner employees and was one of the first development boards available using the Allwinner SoC. While it is not open source hardware, it does offer the schematics for download. Cubietech released three boards: the Cubieboard1, the Cubieboard2, and the Cubieboard3—also known as the Cubietruck. Interfacing with these boards can be quite tricky, as they use 2 mm pitch headers that might be hard to find in Europe or America. You can see the Cubietech board in the following image: Cubieboard1 and Cubieboard2 use identical boards; the only difference is that A20 is used instead of A10 in Cubieboard2. These boards only have a subset of the pins exposed. You can see the Cubietruck board in the following image: Cubietruck is quite different but a well-designed A20 board. It features everything that the previous boards offer, along with Gigabit Ethernet, VGA, Bluetooth, Wi-Fi, and an optical audio out. This does come at the cost that there are fewer pins to keep the size reasonably small. Compared to Raspberry Pi or LIME, it is almost double the size. Lemaker Lemaker made a smart design choice when releasing its Banana Pi board. It is an Allwinner A20-based board but uses the same board size and connector placement as Raspberry Pi and hence the name Banana Pi. Because of this, many of those Raspberry Pi cases could fit the Banana Pi and even shields will fit. Software-wise, it is quite different and does not work when using Raspberry Pi image files. Nevertheless, it features composite video out, stereo audio out, HDMI out Gigabit Ethernet, two USB ports, one USB OtG port, CSI out and LVDS out, and a handful of pins. Also available are a LiPo battery connector and a SATA connector and two buttons, but those might not be accessible on a lot of standard cases. See the following image for the topside of the Banana Pi: Itead and Olimex Itead and Olimex both offer an interesting board, which is worth mentioning separately. The Iteaduino Plus and the Olimex A20-SoM are quite interesting concepts; the computing module, which is a board with the SoC, memory, and flash, which are plugin modules, and a separated baseboard. Both of them sell a very complete baseboard as open source hardware, but anybody can design their own baseboard and buy the computing module. You can see the following board by Itead: Refer to the following board by Olimex: Additional hardware While a development board is a key ingredient, there are several other items that are also required. A power supply, for example, is not always supplied and does have some considerations. Also, additional hardware is required for the initial communication and to debug. Summary In this article, you looked at the additional hardware and a few extra peripherals that will help you understand the stuff you require for your projects. Resources for Article: Further resources on this subject: Home Security by BeagleBone [article] Mobile Devices [article] Making the Unit Very Mobile – Controlling the Movement of a Robot with Legs [article]
Read more
  • 0
  • 0
  • 8868

article-image-how-make-game-using-only-free-tools
Ellison Leao
16 Dec 2014
3 min read
Save for later

How to Make a Game Using Only Free Tools

Ellison Leao
16 Dec 2014
3 min read
Being an independent game developer is never easy. Many developers do not have enough money or sponsorship to buy great game development tools to continue creating great games. If you are starting a career in game development or if you are already a game developer who doesn't want to spend money on licenses,in this post we will examine how to create great games using only free tools. There are some game engines with some free licenses available, which are limited but very powerful. Here are some of them: GameMaker GameMaker, by YoYo Studios, is a very powerful tool that doesn't require a lot of programming skills, since it uses a proper script language called GML, which has a very small learning curve. The free version only exports to Windows. To learn more about it, go to their website: https://www.yoyogames.com/studio. Unity Unity3D is the most famous game development tool these days. With Unity you can create games from 2D platformers to 3D FPS with a few drag-and-drop elements and some scripting. And speaking about scripting, you can create Unity scripts using C#, UnityScript (a Unity version of JavaScript), or Boo. You can use most of the engine features, but if you want to make some money with your games, you can only use the free version if your game annual income does not exceed $100,000. If so, you will need to buy a Pro license. Construct 2 Similar to the GameMaker tool, Construct 2 by Scirraprovidessome abstraction layers for non-programmers and makes it easy and quick to deliver a game. The free version can export to the Windows Store, the Chrome Web Store, and Facebook. But if you are a good programmer who likes a bit of a challenge, there are several tools for you. Here are some frameworks to get you started: Phaser Phaser is an open source HTML5 framework, which provides some nice features, such as WebGL and canvas renders, a physics, particles, animation and camera systems, and more. It also supports Typescript for development. Monogame Monogame is the open source version of the XNA 4 framework. The main goal is to let Monogame users build their games for many platforms, such as iOS, Android, Mac OS X, Linux, and Windows 8. FlashPunk   FlashPunk is a framework written in ActionScript3 that helps you build 2D flash games. It has anything that a game needs, from collision support, audio and graphics system, to a live debugger that helps you fix bugs. These tools are just a sample of many other tools that exist out there. We've created a GitHub repo, gathering information from many places and listing a lot of game development resources that you can use to enhance your game development experience. It covers many types of tools, like 3D tools to render terrains, audio tools to make sound effect and soundtracks, free assets websites and more. You can find the list by clicking here. About the Author Ellison Leão (@ellisonleao) is a passionate software engineer with more than 6 years of experience in web projects. He is a contributor to the MelonJS framework and other open source projects. When he is not writing games, he loves to play drums.
Read more
  • 0
  • 0
  • 7177

article-image-how-to-auto-scale-your-cloud-with-saltstack
Nicole Thomas
15 Dec 2014
10 min read
Save for later

How to Auto-Scale Your Cloud with SaltStack

Nicole Thomas
15 Dec 2014
10 min read
What is SaltStack? SaltStack is an extremely fast, scalable, and powerful remote execution engine and configuration management tool created to control distributed infrastructure, code, and data efficiently. At the heart of SaltStack, or “Salt”, is its remote execution engine, which is a bi-directional, secure communication system administered through the use of a Salt Master daemon. This daemon is used to control Salt Minion daemons, which receive commands from the remote Salt Master. A major component of Salt’s approach to configuration management is Salt Cloud, which was made to manage Salt Minions in cloud environments. The main purpose of Salt Cloud is to spin up instances on cloud providers, install a Salt Minion on the new instance using Salt’s Bootstrap Script, and configure the new minion so it can immediately get to work. Salt Cloud makes it easy to get an infrastructure up and running quickly and supports an array of cloud providers such as OpenStack, Digital Ocean, Joyent, Linode, Rackspace, Amazon EC2, and Google Compute Engine to name a few. Here is a full list of cloud providers supported by SaltStack and the automation features supported for each. What is cloud auto scaling? One of the most formidable benefits of cloud application hosting and data storage is the cloud infrastructure’s capacity to scale as demand fluctuates. Many cloud providers offer auto scaling features that automatically increase or decrease the number of instances that are up and running in a user’s cloud at any given time. These components generate new instances as needed to ensure optimal performance as activity escalates, while during idle periods, instances are destroyed to reduce costs. To harness the power of cloud auto-scaling technologies, SaltStack provides two reactor formulas that integrate Salt’s configuration management and remote execution capabilities for either Amazon EC2 Auto Scaling or Rackspace Auto Scale. The Salt Cloud Reactor Salt Formulas can be very helpful in the rapid build out of management frameworks for cloud infrastructures. Formulas are pre-written Salt States that can be used to configure services, install packages, or any other common configuration management tasks. The Salt Cloud Reactor is a formula that allows Salt to interact with supported Salt Cloud providers who provide cloud auto scaling features. (Note: at the time this article was written, the only supported Salt Cloud providers with cloud auto scaling capabilities were Rackspace Auto Scale and Amazon EC2 Auto Scaling. The Salt Cloud Reactor can also be used directly with EC2 Auto Scaling, but it is recommended that the EC2 Autoscale Reactor be used instead, as discussed in the following section.) The Salt Cloud Reactor allows SaltStack to know when instances are spawned or destroyed by the cloud provider. When a new instance comes online, a Salt Minion is automatically installed and the minion’s key is accepted by the Salt Master. If the configuration for the minion contains the appropriate startup state, it will configure itself and start working on its tasks. Accordingly, when an instance is deleted by the cloud provider, the minion’s key is removed from the Salt Master. In order to use the Salt Cloud Reactor, the Salt Master must be configured appropriately. In addition to applying all necessary settings on the Salt Master, a Salt Cloud query must be executed on a regular basis. The query polls data from the cloud provider to collect changes in the auto scaling sequence, as cloud providers using the Salt Cloud Reactor do not directly trigger notifications to Salt upon instance creation and deletion. The cloud query must be issued via a scheduling system such as cron or the Salt Scheduler. Once the Salt Master has been configured and query scheduling has been implemented, the reactor will manage itself and allow the Salt Master to interact with any Salt Minions created or destroyed by the auto scaling system. The EC2 Autoscale Reactor Salt’s EC2 Autoscale Reactor enables Salt to collaborate with Amazon EC2 Auto Scaling. Similarly to the Salt Cloud Reactor, the EC2 Autoscale Reactor will bootstrap a Salt Minion on any newly created instances and the Salt Master will automatically accept the new minion’s key. Additionally, when an EC2 instance is destroyed, the Salt Minion’s key will be automatically removed from the Salt Master. However, the EC2 Auto Scale Reactor formula differs from the Salt Cloud Reactor formula in one major way. Amazon EC2 provides notifications directly to the reactor when the EC2 cloud is scaled up or down, making it easy for Salt to immediately bootstrap new instances with a Salt Minion, or to delete old Salt Minion keys from the master. This behavior, therefore, does not require any kind of scheduled query to poll EC2 for changes in scale like the Salt Cloud Reactor demands. Changes to the EC2 cloud can be acted upon by the Salt Master immediately, whereas changes in clouds using the Salt Cloud Reactor may experience a delay in the instance being created and the Salt Master bootstrapping the instance with a new minion. Configuring the EC2 Autoscale Reactor Both of the cloud auto scaling reactors were only recently added to the SaltStack arsenal, and as such, the Salt develop branch is required to set up auto any scaling capabilities. To get started, clone the Salt repository from GitHub onto the machine serving as the Salt Master: git clone https://github.com/saltstack/salt Depending on the operating system you are using, there are a few dependencies that also need to be installed to run SaltStack from the develop branch. Check out the Installing Salt for Development documentation for OS-specific instructions. Once Salt has been installed for development, the Salt Master needs to be configured. First, create the default salt directory in /etc : mkdir /etc/salt The default Salt Master configuration file resides in salt/conf/master. Copy this file into the new salt directory: cp path/to/salt/conf/master /etc/salt/master The Salt Master configuration file is completely commented out, as the default configuration for the master will work on most systems. However, some additional settings must be configured to enable the EC2 Autoscale Reactor to work with the Salt Master. Under the external_auth section of the master configuration file, replace the commented out lines with the following: external_auth:   pam:     myuser:       - .*       - ‘@runner’       - ‘@wheel’ rest_cherrypy:   port: 8080   host: 0.0.0.0   webhook_url: /hook   webhook_disable_auth: True reactor:   - ‘salt/netapi/hook/ec2/autoscale’:     - ‘/srv/reactor/ec2-autoscale.sls’ ec2.autoscale:   provider: my-ec2-config   ssh_username: ec2-user These settings allow the Salt API web hook system to interact with EC2. When a web request is received from EC2, the Salt API will execute an event for the reactor system to respond to. The final ec2.autoscale setting points the reactor to the corresponding Salt Cloud provider configuration file. If authenticity problems with the reactor’s web hook occur, an email notification from Amazon will be sent to the user. To configure the Salt Master to connect to a mail server, see the example SMTP settings in the EC2 Autoscale Reactor documentation. Next, the Salt Cloud provider configuration file must be created. First, create the cloud provider configuration directory: mkdir /etc/salt/cloud.providers.d In /etc/salt/cloud.providers.d, create a file named ec2.conf, and set the following configurations according to your Amazon EC2 account: my-ec2-config:   id: <my aws id>   key: <my aws key>   keyname: <my aws key name>   securitygroup: <my aws security group>   private_key: </path/to/my/private_key.pem>   location: us-east-1   provider: ec2   minion:     master: saltmaster.example.com The last line, master: saltmaster.example.com, represents the location of the Salt Master so the Salt Minions know where to connect once it’s up and running. To set up the actual reactor, create a new reactor directory, download the ec2-autoscale-reactor formula, and copy the reactor formula into the new directory, like so: mkdir /srv/reactor cp path/to/downloaded/package/ec2-autoscale.sls /srv/reactor/ec2-autoscale.sls The last major configuration step is to configure all of the appropriate settings on your EC2 account. First, log in to your AWS account and set up SNS HTTP(S) notifications by selecting SNS (Push Notification Service) from the AWS Console. Click Create New Topic, enter a topic name and a display name, and click the Create Topic button. Then, inside the Topic Details area, click Create Subscription. Choose HTTP or HTTPS as needed and enter the web hook for the Salt API. Assuming your Salt Master is set up at https://saltmaster.example.com, the final web hook endpoint will be: https://saltmaster.example.com/hook/ec2/autoscale. Finally, click Subscribe. Next, set up the launch configurations by choosing EC2 (Virtual Servers in the Cloud) from the AWS Console. Then, select Launch Configurations on the left-hand side. Click Create Launch Configuration and follow the prompts to define the appropriate settings for your cloud. Finally, on the review screen, click Create Launch Configuration to save your settings. Once the launch configuration is set up, click Auto Scaling Groups from the left-hand navigation menu to create auto scaling variables such as the minimum and maximum number of instances your cloud should contain. Click Create Auto Scaling Group, choose Create an Auto Scaling group from an existing launch configuration, select the appropriate configuration, and then click Next Step. From there, follow the prompts until you reach the Configure Notifications screen. Click Add Notification and choose the notification setting that was configured during the SNS configuration step. Finally, complete the rest of the prompts. Congratulations! At this point, you should have successfully configured SaltStack to work with EC2 Auto Scaling! Salt Scheduler As mentioned in the Salt Cloud Reactor section, some type of scheduling system must be implemented when using the Salt Cloud Reactor formula. SaltStack provides its own scheduler, which can be used by adding the following state to the Salt Master’s configuration file: schedule:   job1:     function: cloud.full_query     seconds: 300 Here, the seconds setting ensures that the Salt Master will perform a salt-cloud --full-query command every 5 minutes. A minimum value of 300 seconds or greater is recommended, however, the value can be changed as necessary. Salting instances from the web interface Another exciting quality of Salt’s auto-scale reactor formulas is once a reactor is configured, the respective cloud provider web interface can be used to spin up new instances that are automatically “Salted”. Since the reactor integrates with the web interface to automatically install a Salt Minion on any new instances, it will perform the same operations when instances are created manually via the web interface. The same functionality is true for manually deleting instances: if an instance is manually destroyed on the web interface, the corresponding minion’s key will be removed from the Salt Master. More resources For troubleshooting, more configuration options, or SaltStack specifics, SaltStack has many helpful resources such as SaltStack, Salt Cloud, Salt Cloud Reactor, and EC2 Autoscale Reactor documentation. SaltStack also has a thriving, active, and friendly open source community. About the Author Nicole Thomas is a QA Engineer at SaltStack, Inc. Before coming to SaltStack, she wore many hats from web and Android developer to contributing editor to working in Environmental Education. Nicole recently graduated Summa Cum Laude from Westminster College with a degree in Computer Science. Nicole also has a degree in Environmental Studies from the University of Utah.
Read more
  • 0
  • 0
  • 11265
article-image-how-to-build-a-koa-web-application-part-1
Christoffer Hallas
15 Dec 2014
8 min read
Save for later

How to Build a Koa Web Application - Part 1

Christoffer Hallas
15 Dec 2014
8 min read
You may be a seasoned or novice web developer, but no matter your level of experience, you must always be able to set up a basic MVC application. This two part series will briefly show you how to use Koa, a bleeding edge Node.js web application framework to create a web application using MongoDB as its database. Koa has a low footprint and tries to be as unbiased as possible. For this series, we will also use Jade and Mongel, two Node.js libraries that provide HTML template rendering and MongoDB model interfacing, respectively. Note that this series requires you to use Node.js version 0.11+. At the end of the series, we will have a small and basic app where you can create pages with a title and content, list your pages, and view them. Let’s get going! Using NPM and Node.js If you do not already have Node.js installed, you can download installation packages at the official Node.js website, http://nodejs.org. I strongly suggest that you install Node.js in order to code along with the article. Once installed, Node.js will add two new programs to your computer that you can access from your terminal; they’re node and npm. The first program is the main Node.js program and is used to run Node.js applications, and the second program is the Node Package Manager and it’s used to install Node.js packages. For this application we start out in an empty folder by using npm to install four libraries: $ npm install koa jade mongel co-body Once this is done, open your favorite text editor and create an index.js file in the folder in which we will now start our creating our application. We start by using the require function to load the four libraries we just installed: var koa = require('koa'); var jade = require('jade'); var mongel = require('mongel'); var parse = require(‘co-body'); This simply loads the functionality of the libraries into the respective variables. This lets us create our Page model and our Koa app variables: var Page = mongel('pages', ‘mongodb://localhost/app'); var app = koa(); As you can see, we now use the variables mongel and koa that we previously loaded into our program using require. To create a model with mongel, all we have to do is give the name of our MongoDB collection and a MongoDB connection URI that represents the network location of the database; in this case we’re using a local installation of MongoDB and a database called app. It’s simple to create a basic Koa application, and as seen in the code above, all we do is create a new variable called app that is the result of calling the Koa library function. Middleware, generators, and JavaScript Koa uses a new feature in JavaScript called generators. Generators are not widely available in browsers yet except for some versions of Google Chrome, but since Node.js is built on the same JavaScript as Google Chrome it can use generators. The generators function is much like a regular JavaScript function, but it has a special ability to yield several values along with the normal ability of returning a single value. Some expert JavaScript programmers used this to create a new and improved way of writing asynchronous code in JavaScript, which is required when building a networked application such as a web application. The generators function is a complex subject and we won’t cover it in detail. We’ll just show you how to use it in our small and basic app. In Koa, generators are used as something called middleware, a concept that may be familiar to you from other languages such as Ruby and Python. Think of middleware as a stack of functions through which an HTTP request must travel in order to create an appropriate response. Middleware should be created so that the functionality of a given middleware is encapsulated together. In our case, this means we’ll be creating two pieces of middleware: one to create pages and one to list pages or show a page. Let’s create our first middleware: app.use(function* (next) { … }); As you can see, we start by calling the app.use function, which takes a generator as its argument, and this effectively pushes the generator into the stack. To create a generator, we use a special function syntax where an asterisk is added as seen in the previous code snippet. We let our generator take a single argument called next, which represents the next middleware in the stack, if any. From here on, it is simply a matter of checking and responding to the parameters of the HTTP request, which are accessible to us in the Koa context. This is also the function context, which in JavaScript is the keyword this, similar to other languages and the keyword self: if (this.path != '/create') { yield next; return } Since we’re creating some middleware that helps us create pages, we make sure that this request is for the right path, in our case, /create; if not, we use the yield keyword and the next argument to pass the control of the program to the next middleware. Please note the return keyword that we also use; this is very important in this case as the middleware would otherwise continue while also passing control to the next middleware. This is not something you want to happen unless the middleware you’re in will not modify the Koa context or HTTP response, because subsequent middleware will always expect that they’re now in control. Now that we have checked that the path is correct, we still have to check the method to see if we’re just showing the form to create a page, or if we should actually create a page in the database: if (this.method == 'POST') { var body = yield parse.form(this); var page = yield Page.createOne({    title: body.title,    contents: body.contents }); this.redirect('/' + page._id); return } else if (this.method != 'GET') { this.status = 405; this.body = 'Method Not Allowed'; return } To check the method, we use the Koa context again and the method attribute. If we’re handling a POST request we now know how to create a page, but this also means that we must extract extra information from the request. Koa does not process the body of a request, only the headers, so we use the co-body library that we downloaded early and loaded in as the parse variable. Notice how we yield on the parse.form function; this is because this is an asynchronous function and we have to wait until it is done before we continue the program. Then we proceed to use our mongel model Page to create a page using the data we found in the body of the request, again this is an asynchronous function and we use yield to wait before we finally redirect the request using the page’s database id. If it turns out the method was not POST, we still want to use this middleware to show the form that is actually used to issue the request. That means we have to make sure that the method is GET, so we added an else if statement to the original check, and if the request is neither POST or GET we respond with an HTTP status 405 and the message Method Not Allowed, which is the appropriate response for this case. Notice how we don’t yield next; this is because the middleware was able to determine a satisfying response for the request and it requires no further processing. Finally, if the method was actually POST, we use the Jade library that we also installed using npm to render a create.jade template in HTML: var html = jade.renderFile('create.jade'); this.body = html; Notice how we set the Koa context’s body attribute to the rendered HTML from Jade; all this does is tell Koa that we want to send that back to the browser that sent the request. Wrapping up You are well on your way to creating your Koa app. In Part 2 we will implement Jade templates and list and view pages. Ready for the next step? Read Part 2 here. Explore all of our top Node.js content in one place - visit our Node.js page today! About the author Christoffer Hallas is a software developer and entrepreneur from Copenhagen, Denmark. He is a computer polyglot and contributes to and maintains a number of open source projects. When not contemplating his next grand idea (which remains an idea) he enjoys music, sports, and design of all kinds. Christoffer can be found on GitHub as hallas and at Twitter as @hamderhallas.
Read more
  • 0
  • 0
  • 5178

article-image-how-to-run-hadoop-on-google-cloud-1
Robi Sen
15 Dec 2014
4 min read
Save for later

How to Run Hadoop on Google Cloud – Part 1

Robi Sen
15 Dec 2014
4 min read
Setting up and working with Hadoop can sometimes be difficult. Furthermore, most people with limited resources develop on Hadoop instances on Virtual Machines locally or on minimal hardware. The problem with this is that Hadoop is really designed to run on many machines in order to realize its full capabilities. In this two part series of posts, we will show you how you can get started with Hadoop in the cloud with Google services quickly and relatively easily. Getting Started The first thing you need in order to follow along is a Google account. If you don’t have a Google account, you can sign up here: https://accounts.google.com/SignUp. Next, you need to create a Google Compute and Google Cloud storage enabled project via the Google Developers Console. Let’s walk through that right now. First go to the Developer Console and log in using your Google account. You will need your credit card as part of this process; however, to complete this two part post series, you will not need to spend any money. Once you have logged in, you should see something like what is shown in Figure 1. Figure 1: Example view of Google Developers Console Now select Create Project. This will pop up the create new project windows, as shown in Figure 2. In the project name field, go ahead and name your project HadoopTutorial. For the Project ID, Google will assign you a random project ID or you can try to select your own. Whatever your project ID is, just make note of it since we will be using it later. If, however, you forget your project ID, you can just come back to the Google console to look it up. You do not need to select the first checkbox shown in Figure 2, but go ahead and check the second checkbox, which is the terms of service. Now select Create. Figure 2: New Project window When you select Create, be prepared for a small delay as Google builds your project. When it is done, you should see a screen like that shown in Figure 3.   Figure 3: Project Dashboard Now click on Enable an API. You should now see the APIs screen. Make sure you check to see whether the Google Cloud Storage and Google Cloud Storage JSON API options are enabled, that is, showing a green ON button. Now scroll down and find the Google Compute Engine and select the OFF button to enable it like the one shown in Figure 4. If you don’t have a payment account set up on Google, you will be asked to do that now and put in a valid credit card. Once that is done, you can go back and enable the Google Compute Engine.   Figure 4: Setting up your Google APIs     You should now have your Google developer account up and running. In the next post, I will walk you through the installation of the Google Cloud SDK and setting up Hadoop via Windows and Cygwin. Read part 2 here. Want more Hadoop content? Check out our dynamic Hadoop page, updated with our latest titles and most popular content. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as Under Armour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 13611
Modal Close icon
Modal Close icon