Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7009 Articles
article-image-data-analysis-using-r
Packt
04 Jun 2015
17 min read
Save for later

Data Analysis Using R

Packt
04 Jun 2015
17 min read
In this article by Viswa Viswanathan and Shanthi Viswanathan, the authors of the book R Data Analysis Cookbook, we discover how R can be used in various ways such as comparison, classification, applying different functions, and so on. We will cover the following recipes: Creating charts that facilitate comparisons Building, plotting, and evaluating – classification trees Using time series objects Applying functions to subsets of a vector (For more resources related to this topic, see here.) Creating charts that facilitate comparisons In large datasets, we often gain good insights by examining how different segments behave. The similarities and differences can reveal interesting patterns. This recipe shows how to create graphs that enable such comparisons. Getting ready If you have not already done so, download the code files and save the daily-bike-rentals.csv file in your R working directory. Read the data into R using the following command: > bike <- read.csv("daily-bike-rentals.csv") > bike$season <- factor(bike$season, levels = c(1,2,3,4),   labels = c("Spring", "Summer", "Fall", "Winter")) > attach(bike) How to do it... We base this recipe on the task of generating histograms to facilitate the comparison of bike rentals by season. Using base plotting system We first look at how to generate histograms of the count of daily bike rentals by season using R's base plotting system: Set up a 2 X 2 grid for plotting histograms for the four seasons: > par(mfrow = c(2,2)) Extract data for the seasons: > spring <- subset(bike, season == "Spring")$cnt > summer <- subset(bike, season == "Summer")$cnt > fall <- subset(bike, season == "Fall")$cnt > winter <- subset(bike, season == "Winter")$cnt Plot the histogram and density for each season: > hist(spring, prob=TRUE,   xlab = "Spring daily rentals", main = "") > lines(density(spring)) >  > hist(summer, prob=TRUE,   xlab = "Summer daily rentals", main = "") > lines(density(summer)) >  > hist(fall, prob=TRUE,   xlab = "Fall daily rentals", main = "") > lines(density(fall)) >  > hist(winter, prob=TRUE,   xlab = "Winter daily rentals", main = "") > lines(density(winter)) You get the following output that facilitates comparisons across the seasons: Using ggplot2 We can achieve much of the preceding results in a single command: > qplot(cnt, data = bike) + facet_wrap(~ season, nrow=2) +   geom_histogram(fill = "blue") You can also combine all four into a single histogram and show the seasonal differences through coloring: > qplot(cnt, data = bike, fill = season) How it works... When you plot a single variable with qplot, you get a histogram by default. Adding facet enables you to generate one histogram per level of the chosen facet. By default, the four histograms will be arranged in a single row. Use facet_wrap to change this. There's more... You can use ggplot2 to generate comparative boxplots as well. Creating boxplots with ggplot2 Instead of the default histogram, you can get a boxplot with either of the following two approaches: > qplot(season, cnt, data = bike, geom = c("boxplot"), fill = season) >  > ggplot(bike, aes(x = season, y = cnt)) + geom_boxplot() The preceding code produces the following output: The second line of the preceding code produces the following plot: Building, plotting, and evaluating – classification trees You can use a couple of R packages to build classification trees. Under the hood, they all do the same thing. Getting ready If you do not already have the rpart, rpart.plot, and caret packages, install them now. Download the data files and place the banknote-authentication.csv file in your R working directory. How to do it... This recipe shows you how you can use the rpart package to build classification trees and the rpart.plot package to generate nice-looking tree diagrams: Load the rpart, rpart.plot, and caret packages: > library(rpart) > library(rpart.plot) > library(caret) Read the data: > bn <- read.csv("banknote-authentication.csv") Create data partitions. We need two partitions—training and validation. Rather than copying the data into the partitions, we will just keep the indices of the cases that represent the training cases and subset as and when needed: > set.seed(1000) > train.idx <- createDataPartition(bn$class, p = 0.7, list = FALSE) Build the tree: > mod <- rpart(class ~ ., data = bn[train.idx, ], method = "class", control = rpart.control(minsplit = 20, cp = 0.01)) View the text output (your result could differ if you did not set the random seed as in step 3): > mod n= 961   node), split, n, loss, yval, (yprob)      * denotes terminal node   1) root 961 423 0 (0.55983351 0.44016649)    2) variance>=0.321235 511 52 0 (0.89823875 0.10176125)      4) curtosis>=-4.3856 482 29 0 (0.93983402 0.06016598)        8) variance>=0.92009 413 10 0 (0.97578692 0.02421308) *        9) variance< 0.92009 69 19 0 (0.72463768 0.27536232)        18) entropy< -0.167685 52   6 0 (0.88461538 0.11538462) *        19) entropy>=-0.167685 17   4 1 (0.23529412 0.76470588) *      5) curtosis< -4.3856 29   6 1 (0.20689655 0.79310345)      10) variance>=2.3098 7   1 0 (0.85714286 0.14285714) *      11) variance< 2.3098 22   0 1 (0.00000000 1.00000000) *    3) variance< 0.321235 450 79 1 (0.17555556 0.82444444)      6) skew>=6.83375 76 18 0 (0.76315789 0.23684211)      12) variance>=-3.4449 57   0 0 (1.00000000 0.00000000) *      13) variance< -3.4449 19   1 1 (0.05263158 0.94736842) *      7) skew< 6.83375 374 21 1 (0.05614973 0.94385027)      14) curtosis>=6.21865 106 16 1 (0.15094340 0.84905660)        28) skew>=-3.16705 16   0 0 (1.00000000 0.00000000) *       29) skew< -3.16705 90   0 1 (0.00000000 1.00000000) *      15) curtosis< 6.21865 268   5 1 (0.01865672 0.98134328) * Generate a diagram of the tree (your tree might differ if you did not set the random seed as in step 3): > prp(mod, type = 2, extra = 104, nn = TRUE, fallen.leaves = TRUE, faclen = 4, varlen = 8, shadow.col = "gray") The following output is obtained as a result of the preceding command: Prune the tree: > # First see the cptable > # !!Note!!: Your table can be different because of the > # random aspect in cross-validation > mod$cptable            CP nsplit rel error   xerror       xstd 1 0.69030733     0 1.00000000 1.0000000 0.03637971 2 0.09456265     1 0.30969267 0.3262411 0.02570025 3 0.04018913     2 0.21513002 0.2387707 0.02247542 4 0.01891253     4 0.13475177 0.1607565 0.01879222 5 0.01182033     6 0.09692671 0.1347518 0.01731090 6 0.01063830     7 0.08510638 0.1323877 0.01716786 7 0.01000000     9 0.06382979 0.1276596 0.01687712   > # Choose CP value as the highest value whose > # xerror is not greater than minimum xerror + xstd > # With the above data that happens to be > # the fifth one, 0.01182033 > # Your values could be different because of random > # sampling > mod.pruned = prune(mod, mod$cptable[5, "CP"]) View the pruned tree (your tree will look different): > prp(mod.pruned, type = 2, extra = 104, nn = TRUE, fallen.leaves = TRUE, faclen = 4, varlen = 8, shadow.col = "gray") Use the pruned model to predict for a validation partition (note the minus sign before train.idx to consider the cases in the validation partition): > pred.pruned <- predict(mod, bn[-train.idx,], type = "class") Generate the error/classification-confusion matrix: > table(bn[-train.idx,]$class, pred.pruned, dnn = c("Actual", "Predicted"))      Predicted Actual   0   1      0 213 11      1 11 176 How it works... Steps 1 to 3 load the packages, read the data, and identify the cases in the training partition, respectively. In step 3, we set the random seed so that your results should match those that we display. Step 4 builds the classification tree model: > mod <- rpart(class ~ ., data = bn[train.idx, ], method = "class", control = rpart.control(minsplit = 20, cp = 0.01)) The rpart() function builds the tree model based on the following:   Formula specifying the dependent and independent variables   Dataset to use   A specification through method="class" that we want to build a classification tree (as opposed to a regression tree)   Control parameters specified through the control = rpart.control() setting; here we have indicated that the tree should only consider nodes with at least 20 cases for splitting and use the complexity parameter value of 0.01—these two values represent the defaults and we have included these just for illustration Step 5 produces a textual display of the results. Step 6 uses the prp() function of the rpart.plot package to produce a nice-looking plot of the tree: > prp(mod, type = 2, extra = 104, nn = TRUE, fallen.leaves = TRUE, faclen = 4, varlen = 8, shadow.col = "gray")   use type=2 to get a plot with every node labeled and with the split label below the node   use extra=4 to display the probability of each class in the node (conditioned on the node and hence summing to 1); add 100 (hence extra=104) to display the number of cases in the node as a percentage of the total number of cases   use nn = TRUE to display the node numbers; the root node is node number 1 and node n has child nodes numbered 2n and 2n+1   use fallen.leaves=TRUE to display all leaf nodes at the bottom of the graph   use faclen to abbreviate class names in the nodes to a specific maximum length   use varlen to abbreviate variable names   use shadow.col to specify the color of the shadow that each node casts Step 7 prunes the tree to reduce the chance that the model too closely models the training data—that is, to reduce overfitting. Within this step, we first look at the complexity table generated through cross-validation. We then use the table to determine the cutoff complexity level as the largest xerror (cross-validation error) value that is not greater than one standard deviation above the minimum cross-validation error. Steps 8 through 10 display the pruned tree; use the pruned tree to predict the class for the validation partition and then generate the error matrix for the validation partition. There's more... We discuss in the following an important variation on predictions using classification trees. Computing raw probabilities We can generate probabilities in place of classifications by specifying type="prob": > pred.pruned <- predict(mod, bn[-train.idx,], type = "prob") Create the ROC Chart Using the preceding raw probabilities and the class labels, we can generate a ROC chart: > pred <- prediction(pred.pruned[,2], bn[-train.idx,"class"]) > perf <- performance(pred, "tpr", "fpr") > plot(perf) Using time series objects In this recipe, we look at various features to create and plot time-series objects. We will consider data with both a single and multiple time series. Getting ready If you have not already downloaded the data files, do it now and ensure that the files are in your R working directory. How to do it... Read the data. The file has 100 rows and a single column named sales: > s <- read.csv("ts-example.csv") Convert the data to a simplistic time series object without any explicit notion of time: > s.ts <- ts(s) > class(s.ts) [1] "ts" Plot the time series: > plot(s.ts) Create a proper time series object with proper time points: > s.ts.a <- ts(s, start = 2002) > s.ts.a Time Series: Start = 2002 End = 2101 Frequency = 1        sales [1,]   51 [2,]   56 [3,]   37 [4,]   101 [5,]   66 (output truncated) > plot(s.ts.a) > # results show that R treated this as an annual > # time series with 2002 as the starting year The result of the preceding commands is seen in the following graph: To create a monthly time series run the following command: > # Create a monthly time series > s.ts.m <- ts(s, start = c(2002,1), frequency = 12) > s.ts.m        Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2002 51 56 37 101 66 63 45 68 70 107 86 102 2003 90 102 79 95 95 101 128 109 139 119 124 116 2004 106 100 114 133 119 114 125 167 149 165 135 152 2005 155 167 169 192 170 180 175 207 164 204 180 203 2006 215 222 205 202 203 209 200 199 218 221 225 212 2007 250 219 242 241 267 249 253 242 251 279 298 260 2008 269 257 279 273 275 314 288 286 290 288 304 291 2009 314 290 312 319 334 307 315 321 339 348 323 342 2010 340 348 354 291 > plot(s.ts.m) # note x axis on plot The following plot can be seen as a result of the preceding commands: > # Specify frequency = 4 for quarterly data > s.ts.q <- ts(s, start = 2002, frequency = 4) > s.ts.q        Qtr1 Qtr2 Qtr3 Qtr4 2002   51   56   37 101 2003   66   63   45   68 2004   70 107   86 102 2005   90 102   79   95 2006   95 101 128 109 (output truncated) > plot(s.ts.q) Query time series objects (we use the s.ts.m object we created in the previous step): > # When does the series start? > start(s.ts.m) [1] 2002   1 > # When does it end? > end(s.ts.m) [1] 2010   4 > # What is the frequency? > frequency(s.ts.m) [1] 12 Create a time series object with multiple time series. This data file contains US monthly consumer prices for white flour and unleaded gas for the years 1980 through 2014 (downloaded from the website of the US Bureau of Labor Statistics): > prices <- read.csv("prices.csv") > prices.ts <- ts(prices, start=c(1980,1), frequency = 12) Plot a time series object with multiple time series: > plot(prices.ts) The plot in two separate panels appears as follows: > # Plot both series in one panel with suitable legend > plot(prices.ts, plot.type = "single", col = 1:2) > legend("topleft", colnames(prices.ts), col = 1:2, lty = 1) Two series plotted in one panel appear as follow: How it works... Step 1 reads the data. Step 2 uses the ts function to generate a time series object based on the raw data. Step 3 uses the plot function to generate a line plot of the time series. We see that the time axis does not provide much information. Time series objects can represent time in more friendly terms. Step 4 shows how to create time series objects with a better notion of time. It shows how we can treat a data series as an annual, monthly, or quarterly time series. The start and frequency parameters help us to control these data series. Although the time series we provide is just a list of sequential values, in reality our data can have an implicit notion of time attached to it. For example, the data can be annual numbers, monthly numbers, or quarterly ones (or something else, such as 10-second observations of something). Given just the raw numbers (as in our data file, ts-example.csv), the ts function cannot figure out the time aspect and by default assumes no secondary time interval at all. We can use the frequency parameter to tell ts how to interpret the time aspect of the data. The frequency parameter controls how many secondary time intervals there are in one major time interval. If we do not explicitly specify it, by default frequency takes on a value of 1. Thus, the following code treats the data as an annual sequence, starting in 2002: > s.ts.a <- ts(s, start = 2002) The following code, on the other hand, treats the data as a monthly time series, starting in January 2002. If we specify the start parameter as a number, then R treats it as starting at the first subperiod, if any, of the specified start period. When we specify frequency as different from 1, then the start parameter can be a vector such as c(2002,1) to specify the series, the major period, and the subperiod where the series starts. c(2002,1) represent January 2002: > s.ts.m <- ts(s, start = c(2002,1), frequency = 12) Similarly, the following code treats the data as a quarterly sequence, starting in the first quarter of 2002: > s.ts.q <- ts(s, start = 2002, frequency = 4) The frequency values of 12 and 4 have a special meaning—they represent monthly and quarterly time sequences. We can supply start and end, just one of them, or none. If we do not specify either, then R treats the start as 1 and figures out end based on the number of data points. If we supply one, then R figures out the other based on the number of data points. While start and end do not play a role in computations, frequency plays a big role in determining seasonality, which captures periodic fluctuations. If we have some other specialized time series, we can specify the frequency parameter appropriately. Here are two examples:   With measurements taken every 10 minutes and seasonality pegged to the hour, we should specify frequency as 6   With measurements taken every 10 minutes and seasonality pegged to the day, use frequency = 24*6 (6 measurements per hour times 24 hours per day) Step 5 shows the use of the functions start, end, and frequency to query time series objects. Steps 6 and 7 show that R can handle data files that contain multiple time series. Applying functions to subsets of a vector The tapply function applies a function to each partition of the dataset. Hence, when we need to evaluate a function over subsets of a vector defined by a factor, tapply comes in handy. Getting ready Download the files and store the auto-mpg.csv file in your R working directory. Read the data and create factors for the cylinders variable: > auto <- read.csv("auto-mpg.csv", stringsAsFactors=FALSE) > auto$cylinders <- factor(auto$cylinders, levels = c(3,4,5,6,8),   labels = c("3cyl", "4cyl", "5cyl", "6cyl", "8cyl")) How to do it... To apply functions to subsets of a vector, follow these steps: Calculate mean mpg for each cylinder type: > tapply(auto$mpg,auto$cylinders,mean)      3cyl     4cyl     5cyl     6cyl     8cyl 20.55000 29.28676 27.36667 19.98571 14.96311 We can even specify multiple factors as a list. The following example shows only one factor since the out file has only one, but it serves as a template that you can adapt: > tapply(auto$mpg,list(cyl=auto$cylinders),mean)   cyl    3cyl     4cyl     5cyl     6cyl     8cyl 20.55000 29.28676 27.36667 19.98571 14.96311 How it works... In step 1 the mean function is applied to the auto$mpg vector grouped according to the auto$cylinders vector. The grouping factor should be of the same length as the input vector so that each element of the first vector can be associated with a group. The tapply function creates groups of the first argument based on each element's group affiliation as defined by the second argument and passes each group to the user-specified function. Step 2 shows that we can actually group by several factors specified as a list. In this case, tapply applies the function to each unique combination of the specified factors. There's more... The by function is similar to tapply and applies the function to a group of rows in a dataset, but by passing in the entire data frame. The following examples clarify this. Applying a function on groups from a data frame In the following example, we find the correlation between mpg and weight for each cylinder type: > by(auto, auto$cylinders, function(x) cor(x$mpg, x$weight)) auto$cylinders: 3cyl [1] 0.6191685 --------------------------------------------------- auto$cylinders: 4cyl [1] -0.5430774 --------------------------------------------------- auto$cylinders: 5cyl [1] -0.04750808 --------------------------------------------------- auto$cylinders: 6cyl [1] -0.4634435 --------------------------------------------------- auto$cylinders: 8cyl [1] -0.5569099 Summary Being an extensible system, R's functionality is divided across numerous packages with each one exposing large numbers of functions. Even experienced users cannot expect to remember all the details off the top of their head. In this article, we went through a few techniques using which R helps analyze data and visualize the results. Resources for Article: Further resources on this subject: Combining Vector and Raster Datasets [article] Factor variables in R [article] Big Data Analysis (R and Hadoop) [article]
Read more
  • 0
  • 0
  • 3583

article-image-working-liferay-user-user-group-organization
Packt
04 Jun 2015
23 min read
Save for later

Working with a Liferay User / User Group / Organization

Packt
04 Jun 2015
23 min read
In this article by Piotr Filipowicz and Katarzyna Ziółkowska, authors of the book Liferay 6.x Portal Enterprise Intranets Cookbook, we will cover the basic functionalities that will allow us to manage the structure and users of the intranet. In this article, we will cover the following topics: Managing an organization structure Creating a new user group Adding a new user Assigning users to organizations Assigning users to user groups Exporting users (For more resources related to this topic, see here.) The first step in creating an intranet, beyond answering the question of who the users will be, is to determine its structure. The structure of the intranet is often a derivative of the organizational structure of the company or institution. Liferay Portal CMS provides several tools that allow mapping of a company's structure in the system. The hierarchy is built by organizations that match functional or localization departments of the company. Each organization represents one department or localization and assembles users who represent employees of these departments. However, sometimes, there are other groups of employees in the company. These groups exist beyond the company's organizational structure, and can be reflected in the system by the User Groups functionality. Managing an organization structure Building an organizational structure in Liferay resembles the process of managing folders on a computer drive. An organization may have its suborganizations and—except the first level organization—at the same time, it can be a suborganization of another one. This folder-similar mechanism allows you to create a tree structure of organizations. Let's imagine that we are obliged to create an intranet for a software development company. The company's headquarter is located in London. There are also two other offices in Liverpool and Glasgow. The company is divided into finance, marketing, sales, IT, human resources, and legal departments. Employees from Glasgow and Liverpool belong to the IT department. How to do it… In order to create a structure described previously, these are the steps: Log in as an administrator and go to Admin | Control Panel | Users | Users and Organizations. Click on the Add button. Choose the type of organization you want to create (in our example, it will be a regular organization called software development company, but it is also possible to choose a location). Provide a name for the top-level organization. Choose the parent organization (if a top-level organization is created, this must be skipped). Click on the Save button: Click on the Change button and upload a file, with a graphic representation of your company (for example, logo). Use the right column menu to navigate to data sections you want to fill in with the information. Click on the Save button. Go back to the Users and Organizations list by clicking on the back icon (the left-arrow icon next to the Edit Software Development Company header). Click on the Actions button, located near the name of the newly created organization. Choose the Add Regular Organization option. Provide a name for the organization (in our example, it is IT). Click on the Save button. Go back to the Users and Organizations list by clicking on the back icon (left-arrow icon next to Edit IT header). Click on the Actions button, located near the name of the newly created organization (in our case, it is IT). Choose the Add Location option. Provide a name for the organization (for instance, IT Liverpool). Provide a country. Provide a region (if available). Click on the Save button. How it works… Let's take a look at what we did throughout the previous recipe. In steps 1 through 6, we created a new top-level organization called software development company. With steps 7 through 9, we defined a set of attributes of the newly created organization. Starting from step 11, we created suborganizations: standard organization (IT) and its location (IT Liverpool). Creating an organization There are two types of organizations: regular organizations and locations. The regular organization provides the possibility to create a multilevel structure, each unit of which can have parent organizations and suborganizations (there is one exception: the top-level organization cannot have any parent organizations). The localization is a special kind of organization that allows us to provide some additional data, such as country and region. However, it does not enable us to create suborganizations. When creating the tree of organizations, it is possible to combine regular organizations and locations, where, for instance, the top-level organization will be the regular organization and, both locations and regular organizations will be used as child organizations. When creating a new organization, it is very important to choose the organization type wisely, because it is the only organization parameter, which cannot be modified further. As was described previously, organizations can be arranged in a tree structure. The position of the organization in a tree is determined by the parent organization parameter, which is set by creating a new organization or by editing an existing one. If the parent organization is not set, a top-level organization is always created. There are two ways of creating a suborganization. It is possible to add a new organization by using the Add button and choosing a parent organization manually. The other way is to go to a specific organization's action menu and choose the Add Regular Organization action. While creating a new organization using this option, the parent organization parameter will be set automatically. Setting attributes Similarly, just like its counterpart in reality, every organization in Liferay has a set of attributes that are grouped and can be modified through the organization profile form. This form is available after clicking on the Edit button from the organization's action list (see the There's more… section). All the available attributes are divided into the following groups: The ORGANIZATION INFORMATION group, which contains the following sections: The Details section, which allows us to change the organization name, parent organization, country, or region (available for locations only). The name of the organization is the only required organization parameter. It is used by the search mechanism to search for organizations. It is also a part of an URL address of the organization's sites. The Organization Sites section, which allows us to enable the private and public pages of the organization's website. The Categorization section, which provides tags and categories. They can be assigned to an organization. IDENTIFICATION, which groups the Addresses, Phone Numbers, Additional Email Addresses, Websites, and Services sections. MISCELLANEOUS, which consists of: The Comments section, which allows us to manage an organization's comments The Reminder Queries section, in which reminder queries for different languages can be set The Custom Fields section, which provides a tool to manage values of custom attributes defined for the organization Customizing an organization functionalities Liferay provides the possibility to customize an organization's functionality. In the portal.properties file located in the portal-impl/src folder, there is a section called Organizations. All these settings can be overridden in the portal-ext.properties file. We mentioned that top-level organization cannot have any parent organizations. If we look deeper into portal settings, we can dig out the following properties: organizations.rootable[regular-Organization]=true organizations.rootable[location]=false These properties determine which type of organization can be created as a root organization. In many cases, users want to add a new organization's type. To achieve this goal, it is necessary to set a few properties that describe a new type: organizations.types=regular-Organization,location,my-Organization organizations.rootable[my-organization]=false organizations.children.types[my-organization]=location organizations.country.enabled[my-organization]=false organizations.country.required[my-organization]=false The first property defines a list of available types. The second one denies the possibility to create an organization as a root. The next one specifies a list of types that we can create as children. In our case, this is only the location type. The last two properties turn off the country list in the creation process. This option is useful when the location is not important. Another interesting feature is the ability to customize an organization's profile form. It is possible to indicate which sections are available on the creation form and which are available on the modification form. The following properties aggregate this feature: organizations.form.add.main=details,organization-site organizations.form.add.identification= organizations.form.add.miscellaneous=   organizations.form.update.main=details,organization-site,categorization organizations.form.update.identification=addresses,phone-numbers,additional-email-addresses,websites,services organizations.form.update.miscellaneous=comments,reminder-queries,custom-fields There's more… It is also possible to modify an existing organization and its attributes and to manage its members using actions available in the organization Actions menu. There are several possible actions that can be performed on an organization: The Edit action allows us to modify the attributes of an organization. The Manage Site action redirects the user to the Site Settings section in Control Panel and allows us to manage the organization's public and private sites (if the organization site has been already created). The Assign Organization Roles action allows us to set organization roles to members of an organization. The Assign Users action allows us to assign users already existing in the Liferay database to the specific organization. The Add User action allows us to create a new user, who will be automatically assigned to this specific organization. The Add Regular Organization action enables us to create a new child regular organization (the current organization will be automatically set as a parent organization of a new one). The Add Location action enables us to create a new location (the current organization will be automatically set as a parent organization of a new one). The Delete action allows us to remove an organization. While removing an organization, all pages with portlets and content are also removed. An organization cannot be removed if there are suborganizations or users assigned to it. In order to edit an organization, assign or add users, create a new suborganization (regular organization or location) or delete an organization. Perform the following steps: Log in as an administrator and go to Admin | Control panel | Users | Users and Organizations. Click on the Actions button, located near the name of the organization you want to modify. Click on the name of the chosen action. Creating a new user group Sometimes, in addition to the hierarchy, within the company, there are other groups of people linked by common interests or occupations, such as people working on a specific project, people occupying the same post, and so on. Such groups in Liferay are represented by user groups. This functionality is similar to the LDAP users group where it is possible to set group permissions. One user can be assigned into many user groups. How to do it… In order to create a new user group, follow these steps: Log in as an administrator and go to Admin | Control panel | Users | User Groups. Click on the Add button. Provide Name (required) and Description of the user group. Leave the default values in the User Group Site section. Click on the Save button. How it works… The user groups functionality allows us to create a collection of users and provide them with a public and/or private site, which contain a bunch of tools for collaboration. Unlike the organization, the user group cannot be used to produce a multilevel structure. It enables us to create non-hierarchical groups of users, which can be used by other functionalities. For example, a user group can be used as an additional information targeting tool for the announcements portlet, which presents short messages sent by authorized users (the announcements portlet allows us to direct a message to all users from a specific organization or user group). It is also possible to set permissions to a user group and decide which actions can be performed by which roles within this particular user group. It is worth noting that user groups can assemble users who are already members of organizations. This mechanism is often used when, aside from the company organizational structure, there exist other groups of people who need a common place to store data or for information exchange. There's more… It is also possible to modify an existing user group and its attributes and to manage its members using actions available in the user group Actions menu. There are several possible actions that can be performed on a user group. They are as follows: The Edit action allows us to modify attributes of a user group The Permissions action allows us to decide which roles can assign members of this user group, delete the user group, manage announcements, set permissions, and update or view the user group The Manage Site Pages action redirects the user to the site settings section in Control Panel and allows us to manage the user group's public and private sites The Go to the Site's Public Pages action opens the user group's public pages in a new window (if any public pages of User Group Site has been created) The Go to the Site's Private Pages action opens the user group's private pages in a new window (if any public pages of User Group Site has been created) The Assign Members action allows us to assign users already existing in the Liferay database to this specific user group The Delete action allows us to delete a user group A user group cannot be removed if there are users assigned to it. In order to edit a user group, set permissions, assign members, manage site pages, or delete a user group, perform these steps: Go to Admin | Control panel | Users | User Groups. Click on the Actions button, located near the name of the user group you want to modify: Click on the name of the chosen action. Adding a new user Each system is created for users. Liferay Portal CMS provides a few different ways of adding users to the system that can be enabled or disabled depending on the requirements. The first way is to enable users by creating their own accounts via the Create Account form. This functionality allows all users who can enter the site containing the form to register and gain access to the designated content of the website. In this case, the system automatically assigns the default user account parameters, which indicate the range of activities that may be carried by them in the system. The second solution (which we presented in this recipe) is to reserve the users' account creation to the administrators, who will decide what parameters should be assigned to each account. How to do it… To add a new user, you need to follow these steps: Log in as an administrator and go to Admin | Control panel | Users | Users and Organizations. Click on the Add button. Choose the User option. Fill in the form by providing the user's details in the Email Address (Required), Title, First Name (Required), Middle Name, Last Name, Suffix, Birthday, and Job Title fields (if the Autogenerated User Screen Names option in the Portal Settings | Users section is disabled, the screen name field will be available): Click on the Save button: Using the right column menu, navigate to the data sections you want to fill in with the information. Click on the Save button. How it works… In steps 1 through 5, we created a new user. With steps 6 and 7, we defined a set of attributes of the newly created user. This user is active and can already perform activities according to their memberships and roles. To understand all the mechanisms that influence the user's possible behavior in the system, we have to take a deeper look at these attributes. User as a member of organizations, user groups, and sites The first and most important thing to know about users is that they can be members of organizations, user groups, and sites. The range of activities performed by users within each organization, user group, or site they belong to is determined by the roles assigned to them. All the roles must be assigned for each user of an organization and site individually. This means it is possible, for instance, to make a user the administrator of one organization and only a power user of another. User attributes Each user in Liferay has a set of attributes that are grouped and can be modified through the user profile form. This form is available after clicking on the Edit button from the user's actions list (see, the There's more… section). All the available attributes are divided into the following groups: USER INFORMATION, which contains the following sections: The Details section enables us to provide basic user information, such as Screen Name, Email Address, Title, First Name, Middle Name, Last Name, Suffix, Birthday, Job Title, and Avatar The Password section allows us to set a new password or force a user to change their current password The Organizations section enables us to choose the organizations of which the user is a member The Sites section enables us to choose the sites of which the user is a member The User Groups section enables us to choose user groups of which the user is a member The Roles tab allows us to assign user roles The Personal Site section helps direct the public and private sites to the user The Categorization section provides tags and categories, which can be assigned to a user IDENTIFICATION allows us to to set additional user information, such as Addresses, Phone Numbers, Additional Email Addresses, Websites, Instant Messenger, Social Network, SMS, and OpenID MISCELLANEOUS, which contains the following sections: The Announcements section allows us to set the delivery options for alerts and announcements The Display Settings section covers the Language, Time Zone, and Greeting text options The Comments section allows us to manage the user's comments The Custom Fields section provides a tool to manage values of custom attributes defined for the user User site As it was mentioned earlier, each user in Liferay may have access to different kinds of sites: organization sites, user group sites, and standalone sites. In addition to these, however, users may also have their own public and private sites, which can be managed by them. The user's public and private sites can be reached from the user's menu located on the dockbar (the My Profile and My Dashboard links). It is also possible to enter these sites using their addresses, which are /web/username/home and /user/username/home, respectively. Customizing users Liferay gives us a whole bunch of settings in portal.properties under the Users section. If you want to override some of the properties, put them into the portal-ext.properties file. It is possible to deny deleting a user by setting the following property: users.delete=false As in the case of organizations, there is a functionality that lets us customize sections on the creation or modification form: users.form.add.main=details,Organizations,personal-site users.form.add.identification= users.form.add.miscellaneous=   users.form.update.main=details,password,Organizations,sites,user-groups,roles,personal-site,categorization users.form.update.identification=addresses,phone-numbers,additional-email-addresses,websites,instant-messenger,social-network,sms,open-id users.form.update.miscellaneous=announcements,display-settings,comments,custom-fields There are many other properties, but we will not discuss all of them. In portal.properties, located in the portal-impl/src folder, under the Users section, it is possible to find all the settings, and every line is documented by comment. There's more… Each user in the system can be active or inactive. An active user can log into their user account and use all resources available for them within their roles and memberships. Inactive user cannot enter his account, access places and perform activities, which are reserved for authorized and authenticated users only. It is worth noticing that active users cannot be deleted. In order to remove a user from Liferay, you need to to deactivate them first. To deactivate a user, follow these steps: Log in as an administrator and go to Admin | Control panel | Users | Users and Organizations. Click on the Actions button located near the name of the user. Go to the All Users tab. Find the active user you want to deactivate. Click on the Deactivate button. Confirm this action by clicking on the Ok button. To activate a user, follow these steps: Log in as an administrator and go to Admin | Control panel | Users | Users and Organizations. Go to the All Users tab. Find the inactive user you want to activate. Click on the Actions button located near the name of the user. Click on the Activate button. Sometimes, when using the system, users report some irregularities or get a little confused and require assistance. You need to look at the page through the user's eyes. Liferay provides a very useful functionality that allows authorized users to impersonate another user. In order to use this functionality, perform these steps: Log in as an administrator and go to Control Panel | Users | Users and Organizations. Click on the Actions button located near the name of the user. Click on the Impersonate user button. See also For more information on managing users, refer to the Exporting users recipe from this article Assigning users to organizations There are several ways a user can be assigned to an organization. It can be done by editing the user account that has already been created (see the User attributes section in Adding a new user recipe) or using the Assign Users action from the organization actions menu. In this recipe, we will show you how to assign a user to an organization using the option available in the organization actions menu. Getting ready To go through this recipe, you will need an organization and a user (refer to Managing an organization structure and Adding a new user recipes from this article). How to do it… In order to assign a user to an organization from the organization menu, follow these steps: Log in as an administrator and go to Admin | Control panel | Users | Users and Organizations. Click on the Actions button located near the name of the organization to which you want to assign the user. Choose the Assign Users option. Click on the Available tab. Mark a user or group of users you want to assign. Click on the Update Associations button. How it works… Each user in Liferay can be assigned to as many regular organizations as required and to exactly one location. When a user is assigned to the organization, they appear on the list of users of the organization. They become members of the organization and gain access to the organization's public and private pages according to the assigned roles and permissions. As was shown in the previous recipe, while editing the list of assigned users in the organization menu, it is possible to assign multiple users. It is worth noting that an administrator can assign the users of the organizations and suborganizations tasks that she or he can manage. To allow any administrator of an organization to be able to assign any user to that organization, set the following property in the portal-ext.properties file: Organizations.assignment.strict=true In many cases, when our organizations have a tree structure, it is not necessary that a member of a child organization has access to the ancestral ones. To disable this structure set the following property: Organizations.membership.strict=true See also For information on how to create user accounts, refer to the Adding a new user recipe from this article For information on assigning users to user groups, refer to the Assigning users to a user group recipe from this article Assigning users to a user group In addition to being a member of the organization, each user can be a member of one or more user groups. As a member of a user group, a user can profit by getting access to the user group's sites or other information directed exclusively to its members, for instance, messages sent by the Announcements portlet. A user becomes a member of the group when they are assigned to it. This assignment can be done by editing the user account that has already been created (see the User attributes description in Adding a new user recipe) or using the Assign Members action from the User Groups actions menu. In this recipe, we will show you how to assign a user to a user group using the option available in the User Groups actions menu. Getting ready To step through this recipe, first, you have to create a user group and a user (see the Creating a new user group and Adding a new user recipes). How to do it… In order to assign a user to a user group from the User Groups menu, perform these steps: Log in as an administrator and go to Admin | Control panel | Users | User Groups. Click on the Actions button located near the name of the user group to which you want to assign the user. Click on the Assign Members button. Click on the Available tab. Mark a user or group of users you want to assign. Click on the Update Associations button. How it works… As was shown in this recipe, one or more users can be assigned to a user group by editing the list of assigned users in the user group menu. Each user assigned to a user group becomes a member of this group and gains access to the user group's public and private pages according to assigned roles and permissions. See also For information on how to create user accounts, refer to the Adding a new user recipe from this article For information about assigning users to organization, refer to the Assigning users to organizations recipe from this article Exporting users Liferay Portal CMS provides a simple export mechanism, which allows us to export a list of all the users stored in the database or a list of all the users from a specific organization to a file. How to do it… In order to export the list of all users from the database to a file, follow these steps: Log in as an administrator and go to Admin | Control Panel | Users | Users and Organizations. Click on the Export Users button. In order to export the list of all users from the specific organization to a file, follow these steps: Log in as an administrator and go to Admin | Control Panel | Users | Users and Organizations. Click on the All Organizations tab. Click on the name of an organization to which the users are supposed to be exported. Click on the Export Users button. How it works… As mentioned previously, Liferay allows us to export users from a particular organization to a .csv file. The .csv file contains a list of user names and corresponding e-mail addresses. It is also possible to export all the users by clicking on the Export Users button located on the All Users tab. You will find this tab by going to Admin | Control panel | Users | Users and Organizations. See also For information on how to create user accounts, refer to the Adding a new user recipe from this article For information on how to assign users to organizations, refer to the Assigning users to organizations recipe from this article Summary In this article, you have learnt how to manage an organization structure by creating users and assigning them to organizations and user groups. You have also learnt how to export users using Liferay's export mechanism. Resources for Article: Further resources on this subject: Cache replication [article] Portlet [article] Liferay, its Installation and setup [article]
Read more
  • 0
  • 1
  • 7584

article-image-introduction-microsoft-azure-cloud-services
Packt
04 Jun 2015
10 min read
Save for later

Introduction to Microsoft Azure Cloud Services

Packt
04 Jun 2015
10 min read
In this article by Gethyn Ellis, author of the book Microsoft Azure IaaS Essentials, we will understand cloud computing and the various services offered by it. (For more resources related to this topic, see here.) Understanding cloud computing What do we mean when we talk about cloud from an information technology perspective? People mention cloud services; where do we get the services from? What services are offered? The Wikipedia definition of cloud computing is as follows: "Cloud computing is a computing term or metaphor that evolved in the late 1990s, based on utility and consumption of computer resources. Cloud computing involves application systems which are executed within the cloud and operated through internet enabled devices. Purely cloud computing does not rely on the use of cloud storage as it will be removed upon users download action. Clouds can be classified as public, private and [hybrid cloud|hybrid]." If you have worked with virtualization, then the concept of cloud is not completely alien to you. With virtualization, you can group a bunch of powerful hardware together, using a hypervisor. A hypervisor is a kind of software, operating system, or firmware that allows you to run virtual machines. Some of the popular Hypervisors on the market are VMware ESX or Microsoft's Hyper-V. Then, you can use this powerful hardware to run a set of virtual servers or guests. The guests share the resources of the host in order to execute and provide the services and computing resources of your IT department. The IT department takes care of everything from maintaining the hypervisor hosts to managing and maintaining the virtual servers and guests. The internal IT department does all the work. This is sometimes termed as a private cloud. Third-party suppliers, such as Microsoft, VMware, and Amazon, have a public cloud offering. With a public cloud, some computing services are provided to you on the Internet, and you can pay for what you use, which is like a utility bill. For example, let's take the utilities you use at home. This model can be really useful for start-up business that might not have an accurate demand forecast for their services, or the demand may change very quickly. Cloud computing can also be very useful for established businesses, who would like to make use of the elastic billing model. The more services you consume, the more you pay when you get billed at the end of the month. There are various types of public cloud offerings and services from a number of different providers. The TechNet top ten cloud providers are as follows: VMware Microsoft Bluelock Citrix Joyent Terrmark Salesforce.com Century Link RackSpace Amazon Web Services It is interesting to read that in 2013, Microsoft was only listed ninth in the list. With a new CEO, Microsoft has taken a new direction and put its Azure cloud offering at the heart of the business model. To quote one TechNet 2014 attendee: "TechNet this year was all about Azure, even the on premises stuff was built on the Azure model" With a different direction, it seems pretty clear that Microsoft is investing heavily in its cloud offering, and this will be further enhanced with further investment. This will allow a hybrid cloud environment, a combination of on-premises and public cloud, to be combined to offer organizations that ultimate flexibility when it comes to consuming IT resources. Services offered The term cloud is used to describe a variety of service offerings from multiple providers. You could argue, in fact, that the term cloud doesn't actually mean anything specific in terms of the service that you're consuming. It is, in fact, just a term that means you are consuming an IT service from a provider. Be it an internal IT department in the form of a private cloud or a public offering from some cloud provider, a public cloud, or it could be some combination of both in the form of a hybrid cloud. So, then what are the services that cloud providers offer? Virtualization and on-premises technology Most business even in today's cloudy environment has some on-premises technology. Until virtualization became popular and widely deployed several years ago, it was very common to have a one-to-one relationship between a physical hardware server with its own physical resources, such as CPU, RAM, storage, and the operating system installed on the physical server. It became clear that in this type of environment, you would need a lot of physical servers in your data center. An expanding and sometimes, a sprawling environment brings its own set of problems. The servers need cooling and heat management as well as a power source, and all the hardware and software needs to be maintained. Also, in terms of utilization, this model left lots of resources under-utilized: Virtualization changed this to some extent. With virtualization, you can create several guests or virtual servers that are configured to share the resources of the underlying host, each with their own operating system installed. It is possible to run both a Windows and Linux guest on the same physical host using virtualization. This allows you to maximize the resource utilization and allows your business to get a better return on investment on its hardware infrastructure: Virtualization is very much a precursor to cloud; many virtualized environments are sometimes called private clouds, so having an understanding of virtualization and how it works will give you a good grounding in some of the concepts of a cloud-based infrastructure. Software as a service (SaaS) SaaS is a subscription where you need to pay to use the software for the time that you're using it. You don't own any of the infrastructures, and you don't have to manage any of the servers or operating systems, you simply consume the software that you will be using. You can think of SaaS as like taking a taxi ride. When you take a taxi ride, you don't own the car, you don't need to maintain the car, and you don't even drive the car. You simply tell the taxi driver or his company when and where you want to travel somewhere, and they will take care of getting you there. The longer the trip, that is, the longer you use the taxi, the more you pay. An example of Microsoft's Software as a service would be the Azure SQL Database. The following diagram shows the cloud-based SQL database: Microsoft offers customers a SQL database that is fully hosted and maintained in Microsoft data centers, and the customer simply has to make use of the service and the database. So, we can compare this to having an on-premises database. To have an on-premises database, you need a Windows Server machine (physical or virtual) with the appropriate version of SQL Server installed. The server would need enough CPU, RAM, and storage to fulfill the needs of your database, and you need to manage and maintain the environment, applying various patches to the operating systems as they become available, installing, and testing various SQL Server service packs as they become available, and all the while, your application makes use of the database platform. With the SQL Azure database, you have no overhead, you simply need to connect to the Microsoft Azure portal and request a SQL database by following the wizard: Simply, give the database a name. In this case, it's called Helpdesk, select the service tier you want. In this example, I have chosen the Basic service tier. The service tier will define things, such as the resources available to your database, and impose limits, in terms of database size. With the Basic tier, you have a database size limit of 2 GB. You can specify the server that you want to create your database with, accept the defaults on the other settings, click on the check button, and the database gets created: It's really that simple. You will then pay for what you use in terms of database size and data access. In a later section, you will see how to set up a Microsoft Azure account. Platform as a service (PaaS) With PaaS, you rent the hardware, operating system, storage, and network from the public cloud service provider. PaaS is an offshoot of SaaS. Initially, SaaS didn't take off quickly, possibly because of the lack of control that IT departments and business thought they were going to suffer as a result of using the SaaS cloud offering. Going back to the transport analogy, you can compare PaaS to car rentals. When you rent a car, you don't need to make the car, you don't need to own the car, and you have no responsibility to maintain the car. You do, however, need to drive the car if you are going to get to your required destination. In PaaS terms, the developer and the system administrator have slightly more control over how the environment is set up and configured but still much of the work is taken care of by the cloud service provider. So, the hardware, operating system, and all the other components that run your application are managed and taken care of by the cloud provider, but you get a little more control over how things are configured. A geographically dispersed website would be a good example of an application offered on a PaaS offering. Infrastructure as a service (IaaS) With IaaS, you have much more control over the environment, and everything is customizable. Going with the transport analogy again, you can compare it to buying a car. The service provides you with the car upfront, and you are then responsible for using the car to ensure that it gets you from A to B. You are also responsible to fix the car if something goes wrong, and also ensure that the car is maintained by servicing it regularly, adding fuel, checking the tyre pressure, and so on. You have more control, but you also have more to do in terms of maintenance. Microsoft Azure has an offering. You can deploy a virtual machine, you can specify what OS you want, how much RAM you want the virtual machine to have, you can specify where the server will sit in terms of Microsoft data centers, and you can set up and configure recoverability and high availability for your Azure virtual machine: Hybrid environments With a hybrid environment, you get a combination of on-premises infrastructure and cloud services. It allows you to flexibly add resilience and high availability to your existing infrastructure. It's perfectly possible for the cloud to act as a disaster recovery site for your existing infrastructure. Microsoft Azure In order to work with the examples in this article, you need sign up for a Microsoft account. You can visit http://azure.microsoft.com/, and create an account all by yourself by completing the necessary form as follows: Here, you simply enter your details; you can use your e-mail address as your username. Enter the credentials specified. Return to the Azure website, and if you want to make use of the free trial, click on the free trial link. Currently, you get $125 worth of free Azure services. Once you have clicked on the free trial link, you will have to verify your details. You will also need to enter a credit card number and its details. Microsoft assures that you won't be charged during the free trial. Enter the appropriate details and click on Sign Up: Summary In this article, we looked at and discussed some of the terminology around the cloud. From the services offered to some of the specific features available in Microsoft Azure, you should be able to differentiate between a public and private cloud. You can also now differentiate between some of the public cloud offerings. Resources for Article: Further resources on this subject: Windows Azure Service Bus: Key Features [article] Digging into Windows Azure Diagnostics [article] Using the Windows Azure Platform PowerShell Cmdlets [article]
Read more
  • 0
  • 0
  • 8623

article-image-installing-openstack-swift
Packt
04 Jun 2015
10 min read
Save for later

Installing OpenStack Swift

Packt
04 Jun 2015
10 min read
In this article by Amar Kapadia, Sreedhar Varma, and Kris Rajana, authors of the book OpenStack Object Storage (Swift) Essentials, we will see how IT administrators can install OpenStack Swift. The version discussed here is the Juno release of OpenStack. Installation of Swift has several steps and requires careful planning before beginning the process. A simple installation consists of installing all Swift components on a single node, and a complex installation consists of installing Swift on several proxy server nodes and storage server nodes. The number of storage nodes can be in the order of thousands across multiple zones and regions. Depending on your installation, you need to decide on the number of proxy server nodes and storage server nodes that you will configure. This article demonstrates a manual installation process; advanced users may want to use utilities such as Puppet or Chef to simplify the process. This article walks you through an OpenStack Swift cluster installation that contains one proxy server and five storage servers. (For more resources related to this topic, see here.) Hardware planning This section describes the various hardware components involved in the setup. Since Swift deals with object storage, disks are going to be a major part of hardware planning. The size and number of disks required should be calculated based on your requirements. Networking is also an important component, where factors such as a public or private network and a separate network for communication between storage servers need to be planned. Network throughput of at least 1 GB per second is suggested, while 10 GB per second is recommended. The servers we set up as proxy and storage servers are dual quad-core servers with 12 GB of RAM. In our setup, we have a total of 15 x 2 TB disks for Swift storage; this gives us a total size of 30 TB. However, with in-built replication (with a default replica count of 3), Swift maintains three copies of the same data. Therefore, the effective capacity for storing files and objects is approximately 10 TB, taking filesystem overhead into consideration. This is further reduced due to less than 100 percent utilization. The following figure depicts the nodes of our Swift cluster configuration: The storage servers have container, object, and account services running in them. Server setup and network configuration All the servers are installed with the Ubuntu server operating system (64-bit LTS version 14.04). You'll need to configure three networks, which are as follows: Public network: The proxy server connects to this network. This network provides public access to the API endpoints within the proxy server. Storage network: This is a private network and it is not accessible to the outside world. All the storage servers and the proxy server will connect to this network. Communication between the proxy server and the storage servers and communication between the storage servers take place within this network. In our configuration, the IP addresses assigned in this network are 172.168.10.0 and 172.168.10.99. Replication network: This is also a private network that is not accessible to the outside world. It is dedicated to replication traffic, and only storage servers connect to it. All replication-related communication between storage servers takes place within this network. In our configuration, the IP addresses assigned in this network are 172.168.9.0 and 172.168.9.99. This network is optional, and if it is set up, the traffic on it needs to be monitored closely. Pre-installation steps In order for various servers to communicate easily, edit the /etc/hosts file and add the host names of each server in it. This has to be done on all the nodes. The following screenshot shows an example of the contents of the /etc/hosts file of the proxy server node: Install the Network Time Protocol (NTP) service on the proxy server node and storage server nodes. This helps all the nodes to synchronize their services effectively without any clock delays. The pre-installation steps to be performed are as follows: Run the following command to install the NTP service: # apt-get install ntp Configure the proxy server node to be the reference server for the storage server nodes to set their time from the proxy server node. Make sure that the following line is present in /etc/ntp.conf for NTP configuration in the proxy server node: server ntp.ubuntu.com For NTP configuration in the storage server nodes, add the following line to /etc/ntp.conf. Comment out the remaining lines with server addresses such as 0.ubuntu.pool.ntp.org, 1.ubuntu.pool.ntp.org, 2.ubuntu.pool.ntp.org, and 3.ubuntu.pool.ntp.org: # server 0.ubuntu.pool.ntp.org# server 1.ubuntu.pool.ntp.org# server 2.ubuntu.pool.ntp.org# server 3.ubuntu.pool.ntp.orgserver s-swift-proxy Restart the NTP service on each server with the following command: # service ntp restart Downloading and installing Swift The Ubuntu Cloud Archive is a special repository that provides users with the ability to install new releases of OpenStack. The steps required to download and install Swift are as follows: Enable the capability to install new releases of OpenStack, and install the latest version of Swift on each node using the following commands. The second command shown here creates a file named cloudarchive-juno.list in /etc/apt/sources.list.d, whose content is "deb http://ubuntu-cloud.archieve.canonical.com/ubuntu": Now, update the OS using the following command: # apt-get update && apt-get dist-upgrade On all the Swift nodes, we will install the prerequisite software and services using this command: # apt-get install swift rsync memcached python-netifaces python-xattr python-memcache Next, we create a Swift folder under /etc and give users the permission to access this folder, using the following commands: # mkdir –p /etc/swift/# chown –R swift:swift /etc/swift Download the /etc/swift/swift.conf file from GitHub using this command: # curl –o /etc/swift/swift.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/swift.conf-sample Modify the /etc/swift/swift.conf file and add a variable called swift_hash_path_suffix in the swift-hash section. We then create a unique hash string using # python –c "from uuid import uuid4; print uuid4()" or # openssl rand –hex 10, and assign it to this variable, as shown in the following configuration option: We then add another variable called swift_hash_path_prefix to the swift-hash section, and assign to it another hash string created using the method described in the preceding step. These strings will be used in the hashing process to determine the mappings in the ring. The swift.conf file should be identical on all the nodes in the cluster. Setting up storage server nodes This section explains additional steps to set up the storage server nodes, which will contain the object, container, and account services. Installing services The first step required to set up the storage server node is installing services. Let's look at the steps involved: On each storage server node, install the packages for swift-account services, swift-container services, swift-object services, and xfsprogs (XFS Filesystem) using this command: # apt-get install swift-account swift-container swift-object xfsprogs Download the account-server.conf, container-server.conf, and object-server.conf samples from GitHub, using the following commands: # curl –o /etc/swift/account-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/account-server.conf-sample# curl –o /etc/swift/container-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/container-server.conf-sample# curl –o /etc/swift/object-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/object-server.conf-sample Edit the /etc/swift/account-server.conf file with the following section: Edit the /etc/swift/container-server.conf file with this section: Edit the /etc/swift/object-server.conf file with the following section: Formatting and mounting hard disks On each storage server node, we need to identify the hard disks that will be used to store the data. We will then format the hard disks and mount them on a directory, which Swift will then use to store data. We will not create any RAID levels or subpartitions on these hard disks because they are not necessary for Swift. They will be used as entire disks. The operating system will be installed on separate disks, which will be RAID configured. First, identify the hard disks that are going to be used for storage and format them. In our storage server, we have identified sdb, sdc, and sdd to be used for storage. We will perform the following operations on sdb. These four steps should be repeated for sdc and sdd as well: Carry out the partitioning for sdb and create the filesystem using this command: # fdisk /dev/sdb# mkfs.xfs /dev/sdb1 Then let's create a directory in /srv/node/sdb1 that will be used to mount the filesystem. Give the permission to the swift user to access this directory. These operations can be performed using the following commands: # mkdir –p /srv/node/sdb1# chown –R swift:swift /srv/node/sdb1 We set up an entry in fstab for the sdb1 partition in the sdb hard disk, as follows. This will automatically mount sdb1 on /srv/node/sdb1 upon every boot. Add the following command line to the /etc/fstab file: /dev/sdb1 /srv/node/sdb1 xfsnoatime,nodiratime,nobarrier,logbufs=8 0 2 Mount sdb1 on /srv/node/sdb1 using the following command: # mount /srv/node/sdb1 RSYNC and RSYNCD In order for Swift to perform the replication of data, we need to configure rsync by configuring rsyncd.conf. This is done by performing the following steps: Create the rsyncd.conf file in the /etc folder with the following content: # vi /etc/rsyncd.conf We are setting up synchronization within the network by including the following lines in the configuration file: 172.168.9.52 is the IP address that is on the replication network for this storage server. Use the appropriate replication network IP addresses for the corresponding storage servers. We then have to edit the /etc/default/rsync file and set RSYNC_ENABLE to true using the following configuration option: RSYNC_ENABLE=true Next, we restart the rsync service using this command: # service rsync restart Then we create the swift, recon, and cache directories using the following commands, and then set its permissions: # mkdir -p /var/cache/swift# mkdir -p /var/swift/recon Setting permissions is done using these commands: # chown -R swift:swift /var/cache/swift# chown -R swift:swift /var/swift/recon Repeat these steps on every storage server. Setting up the proxy server node This section explains the steps required to set up the proxy server node, which are as follows: Install the following services only on the proxy server node: # apt-get install python-swiftclient python-keystoneclientpython-keystonemiddleware swift-proxy Swift doesn't support HTTPS. OpenSSL has already been installed as part of the operating system installation to support HTTPS. We are going to use the OpenStack Keystone service for authentication. In order to set up the proxy-server.conf file for this, we download the configuration file from the following link and edit it: https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/proxy-server.conf-sample# vi /etc/swift/proxy-server.conf The proxy-server.conf file should be edited to get the correct auth_host, admin_token, admin_tenant_name, admin_user, and admin_password values: admin_token = 01d8b673-9ebb-41d2-968a-d2a85daa1324admin_tenant_name = adminadmin_user = adminadmin_password = changeme Next, we create a keystone-signing directory and give permissions to the swift user using the following commands: # mkdir -p /home/swift/keystone-signing# mkdir -R swift:swift /home/swift/keystone-signing Summary In this article, you learned how to install and set up the OpenStack Swift service to provide object storage, and install and set up the Keystone service to provide authentication for users to access the Swift object storage. Resources for Article: Further resources on this subject: Troubleshooting in OpenStack Cloud Computing [Article] Using OpenStack Swift [Article] Playing with Swift [Article]
Read more
  • 0
  • 0
  • 15975

article-image-events-notifications-and-reporting
Packt
04 Jun 2015
55 min read
Save for later

Events, Notifications, and Reporting

Packt
04 Jun 2015
55 min read
In this article by Martin Wood, the author of the book, Mastering ServiceNow, has discussed about communication which is a key part of any business application. Not only does the boss need to have an updated report by Monday, but your customers and users also want to be kept informed. ServiceNow helps users who want to know what's going on. In this article, we'll explore the functionality available. The platform can notify and provide information to people in a variety of ways: Registering events and creating Scheduled Jobs to automate functionality Sending out informational e-mails when something happens Live dashboards and homepages showing the latest reports and statistics Scheduled reports that help with handover between shifts Capturing information with metrics Presenting a single set of consolidated data with database views (For more resources related to this topic, see here.) Dealing with events Firing an event is a way to tell the platform that something happened. Since ServiceNow is a data-driven system, in many cases, this means that a record has been updated in some way. For instance, maybe a guest has been made a VIP, or has stayed for 20 nights. Several parts of the system may be listening for an event to happen. When it does, they perform an action. One of these actions may be sending an e-mail to thank our guest for their continued business. These days, e-mail notifications don't need to be triggered by events. However, it is an excellent example. When you fire an event, you pass through a GlideRecord object and up to two string parameters. The item receiving this data can then use it as necessary, so if we wanted to send an e-mail confirming a hotel booking, we have those details to hand during processing. Registering events Before an event can be fired, it must be known to the system. We do this by adding it to Event Registry [sysevent_register], which can be accessed by navigating to System Policy > Events > Registry. It's a good idea to check whether there isn't one you can use before you add a new one. An event registration record consists of several fields, but most importantly a string name. An event can be called anything, but by convention it is in a dotted namespace style format. Often, it is prefixed by the application or table name and then by the activity that occurred. Since a GlideRecord object accompanies an event, the table that the record will come from should also be selected. It is also a good idea to describe your event and what will cause it in the Description and Fired by fields. Finally, there is a field that is often left empty, called Queue. This gives us the functionality to categorize events and process them in a specific order or frequency. Firing an event Most often, a script in a Business Rule will notice that something happens and will add an event to the Event [sysevent] queue. This table stores all of the events that have been fired, if it has been processed, and what page the user was on when it happened. As the events come in, the platform deals with them in a first in, first out order by default. It finds everything that is listening for this specific event and executes them. That may be an e-mail notification or a script. By navigating to System Policy > Events > Event Log, you can view the state of an event, when it was added to the queue, and when it was processed. To add an event to the queue, use the eventQueue function of GlideSystem. It accepts four parameters: the name of the event, a GlideRecord object, and two run time parameters. These can be any text strings, but most often are related to the user that caused the event. Sending an e-mail for new reservations Let's create an event that will fire when a Maintenance task has been assigned to one of our teams. Navigate to System Policy > Events > Registry. Click on New and set the following fields:     Event name: maintenance.assigned     Table: Maintenance [u_maintenance] Next, we need to add the event to the Event Queue. This is easily done with a simple Business Rule:     Name: Maintenance assignment events     Table: Maintenance [u_maintenance]     Advanced: <ticked>     When: after Make sure to always fire events after the record has been written to the database. This stops the possibility of firing an event even though another script has aborted the action. Insert: <ticked> Update: <ticked> Filter Conditions: Assignment group – changes Assignment group – is not empty Assigned to – is empty This filter represents when a task is sent to a new group but someone hasn't yet been identified to own the work. Script: gs.eventQueue('maintenance.assigned', current, gs.getUserID(), gs.getUserName()); This script follows the standard convention when firing events—passing the event name, current, which contains the GlideRecord object the Business Rule is working with, and some details about the user who is logged in. We'll pick this event up later and send an e-mail whenever it is fired. There are several events, such as <table_name>.view, that are fired automatically. A very useful one is the login event. Take a look at the Event Log to see what is happening. Scheduling jobs You may be wondering how the platform processes the event queue. What picks them up? How often are they processed? In order to make things happen automatically, ServiceNow has a System Scheduler. Processing the event queue is one job that is done on a repeated basis. ServiceNow can provide extra worker nodes that only process events. These shift the processing of things such as e-mails onto another system, enabling the other application nodes to better service user interactions. To see what is going on, navigate to System Scheduler > Scheduled Jobs > Today's Scheduled Jobs. This is a link to the Schedule Item [sys_trigger] table, a list of everything the system is doing in the background. You will see a job that collects database statistics, another that upgrades the instance (if appropriate), and others that send and receive e-mails or SMS messages. You should also spot one called events process, which deals with the event queue. A Schedule Item has a Next action date and time field. This is when the platform will next run the job. Exactly what will happen is specified through the Job ID field. This is a reference to the Java class in the platform that will actually do the work. The majority of the time, this is RunScriptJob, which will execute some JavaScript code. The Trigger type field specifies how often the job will repeat. Most jobs are run repetitively, with events process set to run every 30 seconds. Others run when the instance is started—perhaps to preload the cache. Another job that is run on a periodic basis is SMTP Sender. Once an e-mail has been generated and placed in the sys_email table, the SMTP Sender job performs the same function as many desktop e-mail clients: it connects to an e-mail server and asks it to deliver the message. It runs every minute by default. This schedule has a direct impact on how quickly our e-mail will be sent out. There may be a delay of up to 30 seconds in generating the e-mail from an event, and a further delay of up to a minute before the e-mail is actually sent. Other jobs may process a particular event queue differently. Events placed into the metric queue will be worked with after 5 seconds. Adding your own jobs The sys_trigger table is a backend data store. It is possible to add your own jobs and edit what is already there, but I don't recommend it. Instead, there is a more appropriate frontend: the Scheduled Job [sysauto] table. The sysauto table is designed to be extended. There are many things that can be automated in ServiceNow, including data imports, sending reports, and creating records, and they each have a table extended from sysauto. Once you create an entry in the sysauto table, the platform creates the appropriate record in the sys_trigger table. This is done through a call in the automation synchronizer Business Rule. Each table extended from sysauto contains fields that are relevant to its automation. For example, a Scheduled Email of Report [sysauto_report] requires e-mail addresses and reports to be specified. Creating events every day Navigate to System Definition > Scheduled Jobs. Unfortunately, the sys_trigger and sysauto tables have very similar module names. Be sure to pick the right one. When you click on New, an interceptor will fire, asking you to choose what you want to automate. Let's write a simple script that will create a maintenance task at the end of a hotel stay, so choose Automatically run a script of your choosing. Our aim is to fire an event for each room that needs cleaning. We'll keep this for midday to give our guests plenty of time to check out. Set the following fields: Name: Clean on end of reservation Time: 12:00:00 Run this script: var res = new GlideRecord('u_reservation'); res.addQuery('u_departure', gs.now()); res.addNotNullQuery('u_room'); res.query(); while (res.next()) { gs.eventQueue('room.reservation_end', res.u_room.getRefRecord()); } Remember to enclose scripts in a function if they could cause other scripts to run. Most often, this is when records are updated, but it is not the case here. Our reliable friend, GlideRecord, is employed to get reservation records. The first filter ensures that only reservations that are ending today will be returned, while the second filter ignores reservations that don't have a room. Once the database has been queried, the records are looped round. For each one, the eventQueue function of GlideSystem is used to add in an event into the event queue. The record that is being passed into the event queue is actually the Room record. The getRefRecord function of GlideElement dot-walks through a reference field and returns a newly initialized GlideRecord object rather than more GlideElement objects. Once the Scheduled Job has been saved, it'll generate the events at midday. But for testing, there is a handy Execute Now UI action. Ensure there is test data that fits the code and click on the button. Navigate to System Policy > Events > Event Log to see the entries. There is a Conditional checkbox with a separate Condition script field. However, I don't often use this; instead, I provide any conditions inline in the script that I'm writing, just like we did here. For anything more than a few lines, a Script Include should be used for modularity and efficiency. Running scripts on events The ServiceNow platform has several items that listen for events. Email Notifications are one, which we'll explore soon. Another is Script Actions. Script Actions is server-side code that is associated with a table and runs against a record, just like a Business Rule. But instead of being triggered by a database action, a Script Action is started with an event. There are many similarities between a Script Action and an asynchronous Business Rule. They both run server-side, asynchronous code. Unless there is a particular reason, stick to Business Rules for ease and familiarity. Just like a Business Rule, the GlideRecord variable called current is available. This is the same record that was passed into the second parameter when gs.eventQueue was called. Additionally, another GlideRecord variable called event is provided. It is initialized against the appropriate Event record on the sysevent table. This gives you access to the other parameters (event.param1 and event.param2) as well as who created the event, when, and more. Creating tasks automatically When creating a Script Action, the first step is to register or identify the event it will be associated with. Create another entry in Event Registry. Event name: room.reservation_end Table: Room [u_room] In order to make the functionality more data driven, let's create another template. Either navigate to System Definition > Templates or create a new Maintenance task and use the Save as Template option in the context menu. Regardless, set the following fields: Name: End of reservation room cleaning Table: Maintenance [u_maintenance] Template: Assignment group: Housekeeping Short description: End of reservation room cleaning Description: Please perform the standard cleaning for the room listed above. To create the Script Action, go to System Policy > Events > Script Actions and use the following details: Name: Produce maintenance tasks Event name: room.reservation_end Active: <ticked> Script: var tsk = new GlideRecord('u_maintenance'); tsk.newRecord(); tsk.u_room = current.sys_id; tsk.applyTemplate('End of reservation room cleaning'); tsk.insert(); This script is quite straightforward. It creates a new GlideRecord object that represents a record in the Maintenance table. The fields are initialized through newRecord, and the Room field is populated with the sys_id of current—which is the Room record that the event is associated with. The applyTemplate function is given the name of the template. It would be better to use a property here instead of hardcoding a template name. Now, the following items should occur every day: At midday, a Scheduled Job looks for any reservations that are ending today For each one, the room.reservation_end event is fired A Script Action will be called, which creates a new Maintenance task The Maintenance task is assigned, through a template, to the Housekeeping group. But how does Housekeeping know that this task has been created? Let's send them an e-mail! Sending e-mail notifications E-mail is ubiquitous. It is often the primary form of communication in business, so it is important that ServiceNow has good support. It is easy to configure ServiceNow to send out communications to whoever needs to know. There are a few general use cases for e-mail notifications: Action: Asking the receiver to do some work Informational: Giving the receiver an update or some data Approval: Asking for a decision While this is similar enough to an action e-mail, it is a common enough scenario to make it independent. We'll work through these scenarios in order to understand how ServiceNow can help. There are obviously a lot more ways you can use e-mails. One of them is for a machine-to-machine integration, such as e-bonding. It is possible to do this in ServiceNow, but it is not the best solution. Setting e-mail properties A ServiceNow instance uses standard protocols to send and receive e-mail. E-mails are sent by connecting to an SMTP server with a username and password, just like Outlook or any other e-mail client. When an instance is provisioned, it also gets an e-mail account. If your instance is available at instance.service-now.com through the Web, it has an e-mail address of instance@service-now.com. This e-mail account is not unusual. It is accessible via POP to receive mail, and uses SMTP to send it. Indeed, any standard e-mail account can be used with an instance. Navigate to System Properties > Email to investigate the settings. The properties are unusually laid out in two columns, for sending and receiving for the SMTP and POP connections. When you reach the page, the settings will be tested, so you can immediately see if the platform is capable of sending or receiving e-mails. Before you spend time configuring Email Notifications, make sure the basics work! ServiceNow will only use one e-mail account to send out e-mails, and by default, will only check for new e-mails in one account too. Tracking sent e-mails in the Activity Log One important feature of Email Notifications is that they can show up in the Activity Log if configured. This means that all e-mails associated with a ticket are associated and kept together. This is useful when tracking correspondence with a Requester. To configure the Activity Log, navigate to a Maintenance record. Right-click on the field and choose Personalize Activities. At the bottom of the Available list is Sent/Received Emails. Add it to the Selected list and click on Save. Once an e-mail has been sent out, check back to the Activity Formatter to see the results. Assigning work Our Housekeeping team is equipped with the most modern technology. Not only are they users of ServiceNow, but they have mobile phones that will send and receive e-mails. They have better things to do than constantly refresh the web interface, so let's ensure that ServiceNow will come to them. One of the most common e-mail notifications is for ServiceNow to inform people when they have been assigned a task. It usually gives an overview and a link to view more details. This e-mail tells them that something needs to happen and that ServiceNow should be updated with the result. Sending an e-mail notification on assignment When our Maintenance tasks have the Assignment group field populated, we need the appropriate team members to be aware. We are going to achieve this by sending an e-mail to everyone in that group. At Gardiner Hotels, we empower our staff: they know that one member of the team should pick the task up and own it by setting the Assigned to field to themselves and then get it done. Navigate to System Policy > Email > Notifications. You will see several examples that are useful to understand the basic configuration, but we'll create our own. Click on New. The Email Notifications form is split into three main sections: When to send, Who will receive, and What it will contain. Some options are hidden in a different view, so click on Advanced view to see them all. Start off by giving the basic details: Name: Group assignment Table: Maintenance [u_maintenance] Now, let's see each of the sections of Email Notifications form in detail, in the following sections. When to send This section gives you a choice of either using an event to determine which record should be worked with or for the e-mail notification system to monitor the table directly. Either way, Conditions and Advanced conditions lets you provide a filter or a script to ensure you only send e-mails at the right time. If you are using an event, the event must be fired and the condition fields satisfied for the e-mail to be sent. The Weight field is often overlooked. A single event or record update may satisfy the condition of multiple Email Notifications. For example, a common scenario is to send an e-mail to the Assignment group when it is populated and to send an e-mail to the Assigned to person when that is populated. But what if they both happen at the same time? You probably don't want the Assignment group being told to pick up a task if it has already been assigned. One way is to give the Assignment group e-mail a higher weight: if two e-mails are being generated, only the lower weight will be sent. The other will be marked as skipped. Another way to achieve this scenario is through conditions. Only send the Assignment group e-mail if the Assigned to field is empty. Since we've already created an event, let's use it. And because of the careful use of conditions in the Business Rule, it only sends out the event in the appropriate circumstances. That means no condition is necessary in this Email Notification. Send when: Event is fired Event name: maintenance.assigned Who will receive Once we've determined when an e-mail should be sent, we need to know who it will go to. The majority of the time, it'll be driven by data on the record. This scenario is exactly that: the people who will receive the e-mail are those in the Assignment group field on the Maintenance task. Of course, it is possible to hardcode recipients and the system can also deliver e-mails to Users and Groups that have been sent as a parameter when creating the event. Users/Groups in fields: Assignment group You can also use scripts to specify the From, To, CC, and BCC of an e-mail. The wiki here contains more information: http://wiki.servicenow.com/?title=Scripting_for_Email_Notifications Send to event creator When someone comes to me and says: "Martin, I've set up the e-mail notification, but it isn't working. Do you know why?", I like to put money on the reason. I very often win, and you can too. Just answer: "Ensure Send to event creator is ticked and try again". The Send to event creator field is only visible on the Advanced view, but is the cause of this problem. So tick Send to event creator. Make sure this field is ticked, at least for now. If you do not, when you test your e-mail notifications, you will not receive your e-mail. Why? By default, the system will not send confirmation e-mails. If you were the person to update a record and it causes e-mails to be sent, and it turns out that you are one of the recipients, it'll go to everyone other than you. The reasoning is straightforward: you carried out the action so why do you need to be informed that it happened? This cuts down on unnecessary e-mails and so is a good thing. But it confuses everyone who first comes across it. If there is one tip I can give to you in this article, it is this – tick the Send to event creator field when testing e-mails. Better still, test realistically! What it will contain The last section is probably the simplest to understand, but the one that takes most time: deciding what to send. The standard view contains just a few fields: a space to enter your message, a subject line, and an SMS alternate field that is used for text messages. Additionally, there is an Email template field that isn't often used but is useful if you want to deliver the same content in multiple e-mail messages. View them by navigating to System Policy > Email > Templates. These fields all support variable substitution. This is a special syntax that instructs the instance to insert data from the record that the e-mail is triggered for. This Maintenance e-mail can easily contain data from the Maintenance record. This lets you create data-driven e-mails. I like to compare it to a mail-merge system; you have some fixed text, some placeholders, and some data, and the platform puts them all together to produce a personalized e-mail. By default, the message will be delivered as HTML. This means you can make your messages look more styled by using image tags and font controls, among other options. Using variable substitution The format for substitution is ${variable}. All of the fields on the record are available as variables, so to include the Short description field in an e-mail, use ${short_description}. Additionally, you can dot-walk. So by having ${assigned_to.email} in the message, you insert the e-mail address of the user that the task is assigned to. Populate the fields with the following information and save: Subject: Maintenance task assigned to your group Message HTML: Hello ${assignment_group}. Maintenance task ${number} has been assigned to your group, for room: ${u_room}. Description: ${description} Please assign to a team member here: ${URI} Thanks! To make this easier, there is a Select variables section on the Message HTML and SMS alternate fields that will create the syntax in a single click. But don't forget that variable substitution is available for the Subject field too. In addition to adding the value of fields, variable substitution like the following ones also makes it easy to add HTML links. ${<reference field>.URI} will create an HTML link to the reference field, with the text LINK ${<reference field>.URI_REF} will create an HTML link, but with the display value of the record as the text Linking to CMS sites is possible through ${CMS_URI+<site>/<page>} Running scripts in e-mail messages If the variables aren't giving you enough control, like everywhere else in ServiceNow, you can add a script. To do so, create a new entry in the Email Scripts [sys_script_email] table, which is available under System Policy > Email > Notification Email Scripts. Typical server-side capability is present, including the current GlideRecord variable. To output text, use the print function of the template object. For example: template.print('Hello, world!'); Like a Script Include, the Name field is important. Call the script by placing ${mail_script:<name>} in the Message HTML field in the e-mail. An object called email is also available. This gives much more control with the resulting e-mail, giving functions such as setImportance, addAddress, and setReplyTo. This wiki has more details: http://wiki.servicenow.com/?title=Scripting_for_Email_Notifications. Controlling the watermark Every outbound mail contains a reference number embedded into the body of the message, in the format Ref:MSG0000100. This is very important for the inbound processing of e-mails, as discussed in a later section. Some options are available to hide or remove the watermark, but this may affect how the platform treats a reply. Navigating to System Mailboxes > Administration > Watermarks shows a full list of every watermark and the associated record and e-mail. Including attachments and other options There are several other options to control how an e-mail is processed: Include Attachments: It will copy any attachments from the record into the e-mail. There is no selection available: it simply duplicates each one every time. You probably wouldn't want this option ticked on many e-mails, since otherwise you will fill up the recipient's inboxes quickly! The attach_links Email Script is a good alternative—it gives HTML links that will let an interested recipient download the file from the instance. Importance: This allows a Low or High priority flag to be set on an e-mail From and Reply-To fields: They'll let you configure who the e-mail purports to be from, on a per–e-mail basis. It is important to realize that this is e-mail spoofing: while the e-mail protocols accept this, it is often used by spam to forge a false address. Sending informational updates Many people rely on e-mails to know what is going on. In addition to telling users when they need to do work, ServiceNow can keep everyone informed as to the current situation. This often takes the form of one of these scenarios: Automatic e-mails, often based on a change of the State field Completely freeform text, with or without a template A combination of the preceding two: a textual update given by a person, but in a structured template Sending a custom e-mail Sometimes, you need to send an e-mail that doesn't fit into a template. Perhaps you need to attach a file, copy in additional people, or want more control over formatting. In many cases, you would turn to the e-mail client on your desktop, such as Outlook or perhaps even Lotus Notes. But the big disadvantage is that the association between the e-mail and the record is lost. Of course, you could save the e-mail and upload it as an attachment, but that isn't as good as it being part of the audit history. ServiceNow comes with a basic e-mail client built in. In fact, it is just shortcutting the process. When you use the e-mail client, you are doing exactly the same as the Email Notifications engine would, by generating an entry in the sys_email table. Enabling the e-mail client The Email Client is accessed by a little icon in the form header of a record. In order to show it, a property must be set in the Dictionary Entry of the table. Navigate to System Definition > Dictionary and find the entry for the u_maintenance table that does not have an entry in the Column name field. The value for the filter is Table - is - u_maintenance and Column name - is – empty. Click on Advanced view. Ensure the Attributes field contains email_client. Navigate to an existing Maintenance record, and next to the attachments icon is the envelope icon. Click on it to open the e-mail client window. The Email Client is a simple window, and the fields should be obvious. Simply fill them out and click on Send to deliver the mail. You may have noticed that some of the fields were prepopulated. You can control what each field initially contains by creating an Email Client Template. Navigate to System Policy > Email > Client Templates, click on New, and save a template for the appropriate table. You can use the variable substitution syntax to place the contents of fields in the e-mail. There is a Conditions field you can add to the form to have the right template used. Quick Messages are a way to let the e-mail user populate Message Text, similar to a record template. Navigate to System Policy > Email > Quick Messages and define some text. These are then available in a dropdown selection field at the top of the e-mail client. The e-mail client is often seized upon by customers who send a lot of e-mail. However, it is a simple solution and does not have a whole host of functionality that is often expected. I've found that this gap can be frustrating. For example, there isn't an easy way to include attachments from the parent record. Instead, often a more automated way to send custom text is useful. Sending e-mails with Additional comments and Work notes The journal fields on the task table are useful enough, allowing you to record results that are then displayed on the Activity log in a who, what, when fashion. But sending out the contents via e-mail makes them especially helpful. This lets you combine two actions in one: documenting information against the ticket and also giving an update to interested parties. The Task table has two fields that let you specify who those people are: the Watch list and the Work notes list. An e-mail notification can then use this information in a structured manner to send out the work note. It can include the contents of the work notes as well as images, styled text, and background information. Sending out Work notes The Work notes field should already be on the Maintenance form. Use Form Design to include the Work notes list field too, placing it somewhere appropriate, such as underneath the Assignment group field. Both the Watch list and the Work notes list are List fields (often referred to as Glide Lists). These are reference fields that contain more than one sys_id from the sys_user table. This makes it is easy to add a requester or fulfiller who is interested in updates to the ticket. What is special about List fields is that although they point towards the sys_user table and store sys_id references, they also store e-mail addresses in the same database field. The e-mail notification system knows all about this. It will run through the following logic: If it is a sys_id, the user record is looked up. The e-mail address in the user record is used. If it is an e-mail address, the user record is searched for. If one is found, any notification settings they have are respected. A user may turn off e-mails, for example, by setting the Notification field to Disabled in their user record. If a user record is not found, the e-mail is sent directly to the e-mail address. Now create a new Email Notification and fill out the following fields: Name: Work notes update Table: Maintenance [u_maintenance] Inserted: <ticked> Updated: <ticked> Conditions: Work notes - changes Users/Groups in fields: Work notes list Subject: New work notes update on ${number} Send to event creator: <ticked> Message: ${number} - ${short_description} has a new work note added.   ${work_notes} This simple message would normally be expanded and made to fit into the corporate style guidelines—use appropriate colors and styles. By default, the last three entries in the Work notes field would be included. If this wasn't appropriate, the global property could be updated or a mail script could use getJournalEntry(1) to grab the last one. Refer to this wiki article for more information: http://wiki.servicenow.com/?title=Using_Journal_Fields#Restrict_the_Number_of_Entries_Sent_in_a_Notification. To test, add an e-mail address or a user into the Work notes list, enter something into the Work notes field, and save. Don't forget about Send to event creator! This is a typical example of how, normally, the person doing the action wouldn't need to receive the e-mail update, since they were the one doing it. But set it so it'll work with your own updates. Approving via e-mail Graphical Workflow generates records that someone will need to evaluate and make a decision on. Most often, approvers will want to receive an e-mail notification to alert them to the situation. There are two approaches to sending out an e-mail when an approval is needed. An e-mail is associated with a particular record; and with approvals, there are two records to choose from: The Approval record, asking for your decision. The response will be processed by the Graphical Workflow. The system will send out one e-mail to each person that is requested to approve it. The Task record that generated the Approval request. The system will send out one e-mail in total. Attaching notifications to the task is sometimes helpful, since it gives you access to all the fields on the record without dot-walking. This section deals with how the Approval record itself uses e-mail notifications. Using the Approval table An e-mail that is sent out from the Approval table often contains the same elements: Some text describing what needs approving: perhaps the Short description or Priority. This is often achieved by dot-walking to the data through the Approval for reference field. A link to view the task that needs approval. A link to the approval record. Two mailto links that allow the user to approve or reject through e-mail. This style is captured in the Email Template named change.itil.approve.role and is used in an Email Notification called Approval Request that is against the Approval [sys_approver] table. The mailto links are generated through a special syntax: ${mailto:mailto.approval} and ${mailto:mailto.rejection}. These actually refer to Email Templates themselves (navigate to System Policy > Email > Templates and find the template called mailto.approval). Altogether, these generate HTML code in the e-mail message that looks something like this: <a href="mailto:<instance>@service-now.com.com?subject=Re:MAI0001001 - approve&body=Ref:MSG0000001">Click here to approve MAI0001001</a> Normally, this URL would be encoded, but I've removed the characters for clarity. When this link is clicked on in the receiver's e-mail client, it creates a new e-mail message addressed to the instance, with Re:MAI0001001 - approve in the subject line and Ref:MSG0000001 in the body. If this e-mail was sent, the instance would process it and approve the approval record. A later section, on processing inbound e-mails, shows in detail how this happens. Testing the default approval e-mail In the baseline system, there is an Email Notification called Approval Request. It is sent when an approval event is fired, which happens in a Business Rule on the Approval table. It uses the e-mail template mentioned earlier, giving the recipient information and an opportunity to approve it either in their web browser, or using their e-mail client. If Howard Johnson was set as the manager of the Maintenance group, he will be receiving any approval requests generated when the Send to External button is clicked on. Try changing the e-mail address in Howard's user account to your own, but ensure the Notification field is set to Enable. Then try creating some approval requests. Specifying Notification Preferences Every user that has access to the standard web interface can configure their own e-mail preferences through the Subscription Notification functionality. Navigate to Self-Service > My profile and click on Notification Preferences to explore what is available. It represents the Notification Messages [cmn_notif_message] table in a straightforward user interface. The Notification Preferences screen shows all the notifications that the user has received, such as the Approval Request and Work notes update configured earlier. They are organized by device. By default, every user has a primary e-mail device. To never receive a notification again, just choose the Off selection and save. This is useful if you are bombarded by e-mails and would rather use the web interface to see updates! If you want to ensure a user cannot unsubscribe, check the Mandatory field in the Email Notification definition record. You may need to add it to the form. This disables the choice, as per the Work notes update notification in the screenshot. Subscribing to Email Notifications The Email Notifications table has a field labeled Subscribable. If this is checked, then users can choose to receive a message every time the Email Notification record's conditions are met. This offers a different way of working: someone can decide if they want more information, rather than the administrator deciding. Edit the Work notes update Email Notification. Switch to the Advanced view, and using Form Design, add the Subscribable field to the Who will receive section on the form. Now make the following changes. Once done, use Insert and Stay to make a copy.     Name: Work notes update (Subscribable)     Users/Groups in fields: <blank>     Subscribable: <ticked> Go to Notification Preferences and click on To subscribe to a new notification click here. The new notification can be selected from the list. Now, every time a Work note is added to any Maintenance record, a notification will be sent to the subscriber. It is important to clear Users/Groups in field if Subscribable is ticked. Otherwise, everyone in the Work notes list will then become subscribed and receive every single subsequent notification for every record! The user can also choose to only receive a subset of the messages. The Schedule field lets them choose when to receive notifications: perhaps only during working hours. The filter lets you define conditions, such as only receiving notifications for important issues. In this instance, a Notification Filter could be created for the Maintenance table, based upon the Priority field. Then, only Work notes for high-priority Maintenance tasks would be sent out. Creating a new device The Notification Devices [cmn_notif_device] table stores e-mail addresses for users. It allows every user to have multiple e-mail addresses, or even register mobile phones for text messages. When a User record is created, a Business Rule named Create primary email device inserts a record in the Notification Devices table. The value in the Email field on the User table is just copied to this table by another Business Rule named Update Email Devices. A new device can be added from the Notification Preferences page, or a Related List can be added to the User form. Navigate to User Administration > Users and create a new user. Once saved, you should receive a message saying Primary email device created for user (the username is displayed in place of user). Then add the Notification Device > User Related List to the form where the e-mail address record should be visible. Click on New. The Notification Device form allows you to enter the details of your e-mail- or SMS-capable device. Service provider is a reference field to the Notification Service Provider table, which specifies how an SMS message should be sent. If you have an account with one of the providers listed, enter your details. There are many hundreds of inactive providers in the Notification Service Provider [cmn_notif_service_provider] table. You may want to try enabling some, though many do not work for the reasons discussed soon. Once a device has been added, they can be set up to receive messages through Notification Preferences. For example, a user can choose to receive approval requests via a text message by adding the Approval Request Notification Message and associating their SMS device. Alternatively, they could have two e-mail addresses, with one for an assistant. If a Notification is sent to a SMS device, the contents of the SMS alternate field are used. Remember that a text message can only be 160 characters at maximum. The Notification Device table has a field called Primary Email. This determines which device is used for a notification that has not been sent to this user before. Despite the name, Primary Email can be ticked for an SMS device. Sending text messages Many mobile phone networks in the US supply e-mail-to-SMS gateways. AT&T gives every subscriber an e-mail address in the form of 5551234567@txt.att.net. This allows the ServiceNow instance to actually send an e-mail and have the gateway convert it into an SMS. The Notification Service Provider form gives several options to construct the appropriate e-mail address. In this scheme, the recipient pays for the text message, so the sending of text messages is free. Many European providers do not provide such functionality, since the sender is responsible for paying. Therefore, it is more common to use the Web to deliver the message to the gateway: perhaps using REST or SOAP. This gives an authenticated method of communication, which allows charging. The Notifications Service Provider table also provides an Advanced notification checkbox that enables a script field. The code is run whenever the instance needs to send out an e-mail. This is a great place to call a Script Include that does the actual work, providing it with the appropriate parameters. Some global variables are present: email.SMSText contains the SMS alternate text and device is the GlideRecord of the Notification Device. This means device.phone_number and device.user are very useful values to access. Delivering an e-mail There are a great many steps that the instance goes through to send an e-mail. Some may be skipped or delivered as a shortcut, depending on the situation, but there are usually a great many steps that are processed. An e-mail may not be sent if any one of these steps goes wrong! A record is updated: Most notifications are triggered when a task changes state or a comment is added. Use debugging techniques to determine what is changing. These next two steps may not be used if the Notification does not use events. An event is fired: A Business Rule may fire an event. Look under System Policy > Events > Event Log to see if it was fired. The event is processed: A Scheduled Job will process each event in turn. Look in the Event Log and ensure that all events have their state changed to Processed. An Email Notification is processed: The event is associated with an Email Notification or the Email Notification uses the Inserted and Updated checkboxes to monitor a table directly. Conditions are evaluated: The platform checks the associated record and ensures the conditions are met. If not, no further processing occurs. The receivers are evaluated: The recipients are determined from the logic in the Email Notification. The use of Send to event creator makes a big impact on this step. The Notification Device is determined: The Notification Messages table is queried. The appropriate Notification Device is then found. If the Notification Device is set to inactive, the recipient is dropped. The Notification field on the User record will control the Active flag of the Notification Devices. Any Notification Device filters are applied: Any further conditions set in the Notification Preferences interface are evaluated, such as Schedule and Filter. An e-mail record is generated: Variable substitution takes place on the Message Text and a record is saved into the sys_email table, with details of the messages in the Outbox. The Email Client starts at this point. The weight is evaluated: If an Email Notification with a lower weight has already been generated for the same event, the e-mail has the Mailbox field set to Skipped. The email is sent: The SMTP Sender Scheduled Job runs every minute. It picks up all messages in the Outbox, generates the message ID, and connects to the SMTP server specified in Email properties. This only occurs if Mail sending is enabled in the properties. Errors will be visible under System Mailboxes > Outbound > Failed. The generated e-mails can be monitored in the System Mailboxes Application Menu, or through System Logs > Emails. They are categorized into Mailboxes, just like an e-mail client. This should be considered a backend table, though some customers who want more control over e-mail notifications make this more accessible. Knowing who the e-mail is from ServiceNow uses one account when sending e-mails. This account is usually the one provided by ServiceNow, but it can be anything that supports SMTP: Exchange, Sendmail, NetMail, or even Gmail. The SMTP protocol lets the sender specify who the mail is from. By default, no checks are done to ensure that the sender is allowed to send from that address. Every e-mail client lets you specify who the e-mail address is from, so I could change the settings in Outlook to say my e-mail address is president@whitehouse.gov or primeminister@number10.gov.uk. Spammers and virus writers have taken advantage of this situation to fill our mailboxes with unwanted e-mails. Therefore, e-mail systems are doing more authentication and checking of addresses when the message is received. You may have seen some e-mails from your client saying an e-mail has been delivered on behalf of another when this validation fails, or it even falling into the spam directly. ServiceNow uses SPF to specify which IP addresses can deliver service-now.com e-mails. Spam filters often use this to check if a sender is authorized. If you spoof the e-mail address, you may need to make an exception for ServiceNow. Read up more about it at: http://en.wikipedia.org/wiki/Sender_Policy_Framework. You may want to change the e-mail addresses on the instance to be your corporate domain. That means that your ServiceNow instance will send the message but will pretend that it is coming from another source. This runs the real risk of the e-mails being marked as spam. Instead, think about only changing the From display (not the e-mail address) or use your own e-mail account. Receiving e-mails Many systems can send e-mails. But isn't it annoying when they are broadcast only? When I get sent a message, I want to be able to reply to it. E-mail should be a conversation, not a fire-and-forget distribution mechanism. So what happens when you reply to a ServiceNow e-mail? It gets categorized, and then processed according to the settings in Inbound Email Actions. Lots of information is available on the wiki: http://wiki.servicenow.com/?title=Inbound_Email_Actions. Determining what an inbound e-mail is Every two minutes, the platform runs the POP Reader scheduled job. It connects to the e-mail account specified in the properties and pulls them all into the Email table, setting the Mailbox to be Inbox. Despite the name, the POP Reader job also supports IMAP accounts. This fires an event called email.read, which in turn starts the classification of the e-mail. It uses a series of logic decisions to determine how it should respond. The concept is that an inbound e-mail can be a reply to something that the platform has already sent out, is an e-mail that someone forwarded, or is part of an e-mail chain that the platform has not seen before; that is, it is a new e-mail. Each of these are handled differently, with different assumptions. As the first step in processing the e-mail, the platform attempts to find the sender in the User table. It takes the address that the e-mail was sent from as the key to search for. If it cannot find a User, it either creates a new User record (if the property is set), or uses the Guest account. Should this e-mail be processed at all? If either of the following conditions match, then the e-mail has the Mailbox set to skipped and no further processing takes place:     Does the subject line start with recognized text such as "out of office autoreply"?     Is the User account locked out? Is this a forward? Both of the following conditions must match, else the e-mail will be checked as a reply:     Does the subject line start with a recognized prefix (such as FW)?     Does the string "From" appear anywhere in the body? Is this a reply? One of the following conditions must match, else the e-mail will be processed as new:     Is there a valid, appropriate watermark that matches an existing record?     Is there an In-Reply-To header in the e-mail that references an e-mail sent by the instance?     Does the subject line start with a recognized prefix (such as RE) and contain a number prefix (such as MAI000100)? If none of these are affirmative, the e-mail is treated as a new e-mail. The prefixes and recognized text are controlled with properties available under System Properties > Email. This order of processing and logic cannot be changed. It is hardcoded into the platform. However, clever manipulation of the properties and prefixes allows great control over what will happen. One common request is to treat forwarded e-mails just like replies. To accomplish this, a nonsensical string should be added into the forward_subject_prefix, and the standard values added to the reply_subject prefix. property. For example, the following values could be used: Forward prefix: xxxxxxxxxxx Reply prefix: re:, aw:, r:, fw:, fwd:… This will ensure that a match with the forwarding prefixes is very unlikely, while the reply logic checks will be met. Creating Inbound Email Actions Once an e-mail has been categorized, it will run through the appropriate Inbound Email Action. The main purpose of an Inbound Email Action is to run JavaScript code that manipulates a target record in some way. The target record depends upon what the e-mail has been classified as: A forwarded or new e-mail will create a new record A reply will update an existing record Every Inbound Email Action is associated with a table and a condition, just like Business Rules. Since a reply must be associated with an existing record (usually found using the watermark), the platform will only look for Inbound Email Actions that are against the same table. The platform initializes the GlideRecord object current as the existing record. An e-mail classified as Reply must have an associated record, found via the watermark, the In-Reply-To header, or by running a search for a prefix stored in the sys_number table, or else it will not proceed. Forwarded and new e-mails will create new records. They will use the first Inbound Email Action that meets the condition, regardless of the table. It will then initialize a new GlideRecord object called current, expecting it to be inserted into the table. Accessing the e-mail information In order to make the scripting easier, the platform parses the e-mail and populates the properties of an object called email. Some of the more helpful properties are listed here: email.to is a comma-separated list of e-mail addresses that the e-mail was sent to and was CC'ed to. email.body_text contains the full text of the e-mail, but does not include the previous entries in the e-mail message chain. This behavior is controlled by a property. For example, anything that appears underneath two empty lines plus -----Original Message----- is ignored. email.subject is the subject line of the e-mail. email.from contains the e-mail address of the User record that the platform thinks sent the e-mail. email.origemail uses the e-mail headers to get the e-mail address of the original sender. email.body contains the body of the e-mail, separated into name:value pairs. For instance, if a line of the body was hello:world, it would be equivalent to email.body.hello = 'world'. Approving e-mails using Inbound Email Actions The previous section looked at how the platform can generate mailto links, ready for a user to select. They generate an e-mail that has the word approve or reject in the subject line and watermark in the body. This is a great example of how e-mail can be used to automate steps in ServiceNow. Approving via e-mail is often much quicker than logging in to the instance, especially if you are working remotely and are on the road. It means approvals happen faster, which in turn provides better service to the requesters and reduces the effort for our approvers. Win win! The Update Approval Request Inbound Email Action uses the information in the inbound e-mail to update the Approval record appropriately. Navigate to System Policy > Email > Inbound Actions to see what it does. We'll inspect a few lines of the code to get a feel for what is possible when automating actions with incoming e-mails. Understanding the code in Update Approval Request One of the first steps within the function, validUser, performs a check to ensure the sender is allowed to update this Approval. They must either be a delegate or the user themselves. Some companies prefer to use an e-Signature method to perform approval, where a password must be entered. This check is not up to that level, but does go some way to helping. E-mail addresses (and From strings) can be spoofed in an e-mail client. Assuming the validation is passed, the Comments field of the Approval record is updated with the body of the e-mail. current.comments = "reply from: " + email.from + "nn" + email.body_text; In order to set the State field, and thus make the decision on the Approval request, the script simply runs a search for the existence of approve or reject within the subject line of the e-mail using the standard indexOf string function. If it is found, the state is set. if (email.subject.indexOf("approve") >= 0) current.state = "approved"; if (email.subject.indexOf("reject") >= 0) current.state = "rejected"; Once the fields have been updated, it saves the record. This triggers the standard Business Rules and will run the Workflow as though this was done in the web interface. Updating the Work notes of a Maintenance task Most often, a reply to an e-mail is to add Additional comments or Work notes to a task. Using scripting, you could differentiate between the two scenarios by seeing who has sent the e-mail: a requester would provide Additional comments and a fulfiller may give either, but it is safer to assume Work notes. Let's make a simple Inbound Email Action to process e-mails and populate the Work notes field. Navigate to System Policy > Email > Inbound Actions and click on New. Use these details: Name: Work notes for Maintenance task Target table: Maintenance [u_maintenance] Active: <ticked> Type: Reply Script: current.work_notes = "Reply from: " + email.origemail + "nn" + email.body_text; current.update(); This script is very simple: it just updates our task record after setting the Work notes field with the e-mail address of the sender and the text they sent. It is separated out with a few new lines. The platform impersonates the sender, so the Activity Log will show the update as though it was done in the web interface. Once the record has been saved, the Business Rules run as normal. This includes ServiceNow sending out e-mails. Anyone who is in the Work notes list will receive the e-mail. If Send to event creator is ticked, it means the person who sent the e-mail may receive another in return, telling them they updated the task! Having multiple incoming e-mail addresses Many customers want to have logic based upon inbound e-mail addresses. For example, sending a new e-mail to invoices@gardiner-hotels.com would create a task for the Finance team, while wifi@gardiner-hotels.com creates a ticket for the Networking group. These are easy to remember and work with, and implementing ServiceNow should not mean that this simplicity should be removed. ServiceNow provides a single e-mail account that is in the format instance@service-now.com and is not able to provide multiple or custom e-mail addresses. There are two broad options for meeting this requirement: Checking multiple accounts Redirecting e-mails Using the Email Accounts plugin While ServiceNow only provides a single e-mail address, it has the ability to pull in e-mails from multiple e-mail accounts through the Email Accounts plugin. The wiki has more information here: http://wiki.servicenow.com/?title=Email_Accounts. Once the plugin has been activated, it converts the standard account information into a new Email Account [sys_email_account] record. There can be multiple Email Accounts for a particular instance, and the POP Reader job is repurposed to check each one. Once the e-mails have been brought into ServiceNow, they are treated as normal. Since ServiceNow does not provide multiple e-mail accounts, it is the customer's responsibility to create, maintain, and configure the instance with the details, including the username and passwords. The instance will need to connect to the e-mail account, which is often hosted within the customer's datacenter. This means that firewall rules or other security methods may need to be considered. Redirecting e-mails Instead of having the instance check multiple e-mail accounts, it is often preferable to continue to work with a single e-mail address. The additional e-mail addresses can be redirected to the one that ServiceNow provides. The majority of e-mail platforms, such as Microsoft Exchange, make it possible to redirect e-mail accounts. When an e-mail is received by the e-mail system, it is resent to the ServiceNow account. This process differs from e-mail forwarding: Forwarding involves adding the FW: prefix to the subject line, altering the message body, and changing the From address. Redirection sends the message unaltered, with the original To address, to the new address. There is little indication that the message has not come directly from the original sender. Redirection is often an easier method to work with than having multiple e-mail accounts. It gives more flexibility to the customer's IT team, since they do not need to provide account details to the instance, and enables them to change the redirection details easily. If a new e-mail address has to be added or an existing one decommissioned, only the e-mail platform needs to be involved. It also reduces the configuration on the ServiceNow instance; nothing needs to change. Processing multiple e-mail address Once the e-mails have been brought into ServiceNow, the platform will need to examine who the e-mail was sent to and make some decisions. This will allow the e-mails sent to wifi@gardiner-hotels.com to be routed as tasks to the networking team. There are several methods available for achieving this: A look-up table can be created, containing a list of e-mail addresses and a matching Group reference. The Inbound Email Script would use a GlideRecord query to find the right entry and populate the Assignment group on the new task. The e-mail address could be copied over into a new field on the task. Standard routing techniques, such as Assignment Rules and Data Lookup, could be used to examine the new field and populate the Assignment group. The Inbound Email Action could contain the addresses hardcoded in the script. While this is not a scalable or maintainable solution, it may be appropriate for a simple deployment. Recording Metrics ServiceNow provides several ways to monitor the progress of a task. These are often reported and e-mailed to the stakeholders, thus providing insight into the effectiveness of processes. Metrics are a way to record information. It allows the analysis and improvement of a process by measuring statistics, based upon particular defined criteria. Most often, these are time based. One of the most common metrics is how long it takes to complete a task: from when the record was created to the moment the Active flag became false. The duration can then be averaged out and compared over time, helping to answer questions such as Are we getting quicker at completing tasks? Metrics provide a great alternative to creating lots of extra fields and Business Rules on a table. Other metrics are more complex and may involve getting more than one result per task. How long does each Assignment group take to deal with the ticket? How long does an SLA get paused for? How many times does the incident get reassigned? The difference between Metrics and SLAs At first glance, a Metric appears to be very similar to an SLA, since they both record time. However, there are some key differences between Metrics and SLAs: There is no target or aim defined in a Metric. It cannot be breached; the duration is simply recorded. A Metric cannot be paused or made to work to a schedule. There is no Workflow associated with a Metric. In general, a Metric is a more straightforward measurement, designed for collecting statistics rather than being in the forefront when processing a task. Running Metrics Every time the Task table gets updated, the metrics events Business Rule fires an event called metric.update. A Script Action named Metric Update is associated with the event and calls the appropriate Metric Definitions. If you define a metric on a non-task-based table, make sure you fire the metric.update event through a Business Rule. The Metric Definition [metric_definition] table specifies how a metric should be recorded, while the Metric Instance [metric_instance] table records the results. As ever, each Metric Definition is applied to a specific table. The Type field of a Metric Definition refers to two situations: Field value duration is associated with a field on the table. Each time the field changes value, the platform creates a new Metric Instance. The duration for which that value was present is recorded. No code is required, but if some is given, it is used as a condition. Script calculation uses JavaScript to determine what the Metric Instance contains. Scripting a Metric Definition There are several predefined variables available to a Metric Definition: current refers to the GlideRecord under examination and definition is a GlideRecord of the Metric Definition. The MetricInstance Script Include provides some helpful functions, including startDuration and endDuration, but it is really only relevant for time-based metrics. Metrics can be used to calculate many statistics (like the number of times a task is reopened), but code must be written to accomplish this. Monitoring the duration of Maintenance tasks Navigate to Metrics > Definitions and click on New. Set the following fields: Name: Maintenance states Table: Maintenance [u_maintenance] Field: State Timeline: <ticked> Once saved, test it out by changing the State field on a Maintenance record to several different values. Make sure to wait 30 seconds or so between each State change, so that the Scheduled Job has time to fire. Right-click on the Form header and choose Metrics Timeline to visualize the changes in the State field. Adding the Metrics Related List to the Maintenance form will display all the captured data. Another Related List is available on the Maintenance Definition form. Summary This article showed how to deal with all the data collected in ServiceNow. The key to this is the automated processing of information. We started with exploring events. When things happen in ServiceNow, the platform can notice and set a flag for processing later. This keeps the system responsive for the user, while ensuring all the work that needs to get done, does get done. Scheduled Jobs is the background for a variety of functions: scheduled reports, scripts, or even task generation. They run on a periodic basis, such as every day or every hour. They are often used for the automatic closure of tasks if the requester hasn't responded recently. Email Notifications are a critical part of any business application. We explored how e-mails are used to let requesters know when they've got work to do, to give requesters a useful update, or when an approver must make a decision. We even saw how approvers can make that decision using only e-mail. Every user has a great deal of control over how they receive these notifications. The Notification Preferences interface lets them add multiple devices, including mobile phones to receive text messages. The Email Client in ServiceNow gives a simple, straightforward interface to send out e-mails, but the Additional comments and Work notes fields are often better and quicker to use. Every e-mail can include the contents of fields and even the output of scripts. Every two minutes, ServiceNow checks for e-mails sent to its account. If it finds any, the e-mail is categorized into being a reply, forward, or new and runs Inbound Email Actions to update or create new records.
Read more
  • 0
  • 0
  • 14720

article-image-regex-practice
Packt
04 Jun 2015
24 min read
Save for later

Regex in Practice

Packt
04 Jun 2015
24 min read
Knowing Regex's syntax allows you to model text patterns, but sometimes coming up with a good reliable pattern can be more difficult, so taking a look at some actual use cases can really help you learn some common design patterns. So, in this article by Loiane Groner and Gabriel Manricks, coauthors of the book JavaScript Regular Expressions, we will develop a form, and we will explore the following topics: Validating a name Validating e-mails Validating a Twitter username Validating passwords Validating URLs Manipulating text (For more resources related to this topic, see here.) Regular expressions and form validation By far, one of the most common uses for regular expressions on the frontend is for use with user submitted forms, so this is what we will be building. The form we will be building will have all the common fields, such as name, e-mail, website, and so on, but we will also experiment with some text processing besides all the validations. In real-world applications, you usually are not going to implement the parsing and validation code manually. You can create a regular expression and rely on some JavaScript libraries, such as: jQuery validation: Refer to http://jqueryvalidation.org/ Parsely.js: Refer to http://parsleyjs.org/ Even the most popular frameworks support the usage of regular expressions with its native validation engine, such as AngularJS (refer to http://www.ng-newsletter.com/posts/validations.html). Setting up the form This demo will be for a site that allows users to create an online bio, and as such, consists of different types of fields. However, before we get into this (since we won't be building a backend to handle the form), we are going to setup some HTML and JavaScript code to catch the form submission and extract/validate the data entered in it. To keep the code neat, we will create an array with all the validation functions, and a data object where all the final data will be kept. Here is a basic outline of the HTML code for which we begin by adding fields: <!DOCTYPE HTML> <html>    <head>        <title>Personal Bio Demo</title>    </head>    <body>        <form id="main_form">            <input type="submit" value="Process" />        </form>          <script>            // js goes here        </script>    </body> </html> Next, we need to write some JavaScript to catch the form and run through the list of functions that we will be writing. If a function returns false, it means that the verification did not pass and we will stop processing the form. In the event where we get through the entire list of functions and no problems arise, we will log out of the console and data object, which contain all the fields we extracted: <script>    var fns = [];    var data = {};      var form = document.getElementById("main_form");      form.onsubmit = function(e) {      e.preventDefault();          data = {};          for (var i = 0; i < fns.length; i++) {            if (fns[i]() == false) {                return;            }        }          console.log("Verified Data: ", data);    } </script> The JavaScript starts by creating the two variables I mentioned previously, we then pull the form's object from the DOM and set the submit handler. The submit handler begins by preventing a page from actually submitting, (as we don't have any backend code in this example) and then we go through the list of functions running them one by one. Validating fields In this section, we will explore how to validate different types of fields manually, such as name, e-mail, website URL, and so on. Matching a complete name To get our feet wet, let's begin with a simple name field. It's something we have gone through briefly in the past, so it should give you an idea of how our system will work. The following code goes inside the script tags, but only after everything we have written so far: function process_name() {    var field = document.getElementById("name_field");    var name = field.value;      var name_pattern = /^(S+) (S*) ?b(S+)$/;      if (name_pattern.test(name) === false) {        alert("Name field is invalid");         return false;    }      var res = name_pattern.exec(name);    data.first_name = res[1];    data.last_name = res[3];      if (res[2].length > 0) {        data.middle_name = res[2];    }      return true; }   fns.push(process_name); We get the name field in a similar way to how we got the form, then, we extract the value and test it against a pattern to match a full name. If the name doesn't match the pattern, we simply alert the user and return false to let the form handler know that the validations have failed. If the name field is in the correct format, we set the corresponding fields on the data object (remember, the middle name is optional here). The last line just adds this function to the array of functions, so it will be called when the form is submitted. The last thing required to get this working is to add HTML for this form field, so inside the form tags (right before the submit button), you can add this text input: Name: <input type="text" id="name_field" /><br /> Opening this page in your browser, you should be able to test it out by entering different values into the Name box. If you enter a valid name, you should get the data object printed out with the correct parameters, otherwise you should be able to see this alert message: Understanding the complete name Regex Let's go back to the regular expression used to match the name entered by a user: /^(S+) (S*) ?b(S+)$/ The following is a brief explanation of the Regex: The ^ character asserts its position at the beginning of a string The first capturing group (S+) S+ matches a non-white space character [^rntf] The + quantifier between one and unlimited times The second capturing group (S*) S* matches any non-whitespace character [^rntf] The * quantifier between zero and unlimited times " ?" matches the whitespace character The ? quantifier between zero and one time b asserts its position at a (^w|w$|Ww|wW) word boundary The third capturing group (S+) S+ matches a non-whitespace character [^rntf] The + quantifier between one and unlimited times $ asserts its position at the end of a string Matching an e-mail with Regex The next type of field we may want to add is an e-mail field. E-mails may look pretty simple at first glance, but there are a large variety of e-mails out there. You may just think of creating a word@word.word pattern, but the first section can contain many additional characters besides just letters, the domain can be a subdomain, or the suffix could have multiple parts (such as .co.uk for the UK). Our pattern will simply look for a group of characters that are not spaces or instances where the @ symbol has been used in the first section. We will then want an @ symbol, followed by another set of characters that have at least one period, followed by the suffix, which in itself could contain another suffix. So, this can be accomplished in the following manner: /[^s@]+@[^s@.]+.[^s@]+/ The pattern of our example is very simple and will not match every valid e-mail address. There is an official standard for an e-mail address's regular expressions called RFC 5322. For more information, please read http://www.regular-expressions.info/email.html. So, let's add the field to our page: Email: <input type="text" id="email_field" /><br /> We can then add this function to verify it: function process_email() {    var field = document.getElementById("email_field");    var email = field.value;      var email_pattern = /^[^s@]+@[^s@.]+.[^s@]+$/;      if (email_pattern.test(email) === false) {        alert("Email is invalid");        return false;    }      data.email = email;    return true; }   fns.push(process_email); There is an HTML5 field type specifically designed for e-mails, but here we are verifying manually, as this is a Regex book. For more information, please refer to http://www.w3.org/TR/html-markup/input.email.html. Understanding the e-mail Regex Let's go back to the regular expression used to match the name entered by the user: /^[^s@]+@[^s@.]+.[^s@]+$/ Following is a brief explanation of the Regex: ^ asserts a position at the beginning of the string [^s@]+ matches a single character that is not present in the following list: The + quantifier between one and unlimited times s matches any white space character [rntf ] @ matches the @ literal character [^s@.]+ matches a single character that is not present in the following list: The + quantifier between one and unlimited times s matches a [rntf] whitespace character @. is a single character in the @. list, literally . matches the . character literally [^s@]+ match a single character that is not present in the following list: The + quantifier between one and unlimited times s matches [rntf] a whitespace character @ is the @ literal character $ asserts its position at end of a string Matching a Twitter name The next field we are going to add is a field for a Twitter username. For the unfamiliar, a Twitter username is in the @username format, but when people enter this in, they sometimes include the preceding @ symbol and on other occasions, they only write the username by itself. Obviously, internally we would like everything to be stored uniformly, so we will need to extract the username, regardless of the @ symbol, and then manually prepend it with one, so regardless of whether it was there or not, the end result will look the same. So again, let's add a field for this: Twitter: <input type="text" id="twitter_field" /><br /> Now, let's write the function to handle it: function process_twitter() {    var field = document.getElementById("twitter_field");    var username = field.value;      var twitter_pattern = /^@?(w+)$/;      if (twitter_pattern.test(username) === false) {        alert("Twitter username is invalid");        return false;    }      var res = twitter_pattern.exec(username);    data.twitter = "@" + res[1];    return true; }   fns.push(process_twitter); If a user inputs the @ symbol, it will be ignored, as we will add it manually after checking the username. Understanding the twitter username Regex Let's go back to the regular expression used to match the name entered by the user: /^@?(w+)$/ This is a brief explanation of the Regex: ^ asserts its position at start of the string @? matches the @ character, literally The ? quantifier between zero and one time First capturing group (w+) w+ matches a [a-zA-Z0-9_] word character The + quantifier between one and unlimited times $ asserts its position at end of a string Matching passwords Another popular field, which can have some unique constraints, is a password field. Now, not every password field is interesting; you may just allow just about anything as a password, as long as the field isn't left blank. However, there are sites where you need to have at least one letter from each case, a number, and at least one other character. Considering all the ways these can be combined, creating a pattern that can validate this could be quite complex. A much better solution for this, and one that allows us to be a bit more verbose with our error messages, is to create four separate patterns and make sure the password matches each of them. For the input, it's almost identical: Password: <input type="password" id="password_field" /><br /> The process_password function is not very different from the previous example as we can see its code as follows: function process_password() {    var field = document.getElementById("password_field");    var password = field.value;      var contains_lowercase = /[a-z]/;    var contains_uppercase = /[A-Z]/;    var contains_number = /[0-9]/;    var contains_other = /[^a-zA-Z0-9]/;      if (contains_lowercase.test(password) === false) {        alert("Password must include a lowercase letter");        return false;    }      if (contains_uppercase.test(password) === false) {        alert("Password must include an uppercase letter");        return false;    }      if (contains_number.test(password) === false) {        alert("Password must include a number");        return false;    }      if (contains_other.test(password) === false) {        alert("Password must include a non-alphanumeric character");        return false;    }      data.password = password;    return true; }   fns.push(process_password); All in all, you may say that this is a pretty basic validation and something we have already covered, but I think it's a great example of working smart as opposed to working hard. Sure, we probably could have created one long pattern that would check everything together, but it would be less clear and less flexible. So, by breaking it into smaller and more manageable validations, we were able to make clear patterns, and at the same time, improve their usability with more helpful alert messages. Matching URLs Next, let's create a field for the user's website; the HTML for this field is: Website: <input type="text" id="website_field" /><br /> A URL can have many different protocols, but for this example, let's restrict it to only http or https links. Next, we have the domain name with an optional subdomain, and we need to end it with a suffix. The suffix itself can be a single word, such as .com or it can have multiple segments, such as.co.uk. All in all, our pattern looks similar to this: /^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i Here, we are using multiple noncapture groups, both for when sections are optional and for when we want to repeat a segment. You may have also noticed that we are using the case insensitive flag (/i) at the end of the regular expression, as links can be written in lowercase or uppercase. Now, we'll implement the actual function: function process_website() {    var field = document.getElementById("website_field");    var website = field.value;      var pattern = /^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i      if (pattern.test(website) === false) {       alert("Website is invalid");        return false;    }      data.website = website;    return true; }   fns.push(process_website); At this point, you should be pretty familiar with the process of adding fields to our form and adding a function to validate them. So, for our remaining examples let's shift our focus a bit from validating inputs to manipulating data. Understanding the URL Regex Let's go back to the regular expression used to match the name entered by the user: /^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i This is a brief explanation of the Regex: ^ asserts its position at start of a string (?:https?://)? is anon-capturing group The ? quantifier between zero and one time http matches the http characters literally (case-insensitive) s? matches the s character literally (case-insensitive) The ? quantifier between zero and one time : matches the : character literally / matches the / character literally / matches the / character literally w+ matches a [a-zA-Z0-9_] word character The + quantifier between one and unlimited times (?:.w+)? is a non-capturing group The ? quantifier between zero and one time . matches the . character literally w+ matches a [a-zA-Z0-9_] word character The + quantifier between one and unlimited times (?:.[A-Z]{2,3})+ is a non-capturing group The + quantifier between one and unlimited times . matches the . character literally [A-Z]{2,3} matches a single character present in this list The {2,3} quantifier between2 and 3 times A-Z is a single character in the range between A and Z (case insensitive) $ asserts its position at end of a string i modifier: insensitive. Case insensitive letters, meaning it will match a-z and A-Z. Manipulating data We are going to add one more input to our form, which will be for the user's description. In the description, we will parse for things, such as e-mails, and then create both a plain text and HTML version of the user's description. The HTML for this form is pretty straightforward; we will be using a standard textbox and give it an appropriate field: Description: <br /> <textarea id="description_field"></textarea><br /> Next, let's start with the bare scaffold needed to begin processing the form data: function process_description() {    var field = document.getElementById("description_field");    var description = field.value;      data.text_description = description;      // More Processing Here      data.html_description = "<p>" + description + "</p>";      return true; }   fns.push(process_description); This code gets the text from the textbox on the page and then saves both a plain text version and an HTML version of it. At this stage, the HTML version is simply the plain text version wrapped between a pair of paragraph tags, but this is what we will be working on now. The first thing I want to do is split between paragraphs, in a text area the user may have different split-ups—lines and paragraphs. For our example, let's say the user just entered a single new line character, then we will add a <br /> tag and if there is more than one character, we will create a new paragraph using the <p> tag. Using the String.replace method We are going to use JavaScript's replace method on the string object This function can accept a Regex pattern as its first parameter, and a function as its second; each time it finds the pattern it will call the function and anything returned by the function will be inserted in place of the matched text. So, for our example, we will be looking for new line characters, and in the function, we will decide if we want to replace the new line with a break line tag or an actual new paragraph, based on how many new line characters it was able to pick up: var line_pattern = /n+/g; description = description.replace(line_pattern, function(match) {    if (match == "n") {        return "<br />";    } else {        return "</p><p>";    } }); The first thing you may notice is that we need to use the g flag in the pattern, so that it will look for all possible matches as opposed to only the first. Besides this, the rest is pretty straightforward. Consider this form: If you take a look at the output from the console of the preceding code, you should get something similar to this: Matching a description field The next thing we need to do is try and extract e-mails from the text and automatically wrap them in a link tag. We have already covered a Regexp pattern to capture e-mails, but we will need to modify it slightly, as our previous pattern expects that an e-mail is the only thing present in the text. In this situation, we are interested in all the e-mails included in a large body of text. If you were simply looking for a word, you would be able to use the b matcher, which matches any boundary (that can be the end of a word/the end of a sentence), so instead of the dollar sign, which we used before to denote the end of a string, we would place the boundary character to denote the end of a word. However, in our case it isn't quite good enough, as there are boundary characters that are valid e-mail characters, for example, the period character is valid. To get around this, we can use the boundary character in conjunction with a lookahead group and say we want it to end with a word boundary, but only if it is followed by a space or end of a sentence/string. This will ensure we aren't cutting off a subdomain or a part of a domain, if there is some invalid information mid-way through the address. Now, we aren't creating something that will try and parse e-mails no matter how they are entered; the point of creating validators and patterns is to force the user to enter something logical. That said, we assume that if the user wrote an e-mail address and then a period, that he/she didn't enter an invalid address, rather, he/she entered an address and then ended a sentence (the period is not part of the address). In our code, we assume that to the end an address, the user is either going to have a space after, such as some kind of punctuation, or that he/she is ending the string/line. We no longer have to deal with lines because we converted them to HTML, but we do have to worry that our pattern doesn't pick up an HTML tag in the process. At the end of this, our pattern will look similar to this: /b[^s<>@]+@[^s<>@.]+.[^s<>@]+b(?=.?(?:s|<|$))/g We start off with a word boundary, then, we look for the pattern we had before. I added both the (>) greater-than and the (<) less-than characters to the group of disallowed characters, so that it will not pick up any HTML tags. At the end of the pattern, you can see that we want to end on a word boundary, but only if it is followed by a space, an HTML tag, or the end of a string. The complete function, which does all the matching, is as follows: function process_description() {    var field = document.getElementById("description_field");    var description = field.value;      data.text_description = description;      var line_pattern = /n+/g;    description = description.replace(line_pattern, function(match) {        if (match == "n") {            return "<br />";        } else {            return "</p><p>";        }    });      var email_pattern = /b[^s<>@]+@[^s<>@.]+.[^s<>@]+b(?=.?(?:s|<|$))/g;    description = description.replace(email_pattern, function(match){        return "<a href='mailto:" + match + "'>" + match + "</a>";    });      data.html_description = "<p>" + description + "</p>";      return true; } We can continue to add fields, but I think the point has been understood. You have a pattern that matches what you want, and with the extracted data, you are able to extract and manipulate the data into any format you may need. Understanding the description Regex Let's go back to the regular expression used to match the name entered by the user: /b[^s<>@]+@[^s<>@.]+.[^s<>@]+b(?=.?(?:s|<|$))/g This is a brief explanation of the Regex: b asserts its position at a (^w|w$|Ww|wW) word boundary [^s<>@]+ matches a single character not present in the this list: The + quantifier between one and unlimited times s matches a [rntf ] whitespace character <>@ is a single character in the <>@ list (case-sensitive) @ matches the @ character literally [^s<>@.]+ matches a single character not present in this list: The + quantifier between one and unlimited times s matches any [rntf] whitespace character <>@. is a single character in the <>@. list literally (case sensitive) . matches the . character literally [^s<>@]+ matches a single character not present in this the list: The + quantifier between one and unlimited times s matches a [rntf ] whitespace character <>@ isa single character in the <>@ list literally (case sensitive) b asserts its position at a (^w|w$|Ww|wW) word boundary (?=.?(?:s|<|$)) Positive Lookahead - Assert that the Regex below can be matched .? matches any character (except new line) The ? quantifier between zero and one time (?:s|<|$) is a non-capturing group: First alternative: s matches any white space character [rntf] Second alternative: < matches the character < literally Third alternative: $ assert position at end of the string The g modifier: global match. Returns all matches of the regular expression, not only the first one Explaining a Markdown example More examples of regular expressions can be seen with the popular Markdown syntax (refer to http://en.wikipedia.org/wiki/Markdown). This is a situation where a user is forced to write things in a custom format, although it's still a format, which saves typing and is easier to understand. For example, to create a link in Markdown, you would type something similar to this: [Click Me](http://gabrielmanricks.com) This would then be converted to: <a href="http://gabrielmanricks.com">Click Me</a> Disregarding any validation on the URL itself, this can easily be achieved using this pattern: /[([^]]*)](([^(]*))/g It looks a little complex, because both the square brackets and parenthesis are both special characters that need to be escaped. Basically, what we are saying is that we want an open square bracket, anything up to the closing square bracket, then we want an open parenthesis, and again, anything until the closing parenthesis. A good website to write markdown documents is http://dillinger.io/. Since we wrapped each section into its own capture group, we can write this function: text.replace(/[([^]]*)](([^(]*))/g, function(match, text, link){    return "<a href='" + link + "'>" + text + "</a>"; }); We haven't been using capture groups in our manipulation examples, but if you use them, then the first parameter to the callback is the entire match (similar to the ones we have been working with) and then all the individual groups are passed as subsequent parameters, in the order that they appear in the pattern. Summary In this article, we covered a couple of examples that showed us how to both validate user inputs as well as manipulate them. We also took a look at some common design patterns and saw how it's sometimes better to simplify the problem instead of using brute force in one pattern for the purpose of creating validations. Resources for Article: Further resources on this subject: Getting Started with JSON [article] Function passing [article] YUI Test [article]
Read more
  • 0
  • 0
  • 7004
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-getting-started-multiplayer-game-programming
Packt
04 Jun 2015
33 min read
Save for later

Getting Started with Multiplayer Game Programming

Packt
04 Jun 2015
33 min read
In this article by Rodrigo Silveira author of the book Multiplayer gaming with HTML5 game development, if you're reading this, chances are pretty good that you are already a game developer. That being the case, then you already know just how exciting it is to program your own games, either professionally or as a highly gratifying hobby that is very time-consuming. Now you're ready to take your game programming skills to the next level—that is, you're ready to implement multiplayer functionality into your JavaScript-based games. (For more resources related to this topic, see here.) In case you have already set out to create multiplayer games for the Open Web Platform using HTML5 and JavaScript, then you may have already come to realize that a personal desktop computer, laptop, or a mobile device is not particularly the most appropriate device to share with another human player for games in which two or more players share the same game world at the same time. Therefore, what is needed in order to create exciting multiplayer games with JavaScript is some form of networking technology. We will be discussing the following principles and concepts: The basics of networking and network programming paradigms Socket programming with HTML5 Programming a game server and game clients Turn-based multiplayer games Understanding the basics of networking It is said that one cannot program games that make use of networking without first understanding all about the discipline of computer networking and network programming. Although having a deep understanding of any topic can be only beneficial to the person working on that topic, I don't believe that you must know everything there is to know about game networking in order to program some pretty fun and engaging multiplayer games. Saying that is the case is like saying that one needs to be a scholar of the Spanish language in order to cook a simple burrito. Thus, let us take a look at the most basic and fundamental concepts of networking. After you finish reading this article, you will know enough about computer networking to get started, and you will feel comfortable adding multiplayer aspects to your games. One thing to keep in mind is that, even though networked games are not nearly as old as single-player games, computer networking is actually a very old and well-studied subject. Some of the earliest computer network systems date back to the 1950s. Though some of the techniques have improved over the years, the basic idea remains the same: two or more computers are connected together to establish communication between the machines. By communication, I mean data exchange, such as sending messages back and forth between the machines, or one of the machines only sends the data and the other only receives it. With this brief introduction to the concept of networking, you are now grounded in the subject of networking, enough to know what is required to network your games—two or more computers that talk to each other as close to real time as possible. By now, it should be clear how this simple concept makes it possible for us to connect multiple players into the same game world. In essence, we need a way to share the global game data among all the players who are connected to the game session, then continue to update each player about every other player. There are several different techniques that are commonly used to achieve this, but the two most common approaches are peer-to-peer and client-server. Both techniques present different opportunities, including advantages and disadvantages. In general, neither is particularly better than the other, but different situations and use cases may be better suited for one or the other technique. Peer-to-peer networking A simple way to connect players into the same virtual game world is through the peer-to-peer architecture. Although the name might suggest that only two peers ("nodes") are involved, by definition a peer-to-peer network system is one in which two or more nodes are connected directly to each other without a centralized system orchestrating the connection or information exchange. On a typical peer-to-peer setup, each peer serves the same function as every other one—that is, they all consume the same data and share whatever data they produce so that others can stay synchronized. In the case of a peer-to-peer game, we can illustrate this architecture with a simple game of Tic-tac-toe. Once both the players have established a connection between themselves, whoever is starting the game makes a move by marking a cell on the game board. This information is relayed across the wire to the other peer, who is now aware of the decision made by his or her opponent, and can thus update their own game world. Once the second player receives the game's latest state that results from the first player's latest move, the second player is able to make a move of their own by checking some available space on the board. This information is then copied over to the first player who can update their own world and continue the process by making the next desired move. The process goes on until one of the peers disconnects or the game ends as some condition that is based on the game's own business logic is met. In the case of the game of Tic-tac-toe, the game would end once one of the players has marked three spaces on the board forming a straight line or if all nine cells are filled, but neither player managed to connect three cells in a straight path. Some of the benefits of peer-to-peer networked games are as follows: Fast data transmission: Here, the data goes directly to its intended target. In other architectures, the data could go to some centralized node first, then the central node (or the "server") contacts the other peer, sending the necessary updates. Simpler setup: You would only need to think about one instance of your game that, generally speaking, handles its own input, sends its input to other connected peers, and handles their output as input for its own system. This can be especially handy in turn-based games, for example, most board games such as Tic-tac-toe. More reliability: Here one peer that goes offline typically won't affect any of the other peers. However, in the simple case of a two-player game, if one of the players is unable to continue, the game will likely cease to be playable. Imagine, though, that the game in question has dozens or hundreds of connected peers. If a handful of them suddenly lose their Internet connection, the others can continue to play. However, if there is a server that is connecting all the nodes and the server goes down, then none of the other players will know how to talk to each other, and nobody will know what is going on. On the other hand, some of the more obvious drawbacks of peer-to-peer architecture are as follows: Incoming data cannot be trusted: Here, you don't know for sure whether or not the sender modified the data. The data that is input into a game server will also suffer from the same challenge, but once the data is validated and broadcasted to all the other peers, you can be more confident that the data received by each peer from the server will have at least been sanitized and verified, and will be more credible. Fault tolerance can be very low: If enough players share the game world, one or more crashes won't make the game unplayable to the rest of the peers. Now, if we consider the many cases where any of the players that suddenly crash out of the game negatively affect the rest of the players, we can see how a server could easily recover from the crash. Data duplication when broadcasting to other peers: Imagine that your game is a simple 2D side scroller, and many other players are sharing that game world with you. Every time one of the players moves to the right, you receive the new (x, y) coordinates from that player, and you're able to update your own game world. Now, imagine that you move your player to the right by a very few pixels; you would have to send that data out to all of the other nodes in the system. Overall, peer-to-peer is a very powerful networking architecture and is still widely used by many games in the industry. Since current peer-to-peer web technologies are still in their infancy, most JavaScript-powered games today do not make use of peer-to-peer networking. For this and other reasons that should become apparent soon, we will focus almost exclusively on the other popular networking paradigm, namely, the client-server architecture. Client-server networking The idea behind the client-server networking architecture is very simple. If you squint your eyes hard enough, you can almost see a peer-to-peer graph. The most obvious difference between them, is that, instead of every node being an equal peer, one of the nodes is special. That is, instead of every node connecting to every other node, every node (client) connects to a main centralized node called the server. While the concept of a client-server network seems clear enough, perhaps a simple metaphor might make it easier for you to understand the role of each type of node in this network format as well as differentiate it from peer-to-peer . In a peer-to-peer network, you can think of it as a group of friends (peers) having a conversation at a party. They all have access to all the other peers involved in the conversation and can talk to them directly. On the other hand, a client-server network can be viewed as a group of friends having dinner at a restaurant. If a client of the restaurant wishes to order a certain item from the menu, he or she must talk to the waiter, who is the only person in that group of people with access to the desired products and the ability to serve the products to the clients. In short, the server is in charge of providing data and services to one or more clients. In the context of game development, the most common scenario is when two or more clients connect to the same server; the server will keep track of the game as well as the distributed players. Thus, if two players are to exchange information that is only pertinent to the two of them, the communication will go from the first player to and through the server and will end up at the other end with the second player. Following the example of the two players involved in a game of Tic-tac-toe, we can see how similar the flow of events is on a client-server model. Again, the main difference is that players are unaware of each other and only know what the server tells them. While you can very easily mimic a peer-to-peer model by using a server to merely connect the two players, most often the server is used much more actively than that. There are two ways to engage the server in a networked game, namely in an authoritative and a non-authoritative way. That is to say, you can have the enforcement of the game's logic strictly in the server, or you can have the clients handle the game logic, input validation, and so on. Today, most games using the client-server architecture actually use a hybrid of the two (authoritative and non-authoritative servers). For all intents and purposes, however, the server's purpose in life is to receive input from each of the clients and distribute that input throughout the pool of connected clients. Now, regardless of whether you decide to go with an authoritative server instead of a non-authoritative one, you will notice that one of challenges with a client-server game is that you will need to program both ends of the stack. You will have to do this even if your clients do nothing more than take input from the user, forward it to the server, and render whatever data they receive from the server; if your game server does nothing more than forward the input that it receives from each client to every other client, you will still need to write a game client and a game server. We will discuss game clients and servers later. For now, all we really need to know is that these two components are what set this networking model apart from peer-to-peer. Some of the benefits of client-server networked games are as follows: Separation of concerns: If you know anything about software development, you know that this is something you should always aim for. That is, good, maintainable software is written as discrete components where each does one "thing", and it is done well. Writing individual specialized components lets you focus on performing one individual task at a time, making your game easier to design, code, test, reason, and maintain. Centralization: While this can be argued against as well as in favor of, having one central place through which all communication must flow makes it easier to manage such communication, enforce any required rules, control access, and so forth. Less work for the client: Instead of having a client (peer) in charge of taking input from the user as well as other peers, validating all the input, sharing data among other peers, rendering the game, and so on, the client can focus on only doing a few of these things, allowing the server to offload some of this work. This is particularly handy when we talk about mobile gaming, and how much subtle divisions of labor can impact the overall player experience. For example, imagine a game where 10 players are engaged in the same game world. In a peer-to-peer setup, every time one player takes an action, he or she would need to send that action to nine other players (in other words, there would need to be nine network calls, boiling down to more mobile data usage). On the other hand, on a client-server configuration, one player would only need to send his or her action to one of the peers, that is, the server, who would then be responsible for sending that data to the remaining nine players. Common drawbacks of client-server architectures, whether or not the server is authoritative, are as follows: Communication takes longer to propagate: In the very best possible scenario imaginable, every message sent from the first player to the second player would take twice as long to be delivered as compared to a peer-to-peer connection. That is, the message would be first sent from the first player to the server and then from the server to the second player. There are many techniques that are used today to solve the latency problem faced in this scenario, some of which we will discuss in much more depth later. However, the underlying dilemma will always be there. More complexity due to more moving parts: It doesn't really matter how you slice the pizza; the more code you need to write (and trust me, when you build two separate modules for a game, you will write more code), the greater your mental model will have to be. While much of your code can be reused between the client and the server (especially if you use well-established programming techniques, such as object-oriented programming), at the end of the day, you need to manage a greater level of complexity. Single point of failure and network congestion: Up until now, we have mostly discussed the case where only a handful of players participates in the same game. However, the more common case is that a handful of groups of players play different games at the same time. Using the same example of the two-player game of Tic-tac-toe, imagine that there are thousands of players facing each other in single games. In a peer-to-peer setup, once a couple of players have directly paired off, it is as though there are no other players enjoying that game. The only thing to keep these two players from continuing their game is their own connection with each other. On the other hand, if the same thousands of players are connected to each other through a server sitting between the two, then two singled out players might notice severe delays between messages because the server is so busy handling all of the messages from and to all of the other people playing isolated games. Worse yet, these two players now need to worry about maintaining their own connection with each other through the server, but they also hope that the server's connection between them and their opponent will remain active. All in all, many of the challenges involved in client-server networking are well studied and understood, and many of the problems you're likely to face during your multiplayer game development will already have been solved by someone else. Client-server is a very popular and powerful game networking model, and the required technology for it, which is available to us through HTML5 and JavaScript, is well developed and widely supported. Networking protocols – UDP and TCP By discussing some of the ways in which your players can talk to each other across some form of network, we have yet only skimmed over how that communication is actually done. Let us then describe what protocols are and how they apply to networking and, more importantly, multiplayer game development. The word protocol can be defined as a set of conventions or a detailed plan of a procedure [Citation [Def. 3,4]. (n.d.). In Merriam Webster Online, Retrieved February 12, 2015, from http://www.merriam-webster.com/dictionary/protocol]. In computer networking, a protocol describes to the receiver of a message how the data is organized so that it can be decoded. For example, imagine that you have a multiplayer beat 'em up game, and you want to tell the game server that your player just issued a kick command and moved 3 units to the left. What exactly do you send to the server? Do you send a string with a value of "kick", followed by the number 3? Otherwise, do you send the number first, followed by a capitalized letter "K", indicating that the action taken was a kick? The point I'm trying to make is that, without a well-understood and agreed-upon protocol, it is impossible to successfully and predictably communicate with another computer. The two networking protocols that we'll discuss in the section, and that are also the two most widely used protocols in multiplayer networked games, are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). Both protocols provide communication services between clients in a network system. In simple terms, they are protocols that allow us to send and receive packets of data in such a way that the data can be identified and interpreted in a predictable way. When data is sent through TCP, the application running in the source machine first establishes a connection with the destination machine. Once a connection has been established, data is transmitted in packets in such a way that the receiving application can then put the data back together in the appropriate order. TCP also provides built-in error checking mechanisms so that, if a packet is lost, the target application can notify the sender application, and any missing packets are sent again until the entire message is received. In short, TCP is a connection-based protocol that guarantees the delivery of the full data in the correct order. Use cases where this behavior is desirable are all around us. When you download a game from a web server, for example, you want to make sure that the data comes in correctly. You want to be sure that your game assets will be properly and completely downloaded before your users start playing your game. While this guarantee of delivery may sound very reassuring, it can also be thought of as a slow process, which, as we'll see briefly, may sometimes be more important than knowing that the data will arrive in full. In contrast, UDP transmits packets of data (called datagrams) without the use of a pre-established connection. The main goal of the protocol is to be a very fast and frictionless way of sending data towards some target application. In essence, you can think of UDP as the brave employees who dress up as their company's mascot and stand outside their store waving a large banner in the hope that at least some of the people driving by will see them and give them their business. While at first, UDP may seem like a reckless protocol, the use cases that make UDP so desirable and effective includes the many situations when you care more about speed than missing packets a few times, getting duplicate packets, or getting them out of order. You may also want to choose UDP over TCP when you don't care about the reply from the receiver. With TCP, whether or not you need some form of confirmation or reply from the receiver of your message, it will still take the time to reply back to you, at least acknowledging that the message was received. Sometimes, you may not care whether or not the server received the data. A more concrete example of a scenario where UDP is a far better choice over TCP is when you need a heartbeat from the client letting the server know if the player is still there. If you need to let your server know that the session is still active every so often, and you don't care if one of the heartbeats get lost every now and again, then it would be wise to use UDP. In short, for any data that is not mission-critical and you can afford to lose, UDP might be the best option. In closing, keep in mind that, just as peer-to-peer and client-server models can be built side by side, and in the same way your game server can be a hybrid of authoritative and non-authoritative, there is absolutely no reason why your multiplayer games should only use TCP or UDP. Use whichever protocol a particular situation calls for. Network sockets There is one other protocol that we'll cover very briefly, but only so that you can see the need for network sockets in game development. As a JavaScript programmer, you are doubtlessly familiar with Hypertext Transfer Protocol (HTTP). This is the protocol in the application layer that web browsers use to fetch your games from a Web server. While HTTP is a great protocol to reliably retrieve documents from web servers, it was not designed to be used in real-time games; therefore, it is not ideal for this purpose. The way HTTP works is very simple: a client sends a request to a server, which then returns a response back to the client. The response includes a completion status code, indicating to the client that the request is either in process, needs to be forwarded to another address, or is finished successfully or erroneously. There are a handful of things to note about HTTP that will make it clear that a better protocol is needed for real-time communication between the client and server. Firstly, after each response is received by the requester, the connection is closed. Thus, before making each and every request, a new connection must be established with the server. Most of the time, an HTTP request will be sent through TCP, which, as we've seen, can be slow, relatively speaking. Secondly, HTTP is by design a stateless protocol. This means that, every time you request a resource from a server, the server has no idea who you are and what is the context of the request. (It doesn't know whether this is your first request ever or if you're a frequent requester.) A common solution to this problem is to include a unique string with every HTTP request that the server keeps track of, and can thus provide information about each individual client on an ongoing basis. You may recognize this as a standard session. The major downside with this solution, at least with regard to real-time gaming, is that mapping a session cookie to the user's session takes additional time. Finally, the major factor that makes HTTP unsuitable for multiplayer game programming is that the communication is one way—only the client can connect to the server, and the server replies back through the same connection. In other words, the game client can tell the game server that a punch command has been entered by the user, but the game server cannot pass that information along to other clients. Think of it like a vending machine. As a client of the machine, we can request specific items that we wish to buy. We formalize this request by inserting money into the vending machine, and then we press the appropriate button. Under no circumstance will a vending machine issue commands to a person standing nearby. That would be like waiting for a vending machine to dispense food, expecting people to deposit the money inside it afterwards. The answer to this lack of functionality in HTTP is pretty straightforward. A network socket is an endpoint in a connection that allows for two-way communication between the client and the server. Think of it more like a telephone call, rather than a vending machine. During a telephone call, either party can say whatever they want at any given time. Most importantly, the connection between both parties remains open throughout the duration of the conversation, making the communication process highly efficient. WebSocket is a protocol built on top of TCP, allowing web-based applications to have two-way communication with a server. The way a WebSocket is created consists of several steps, including a protocol upgrade from HTTP to WebSocket. Thankfully, all of the heavy lifting is done behind the scenes by the browser and JavaScript. For now, the key takeaway here is that with a TCP socket (yes, there are other types of socket including UDP sockets), we can reliably communicate with a server, and the server can talk back to us as per the need. Socket programming in JavaScript Let's now bring the conversation about network connections, protocols, and sockets to a close by talking about the tools—JavaScript and WebSockets—that bring everything together, allowing us to program awesome multiplayer games in the language of the open Web. The WebSocket protocol Modern browsers and other JavaScript runtime environments have implemented the WebSocket protocol in JavaScript. Don't make the mistake of thinking that just because we can create WebSocket objects in JavaScript, WebSockets are part of JavaScript. The standard that defines the WebSocket protocol is language-agnostic and can be implemented in any programming language. Thus, before you start to deploy your JavaScript games that make use of WebSockets, ensure that the environment that will run your game uses an implementation of the ECMA standard that also implements WebSockets. In other words, not all browsers will know what to do when you ask for a WebSocket connection. For the most part, though, the latest versions, as of this writing, of the most popular browsers today (namely, Google Chrome, Safari, Mozilla Firefox, Opera, and Internet Explorer) implement the current latest revision of RFC 6455. Previous versions of WebSockets (such as protocol version - 76, 7, or 10) are slowly being deprecated and have been removed by some of the previously mentioned browsers. Probably the most confusing thing about the WebSocket protocol is the way each version of the protocol is named. The very first draft (which dates back to 2010), was named draft-hixie-thewebsocketprotocol-75. The next version was named draft-hixie-thewebsocketprotocol-76. Some people refer to these versions as 75 and 76, which can be quite confusing, especially since the fourth version of the protocol is named draft-ietf-hybi-thewebsocketprotocol-07, which is named in the draft as WebSocket Version 7. The current version of the protocol (RFC 6455) is 13. Let us take a quick look at the programming interface (API) that we'll use within our JavaScript code to interact with a WebSocket server. Keep in mind that we'll need to write both the JavaScript clients that use WebSockets to consume data as well as the WebSocket server, which uses WebSockets but plays the role of the server. The difference between the two will become apparent as we go over some examples. Creating a client-side WebSocket The following code snippet creates a new object of type WebSocket that connects the client to some backend server. The constructor takes two parameters; the first is required and represents the URL where the WebSocket server is running and expecting connections. The second URL, is an optional list of sub-protocols that the server may implement. var socket = new WebSocket('ws://www.game-domain.com'); Although this one line of code may seem simple and harmless enough, here are a few things to keep in mind: We are no longer in HTTP territory. The address to your WebSocket server now starts with ws:// instead of http://. Similarly, when we work with secure (encrypted) sockets, we would specify the server's URL as wss://, just like in https://. It may seem obvious to you, but a common pitfall that those getting started with WebSockets fall into is that, before you can establish a connection with the previous code, you need a WebSocket server running at that domain. WebSockets implement the same-origin security model. As you may have already seen with other HTML5 features, the same-origin policy states that you can only access a resource through JavaScript if both the client and the server are in the same domain. For those who are not familiar with the same-domain (also known as the same-origin) policy, the three things that constitute a domain, in this context, are the protocol, host, and port of the resource being accessed. In the previous example, the protocol, host, and port number were, respectively ws (and not wss, http, or ssh), www.game-domain.com (any sub-domain, such as game-domain.com or beta.game-domain.com would violate the same-origin policy), and 80 (by default, WebSocket connects to port 80, and port 443 when it uses wss). Since the server in the previous example binds to port 80, we don't need to explicitly specify the port number. However, had the server been configured to run on a different port, say 2667, then the URL string would need to include a colon followed by the port number that would need to be placed at the end of the host name, such as ws://www.game-domain.com:2667. As with everything else in JavaScript, WebSocket instances attempt to connect to the backend server asynchronously. Thus, you should not attempt to issue commands on your newly created socket until you're sure that the server has connected; otherwise, JavaScript will throw an error that may crash your entire game. This can be done by registering a callback function on the socket's onopen event as follows: var socket = new WebSocket('ws://www.game-domain.com'); socket.onopen = function(event) {    // socket ready to send and receive data }; Once the socket is ready to send and receive data, you can send messages to the server by calling the socket object's send method, which takes a string as the message to be sent. // Assuming a connection was previously established socket.send('Hello, WebSocket world!'); Most often, however, you will want to send more meaningful data to the server, such as objects, arrays, and other data structures that have more meaning on their own. In these cases, we can simply serialize our data as JSON strings. var player = {    nickname: 'Juju',    team: 'Blue' };   socket.send(JSON.stringify(player)); Now, the server can receive that message and work with it as the same object structure that the client sent it, by running it through the parse method of the JSON object. var player = JSON.parse(event.data); player.name === 'Juju'; // true player.team === 'Blue'; // true player.id === undefined; // true If you look at the previous example closely, you will notice that we extract the message that is sent through the socket from the data attribute of some event object. Where did that event object come from, you ask? Good question! The way we receive messages from the socket is the same on both the client and server sides of the socket. We must simply register a callback function on the socket's onmessage event, and the callback will be invoked whenever a new message is received. The argument passed into the callback function will contain an attribute named data, which will contain the raw string object with the message that was sent. socket.onmessage = function(event) {    event instanceof MessageEvent; // true      var msg = JSON.parse(event.data); }; Other events on the socket object on which you can register callbacks include onerror, which is triggered whenever an error related to the socket occurs, and onclose, which is triggered whenever the state of the socket changes to CLOSED; in other words, whenever the server closes the connection with the client for any reason or the connected client closes its connection. As mentioned previously, the socket object will also have a property called readyState, which behaves in a similar manner to the equally-named attribute in AJAX objects (or more appropriately, XMLHttpRequest objects). This attribute represents the current state of the connection and can have one of four values at any point in time. This value is an unsigned integer between 0 and 3, inclusive of both the numbers. For clarity, there are four accompanying constants on the WebSocket class that map to the four numerical values of the instance's readyState attribute. The constants are as follows: WebSocket.CONNECTING: This has a value of 0 and means that the connection between the client and the server has not yet been established. WebSocket.OPEN: This has a value of 1 and means that the connection between the client and the server is open and ready for use. Whenever the object's readyState attribute changes from CONNECTING to OPEN, which will only happen once in the object's life cycle, the onopen callback will be invoked. WebSocket.CLOSING: This has a value of 2 and means that the connection is being closed. WebSocket.CLOSED: This has a value of 3 and means that the connection is now closed (or could not be opened to begin with). Once the readyState has changed to a new value, it will never return to a previous state in the same instance of the socket object. Thus, if a socket object is CLOSING or has already become CLOSED, it will never OPEN again. In this case, you would need a new instance of WebSocket if you would like to continue to communicate with the server. To summarize, let us bring together the simple WebSocket API features that we discussed previously and create a convenient function that simplifies data serialization, error checking, and error handling when communicating with the game server: function sendMsg(socket, data) {    if (socket.readyState === WebSocket.OPEN) {      socket.send(JSON.stringify(data));        return true;    }      return false; }; Game clients Earlier, we talked about the architecture of a multiplayer game that was based on the client-server pattern. Since this is the approach we will take for the games that we'll be developing, let us define some of the main roles that the game client will fulfill. From a higher level, a game client will be the interface between the human player and the rest of the game universe (which includes the game server and other human players who are connected to it). Thus, the game client will be in charge of taking input from the player, communicating this to the server, receive any further instructions and information from the server, and then render the final output to the human player again. Depending on the type of game server used, the client can be more sophisticated than just an input application that renders static data received from the server. For example, the client could very well simulate what the game server will do and present the result of this simulation to the user while the server performs the real calculations and tells the results to the client. The biggest selling point of this technique is that the game would seem a lot more dynamic and real-time to the user since the client responds to input almost instantly. Game servers The game server is primarily responsible for connecting all the players to the same game world and keeping the communication going between them. However as you will soon realize, there may be cases where you will want the server to be more sophisticated than a routing application. For example, just because one of the players is telling the server to inform the other participants that the game is over, and the player sending the message is the winner, we may still want to confirm the information before deciding that the game is in fact over. With this idea in mind, we can label the game server as being of one of the two kinds: authoritative or non-authoritative. In an authoritative game server, the game's logic is actually running in memory (although it normally doesn't render any graphical output like the game clients certainly will) all the time. As each client reports the information to the server by sending messages through its corresponding socket, the server updates the current game state and sends the updates back to all of the players, including the original sender. This way we can be more certain that any data coming from the server has been verified and is accurate. In a non-authoritative server, the clients take on a much more involved part in the game logic enforcement, which gives the client a lot more trust. As suggested previously, what we can do is take the best of both worlds and create a mix of both the techniques. What we will do is, have a strictly authoritative server, but clients that are smart and can do some of the work on their own. Since the server has the ultimate say in the game, however, any messages received by clients from the server are considered as the ultimate truth and supersede any conclusions it came to on its own. Summary Overall, we discussed the basics of networking and network programming paradigms. We saw how WebSockets makes it possible to develop real-time, multiplayer games in HTML5. Finally, we implemented a simple game client and game server using widely supported web technologies and built a fun game of Tic-tac-toe. Resources for Article: Further resources on this subject: HTML5 Game Development – A Ball-shooting Machine with Physics Engine [article] Creating different font files and using web fonts [article] HTML5 Canvas [article]
Read more
  • 0
  • 0
  • 34115

article-image-splunk-web-framework
Packt
04 Jun 2015
10 min read
Save for later

The Splunk Web Framework

Packt
04 Jun 2015
10 min read
In this article by the author, Kyle Smith, of the book, Splunk Developer's Guide, we learn about search-related and view-related modules. We will be covering the following topics: Search-related modules View-related modules (For more resources related to this topic, see here.) Search-related modules Let's talk JavaScript modules. For each module, we will review their primary purpose, their module path, the default variable used in an HTML dashboard, and the JavaScript instantiation of the module. We will also cover which attributes are required and which are optional. SearchManager The SearchManager is a primary driver of any dashboard. This module contains an entire search job, including the query, properties, and the actual dispatch of the job. Let's instantiate an object, and dissect the options from this sample code: Module Path: splunkjs/mvc/searchmanager Default Variable: SearchManager JavaScript Object instantiation    Var mySearchManager = new SearchManager({        id: "search1",        earliest_time: "-24h@h",        latest_time: "now",        preview: true,        cache: false,        search: "index=_internal | stats count by sourcetype"    }, {tokens: true, tokenNamespace: "submitted"}); The only required property is the id property. This is a reference ID that will be used to access this object from other instantiated objects later in the development of the page. It is best to name it something concise, yet descriptive with no spaces. The search property is optional, and contains the SPL query that will be dispatched from the module. Make sure to escape any quotes properly, if not, you may cause a JavaScript exception. earliest_time and latest_time are time modifiers that restrict the range of the events. At the end of the options object, notice the second object with token references. This is what automatically executes the search. Without these options, you would have to trigger the search manually. There are a few other properties shown, but you can refer to the actual documentation at the main documentation page http://docs.splunk.com/DocumentationStatic/WebFramework/1.1/compref_searchmanager.html. SearchManagers are set to autostart on page load. To prevent this, set autostart to false in the options. SavedSearchManager The SavedSearchManager is very similar in operation to the SearchManager, but works with a saved report, instead of an ad hoc query. The advantage to using a SavedSearchManager is in performance. If the report is scheduled, you can configure the SavedSearchManager to use the previously run jobs to load the data. If any other user runs the report within Splunk, the SavedSearchManager can reuse that user's results in the manager to boost performance. Let's take a look at a few sections of our code: Module Path: splunkjs/mvc/savedsearchmanager Default Variable: SavedSearchManager JavaScript Object instantiation        Var mySavedSearchManager = new SavedSearchManager({            id: "savedsearch1",        searchname: "Saved Report 1"            "dispatch.earliest_time": "-24h@h",            "dispatch.latest_time": "now",            preview: true,            cache: true        }); The only two required properties are id and searchname. Both of those must be present in order for this manager to run correctly. The other options are very similar to the SearchManager, except for the dispatch options. The SearchManager has the option "earliest_time", whereas the SavedSearchManager uses the option "dispatch.earliest_time". They both have the same restriction but are named differently. The additional options are listed in the main documentation page available at http://docs.splunk.com/DocumentationStatic/WebFramework/1.1/compref_savedsearchmanager.html. PostProcessManager The PostProcessManager does just that, post processes the results of a main search. This works in the same way as the post processing done in SimpleXML; a main search to load the event set, and a secondary search to perform an additional analysis and transformation. Using this manager has its own performance considerations as well. By loading a single job first, and then performing additional commands on those results, you avoid having concurrent searches for the same information. Your usage of CPU and RAM will be less, as you only store one copy of the results, instead of multiple. Module Path: splunkjs/mvc/postprocessmanager Default Variable: PostProcessManager JavaScript Object instantiation        Var mysecondarySearch = new PostProcessManager({            id: "after_search1",        search: "stats count by sourcetype",    managerid: "search1"        }); The property id is the only required property. The module won't do anything when instantiated with only an id property, but you can set it up to populate later. The other options are similar to the SearchManager, the major difference being that the search property in this case is appended to the search property of the manager listed in the managerid property. For example, if the manager search is search index=_internal source=*splunkd.log, and the post process manager search is stats count by host, then the entire search for the post process manager is search index=_internal source=*splunkd.log | stats count by host. The additional options are listed at the main documentation page http://docs.splunk.com/DocumentationStatic/WebFramework/1.1/compref_postprocessmanager.html. View-related modules These modules are related to the views and data visualizations that are native to Splunk. They range in use from charts that display data, to control groups, such as radio groups or dropdowns. These are also included with Splunk and are included by default in the RequireJS declaration. ChartView The ChartView displays a series of data in the formats in the list as follows. Item number one shows an example of how each different chart is described and presented. Each ChartView is instantiated in the same way, the only difference is in what searches are used with which chart. Module Path: splunkjs/mvc/chartview Default Variable: ChartView JavaScript Object instantiation        Var myBarChart = new ChartView({            id: "myBarChart",             managerid: "searchManagerId",            type: "bar",            el: $("#mybarchart")        }); The only required property is the id property. This assigns the object an id that can be later referenced as needed. The el option refers to the HTML element in the page that this view will be assigned and created within. The managerid relates to an existing search, saved search, or post process manager object. The results are passed from the manager into the chart view and displayed as indicated. Each chart view can be customized extensively using the charting.* properties. For example, charting.chart.overlayFields, when set to a comma separated list of field names, will overlay those fields over the chart of other data, making it possible to display SLA times over the top of Customer Service Metrics. The full list of configurable options can be found at the following link: http://docs.splunk.com/Documentation/Splunk/latest/Viz/ChartConfigurationReference. The different types of ChartView Now that we've introduced the ChartView module, let's look at the different types of charts that are built-in. This section has been presented in the following format: Name of Chart Short description of the chart type Type property for use in the JavaScript configuration Example chart command that can be displayed with this chart type Example image of the chart The different ChartView types we will cover in this section include the following: Area The area chart is similar to the line chart, and compares quantitative data. The graph is filled with color to show volume. This is commonly used to show statistics of data over time. An example of an area chart is as follows: timechart span=1h max(results.collection1{}.meh_clicks) as MehClicks max(results.collection1{}.visitors) as Visits Bar The bar chart is similar to the column chart, except that the x and y axes have been switched, and the bars run horizontally and not vertically. The bar chart is used to compare different categories. An example of a bar chart is as follows: stats max(results.collection1{}.visitors) as Visits max(results.collection1{}.meh_clicks) as MehClicks by results.collection1{}.title.text Column The column chart is similar to the bar chart, but the bars are displayed vertically. An example of a column chart is as follows: timechart span=1h avg(DPS) as "Difference in Products Sold" Filler gauge The filler gauge is a Splunk-provided visualization. It is intended for single values, normally as a percentage, but can be adjusted to use discrete values as well. The gauge uses different colors for different ranges of values, by default using green, yellow, and red, in that order. These colors can also be changed using the charting.* properties. One of the differences between this gauge and the other single value gauges is that it shows both the color and value close together, whereas the others do not. An example of a filler gauge chart is as follows: eval diff = results.collection1{}.meh_clicks / results.collection1{}.visitors * 100 | stats latest(diff) as D Line The line chart is similar to the area chart but does not fill the area under the line. This chart can be used to display discrete measurements over time. An example of a line chart is as follows: timechart span=1h max(results.collection1{}.meh_clicks) as MehClicks max(results.collection1{}.visitors) as Visits Marker gauge The marker gauge is a Splunk native visualization intended for use with a single value. Normally this will be a percentage of a value, but can be adjusted as needed. The gauge uses different colors for different ranges of values, by default using green, yellow, and red, in that order. These colors can also be changed using the charting.* properties. An example of a marker gauge chart is as follows: eval diff = results.collection1{}.meh_clicks / results.collection1{}.visitors * 100 | stats latest(diff) as D Pie Chart A pie chart is useful for displaying percentages. It gives you the ability to quickly see which part of the "pie" is disproportionate to the others. Actual measurements may not be relevant. An example of a pie chart is as follows: top op_action Radial gauge The radial gauge is another single value chart provided by Splunk. It is normally used to show percentages, but can be adjusted to show discrete values. The gauge uses different colors for different ranges of values, by default using green, yellow, and red, in that order. These colors can also be changed using the charting.* properties. An example of a radial gauge is as follows: eval diff = MC / V * 100 | stats latest(diff) as D Scatter The scatter plot can plot two sets of data on an x and y axis chart (Cartesian coordinates). This chart is primarily time independent, and is useful for finding correlations (but not necessarily causation) in data. An example of a scatter plot is as follows: table MehClicks Visitors Summary We covered some deeper elements of Splunk applications and visualizations. We reviewed each of the SplunkJS modules, how to instantiate them, and gave an example of each search-related modules and view-related modules. Resources for Article: Further resources on this subject: Introducing Splunk [article] Lookups [article] Loading data, creating an app, and adding dashboards and reports in Splunk [article]
Read more
  • 0
  • 0
  • 3523

article-image-getting-started-hyper-v-architecture-and-components
Packt
04 Jun 2015
19 min read
Save for later

Getting Started with Hyper-V Architecture and Components

Packt
04 Jun 2015
19 min read
In this article by Vinícius R. Apolinário, author of the book Learning Hyper-V, we will cover the following topics: Hypervisor architecture Type 1 and 2 Hypervisors Microkernel and Monolithic Type 1 Hypervisors Hyper-V requirements and processor features Memory configuration Non-Uniform Memory Access (NUMA) architecture (For more resources related to this topic, see here.) Hypervisor architecture If you've used Microsoft Virtual Server or Virtual PC, and then moved to Hyper-V, I'm almost sure that your first impression was: "Wow, this is much faster than Virtual Server". You are right. And there is a reason why Hyper-V performance is much better than Virtual Server or Virtual PC. It's all about the architecture. There are two types of Hypervisor architectures. Hypervisor Type 1, like Hyper-V and ESXi from VMware, and Hypervisor Type 2, like Virtual Server, Virtual PC, VMware Workstation, and others. The objective of the Hypervisor is to execute, manage and control the operation of the VM on a given hardware. For that reason, the Hypervisor is also called Virtual Machine Monitor (VMM). The main difference between these Hypervisor types is the way they operate on the host machine and its operating systems. As Hyper-V is a Type 1 Hypervisor, we will cover Type 2 first, so we can detail Type 1 and its benefits later. Type 1 and Type 2 Hypervisors Hypervisor Type 2, also known as hosted, is an implementation of the Hypervisor over and above the OS installed on the host machine. With that, the OS will impose some limitations to the Hypervisor to operate, and these limitations are going to reflect on the performance of the VM. To understand that, let me explain how a process is placed on the processor: the processor has what we call Rings on which the processes are placed, based on prioritization. The main Rings are 0 and 3. Kernel processes are placed on Ring 0 as they are vital to the OS. Application processes are placed on Ring 3, and, as a result, they will have less priority when compared to Ring 0. The issue on Hypervisors Type 2 is that it will be considered an application, and will run on Ring 3. Let's have a look at it: As you can see from the preceding diagram, the hypervisor has an additional layer to access the hardware. Now, let's compare it with Hypervisor Type 1: The impact is immediate. As you can see, Hypervisor Type 1 has total control of the underlying hardware. In fact, when you enable Virtualization Assistance (hardware-assisted virtualization) at the server BIOS, you are enabling what we call Ring -1, or Ring decompression, on the processor and the Hypervisor will run on this Ring. The question you might have is "And what about the host OS?" If you install the Hyper-V role on a Windows Server for the first time, you may note that after installation, the server will restart. But, if you're really paying attention, you will note that the server will actually reboot twice. This behavior is expected, and the reason it will happen is because the OS is not only installing and enabling Hyper-V bits, but also changing its architecture to the Type 1 Hypervisor. In this mode, the host OS will operate in the same way a VM does, on top of the Hypervisor, but on what we call parent partition. The parent partition will play a key role as the boot partition and in supporting the child partitions, or guest OS, where the VMs are running. The main reason for this partition model is the key attribute of a Hypervisor: isolation. For Microsoft Hyper-V Server you don't have to install the Hyper-V role, as it will be installed when you install the OS, so you won't be able to see the server booting twice. With isolation, you can ensure that a given VM will never have access to another VM. That means that if you have a compromised VM, with isolation, the VM will never infect another VM or the host OS. The only way a VM can access another VM is through the network, like all other devices in your network. Actually, the same is true for the host OS. This is one of the reasons why you need an antivirus for the host and the VMs, but this will be discussed later. The major difference between Type 1 and Type 2 now is that kernel processes from both host OS and VM OS will run on Ring 0. Application processes from both host OS and VM OS will run on Ring 3. However, there is one piece left. The question now is "What about device drivers?" Microkernel and Monolithic Type 1 Hypervisors Have you tried to install Hyper-V on a laptop? What about an all-in-one device? A PC? A server? An x64 based tablet? They all worked, right? And they're supposed to work. As Hyper-V is a Microkernel Type 1 Hypervisor, all the device drivers are hosted on the parent partition. A Monolithic Type 1 Hypervisor hosts its drivers on the Hypervisor itself. VMware ESXi works this way. That's why you should never use a standard ESXi media to install an ESXi host. The hardware manufacturer will provide you with an appropriate media with the correct drivers for the specific hardware. The main advantage of the Monolithic Type 1 Hypervisor is that, as it always has the correct driver installed, you will never have a performance issue due to an incorrect driver. On the other hand, you won't be able to install this on any device. The Microkernel Type 1 Hypervisor, on the other hand, hosts its drivers on the parent partition. That means that if you installed the host OS on a device, and the drivers are working, the Hypervisor, and in this case Hyper-V, will work just fine. There are other hardware requirements. These will be discussed later in this article. The other side of this is that if you use a generic driver, or a wrong version of it, you may have performance issues, or even driver malfunction. What you have to keep in mind here is that Microsoft does not certify drivers for Hyper-V. Device drivers are always certified for Windows Server. If the driver is certified for Windows Server, it is also certified for Hyper-V. But you always have to ensure the use of correct driver for a given hardware. Let's take a better look at how Hyper-V works as a Microkernel Type 1 Hypervisor: As you can see from the preceding diagram, there are multiple components to ensure that the VM will run perfectly. However, the major component is the Integration Components (IC), also called Integration Services. The IC is a set of tools that you should install or upgrade on the VM, so that the VM OS will be able to detect the virtualization stack and run as a regular OS on a given hardware. To understand this more clearly, let's see how an application accesses the hardware and understand all the processes behind it. When the application tries to send a request to the hardware, the kernel is responsible for interpreting this call. As this OS is running on an Enlightened Child Partition (Means that IC is installed), the Kernel will send this call to the Virtual Service Client (VSC) that operates as a synthetic device driver. The VSC is responsible for communicating with the Virtual Service Provider (VSP) on the parent partition, through VMBus, so the VSC can use the hardware resource. The VMBus will then be able to communicate with the hardware for the VM. The VMBus, a channel-based communication, is actually responsible for communicating with the parent partition and hardware. For the VMBus to access the hardware, it will communicate directly with a component on the Hypervisor called hypercalls. These hypercalls are then redirected to the hardware. However, only the parent partition can actually access the physical processor and memory. The child partitions access a virtual view of these components that are translated on the guest and the host partitions. New processors have a feature called Second Level Address Translation (SLAT) or Nested Paging. This feature is extremely important on high performance VMs and hosts, as it helps reduce the overhead of the virtual to physical memory and processor translation. On Windows 8, SLAT is a requirement for Hyper-V. It is important to note that Enlightened Child Partitions, or partitions with IC, can be Windows or Linux OS. If the child partitions have a Linux OS, the name of the component is Linux Integration Services (LIS), but the operation is actually the same. Another important fact regarding ICs is that they are already present on Windows Server 2008 or later. But, if you are running a newer version of Hyper-V, you have to upgrade the IC version on the VM OS. For example, if you are running Hyper-V 2012 R2 on the host OS and the guest OS is running Windows Server 2012 R2, you probably don't have to worry about it. But if you are running Hyper-V 2012 R2 on the host OS and the guest OS is running Windows Server 2012, then you have to upgrade the IC on the VM to match the parent partition version. Running guest OS Windows Server 2012 R2 on a VM on top of Hyper-V 2012 is not recommended. For Linux guest OS, the process is the same. Linux kernel version 3 or later already have LIS installed. If you are running an old version of Linux, you should verify the correct LIS version of your OS. To confirm the Linux and LIS versions, you can refer to an article at http://technet.microsoft.com/library/dn531030.aspx. Another situation is when the guest OS does not support IC or LIS, or an Unenlightened Child Partition. In this case, the guest OS and its kernel will not be able to run as an Enlightened Child Partition. As the VMBus is not present in this case, the utilization of hardware will be made by emulation and performance will be degraded. This only happens with old versions of Windows and Linux, like Windows 2000 Server, Windows NT, and CentOS 5.8 or earlier, or in case that the guest OS does not have or support IC. Now that you understand how the Hyper-V architecture works, you may be thinking "Okay, so for all of this to work, what are the requirements?" Hyper-V requirements and processor features At this point, you can see that there is a lot of effort for putting all of this to work. In fact, this architecture is only possible because hardware and software companies worked together in the past. The main goal of both type of companies was to enable virtualization of operating systems without changing them. Intel and AMD created, each one with its own implementation, a processor feature called virtualization assistance so that the Hypervisor could run on Ring 0, as explained before. But this is just the first requirement. There are other requirement as well, which are as follows: Virtualization assistance (also known as Hardware-assisted virtualization): This feature was created to remove the necessity of changing the OS for virtualizing it. On Intel processors, it is known as Intel VT-x. All recent processor families support this feature, including Core i3, Core i5, and Core i7. The complete list of processors and features can be found at http://ark.intel.com/Products/VirtualizationTechnology. You can also use this tool to check if your processor meets this requirement which can be downloaded at: https://downloadcenter.intel.com/Detail_Desc.aspx?ProductID=1881&DwnldID=7838. On AMD Processors, this technology is known as AMD-V. Like Intel, all recent processor families support this feature. AMD provides a tool to check processor compatibility that can be downloaded at http://www.amd.com/en-us/innovations/software-technologies/server-solution/virtualization. Data Execution Prevention (DEP): This is a security feature that marks memory pages as either executable or nonexecutable. For Hyper-V to run, this option must be enabled on the System BIOS. For an Intel-based processor, this feature is called Execute Disable bit (Intel XD bit) and No Execute Bit (AMD NX bit). This configuration will vary from one System BIOS to another. Check with your hardware vendor how to enable it on System BIOS. x64 (64-bit) based processor: This processor feature uses a 64-bit memory address. Although you may find that all new processors are x64, you might want to check if this is true before starting your implementation. The compatibility checkers above, from Intel and AMD, will show you if your processor is x64. Second Level Address Translation (SLAT): As discussed before, SLAT is not a requirement for Hyper-V to work. This feature provides much more performance on the VMs as it removes the need for translating physical and virtual pages of memory. It is highly recommended to have the SLAT feature on the processor ait provides more performance on high performance systems. As also discussed before, SLAT is a requirement if you want to use Hyper-V on Windows 8 or 8.1. To check if your processor has the SLAT feature, use the Sysinternals tool—Coreinfo— that can be downloaded at http://technet.microsoft.com/en-us/sysinternals/cc835722.aspx. There are some specific processor features that are not used exclusively for virtualization. But when the VM is initiated, it will use these specific features from the processor. If the VM is initiated and these features are allocated on the guest OS, you can't simply remove them. This is a problem if you are going to Live Migrate this VM from a host to another host; if these specific features are not available, you won't be able to perform the operation. At this moment, you have to understand that Live Migration moves a powered-on VM from one host to another. If you try to Live Migrate a VM between hosts with different processor types, you may be presented with an error. Live Migration is only permitted between the same processor vendor: Intel-Intel or AMD-AMD. Intel-AMD Live Migration is not allowed under any circumstance. If the processor is the same on both hosts, Live Migration and Share Nothing Live Migration will work without problems. But even within the same vendor, there can be different processor families. In this case, you can remove these specific features from the Virtual Processor presented to the VM. To do that, open Hyper-V Manager | Settings... | Processor | Processor Compatibility. Mark the Migrate to a physical computer with a different processor version option. This option is only available if the VM is powered off. Keep in mind that enabling this option will remove processor-specific features for the VM. If you are going to run an application that requires these features, they will not be available and the application may not run. Now that you have checked all the requirements, you can start planning your server for virtualization with Hyper-V. This is true from the perspective that you understand how Hyper-V works and what are the requirements for it to work. But there is another important subject that you should pay attention to when planning your server: memory. Memory configuration I believe you have heard this one before "The application server is running under performance". In the virtualization world, there is an obvious answer to it: give more virtual hardware to the VM. Although it seems to be the logical solution, the real effect can be totally opposite. During the early days, when servers had just a few sockets, processors, and cores, a single channel made the communication between logical processors and memory. But server hardware has evolved, and today, we have servers with 256 logical processors and 4 TB of RAM. To provide better communication between these components, a new concept emerged. Modern servers with multiple logical processors and high amount of memory use a new design called Non-Uniform Memory Access (NUMA) architecture. Non-Uniform Memory Access (NUMA) architecture NUMA is a memory design that consists of allocating memory to a given node, or a cluster of memory and logical processors. Accessing memory from a processor inside the node is notably faster than accessing memory from another node. If a processor has to access memory from another node, the performance of the process performing the operation will be affected. Basically, to solve this equation you have to ensure that the process inside the guest VM is aware of the NUMA node and is able to use the best available option: When you create a virtual machine, you decide how many virtual processors and how much virtual RAM this VM will have. Usually, you assign the amount of RAM that the application will need to run and meet the expected performance. For example, you may ask a software vendor on the application requirements and this software vendor will say that the application would be using at least 8 GB of RAM. Suppose you have a server with 16 GB of RAM. What you don't know is that this server has four NUMA nodes. To be able to know how much memory each NUMA node has, you must divide the total amount of RAM installed on the server by the number of NUMA nodes on the system. The result will be the amount of RAM of each NUMA node. In this case, each NUMA node has a total of 4 GB of RAM. Following the instructions of the software vendor, you create a VM with 8 GB of RAM. The Hyper-V standard configuration is to allow NUMA spanning, so you will be able to create the VM and start it. Hyper-V will accommodate 4 GB of RAM on two NUMA nodes. This NUMA spanning configuration means that a processor can access the memory on another NUMA node. As mentioned earlier, this will have an impact on the performance if the application is not aware of it. On Hyper-V, prior to the 2012 version, the guest OS was not informed about the NUMA configuration. Basically, in this case, the guest OS would see one NUMA node with 8 GB of RAM, and the allocation of memory would be made without NUMA restrictions, impacting the final performance of the application. Hyper-V 2012 and 2012 R2 have the same feature—the guest OS will see the virtual NUMA (vNUMA) presented to the child partition. With this feature, the guest OS and/or the application can make a better choice on where to allocate memory for each process running on this VM. NUMA is not a virtualization technology. In fact, it has been used for a long time, and even applications like SQL Server 2005 already used NUMA to better allocate the memory that its processes are using. Prior to Hyper-V 2012, if you wanted to avoid this behavior, you had two choices: Create the VM and allocate the maximum vRAM of a single NUMA node for it, as Hyper-V will always try to allocate the memory inside of a single NUMA node. In the above case, the VM should not have more than 4 GB of vRAM. But for this configuration to really work, you should also follow the next choice. Disable NUMA Spanning on Hyper-V. With this configuration disabled, you will not be able to run a VM if the memory configuration exceeds a single NUMA node. To do this, you should clear the Allow virtual machines to span physical NUMA nodes checkbox on Hyper-V Manager | Hyper-V Settings... | NUMA Spanning. Keep in mind that disabling this option will prevent you from running a VM if no nodes are available. You should also remember that even with Hyper-V 2012, if you create a VM with 8 GB of RAM using two NUMA nodes, the application on top of the guest OS (and the guest OS) must understand the NUMA topology. If the application and/or guest OS are not NUMA aware, vNUMA will not have effect and the application can still have performance issues. At this point you are probably asking yourself "How do I know how many NUMA nodes I have on my server?" This was harder to find in the previous versions of Windows Server and Hyper-V Server. In versions prior to 2012, you should open the Performance Monitor and check the available counters in Hyper-V VM Vid NUMA Node. The number of instances represents the number of NUMA Nodes. In Hyper-V 2012, you can check the settings for any VM. Under the Processor tab, there is a new feature available for NUMA. Let's have a look at this screen to understand what it represents: In Configuration, you can easily confirm how many NUMA nodes the host running this VM has. In the case above, the server has only 1 NUMA node. This means that all memory will be allocated close to the processor. Multiple NUMA nodes are usually present on servers with high amount of logical processors and memory. In the NUMA topology section, you can ensure that this VM will always run with the informed configuration. This is presented to you because of a new Hyper-V 2012 feature called Share Nothing Live Migration, which will be explained in detail later. This feature allows you to move a VM from one host to another without turning the VM off, with no cluster and no shared storage. As you can move the VM turned on, you might want to force the processor and memory configuration, based on the hardware of your worst server, ensuring that your VM will always meet your performance expectations. The Use Hardware Topology button will apply the hardware topology in case you moved the VM to another host or in case you changed the configuration and you want to apply the default configuration again. To summarize, if you want to make sure that your VM will not have performance problems, you should check how many NUMA nodes your server has and divide the total amount of memory by it; the result is the total memory on each node. Creating a VM with more memory than a single node will make Hyper-V present a vNUMA to the guest OS. Ensuring that the guest OS and applications are NUMA aware is also important, so that the guest OS and application can use this information to allocate memory for a process on the correct node. NUMA is important to ensure that you will not have problems because of host configuration and misconfiguration on the VM. But, in some cases, even when planning the VM size, you will come to a moment when the VM memory is stressed. In these cases, Hyper-V can help with another feature called Dynamic Memory. Summary In this we learned about the Hypervisor architecture and different Hypervisor types. We explored in brief about Microkernel and Monolithic Type 1 Hypervisors. In addition to this, this article also explains the Hyper-V requirements and processor features, Memory configuration and the NUMA architecture. Resources for Article: Further resources on this subject: Planning a Compliance Program in Microsoft System Center 2012 [Article] So, what is Microsoft © Hyper-V server 2008 R2? [Article] Deploying Applications and Software Updates on Microsoft System Center 2012 Configuration Manager [Article]
Read more
  • 0
  • 0
  • 6502

article-image-installing-jquery
Packt
04 Jun 2015
25 min read
Save for later

Installing jQuery

Packt
04 Jun 2015
25 min read
 In this article by Alex Libby, author of the book Mastering jQuery, we will examine some of the options available to help develop your skills even further. (For more resources related to this topic, see here.) Local or CDN, I wonder…? Which version…? Do I support old IE…? Installing jQuery is a thankless task that has to be done countless times by any developer—it is easy to imagine that person asking some of the questions. It is easy to imagine why most people go with the option of using a Content Delivery Network (CDN) link, but there is more to installing jQuery than taking the easy route! There are more options available, where we can be really specific about what we need to use—throughout this article, we will. We'll cover a number of topics, which include: Downloading and installing jQuery Customizing jQuery downloads Building from Git Using other sources to install jQuery Adding source map support Working with Modernizr as a fallback Intrigued? Let's get started. Downloading and installing jQuery As with all projects that require the use of jQuery, we must start somewhere—no doubt you've downloaded and installed jQuery a thousand times; let's just quickly recap to bring ourselves up to speed. If we browse to http://www.jquery.com/download, we can download jQuery using one of the two methods: downloading the compressed production version or the uncompressed development version. If we don't need to support old IE (IE6, 7, and 8), then we can choose the 2.x branch. If, however, you still have some diehards who can't (or don't want to) upgrade, then the 1.x branch must be used instead. To include jQuery, we just need to add this link to our page: <script src="http://code.jquery.com/jquery-X.X.X.js"></script> Here, X.X.X marks the version number of jQuery or the Migrate plugin that is being used in the page. Conventional wisdom states that the jQuery plugin (and this includes the Migrate plugin too) should be added to the <head> tag, although there are valid arguments to add it as the last statement before the closing <body> tag; placing it here may help speed up loading times to your site. This argument is not set in stone; there may be instances where placing it in the <head> tag is necessary and this choice should be left to the developer's requirements. My personal preference is to place it in the <head> tag as it provides a clean separation of the script (and the CSS) code from the main markup in the body of the page, particularly on lighter sites. I have even seen some developers argue that there is little perceived difference if jQuery is added at the top, rather than at the bottom; some systems, such as WordPress, include jQuery in the <head> section too, so either will work. The key here though is if you are perceiving slowness, then move your scripts to just before the <body> tag, which is considered a better practice. Using jQuery in a development capacity A useful point to note at this stage is that best practice recommends that CDN links should not be used within a development capacity; instead, the uncompressed files should be downloaded and referenced locally. Once the site is complete and is ready to be uploaded, then CDN links can be used. Adding the jQuery Migrate plugin If you've used any version of jQuery prior to 1.9, then it is worth adding the jQuery Migrate plugin to your pages. The jQuery Core team made some significant changes to jQuery from this version; the Migrate plugin will temporarily restore the functionality until such time that the old code can be updated or replaced. The plugin adds three properties and a method to the jQuery object, which we can use to control its behavior: Property or Method Comments jQuery.migrateWarnings This is an array of string warning messages that have been generated by the code on the page, in the order in which they were generated. Messages appear in the array only once even if the condition has occurred multiple times, unless jQuery.migrateReset() is called. jQuery.migrateMute Set this property to true in order to prevent console warnings from being generated in the debugging version. If this property is set, the jQuery.migrateWarnings array is still maintained, which allows programmatic inspection without console output. jQuery.migrateTrace Set this property to false if you want warnings but don't want traces to appear on the console. jQuery.migrateReset() This method clears the jQuery.migrateWarnings array and "forgets" the list of messages that have been seen already. Adding the plugin is equally simple—all you need to do is add a link similar to this, where X represents the version number of the plugin that is used: <script src="http://code.jquery.com/jquery-migrate- X.X.X.js"></script> If you want to learn more about the plugin and obtain the source code, then it is available for download from https://github.com/jquery/jquery-migrate. Using a CDN We can equally use a CDN link to provide our jQuery library—the principal link is provided by MaxCDN for the jQuery team, with the current version available at http://code.jquery.com. We can, of course, use CDN links from some alternative sources, if preferred—a reminder of these is as follows: Google (https://developers.google.com/speed/libraries/devguide#jquery) Microsoft (http://www.asp.net/ajaxlibrary/cdn.ashx#jQuery_Releases_on_the_CDN_0) CDNJS (http://cdnjs.com/libraries/jquery/) jsDelivr (http://www.jsdelivr.com/#%!jquery) Don't forget though that if you need, we can always save a copy of the file provided on CDN locally and reference this instead. The jQuery CDN will always have the latest version, although it may take a couple of days for updates to appear via the other links. Using other sources to install jQuery Right. Okay, let's move on and develop some code! "What's next?" I hear you ask. Aha! If you thought downloading and installing jQuery from the main site was the only way to do this, then you are wrong! After all, this is about mastering jQuery, so you didn't think I will only talk about something that I am sure you are already familiar with, right? Yes, there are more options available to us to install jQuery than simply using the CDN or main download page. Let's begin by taking a look at using Node. Each demo is based on Windows, as this is the author's preferred platform; alternatives are given, where possible, for other platforms. Using Node JS to install jQuery So far, we've seen how to download and reference jQuery, which is to use the download from the main jQuery site or via a CDN. The downside of this method is the manual work required to keep our versions of jQuery up to date! Instead, we can use a package manager to help manage our assets. Node.js is one such system. Let's take a look at the steps that need to be performed in order to get jQuery installed: We first need to install Node.js—head over to http://www.nodejs.org in order to download the package for your chosen platform; accept all the defaults when working through the wizard (for Mac and PC). Next, fire up a Node Command Prompt and then change to your project folder. In the prompt, enter this command: npm install jquery Node will fetch and install jQuery—it displays a confirmation message when the installation is complete: You can then reference jQuery by using this link: <name of drive>:websitenode_modulesjquerydistjquery.min.js. Node is now installed and ready for use—although we've installed it in a folder locally, in reality, we will most likely install it within a subfolder of our local web server. For example, if we're running WampServer, we can install it, then copy it into the /wamp/www/js folder, and reference it using http://localhost/js/jquery.min.js. If you want to take a look at the source of the jQuery Node Package Manager (NPM) package, then check out https://www.npmjs.org/package/jquery. Using Node to install jQuery makes our work simpler, but at a cost. Node.js (and its package manager, NPM) is primarily aimed at installing and managing JavaScript components and expects packages to follow the CommonJS standard. The downside of this is that there is no scope to manage any of the other assets that are often used within websites, such as fonts, images, CSS files, or even HTML pages. "Why will this be an issue?," I hear you ask. Simple, why make life hard for ourselves when we can manage all of these assets automatically and still use Node? Installing jQuery using Bower A relatively new addition to the library is the support for installation using Bower—based on Node, it's a package manager that takes care of the fetching and installing of packages from over the Internet. It is designed to be far more flexible about managing the handling of multiple types of assets (such as images, fonts, and CSS files) and does not interfere with how these components are used within a page (unlike Node). For the purpose of this demo, I will assume that you have already installed it; if not, you will need to revisit it before continuing with the following steps: Bring up the Node Command Prompt, change to the drive where you want to install jQuery, and enter this command: bower install jquery This will download and install the script, displaying the confirmation of the version installed when it has completed. The library is installed in the bower_components folder on your PC. It will look similar to this example, where I've navigated to the jquery subfolder underneath. By default, Bower will install jQuery in its bower_components folder. Within bower_components/jquery/dist/, we will find an uncompressed version, compressed release, and source map file. We can then reference jQuery in our script using this line: <script src="/bower_components/jquery/jquery.js"></script> We can take this further though. If we don't want to install the extra files that come with a Bower installation by default, we can simply enter this in a Command Prompt instead to just install the minified version 2.1 of jQuery: bower install http://code.jquery.com/jquery-2.1.0.min.js Now, we can be really clever at this point; as Bower uses Node's JSON files to control what should be installed, we can use this to be really selective and set Bower to install additional components at the same time. Let's take a look and see how this will work—in the following example, we'll use Bower to install jQuery 2.1 and 1.10 (the latter to provide support for IE6-8). In the Node Command Prompt, enter the following command: bower init This will prompt you for answers to a series of questions, at which point you can either fill out information or press Enter to accept the defaults. Look in the project folder; you should find a bower.json file within. Open it in your favorite text editor and then alter the code as shown here: {"ignore": [ "**/.*", "node_modules", "bower_components","test", "tests" ] ,"dependencies": {"jquery-legacy": "jquery#1.11.1","jquery-modern": "jquery#2.10"}} At this point, you have a bower.json file that is ready for use. Bower is built on top of Git, so in order to install jQuery using your file, you will normally need to publish it to the Bower repository. Instead, you can install an additional Bower package, which will allow you to install your custom package without the need to publish it to the Bower repository: In the Node Command Prompt window, enter the following at the prompt: npm install -g bower-installer When the installation is complete, change to your project folder and then enter this command line: bower-installer The bower-installer command will now download and install both the versions of jQuery. At this stage, you now have jQuery installed using Bower. You're free to upgrade or remove jQuery using the normal Bower process at some point in the future. If you want to learn more about how to use Bower, there are plenty of references online; https://www.openshift.com/blogs/day-1-bower-manage-your-client-side-dependencies is a good example of a tutorial that will help you get accustomed to using Bower. In addition, there is a useful article that discusses both Bower and Node, available at http://tech.pro/tutorial/1190/package-managers-an-introductory-guide-for-the-uninitiated-front-end-developer. Bower isn't the only way to install jQuery though—while we can use it to install multiple versions of jQuery, for example, we're still limited to installing the entire jQuery library. We can improve on this by referencing only the elements we need within the library. Thanks to some extensive work undertaken by the jQuery Core team, we can use the Asynchronous Module Definition (AMD) approach to reference only those modules that are needed within our website or online application. Using the AMD approach to load jQuery In most instances, when using jQuery, developers are likely to simply include a reference to the main library in their code. There is nothing wrong with it per se, but it loads a lot of extra code that is surplus to our requirements. A more efficient method, although one that takes a little effort in getting used to, is to use the AMD approach. In a nutshell, the jQuery team has made the library more modular; this allows you to use a loader such as require.js to load individual modules when needed. It's not suitable for every approach, particularly if you are a heavy user of different parts of the library. However, for those instances where you only need a limited number of modules, then this is a perfect route to take. Let's work through a simple example to see what it looks like in practice. Before we start, we need one additional item—the code uses the Fira Sans regular custom font, which is available from Font Squirrel at http://www.fontsquirrel.com/fonts/fira-sans. Let's make a start using the following steps: The Fira Sans font doesn't come with a web format by default, so we need to convert the font to use the web font format. Go ahead and upload the FiraSans-Regular.otf file to Font Squirrel's web font generator at http://www.fontsquirrel.com/tools/webfont-generator. When prompted, save the converted file to your project folder in a subfolder called fonts. We need to install jQuery and RequireJS into our project folder, so fire up a Node.js Command Prompt and change to the project folder. Next, enter these commands one by one, pressing Enter after each: bower install jquerybower install requirejs We need to extract a copy of the amd.html and amd.css files—it contains some simple markup along with a link to require.js; the amd.css file contains some basic styling that we will use in our demo. We now need to add in this code block, immediately below the link for require.js—this handles the calls to jQuery and RequireJS, where we're calling in both jQuery and Sizzle, the selector engine for jQuery: <script>require.config({paths: {"jquery": "bower_components/jquery/src","sizzle": "bower_components/jquery/src/sizzle/dist/sizzle"}});require(["js/app"]);</script> Now that jQuery has been defined, we need to call in the relevant modules. In a new file, go ahead and add the following code, saving it as app.js in a subfolder marked js within our project folder: define(["jquery/core/init", "jquery/attributes/classes"],function($) {$("div").addClass("decoration");}); We used app.js as the filename to tie in with the require(["js/app"]); reference in the code. If all went well, when previewing the results of our work in a browser. Although we've only worked with a simple example here, it's enough to demonstrate how easy it is to only call those modules we need to use in our code rather than call the entire jQuery library. True, we still have to provide a link to the library, but this is only to tell our code where to find it; our module code weighs in at 29 KB (10 KB when gzipped), against 242 KB for the uncompressed version of the full library! Now, there may be instances where simply referencing modules using this method isn't the right approach; this may apply if you need to reference lots of different modules regularly. A better alternative is to build a custom version of the jQuery library that only contains the modules that we need to use and the rest are removed during build. It's a little more involved but worth the effort—let's take a look at what is involved in the process. Customizing the downloads of jQuery from Git If we feel so inclined, we can really push the boat out and build a custom version of jQuery using the JavaScript task runner, Grunt. The process is relatively straightforward but involves a few steps; it will certainly help if you have some prior familiarity with Git! The demo assumes that you have already installed Node.js—if you haven't, then you will need to do this first before continuing with the exercise. Okay, let's make a start by performing the following steps: You first need to install Grunt if it isn't already present on your system—bring up the Node.js Command Prompt and enter this command: npm install -g grunt-cli Next, install Git—for this, browse to http://msysgit.github.io/ in order to download the package. Double-click on the setup file to launch the wizard, accepting all the defaults is sufficient for our needs. If you want more information on how to install Git, head over and take a look at https://github.com/msysgit/msysgit/wiki/InstallMSysGit for more details. Once Git is installed, change to the jquery folder from within the Command Prompt and enter this command to download and install the dependencies needed to build jQuery: npm install The final stage of the build process is to build the library into the file we all know and love; from the same Command Prompt, enter this command: grunt Browse to the jquery folder—within this will be a folder called dist, which contains our custom build of jQuery, ready for use. If there are modules within the library that we don't need, we can run a custom build. We can set the Grunt task to remove these when building the library, leaving in those that are needed for our project. For a complete list of all the modules that we can exclude, see https://github.com/jquery/jquery#modules. For example, to remove AJAX support from our build, we can run this command in place of step 5, as shown previously: grunt custom:-ajax This results in a file saving on the original raw version of 30 KB as shown in the following screenshot: The JavaScript and map files can now be incorporated into our projects in the usual way. For a detailed tutorial on the build process, this article by Dan Wellman is worth a read (https://www.packtpub.com/books/content/building-custom-version-jquery). Using a GUI as an alternative There is an online GUI available, which performs much the same tasks, without the need to install Git or Grunt. It's available at hhttp://projects.jga.me/jquery-builder/, although it is worth noting that it hasn't been updated for a while! Okay, so we have jQuery installed; let's take a look at one more useful function that will help in the event of debugging errors in our code. Support for source maps has been made available within jQuery since version 1.9. Let's take a look at how they work and see a simple example in action. Adding source map support Imagine a scenario, if you will, where you've created a killer site, which is running well, until you start getting complaints about problems with some of the jQuery-based functionality that is used on the site. Sounds familiar? Using an uncompressed version of jQuery on a production site is not an option; instead we can use source maps. Simply put, these map a compressed version of jQuery against the relevant line in the original source. Historically, source maps have given developers a lot of heartache when implementing, to the extent that the jQuery Team had to revert to disabling the automatic use of maps! For best effects, it is recommended that you use a local web server, such as WAMP (PC) or MAMP (Mac), to view this demo and that you use Chrome as your browser. Source maps are not difficult to implement; let's run through how you can implement them: Extract a copy of the sourcemap folder and save it to your project area locally. Press Ctrl + Shift + I to bring up the Developer Tools in Chrome. Click on Sources, then double-click on the sourcemap.html file—in the code window, and finally click on 17. Now, run the demo in Chrome—we will see it paused; revert back to the developer toolbar where line 17 is highlighted. The relevant calls to the jQuery library are shown on the right-hand side of the screen: If we double-click on the n.event.dispatch entry on the right, Chrome refreshes the toolbar and displays the original source line (highlighted) from the jQuery library, as shown here: It is well worth spending the time to get to know source maps—all the latest browsers support it, including IE11. Even though we've only used a simple example here, it doesn't matter as the principle is exactly the same, no matter how much code is used in the site. For a more in-depth tutorial that covers all the browsers, it is worth heading over to http://blogs.msdn.com/b/davrous/archive/2014/08/22/enhance-your-javascript-debugging-life-thanks-to-the-source-map-support-available-in-ie11-chrome-opera-amp-firefox.aspx—it is worth a read! Adding support for source maps We've just previewed the source map, source map support has already been added to the library. It is worth noting though that source maps are not included with the current versions of jQuery by default. If you need to download a more recent version or add support for the first time, then follow these steps: Source maps can be downloaded from the main site using http://code.jquery.com/jquery-X.X.X.min.map, where X represents the version number of jQuery being used. Open a copy of the minified version of the library and then add this line at the end of the file: //# sourceMappingURL=jquery.min.map Save it and then store it in the JavaScript folder of your project. Make sure you have copies of both the compressed and uncompressed versions of the library within the same folder. Let's move on and look at one more critical part of loading jQuery: if, for some unknown reason, jQuery becomes completely unavailable, then we can add a fallback position to our site that allows graceful degradation. It's a small but crucial part of any site and presents a better user experience than your site simply falling over! Working with Modernizr as a fallback A best practice when working with jQuery is to ensure that a fallback is provided for the library, should the primary version not be available. (Yes, it's irritating when it happens, but it can happen!) Typically, we might use a little JavaScript, such as the following example, in the best practice suggestions. This would work perfectly well but doesn't provide a graceful fallback. Instead, we can use Modernizr to perform the check for us and provide a graceful degradation if all fails. Modernizr is a feature detection library for HTML5/CSS3, which can be used to provide a standardized fallback mechanism in the event of a functionality not being available. You can learn more at http://www.modernizr.com. As an example, the code might look like this at the end of our website page. We first try to load jQuery using the CDN link, falling back to a local copy if that hasn't worked or an alternative if both fail: <body><script src="js/modernizr.js"></script><script type="text/javascript">Modernizr.load([{load: 'http://code.jquery.com/jquery-2.1.1.min.js',complete: function () {// Confirm if jQuery was loaded using CDN link// if not, fall back to local versionif ( !window.jQuery ) {Modernizr.load('js/jquery-latest.min.js');}}},// This script would wait until fallback is loaded, beforeloading{ load: 'jquery-example.js' }]);</script></body> In this way, we can ensure that jQuery either loads locally or from the CDN link—if all else fails, then we can at least make a graceful exit. Best practices for loading jQuery So far, we've examined several ways of loading jQuery into our pages, over and above the usual route of downloading the library locally or using a CDN link in our code. Now that we have it installed, it's a good opportunity to cover some of the best practices we should try to incorporate into our pages when loading jQuery: Always try to use a CDN to include jQuery on your production site. We can take advantage of the high availability and low latency offered by CDN services; the library may already be precached too, avoiding the need to download it again. Try to implement a fallback on your locally hosted library of the same version. If CDN links become unavailable (and they are not 100 percent infallible), then the local version will kick in automatically, until the CDN link becomes available again: <script type="text/javascript" src="//code.jquery.com/jquery-1.11.1.min.js"></script><script>window.jQuery || document.write('<scriptsrc="js/jquery-1.11.1.min.js"></script>')</script> Note that although this will work equally well as using Modernizr, it doesn't provide a graceful fallback if both the versions of jQuery should become unavailable. Although one hopes to never be in this position, at least we can use CSS to provide a graceful exit! Use protocol-relative/protocol-independent URLs; the browser will automatically determine which protocol to use. If HTTPS is not available, then it will fall back to HTTP. If you look carefully at the code in the previous point, it shows a perfect example of a protocol-independent URL, with the call to jQuery from the main jQuery Core site. If possible, keep all your JavaScript and jQuery inclusions at the bottom of your page—scripts block the rendering of the rest of the page until they have been fully rendered. Use the jQuery 2.x branch, unless you need to support IE6-8; in this case, use jQuery 1.x instead—do not load multiple jQuery versions. If you load jQuery using a CDN link, always specify the complete version number you want to load, such as jquery-1.11.1.min.js. If you are using other libraries, such as Prototype, MooTools, Zepto, and so on, that use the $ sign as well, try not to use $ to call jQuery functions and simply use jQuery instead. You can return the control of $ back to the other library with a call to the $.noConflict() function. For advanced browser feature detection, use Modernizr. It is worth noting that there may be instances where it isn't always possible to follow best practices; circumstances may dictate that we need to make allowances for requirements, where best practices can't be used. However, this should be kept to a minimum where possible; one might argue that there are flaws in our design if most of the code doesn't follow best practices! Summary If you thought that the only methods to include jQuery were via a manual download or using a CDN link, then hopefully this article has opened your eyes to some alternatives—let's take a moment to recap what we have learned. We kicked off with a customary look at how most developers are likely to include jQuery before quickly moving on to look at other sources. We started with a look at how to use Node, before turning our attention to using the Bower package manager. Next, we had a look at how we can reference individual modules within jQuery using the AMD approach. We then moved on and turned our attention to creating custom builds of the library using Git. We then covered how we can use source maps to debug our code, with a look at enabling support for them within Google's Chrome browser. To round out our journey of loading jQuery, we saw what might happen if we can't load jQuery at all and how we can get around this, by using Modernizr to allow our pages to degrade gracefully. We then finished the article with some of the best practices that we can follow when referencing jQuery. Resources for Article: Further resources on this subject: Using different jQuery event listeners for responsive interaction [Article] Building a Custom Version of jQuery [Article] Learning jQuery [Article]
Read more
  • 0
  • 0
  • 51051
article-image-preparing-optimizations
Packt
04 Jun 2015
11 min read
Save for later

Preparing Optimizations

Packt
04 Jun 2015
11 min read
In this article by Mayur Pandey and Suyog Sarda, authors of LLVM Cookbook, we will look into the following recipes: Various levels of optimization Writing your own LLVM pass Running your own pass with the opt tool Using another pass in a new pass (For more resources related to this topic, see here.) Once the source code transformation completes, the output is in the LLVM IR form. This IR serves as a common platform for converting into assembly code, depending on the backend. However, before converting into an assembly code, the IR can be optimized to produce more effective code. The IR is in the SSA form, where every new assignment to a variable is a new variable itself—a classic case of an SSA representation. In the LLVM infrastructure, a pass serves the purpose of optimizing LLVM IR. A pass runs over the LLVM IR, processes the IR, analyzes it, identifies the optimization opportunities, and modifies the IR to produce optimized code. The command-line interface opt is used to run optimization passes on LLVM IR. Various levels of optimization There are various levels of optimization, starting at 0 and going up to 3 (there is also s for space optimization). The code gets more and more optimized as the optimization level increases. Let's try to explore the various optimization levels. Getting ready... Various optimization levels can be understood by running the opt command-line interface on LLVM IR. For this, an example C program can first be converted to IR using the Clang frontend. Open an example.c file and write the following code in it: $ vi example.c int main(int argc, char **argv) { int i, j, k, t = 0; for(i = 0; i < 10; i++) {    for(j = 0; j < 10; j++) {      for(k = 0; k < 10; k++) {        t++;      }    }    for(j = 0; j < 10; j++) {      t++;    } } for(i = 0; i < 20; i++) {    for(j = 0; j < 20; j++) {      t++;    }    for(j = 0; j < 20; j++) {      t++;    } } return t; } Now convert this into LLVM IR using the clang command, as shown here: $ clang –S –O0 –emit-llvm example.c A new file, example.ll, will be generated, containing LLVM IR. This file will be used to demonstrate the various optimization levels available. How to do it… Do the following steps: The opt command-line tool can be run on the IR-generated example.ll file: $ opt –O0 –S example.ll The –O0 syntax specifies the least optimization level. Similarly, you can run other optimization levels: $ opt –O1 –S example.ll $ opt –O2 –S example.ll $ opt –O3 –S example.ll How it works… The opt command-line interface takes the example.ll file as the input and runs the series of passes specified in each optimization level. It can repeat some passes in the same optimization level. To see which passes are being used in each optimization level, you have to add the --debug-pass=Structure command-line option with the previous opt commands. See Also To know more on various other options that can be used with the opt tool, refer to http://llvm.org/docs/CommandGuide/opt.html Writing your own LLVM pass All LLVM passes are subclasses of the pass class, and they implement functionality by overriding the virtual methods inherited from pass. LLVM applies a chain of analyses and transformations on the target program. A pass is an instance of the Pass LLVM class. Getting ready Let's see how to write a pass. Let's name the pass function block counter; once done, it will simply display the name of the function and count the basic blocks in that function when run. First, a Makefile needs to be written for the pass. Follow the given steps to write a Makefile: Open a Makefile in the llvm lib/Transform folder: $ vi Makefile Specify the path to the LLVM root folder and the library name, and make this pass a loadable module by specifying it in Makefile, as follows: LEVEL = ../../.. LIBRARYNAME = FuncBlockCount LOADABLE_MODULE = 1 include $(LEVEL)/Makefile.common This Makefile specifies that all the .cpp files in the current directory are to be compiled and linked together in a shared object. How to do it… Do the following steps: Create a new .cpp file called FuncBlockCount.cpp: $ vi FuncBlockCount.cpp In this file, include some header files from LLVM: #include "llvm/Pass.h" #include "llvm/IR/Function.h" #include "llvm/Support/raw_ostream.h" Include the llvm namespace to enable access to LLVM functions: using namespace llvm; Then start with an anonymous namespace: namespace { Next declare the pass: struct FuncBlockCount : public FunctionPass { Then declare the pass identifier, which will be used by LLVM to identify the pass: static char ID; FuncBlockCount() : FunctionPass(ID) {} This step is one of the most important steps in writing a pass—writing a run function. Since this pass inherits FunctionPass and runs on a function, a runOnFunction is defined to be run on a function: bool runOnFunction(Function &F) override {      errs() << "Function " << F.getName() << 'n';      return false;    } }; } This function prints the name of the function that is being processed. The next step is to initialize the pass ID: char FuncBlockCount::ID = 0; Finally, the pass needs to be registered, with a command-line argument and a name: static RegisterPass<FuncBlockCount> X("funcblockcount", "Function Block Count", false, false); Putting everything together, the entire code looks like this: #include "llvm/Pass.h" #include "llvm/IR/Function.h" #include "llvm/Support/raw_ostream.h" using namespace llvm; namespace { struct FuncBlockCount : public FunctionPass { static char ID; FuncBlockCount() : FunctionPass(ID) {} bool runOnFunction(Function &F) override {    errs() << "Function " << F.getName() << 'n';    return false; }            };        }        char FuncBlockCount::ID = 0;        static RegisterPass<FuncBlockCount> X("funcblockcount", "Function Block Count", false, false); How it works A simple gmake command compiles the file, so a new file FuncBlockCount.so is generated at the LLVM root directory. This shared object file can be dynamically loaded to the opt tool to run it on a piece of LLVM IR code. How to load and run it will be demonstrated in the next section. See also To know more on how a pass can be built from scratch, visit http://llvm.org/docs/WritingAnLLVMPass.html Running your own pass with the opt tool The pass written in the previous recipe, Writing your own LLVM pass, is ready to be run on the LLVM IR. This pass needs to be loaded dynamically for the opt tool to recognize and execute it. How to do it… Do the following steps: Write the C test code in the sample.c file, which we will convert into an .ll file in the next step: $ vi sample.c   int foo(int n, int m) { int sum = 0; int c0; for (c0 = n; c0 > 0; c0--) {    int c1 = m;  for (; c1 > 0; c1--) {      sum += c0 > c1 ? 1 : 0;    } } return sum; } Convert the C test code into LLVM IR using the following command: $ clang –O0 –S –emit-llvm sample.c –o sample.ll This will generate a sample.ll file. Run the new pass with the opt tool, as follows: $ opt -load (path_to_.so_file)/FuncBlockCount.so -funcblockcount sample.ll The output will look something like this: Function foo How it works… As seen in the preceding code, the shared object loads dynamically into the opt command-line tool and runs the pass. It goes over the function and displays its name. It does not modify the IR. Further enhancement in the new pass is demonstrated in the next recipe. See also To know more about the various types of the Pass class, visit http://llvm.org/docs/WritingAnLLVMPass.html#pass-classes-and-requirements Using another pass in a new pass A pass may require another pass to get some analysis data, heuristics, or any such information to decide on a further course of action. The pass may just require some analysis such as memory dependencies, or it may require the altered IR as well. The new pass that you just saw simply prints the name of the function. Let's see how to enhance it to count the basic blocks in a loop, which also demonstrates how to use other pass results. Getting ready The code used in the previous recipe remains the same. Some modifications are required, however, to enhance it—as demonstrated in next section—so that it counts the number of basic blocks in the IR. How to do it… The getAnalysis function is used to specify which other pass will be used: Since the new pass will be counting the number of basic blocks, it requires loop information. This is specified using the getAnalysis loop function: LoopInfo *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo(); This will call the LoopInfo pass to get information on the loop. Iterating through this object gives the basic block information: unsigned num_Blocks = 0; Loop::block_iterator bb; for(bb = L->block_begin(); bb != L->block_end();++bb)    num_Blocks++; errs() << "Loop level " << nest << " has " << num_Blocks << " blocksn"; This will go over the loop to count the basic blocks inside it. However, it counts only the basic blocks in the outermost loop. To get information on the innermost loop, recursive calling of the getSubLoops function will help. Putting the logic in a separate function and calling it recursively makes more sense: void countBlocksInLoop(Loop *L, unsigned nest) { unsigned num_Blocks = 0; Loop::block_iterator bb; for(bb = L->block_begin(); bb != L->block_end();++bb)    num_Blocks++; errs() << "Loop level " << nest << " has " << num_Blocks << " blocksn"; std::vector<Loop*> subLoops = L->getSubLoops(); Loop::iterator j, f; for (j = subLoops.begin(), f = subLoops.end(); j != f; ++j)    countBlocksInLoop(*j, nest + 1); } virtual bool runOnFunction(Function &F) { LoopInfo *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo(); errs() << "Function " << F.getName() + "n"; for (Loop *L : *LI)    countBlocksInLoop(L, 0); return false; } How it works… The newly modified pass now needs to run on a sample program. Follow the given steps to modify and run the sample program: Open the sample.c file and replace its content with the following program: int main(int argc, char **argv) { int i, j, k, t = 0; for(i = 0; i < 10; i++) {    for(j = 0; j < 10; j++) {      for(k = 0; k < 10; k++) {        t++;      }    }    for(j = 0; j < 10; j++) {      t++;    } } for(i = 0; i < 20; i++) {    for(j = 0; j < 20; j++) {      t++;    }    for(j = 0; j < 20; j++) {      t++;    } } return t; } Convert it into a .ll file using Clang: $ clang –O0 –S –emit-llvm sample.c –o sample.ll Run the new pass on the previous sample program: $ opt -load (path_to_.so_file)/FuncBlockCount.so - funcblockcount sample.ll The output will look something like this: Function main Loop level 0 has 11 blocks Loop level 1 has 3 blocks Loop level 1 has 3 blocks Loop level 0 has 15 blocks Loop level 1 has 7 blocks Loop level 2 has 3 blocks Loop level 1 has 3 blocks There's more… The LLVM's pass manager provides a debug pass option that gives us the chance to see which passes interact with our analyses and optimizations, as follows: $ opt -load (path_to_.so_file)/FuncBlockCount.so - funcblockcount sample.ll –disable-output –debug-pass=Structure Summary In this article you have explored various optimization levels, and the optimization techniques kicking at each level. We also saw the step-by-step approach to writing our own LLVM pass. Resources for Article: Further resources on this subject: Integrating a D3.js visualization into a simple AngularJS application [article] Getting Up and Running with Cassandra [article] Cassandra Architecture [article]
Read more
  • 0
  • 0
  • 4371

article-image-upgrading-vmware-virtual-infrastructure-setups
Packt
04 Jun 2015
13 min read
Save for later

Upgrading VMware Virtual Infrastructure Setups

Packt
04 Jun 2015
13 min read
In this article by Kunal Kumar and Christian Stankowic, authors of the book VMware vSphere Essentials, you will learn how to correctly upgrade VMware virtual infrastructure setups. (For more resources related to this topic, see here.) This article will cover the following topics: Prerequisites and preparations Upgrading vCenter Server Upgrading ESXi hosts Additional steps after upgrading An example scenario Let's start with a realistic scenario that is often found in data centers these days. I assume that your virtual infrastructure consists of components such as: Multiple VMware ESXi hosts Shared storage (NFS or Fibre-channel) VMware vCenter Server and vSphere Update Manager In this example, a cluster consisting of two ESXi hosts (esxi1 and esxi2) is running VMware ESXi 5.5. On a virtual machine (vc1), a Microsoft Windows Server system is running vCenter Server and vSphere Update Manager (vUM) 5.5. This article is written as a step-by-step guide to upgrade these particular vSphere components to the most recent version, which is 6.0. Example scenario consisting of two ESXi hosts with shared storage and vCenter Server Prerequisites and preparations Before we start the upgrade, we need to fulfill the following prerequisites: Ensure ESXi version support by the hardware vendor Gurarantee ESXi version support on used hardware by VMware Create a backup of the ESXi images and vCenter Server First of all, we need to refer to our hardware vendor's support matrix to ensure that our physical hosts running VMware ESXi are supported in the new release. Hardware vendors evaluate their systems before approving upgrades to customers. As an example, Dell offers a comprehensive list for their PowerEdge servers at http://topics-cdn.dell.com/pdf/vmware-esxi-6.x_Reference%20Guide2_en-us.pdf. Here are some additional links for alternative hardware vendors: Hewlett-Packard: http://h17007.www1.hp.com/us/en/enterprise/servers/supportmatrix/vmware.aspx IBM: http://www-03.ibm.com/systems/info/x86servers/serverproven/compat/us/nos/vmware.html Cisco UCS: http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html When using Fibre-channel-based storage systems, you might also need to ensure fulfilling that vendor's support matrix. Please check out your vendor's website or contact support for this information. VMware also offers a comprehensive list of tested hardware setups at http://www.vmware.com/resources/compatibility/pdf/vi_systems_guide.pdf. In their Compatibility Guide portal, VMware enabled customers to browse for particular server systems—this information might be more recent than the aforementioned PDF file. Creating a backup of ESXi Before upgrading our ESXi hosts, we also need to make sure that we have a valid backup. In case things go wrong, we might need this backup to restore the previous ESXi version. For creating a backup of the hard disk ESXi is installed on, there are a plenty of tools in the market that implement image-based backups. One possible solution, which is free, is Clonezilla. Clonezilla is a Linux-based live medium that can easily create backup images of hard disks. To create a backup using Clonezilla, proceed with the following steps: Download the Clonezilla ISO image from their website. Make sure you select the AMD64 architecture and the ISO file format. Enable maintenance mode for the particular ESXi host. Make sure you migrate virtual machines to alternative nodes or power them off. Connect the ISO file to the ESXi host and boot from CD. Also, connect a USB drive to the host. This drive will be used to store the backup. Boot from CD and select Clonezilla live. Wait until the boot process completes. When prompted, select your keyboard layout (for example, en_US.utf8) and select Don't touch keymap. In the Start Clonezilla menu, select Start_Clonezilla and device-image. This mode creates an image of the medium ESXi is running on and stores it in the USB storage. Select local_dev and choose the USB storage connected to the host from the list in the next step. Select a folder for storing the backup (optional). Select Beginner and savedisk to store the entire disk ESXi resides on as an image. Enter a name for the backup. Select the hard disk containing the ESXi installation and proceed. You can also specify whether Clonezilla should check the image after creating it (highly recommended). Afterwards, confirm the backup process. The backup job will start immediately. Once the backup completes, select reboot from the menu to reboot the host. A running backup job in Clonezilla To restore a backup using Clonezilla, perform the following steps after booting the Clonezilla media: Complete steps 1 to 8 from the previous guide. Select Beginner and restoredisk to restore the entire disk. Select the image from the USB storage and the hard drive the image should be restored on. Acknowledge the restore process. Once the restoration completes, select reboot from the menu to reboot the host. For the system running vCenter Server, we can easily create a VM snapshot, or also use Clonezilla if a physical machine is used instead. The upgrade path It is very important to execute the particular upgrade tasks in the following order: Upgrade VMware vCenter Server Upgrade the particular ESXi hosts Reformat or upgrade the VMFS data stores (if applicable) Upgrading additional components, such as distributed virtual switches, or additional appliances The first step is to upgrade vCenter Server. This is necessary to ensure that we are able to manage our ESXi hosts after upgrading them. Newer vCenter Server versions are downward compatible with numerous ESXi versions. To double-check this, we can look up the particular version support by browsing VMware's Product Interoperability Matrix on their website. Click on Solution Interoperability, choose VMware vCenter Server from the drop-down menu, and select the version you want to upgrade to. In our example, we will choose the most recent release, 6.0, and select VMware ESX/ESXi from the Add Platform/Solution drop-down menu. VMware Product Interoperability Matrix for vCenter Server and ESXi vCenter Server 6.0 supports management of VMware ESXi 5.0 and higher. We need to ensure the same support agreement for any other used products, such as these: VMware vSphere Update Manager VMware vCenter Operations (if applicable) VMware vSphere Data Protection In other words, we need to upgrade all additional vSphere and vCenter Server components to ensure full functionality. Upgrading vCenter Server Upgrading vCenter Server is the most crucial step, as this is our central management platform. The upgrade process varies according to the chosen architecture. Upgrading Windows-based vCenter Server installations is quite easy, as the installation supports in-place upgrades. When using the vCenter Server Appliance (vCSA), there is no in-place upgrade; it is necessary to deploy a new vCSA and import the settings from the old installation. This process varies between the particular vCSA versions. For upgrading from vCSA 5.0 or 5.1 to 5.5, VMware offers a comprehensive article at http://kb.vmware.com/kb/2058441. To upgrade vCenter Server 5.x on Windows to 6.0 using the Easy Install method, proceed with the following steps: Mount the vCenter Server 6.x installation media (VMware-VIMSetup-all-6.0.0-xxx.iso) on the server running vCenter Server. Wait until the installation wizard starts; if it doesn't start, double-click on the CD/DVD icon in Windows Explorer. Select vCenter Server for Windows and click on Install to start the installation utility. Accept the End-User License Agreement (EULA). Enter the current vCenter Single-Sign-On password and proceed with the next step. The installation utility begins to execute pre-upgrade checks; this might take some time. If you're running vCenter Server along with Microsoft SQL Server Express Edition, the database will be migrated to VMware vPostgres. Review and change (if necessary) the network ports of your vCenter Server installation. If needed, change the directories for vCenter Server and the Embedded Platform Controller (ESC). Carefully review the upgrade information displayed in the wizard. Also verify that you have created a backup of your system and the database. Then click on Upgrade to start the upgrade. After the upgrade, vSphere Web Client can be used to connect to the upgraded vCenter Server system. Also note that the Microsoft SQL Server Express Edition database is not used anymore. Upgrading ESXi hosts Upgrading ESXi hosts can be done using two methods: Using the installation media from the VMware website vSphere Update Manager If you need to upgrade a large number of ESXi hosts, I recommend that you use vSphere Update Manager to save time, as it can automate the particular steps. For smaller landscapes, using the installation media is easier. For using vUM to upgrade ESXi hosts, VMware offers a guide on their knowledge base at http://kb.vmware.com/kb/1019545. In order to upgrade an ESXi host using the installation media, perform the following steps: First of all, enable maintenance mode for the particular ESXi host. Make sure you migrate the virtual machines to alternative nodes or power them off. Connect the installation media to the ESXi host and boot from CD. Once the setup utility becomes available, press Enter to start the installation wizard. Accept the End-User License Agreement (EULA) by pressing F11. Select the disk containing the current ESXi installation. In the ESXi found dialog, select Upgrade. Review the installation information and press F11 to start the upgrade. After the installation completes, press Enter to reboot the system. After the system has rebooted, it will automatically reconnect to vCenter Server. Select the particular ESXi host to see whether the version has changed. In this example, the ESXi host has been successfully upgraded to version 6.0: Version information of an updated ESXi host running release 6.0 Repeat all of these steps for all the remaining ESXi hosts. Note that running an ESXi cluster with mixed versions should only be a temporary solution. It is not recommended to mix various ESXi releases in production usage, as the various features of ESXi might not perform as expected in mixed clusters. Additional steps After upgrading vCenter Server and our ESXi hosts, there are additional steps that can be done: Reformating or upgrading VMFS data stores Upgrading distributed virtual switches Upgrading virtual machine's hardware versions Upgrading VMFS data stores VMware's VMFS (Virtual Machine Filesystem) is the most used filesystem for shared storage. It can be used along with local storage, iSCSI, or Fibre-channel storage. Particularly, ESX(i) releases support various versions of VMFS. Let's take a look at the major differences:   VMFS 2   VMFS 3   VMFS 5   Supported by ESX 2.x, ESXi 3.x/4.x (read-only) ESX(i) 3.x and higher ESXi 5.x and higher Block size(s) 1, 8, 64, or 256 MB 1, 2, 4, or 8 MB 1 MB (fixed) Maximum file size 1 MB block size: 456 MB 8 MB block size: 2.5 TB 64 MB block size: 28.5 TB 256 MB block size: 64 TB 1 MB block size: 256 MB 2 MB block size: 512 GB 4 MB block size: 1 TB 8 MB block size: 2 TB 62 TB Files per volume Ca. 256 (no directories supported) Ca. 37,720 Ca. 130,690 When migrating from an ESXi version such as 4.x or older, it is possible to upgrade VMFS data stores to version 5. VMFS 2 cannot be upgraded to VMFS 5; it first needs to be upgraded to VMFS 3. To enable the upgrade, a VMFS 2 volume must not have a block size more than 8 MB, as VMFS 3 only supports block sizes up to 8 MB. In comparison with older VMFS versions, VMFS 5 supports larger file sizes and more files per volume. I highly recommend that you reformat VMFS data stores instead of upgrading them, as the upgrade does not change the filesystem's block size. Because of this limitation, you won't benefit from all the new VMFS 5 features after an upgrade. To upgrade a VMFS 3 volume to VMFS 5, perform these steps: Log in to vSphere Web Client. Go to the Storage pane. Click on the data store to upgrade and go to Settings under the Manage tab. Click on Upgrade to VMFS5. Then click on OK to start the upgrade. VMware vNetwork Distributed Switch When using vNetwork Distributed Switches (also often called dvSwitches) it is recommended to perform an upgrade to the latest version. In comparison with vNetwork Standard Switches (also called vSwitches), dvSwitches are created at the vCenter Server level and replicated to all subscribed ESXi hosts. When creating a dvSwitch, the administrator can choose between various dvSwitch versions. After upgrading vCenter Server and the ESXi hosts, additional features can be unlocked by upgrading the dvSwitch. Let's take a look at some commonly used dvSwitch versions:   vDS 5.0   vDS 5.1   vDS 5.5   vDS 6.0   Compatible with ESXi 5.0 and higher ESXi 5.1 and higher ESXi 5.5 and higher ESXi 6.0 Common features Network I/O Control, load-based teaming, traffic shaping, VM port blocking, PVLANs (private VLANs), network vMotion, and port policies Additional features Network resource pools, NetFlow, and port mirroring VDS 5.0 +, management network rollback, network health checks, enhanced port mirroring, and LACP (Link Aggregation Control Protocol) VDS 5.1 +, traffic filtering, and enhanced LACP functionality VDS 5.5 +, multicast snooping, and Network I/O Control version 3 (bandwidth guarantee) It is also possible to use the old version furthermore, as vCenter Server is downward compatible with numerous dvSwitch versions. Upgrading a dvSwitch is a task that cannot be undone. During the upgrade, it is possible that virtual machines will lose their network connectivity for some seconds. After the upgrade, older ESXi hosts will not be able to participate in the distributed switch setup. To upgrade a dvSwitch, perform the following steps: Log in to vSphere Web Client. Go to the Networking pane and select the dvSwitch to upgrade. Lorem..... After upgrading the dvSwitch, you will notice that the version has changed: Version information of a dvSwitch running VDS 6.0 Virtual machine hardware version Every virtual machine is created with a virtual machine hardware version specified (also called VMHW or vHW). A vHW version defines a set of particular limitations and features, such as controller types or network cards. To benefit from the new virtual machine features, it is sufficient to upgrade vHW versions. ESXi hosts support a range of vHW versions, but it is always advisable to use the most recent vHW version. Once a vHW version is upgraded, particular virtual machines cannot be started on older ESXi versions that don't support the vHW version. Let's take a deeper look at some popular vHW versions:   vSphere 4.1   vSphere 5.1   vSphere 5.5   vSphere 6.0   Maximum vHW 7 9 10 11 Virtual CPUs 8 64 128 Virtual RAM 255 GB 1 TB 4 TB vDisk size 2 TB 62 TB SCSI adapters / targets 4/60 SATA adapters / targets Not supported 4/30 Parallel / Serial Ports 3/4 3/32 USB controllers / devices per VM 1/20 (USB 1.x + 2.x) 1/20 (USB 1.x, 2.x + 3.x) The upgrade cannot be undone. Also, it might be necessary to update VMware Tools and the drivers of the operating system running in the virtual machine. Summary In this article we learnt how to correctly upgrade VMware virtual infrastructure setups. If you want to know more about VMware vSphere and virtual infrastructure setups, go ahead and get your copy of Packt Publishing's book VMware vSphere Essentials. Resources for Article: Further resources on this subject: Networking [article] The Design Documentation [article] VMware View 5 Desktop Virtualization [article]
Read more
  • 0
  • 0
  • 7692

article-image-plotting-haskell
Packt
04 Jun 2015
10 min read
Save for later

Plotting in Haskell

Packt
04 Jun 2015
10 min read
In this article by James Church, author of the book Learning Haskell Data Analysis, we will see the different methods of data analysis by plotting data using Haskell. The other topics that this article covers is using GHCi, scaling data, and comparing stock prices. (For more resources related to this topic, see here.) Can you perform data analysis in Haskell? Yes, and you might even find that you enjoy it. We are going to take a few snippets of Haskell and put some plots of the stock market data together. To get started with, the following software needs to be installed: The Haskell platform (http://www.haskell.org/platform) Gnuplot (http://www.gnuplot.info/) The cabal command-line tool is the tool used to install packages in Haskell. There are three packages that we may need in order to analyze the stock market data. To use cabal, you will use the cabal install [package names] command. Run the following command to install the CSV parsing package, the EasyPlot package, and the Either package: $ cabal install csv easyplot either Once you have the necessary software and packages installed, we are all set for some introductory analysis in Haskell. We need data It is difficult to perform an analysis of data without data. The Internet is rich with sources of data. Since this tutorial looks at the stock market data, we need a source. Visit the Yahoo! Finance website to find the history of every publicly traded stock on the New York Stock Exchange that has been adjusted to reflect splits over time. The good folks at Yahoo! provide this resource in the csv file format. We begin with downloading the entire history of the Apple company from Yahoo! Finance (http://finance.yahoo.com). You can find the content for Apple by performing a quote look up from the Yahoo! Finance home page for the AAPL symbol (that is, 2 As, not 2 Ps). On this page, you can find the link for Historical Prices. On the Historical Prices page, identify the link that says Download to Spreadsheet. The complete link to Apple's historical prices can be found at the following link: http://real-chart.finance.yahoo.com/table.csv?s=AAPL. We should take a moment to explore our dataset. Here are the column headers in the csv file: Date: This is a string that represents the date of a particular date in Apple's history Open: This is the opening value of one share High: This is the high trade value over the course of this day Low: This is the low trade value of the course of this day Close: This is the final price of the share at the end of this trading day Volume: This is the total number of shares traded on this day Adj Close: This is a variation on the closing price that adjusts the dividend payouts and company splits Another feature of this dataset is that each of the rows are written in a table in a chronological reverse order. The most recent date in the table is the first. The oldest is the last. Yahoo! Finance provides this table (Apple's historical prices) under the unhelpful name table.csv. I renamed my csv file aapl.csv, which is provided by Yahoo! Finance. Start GHCi The interactive prompt for Haskell is GHCi. On the command line, type GHCi. We begin with importing our newly installed libraries from the prompt: > import Data.List< > import Text.CSV< > import Data.Either.Combinators< > import Graphics.EasyPlot Parse the csv file that you just downloaded using the parseCSVFromFile command. This command will return an Either type, which represents one of the two things that happened: your file was parsed (Right) or something went wrong (Left). We can inspect the type of our result with the :t command: > eitherErrorOrCells <- parseCSVFromFile "aapl.csv"< > :t eitherErrorOrCells < eitherErrorOrCells :: Either Text.Parsec.Error.ParseError CSV Did we get an error for our result? For this, we are going to use the fromRight and fromLeft commands. Remember, Right is right and Left is wrong. When we run the fromLeft command, we should see this message saying that our content is in the Right: > fromLeft' eitherErrorOrCells < *** Exception: Data.Either.Combinators.fromLeft: Argument takes from 'Right _' Pull the cells of our csv file into cells. We can see the first four rows of our content using take 5 (which will pull our header line and the first four cells): > let cells = fromRight' eitherErrorOrCells< > take 5 cells< [["Date","Open","High","Low","Close","Volume","Adj Close"],["2014-11-10","552.40","560.63","551.62","558.23","1298900","558.23"],["2014-11-07","555.60","555.60","549.35","551.82","1589100","551.82"],["2014-11-06","555.50","556.80","550.58","551.69","1649900","551.69"],["2014-11-05","566.79","566.90","554.15","555.95","1645200","555.95"]] The last column in our csv file is the Adj Close, which is the column we would like to plot. Count the columns (starting with 0), and you will find that Adj Close is number 6. Everything else can be dropped. (Here, we are also using the init function to drop the last row of the data, which is an empty list. Grabbing the 6th element of an empty list will not work in Haskell.): > map (x -> x !! 6) (take 5 (init cells))< ["Adj Close","558.23","551.82","551.69","555.95"] We know that this column represents the adjusted close prices. We should drop our header row. Since we use tail to drop the header row, take 5 returns the first five adjusted close prices: > map (x -> x !! 6) (take 5 (tail (init cells)))< ["558.23","551.82","551.69","555.95","564.19"] We should store all of our adjusted close prices in a value called adjCloseOriginal: > let adjCloseAAPLOriginal = map (x -> x !! 6) (tail (init cells)) These are still raw strings. We need to convert these to a Double type with the read function: > let adjCloseAAPL = map read adjCloseAaplOriginal :: [Double] We are almost done messaging our data. We need to make sure that every value in adjClose is paired with an index position for the purpose of plotting. Remember that our adjusted closes are in a chronological reverse order. This will create a tuple, which can be passed to the plot function: > let aapl = zip (reverse [1.0..length adjCloseAAPL]) adjCloseAAPL< > take 5 aapl < [(2577,558.23),(2576,551.82),(2575,551.69),(2574,555.95),(2573,564.19)] Plotting > plot (PNG "aapl.png") $ Data3D [Title "AAPL"] [] aapl< True The following chart is the result of the preceding command: Open aapl.png, which should be newly created in your current working directory. This is a typical default chart created by EasyPlot. We can see the entire history of the Apple stock price. For most of this history, the adjusted share price was less than $10 per share. At about the 6,000 trading day, we see the quick ascension of the share price to over $100 per share. Most of the time, when we take a look at a share price, we are only interested in the tail portion (say, the last year of changes). Our data is already reversed, so the newest close prices are at the front. There are 252 trading days in a year, so we can take the first 252 elements in our value and plot them. While we are at it, we are going to change the style of the plot to a line plot: > let aapl252 = take 252 aapl< > plot (PNG "aapl_oneyear.png") $ Data2D [Title "AAPL", Style Lines] [] aapl252< True The following chart is the result of the preceding command: Scaling data Looking at the share price of a single company over the course of a year will tell you whether the price is trending upward or downward. While this is good, we can get better information about the growth by scaling the data. To scale a dataset to reflect the percent change, we subtract each value by the first element in the list, divide that by the first element, and then multiply by 100. Here, we create a simple function called percentChange. We then scale the values 100 to 105, using this new function. (Using the :t command is not necessary, but I like to use it to make sure that I have at least the desired type signature correct.): > let percentChange first value = 100.0 * (value - first) / first< > :t percentChange< percentChange :: Fractional a => a -> a -> a< > map (percentChange 100) [100..105]< [0.0,1.0,2.0,3.0,4.0,5.0] We will use this new function to scale our Apple dataset. Our tuple of values can be split using the fst (for the first value containing the index) and snd (for the second value containing the adjusted close) functions: > let firstValue = snd (last aapl252)< > let aapl252scaled = map (pair -> (fst pair, percentChange firstValue (snd pair))) aapl252< > plot (PNG "aapl_oneyear_pc.png") $ Data2D [Title "AAPL PC", Style Lines] [] aapl252scaled< True The following chart is the result of the preceding command: Let's take a look at the preceding chart. Notice that it looks identical to the one we just made, except that the y axis is now changed. The values on the left-hand side of the chart are now the fluctuating percent changes of the stock from a year ago. To the investor, this information is more meaningful. Comparing stock prices Every publicly traded company has a different stock price. When you hear that Company A has a share price of $10 and Company B has a price of $100, there is almost no meaningful content to this statement. We can arrive at a meaningful analysis by plotting the scaled history of the two companies on the same plot. Our Apple dataset uses an index position of the trading day for the x axis. This is fine for a single plot, but in order to combine plots, we need to make sure that all plots start at the same index. In order to prepare our existing data of Apple stock prices, we will adjust our index variable to begin at 0: > let firstIndex = fst (last aapl252scaled)< > let aapl252scaled = map (pair -> (fst pair - firstIndex, percentChange firstValue (snd pair))) aapl252 We will compare Apple to Google. Google uses the symbol GOOGL (spelled Google without the e). I downloaded the history of Google from Yahoo! Finance and performed the same steps that I previously wrote with our Apple dataset: > -- Prep Google for analysis< > eitherErrorOrCells <- parseCSVFromFile "googl.csv"< > let cells = fromRight' eitherErrorOrCells< > let adjCloseGOOGLOriginal = map (x -> x !! 6) (tail (init cells))< > let adjCloseGOOGL = map read adjCloseGOOGLOriginal :: [Double]< > let googl = zip (reverse [1.0..genericLength adjCloseGOOGL]) adjCloseGOOGL< > let googl252 = take 252 googl< > let firstValue = snd (last googl252)< > let firstIndex = fst (last googl252)< > let googl252scaled = map (pair -> (fst pair - firstIndex, percentChange firstValue (snd pair))) googl252 Now, we can plot the share prices of Apple and Google on the same chart, Apple plotted in red and Google plotted in blue: > plot (PNG "aapl_googl.png") [Data2D [Title "AAPL PC", Style Lines, Color Red] [] aapl252scaled, Data2D [Title "GOOGL PC", Style Lines, Color Blue] [] googl252scaled]< True The following chart is the result of the preceding command: You can compare for yourself the growth rate of the stock price for these two competing companies because I believe that the contrast is enough to let the image speak for itself. This type of analysis is useful in the investment strategy known as growth investing. I am not recommending this as a strategy, nor am I recommending either of these two companies for the purpose of an investment. I am recommending Haskell as your language of choice for performing data analysis. Summary In this article, we used data from a csv file and plotted data. The other topics covered in this article were using GHCi and EasyPlot for plotting, scaling data, and comparing stock prices. Resources for Article: Further resources on this subject: The Hunt for Data [article] Getting started with Haskell [article] Driving Visual Analyses with Automobile Data (Python) [article]
Read more
  • 0
  • 0
  • 7936
article-image-predicting-hospital-readmission-expense-using-cascading
Packt
04 Jun 2015
10 min read
Save for later

Predicting Hospital Readmission Expense Using Cascading

Packt
04 Jun 2015
10 min read
In this article by Michael Covert, author of the book Learning Cascading, we will look at a system that allows for health care providers to create complex predictive models that can assess who is most at risk for such readmission using Cascading. (For more resources related to this topic, see here.) Overview Hospital readmission is an event that health care providers are attempting to reduce, and it is the primary target of new regulations of the Affordable Care Act, passed by the US government. A readmission is defined as any reentry to a hospital 30 days or less from a prior discharge. The financial impact of this is that US Medicare and Medicaid will either not pay or will reduce the payment made to hospitals for expenses incurred. By the end of 2014, over 2600 hospitals will incur these losses from a Medicare and Medicaid tab that is thought to exceed $24 billion annually. Hospitals are seeking to find ways to predict when a patient is susceptible to readmission so that actions can be taken to fully treat the patient before discharge. Many of them are using big data and machine learning-based predictive analytics. One such predictive engine is MedPredict from Analytics Inside, a company based in Westerville, Ohio. MedPredict is the predictive modeling component of the MedMiner suite of health care products. These products use Concurrent Cascading products to perform nightly rescoring of inpatients using a highly customizable calculation known as LACE, which stands for the following: Length of stay: This refers to the number of days a patient been in hospital Acute admissions through emergency department: This refers to whether a patient has arrived through the ER Comorbidities: A comorbidity refers to the presence of a two or more individual conditions in a patient. Each condition is designated by a diagnosis code. Diagnosis codes can also indicate complications and severity of a condition. In LACE, certain conditions are associated with the probability of readmission through statistical analysis. For instance, a diagnosis of AIDS, COPD, diabetes, and so on will each increase the probability of readmission. So, each diagnosis code is assigned points, with other points indicating "seriousness" of the condition. Diagnosis codes: These refer to the International Classification of Disease codes. Version 9 (ICD-9) and now version 10 (ICD-10) standards are available as well. Emergency visits: This refers to the number of emergency room visits the patient has made in a particular window of time. The LACE engine looks at a patient's history and computes a score that is a predictor of readmissions. In order to compute the comorbidity score, the Charlson Comorbidity Index (CCI) calculation is used. It is a statistical calculation that factors in the age and complexity of the patient's condition. Using Cascading to control predictive modeling The full data workflow to compute the probability of readmissions is as follows: Read all hospital records and reformat them into patient records, diagnosis records, and discharge records. Read all data related to patient diagnosis and diagnosis records, that is, ICD-9/10, date of diagnosis, complications, and so on. Read all tracked diagnosis records and join them with patient data to produce a diagnosis (comorbidity) score by summing up comorbidity "points". Read all data related to patient admissions, that is, records associated with admission and discharge, length of stay, hospital, admittance location, stay type, and so on. Read patient profile record, that is, age, race, gender, ethnicity, eye color, body mass indicator, and so on. Compute all intermediate scores for age, emergency visits, and comorbidities. Calculate the LACE score (refer to Figure 2). Assign a date and time to it. Take all the patient information, as mentioned in the preceding points, and run it through MedPredict to produce these variety of metrics: Expected length of stay Expected expense Expected outcome Probability of readmission Figure 1 – The data workflow The Cascading LACE engine The calculational aspects of computing LACE scores makes it ideal for Cascading as a series of reusable subassemblies. Firstly, the extraction, transformation, and loading (ETL) of patient data is complex and costly. Secondly, the calculations are data-intensive. The CCI alone has to examine a patient's medical history and must find all matching diagnosis codes (such as ICD-9 or ICD-10) to assign a score. This score must be augmented by the patient's age, and lastly, a patient's inpatient discharge records must be examined for admittance to the ER as well as emergency room visits. Also, many hospitals desire to customize these calculations. The LACE engine supports and facilitates this since scores are adjustable at the diagnosis code level, and MedPredict automatically produces metrics about how significant an individual feature is to the resulting score. Medical data is quite complex too. For instance, the particular diagnosis codes that represent cancer are many, and their meanings are quite nuanced. In some cases, metastasis (spreading of cancer to other locations in the body) may have occurred, and this is treated as a more severe situation. In other situations, measured values may be "bucketed", so this implies that we track the number of emergency room visits over 1 year, 6 months, 90 days, and 30 days. The Cascading LACE engine performs these calculations easily. It is customized through a set of hospital supplied parameters, and it has the capability to perform full calculations nightly due to its usage of Hadoop. Using this capability, a patient's record can track the full history of the LACE index over time. Additionally, different sets of LACE indices can be computed simultaneously, maybe one used for diabetes, the other for Chronic Obstructive Pulmonary Disorder (COPD), and so on. Figure 2 – The LACE subassembly MedPredict tracking The Lace engine metrics feed into MedPredict along with many other variables cited previously. These records are rescored nightly and the patient history is updated. This patient history is then used to analyze trends and generate alerts when the patient is showing an increased likelihood of variance to the desired metric values. What Cascading does for us We chose Cascading to help reduce the complexity of our development efforts. MapReduce provided us with the scalability that we desired, but we found that we were developing massive amounts of code to do so. Reusability was difficult, and the Java code library was becoming large. By shifting to Cascading, we found that we could encapsulate our code better and achieve significantly greater reusability. Additionally, we reduced complexity as well. The Cascading API provides simplification and understandability, which accelerates our development velocity metrics and also reduces bugs and maintenance cycles. We allow Cascading to control the end-to-end workflow of these nightly calculations. It handles preprocessing and formatting of data. Then, it handles running these calculations in parallel, allowing high speed hash joins to be performed, and also for each leg of the calculation to be split into a parallel pipe. Next, all these calculations are merged and the final score is produced. The last step is to analyze the patient trends and generate alerts where potential problems are likely to occur. Cascading has allowed us to produce a reusable assembly that is highly parameterized, thereby allowing hospitals to customize their usage. Not only can thresholds, scores, and bucket sizes be varied, but if it's desired, additional information could be included for things, such as medical procedures performed on the patient. The local mode of Cascading allows for easy testing, and it also provides a scaled down version that can be run against a small number of patients. However, by using Cascading in the Hadoop mode, massive scalability can be achieved against very large patient populations and ICD-9/10 code sets. Concurrent also provides an excellent framework for predictive modeling using machine learning through its Pattern component. MedPredict uses this to integrate its predictive engine, which is written using Cascading, MapReduce, and Mahout. Pattern provides an interface for the integration of other external analysis products through the exchange of Predictive Model Markup Language (PMML), an XML dialect that allows many of the MedPredict proprietary machine learning algorithms to be directly incorporated into the full Cascading LACE workflow. MedPredict then produces a variety of predictive metrics in a single pass of the data. The LACE scores (current and historical trends) are used as features for these predictions. Additionally, Concurrent provides a product called Driven that greatly reduces the development cycle time for such large, complex applications. Their lingual product provides seamless integration with relational databases, which is also key to enterprise integration. Results Numerous studies have now been performed using LACE risk estimates. Many hospitals have shown the ability to reduce readmission rates by 5-10 percent due to early intervention and specific guidance given to a patient as a result of an elevated LACE score. Other studies are examining the efficacy of additional metrics, and of segmentation of the patients into better identifying groups, such as heart failure, cancer, diabetes, and so on. Additional effort is being put in to study the ability of modifying the values of the comorbidity scores, taking into account combinations and complications. In some cases, even more dramatic improvements have taken place using these techniques. For up-to-date information, search for LACE readmissions, which will provide current information about implementations and results. Analytics Inside LLC Analytics Inside is based in Westerville, Ohio. It was founded in 2005 and specializes in advanced analytical solutions and services. Analytics Inside produces the RelMiner family of relationship mining systems. These systems are based on machine learning, big data, graph theories, data visualizations, and Natural Language Processing (NLP). For further information, visit our website at http://www.AnalyticsInside.us, or e-mail us at info@AnalyticsInside.us. MedMiner Advanced Analytics for Health Care is an integrated software system designed to help an organization or patient care team in the following ways: Predicting the outcomes of patient cases and tracking these predictions over time Generating alerts based on patient case trends that will help direct remediation Complying better with ARRA value-based purchasing and meaningful use guidelines Providing management dashboards that can be used to set guidelines and track performance Tracking performance of drug usage, interactions, potentials for drug diversion, and pharmaceutical fraud Extracting medical information contained within text documents Designating data security is a key design point PHI can be hidden through external linkages, so data exchange is not required If PHI is required, it is kept safe through heavy encryption, virus scanning, and data isolation Using both cloud-based and on premise capabilities to meet client needs Concurrent Inc. Concurrent Inc. is the leader in big data application infrastructure, delivering products that help enterprises create, deploy, run, and manage data applications at scale. The company's flagship enterprise solution, Driven, was designed to accelerate the development and management of enterprise data applications. Concurrent is the team behind Cascading, the most widely deployed technology for data applications with more than 175,000 user downloads a month. Used by thousands of businesses, including eBay, Etsy, The Climate Corporation, and Twitter, Cascading is the defacto standard in open source application infrastructure technology. Concurrent is headquartered in San Francisco and can be found online at http://concurrentinc.com. Summary Hospital readmission is an event that health care providers are attempting to reduce, and it is a primary target of new regulation from the Affordable Care Act, passed by the US government. This article describes a system that allows for health care providers to create complex predictive models that can assess who is most at risk for such readmission using Cascading. Resources for Article: Further resources on this subject: Hadoop Monitoring and its aspects [article] Introduction to Hadoop [article] YARN and Hadoop [article]
Read more
  • 0
  • 0
  • 1933

Packt
03 Jun 2015
9 min read
Save for later

Microsoft Azure – Developing Web API for Mobile Apps

Packt
03 Jun 2015
9 min read
Azure Websites is an excellent platform to deploy and manage the Web API, Microsoft Azure provides, however, another alternative in the form of Azure Mobile Services, which targets mobile application developers. In this article by Nikhil Sachdeva, coauthor of the book Building Web Services with Microsoft Azure, we delve into the capabilities of Azure Mobile Services and how it provides a quick and easy development ecosystem to develop Web APIs that support mobile apps. (For more resources related to this topic, see here.) Creating a Web API using Mobile Services In this section, we will create a Mobile Services-enabled Web API using Visual Studio 2013. For our fictitious scenario, we will create an Uber-like service but for medical emergencies. In the case of a medical emergency, users will have the option to send a request using their mobile device. Additionally, third-party applications and services can integrate with the Web API to display doctor availability. All requests sent to the Web API will follow the following process flow: The request will be persisted to a data store. An algorithm will find a doctor that matches the incoming request based on availability and proximity. Push Notifications will be sent to update the physician and patient. Creating the project Mobile Services provides two options to create a project: Using the Management portal, we can create a new Mobile Service and download a preassembled package that contains the Web API as well as the targeted mobile platform project Using Visual Studio templates The Management portal approach is easier to implement and does give a jumpstart by creating and configuring the project. However, for the scope of this article, we will use the Visual Studio template approach. For more information on creating a Mobile Services Web API using the Azure Management Portal, please refer to http://azure.microsoft.com/en-us/documentation/articles/mobile-services-dotnet-backend-windows-store-dotnet-get-started/. Azure Mobile Services provides a Visual Studio 2013 template to create a .NET Web API, we will use this template for our scenario. Note that the Azure Mobile Services template is only available from Visual Studio 2013 update 2 and onward. Creating a Mobile Service in Visual Studio 2013 requires the following steps: Create a new Azure Mobile Service project and assign it a Name, Location, and Solution. Click OK. In the next tab, we have a familiar ASP.NET project type dialog. However, we notice a few differences from the traditional ASP.NET dialog, which are as follows:    The Web API option is enabled by default and is the only choice available    The Authentication tab is disabled by default    The Test project option is disabled    The Host in the cloud option automatically suggests Mobile Services and is currently the only choice Select the default settings and click on OK. Visual Studio 2013 prompts developers to enter their Azure credentials in case they are not already logged in: For more information on Azure tools for Visual Studio, please refer visit https://msdn.microsoft.com/en-us/library/azure/ee405484.aspx. Since we are building a new Mobile Service, the next screen gathers information about how to configure the service. We can specify the existing Azure resources in our subscription or create new from within Visual Studio. Select the appropriate options and click on Create: The options are described here: Option Description Subscription This lists the name of the Azure subscription where the service will be deployed. Select from the dropdown if multiple subscriptions are available. Name This is the name of the Mobile Services deployment, this will eventually become the root DNS URL for the mobile service unless a custom domain is specified. (For example, contoso.azure-mobile.net). Runtime This allows selection of runtime. Note that as of writing this book, only the .NET framework was supported in Visual Studio, so this option is currently prepopulated and disabled. Region Select the Azure data center where the Web API will be deployed. As of writing this book, Mobile Services is available in the following regions: West US, East US, North Europe, East Asia, and West Japan. For details on latest regional availability, please refer to http://azure.microsoft.com/en-us/regions/#services. Database By default, a SQL Azure database gets associated with every Mobile Services deployment. It comes in handy if SQL is being used as the data store. However, in scenarios where different data stores such as the table storage or Mongo DB may be used, we still create this SQL database. We can select from a free 20 MB SQL database or an existing paid standard SQL database. For more information about SQL tiers, please visit http://azure.microsoft.com/en-us/pricing/details/sql-database. Server user name Provide the server name for the Azure SQL database. Server password Provide a password for the Azure SQL database. This process creates the required entities in the configured Azure subscription. Once completed, we have a new Web API project in the Visual Studio solution. The following screenshot is the representation of a new Mobile Service project: When we create a Mobile Service Web API project, the following NuGet packages are referenced in addition to the default ASP.NET Web API NuGet packages: Package Description WindowsAzure MobileServices Backend This package enables developers to build scalable and secure .NET mobile backend hosted in Microsoft Azure. We can also incorporate structured storage, user authentication, and push notifications. Assembly: Microsoft.WindowsAzure.Mobile.Service Microsoft Azure Mobile Services .NET Backend Tables This package contains the common infrastructure needed when exposing structured storage as part of the .NET mobile backend hosted in Microsoft Azure. Assembly: Microsoft.WindowsAzure.Mobile.Service.Tables Microsoft Azure Mobile Services .NET Backend Entity Framework Extension This package contains all types necessary to surface structured storage (using Entity Framework) as part of the .NET mobile backend hosted in Microsoft Azure. Assembly: Microsoft.WindowsAzure.Mobile.Service.Entity Additionally, the following third-party packages are installed: Package Description EntityFramework Since Mobile Services provides a default SQL database, it leverages Entity Framework to provide an abstraction for the data entities. AutoMapper AutoMapper is a convention based object-to-object mapper. It is used to map legacy custom entities to DTO objects in Mobile Services. OWIN Server and related assemblies Mobile Services uses OWIN as the default hosting mechanism. The current template also adds: Microsoft OWIN Katana packages to run the solution in IIS Owin security packages for Google, Azure AD, Twitter, Facebook Autofac This is the favorite Inversion of Control (IoC) framework. Azure Service Bus Microsoft Azure Service Bus provides Notification Hub functionality. We now have our Mobile Services Web API project created. The default project added by Visual Studio is not an empty project but a sample implementation of a Mobile Service-enabled Web API. In fact, a controller and Entity Data Model are already defined in the project. If we hit F5 now, we can see a running sample in the local Dev environment: Note that Mobile Services modifies the WebApiConfig file under the App_Start folder to accommodate some initialization and configuration changes: {    ConfigOptions options = new ConfigOptions();      HttpConfiguration config = ServiceConfig.Initialize     (new ConfigBuilder(options)); } In the preceding code, the ServiceConfig.Initialize method defined in the Microsoft.WindowsAzure.Mobile.Service assembly is called to load the hosting provider for our mobile service. It loads all assemblies from the current application domain and searches for types with HostConfigProviderAttribute. If it finds one, the custom host provider is loaded, or else the default host provider is used. Let's extend the project to develop our scenario. Defining the data model We now create the required entities and data model. Note that while the entities have been kept simple for this article, in the real-world application, it is recommended to define a data architecture before creating any data entities. For our scenario, we create two entities that inherit from Entity Data. These are described here. Record Record is an entity that represents data for the medical emergency. We use the Record entity when invoking CRUD operations using our controller. We also use this entity to update doctor allocation and status of the request as shown: namespace Contoso.Hospital.Entities {       /// <summary>    /// Emergency Record for the hospital    /// </summary> public class Record : EntityData    {        public string PatientId { get; set; }          public string InsuranceId { get; set; }          public string DoctorId { get; set; }          public string Emergency { get; set; }          public string Description { get; set; }          public string Location { get; set; }          public string Status { get; set; }           } } Doctor The Doctor entity represents the doctors that are registered practitioners in the area, the service will search for the availability of a doctor based on the properties of this entity. We will also assign the primary DoctorId to the Record type when a doctor is assigned to an emergency. The schema for the Doctor entity is as follows: amespace Contoso.Hospital.Entities {    public class Doctor: EntityData    {        public string Speciality{ get; set; }          public string Location { get; set; }               public bool Availability{ get; set; }           } } Summary In this article, we looked at a solution for developing a Web API that targets mobile developers. Resources for Article: Further resources on this subject: Security in Microsoft Azure [article] Azure Storage [article] High Availability, Protection, and Recovery using Microsoft Azure [article]
Read more
  • 0
  • 0
  • 3558
Modal Close icon
Modal Close icon