Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-tour-xcode
Packt
06 Feb 2015
13 min read
Save for later

Tour of Xcode

Packt
06 Feb 2015
13 min read
In this article, written by Jayant Varma, the author of Xcode 6 Essentials, we shall look at Xcode closely as this is going to be the tool you would use quite a lot for all aspects of your app development for Apple devices. It is a good idea to know and be familiar with the interface, the sections, shortcut keys, and so on. (For more resources related to this topic, see here.) Starting Xcode Xcode, like many other Mac applications, is found in the Applications folder or the Launchpad. On starting Xcode, you will be greeted with the launch screen that offers some entry points for working with Xcode. Mostly, you will select Create a new Xcode project or Check out an existing project , if you have an existing project to continue work on. Xcode remembers what it was doing last, so if you had a project or file open, it will open up those windows again. Creating a new project After selecting the Create a new project option, we are guided via a wizard that helps us get started. Selecting the project type The first step is to select what type of project you want to create. At the moment, there are two distinct types of projects, mobile (iOS) or desktop (OS X) that you can create. Within each of those types, you can select the type of project you want. The screenshot displays a standard configuration for iOS application projects. The templates used when the selected type of project is created are self sufficient, that is, when the Run button is pressed, the app compiles and runs. It might do nothing, as this is a minimalistic template. On selecting the type of project, we can select the next step: Setting the project options This step allows selecting the options, namely setting the application name, the organization name, identifier, language, and devices to support. In the past, the language was always set to Objective-C, however with Xcode 6, there are two options: objective-C and Swift Setting the project properties On creation, the main screen is displayed. Here it offers the option to change other details related to the application such as the version number and build. It also allows you to configure the team ID and certificates used for signing the application to test on a mobile device or for distribution to the App Store. It also allows you to set the compatibility for earlier versions. The orientation and app icons, splash screens, and so on are also set from this screen. If you want to set these up later on in the project, it is fine, this can be accessed at any time and does not stop you from development. It needs to be set prior to deploying it on a device or creating an App Store ready application. Xcode overview Let us have a look at the Xcode interface to familiarize ourselves with the same as it would help improve productivity when building your application. The top section immediately following the traffic light (window chrome) displays a Play and Stop button. This allows the project to run and stop. The breadcrumb toolbar displays the project-specific settings with respect to the product and the target. With an iOS project, it could be a particular simulator for iPhone, iPad, and so on, or a physical device (number 5 in the following screenshot). Just under this are vertical areas that are the main content area with all the files, editors, UI, and so on. These can be displayed or hidden as required and can be stacked vertically or horizontally. The distinct areas in Xcode are as follows: Project navigation (number1) Editor and assistant editor (number 2 ) and (number 3 ) Utility/inspector (number 4 ) The toolbar (number 5 ) and (number 6 ) These sections can be switched on and off (shown or hidden) as required to make space for other sections or more screen space to work with: Sections in Xcode The project section The project navigation section has three sub sections, the topmost being the project toolbar that has eight icons. These can be seen as in the following screenshot. The next sub section contains the project files and all the assets required for this project. The bottom most section consists of recently edited files and filters: You can use the keyboard shortcuts to access these areas quickly with the CMD + 1...8 keys. The eight areas available under project navigation are key and for the beginner to Xcode, this could be a bit daunting. When you run the project, the current section might change and display another where you might wonder how to get back to the project (file) navigator. Getting familiar with these is always helpful and the easiest way to navigate between these is the CMD + 1..8 keys. Project navigator ( CMD + 1 ): This displays all of the files, folders, assets, frameworks, and so on that are part of this project. This is displayed as a hierarchical view and is the way that a majority of developers access their files, folders, and so on. Symbol navigator ( CMD + 2 ): This displays all of the classes, members, and methods that are available in them. This is the easiest way to navigate quickly to a method/function, attribute/property. Search navigator ( CMD + 3 ): This allows you to search the project for a particular match. This is quite useful to find and replace text. Issues navigator ( CMD + 4 ): This displays the warning and errors that occur while typing your code or on building and running it. This also displays the results of the static analyzer. Tests navigator ( CMD + 5 ); This displays the tests that you have present in your code either added by yourself or the default ones created with the project. Debug navigator ( CMD + 6 ): This displays the information about the application when you choose to run it. It has some amazing detailed information on CPU usage, memory usage, disk usage, threads, and so on. Breakpoint navigator ( CMD + 7 ): This displays all the breakpoints in your project from all files. This also allows you to create exception and symbolic breakpoints. Log navigator ( CMD + 8 ): This displays a log of all actions carried out, namely compiling, building, and running. This is more useful when used to determine the results of automated builds The editor and assistant editor sections The second area contains the editor and assistant editor sections. These display the code, the XIB (as appropriate), storyboard files, device previews, and so on. Each of the sub sections have a jump bar on the top that relates to files and allow for navigating back and forth in the files and display the location of the file in the workspace. To the right from this is a mini issues navigator that displays all warnings and errors. In the case of the assistant editors, it also displays two buttons: one to add a new assistant editor area and another to close it.   Source code editors While we are looking at the interface, it is worth noting that the Xcode code editor is a very advanced editor with a lot of features, which is now seen as standard with a lot of text editors. Some of the features that make working with Xcode easier are as follows: Code folding : This feature helps to hide code at points such as the function declaration, loops, matching brace brackets, and so on. When a function or portion of code is folded, it hides it from view, thereby allowing you to view other areas of the code that would not be visible unless you scrolled. Syntax highlighting : This is one of the most useful features as it helps you, the developer, to visually, at a glance, differentiate your source code from variables, constants, and strings. Xcode has syntax highlighting for a couple of languages as mentioned earlier. Context help : This is one of the best features whereby when you hover over a word in the source code with OPT pressed, it shows a dotted underline and the cursor changes to a question mark. When you click on a word with the dotted underline and the question mark cursor, it displays a popup with details about that word. It also highlights all instances of that word in the file. The popup details as much information as available. If it is a variable or a function that you have added to the code, then it will display the name of the file where it was declared. If it is a word that is contained in the Apple libraries, then it displays the description and other additional details. Context jump : This is another cool feature that allows jumping to the point of declaration of that word. This is achieved by clicking on a word while keeping the CMD button pressed. In many cases, this is mainly helpful to know how the function is declared and what parameters it expects. It can also be useful to get information on other enumerators and constants used with that function. The jump could be in the same file as where you are editing the code or it could be to the header files where they are declared. Edit all in scope : This is a cool feature where you can edit all of the instances of the word together rather than using search and replace. A case scenario is if you want to change the name of a variable and ensure that all instances you are using in the file are changed but not the ones that are text, then you can use this option to quickly change it. Catching mistakes with fix-it : This is another cool feature in Xcode that will save you a lot of time and hassle. As you type text, Xcode keeps analyzing the code and looking for errors. If you have declared a variable and not used it in your code, Xcode immediately draws attention to it suggesting that the variable is an unused variable. However, if it was supposed to be a pointer and you have declared it without *; Xcode immediately flags it as an error that the interface type cannot be statically allocated. It offers a fix-it solution of inserting * and the code has a greyed * character showing where it will be added. This helps the developer fix commonly overlooked issues such as missing semicolons, missing declarations, or misspelled variable names. Code completion : This is the bit that makes writing code so much easier, type in a few letters of the function name and Xcode pops up a list of functions, constants, methods, and so on that start with those letters and displays all of the required parameters (as applicable) including the return type. When selected, it adds the token placeholders that can be replaced with the actual parameter values. The results might vary from person to person depending on the settings and the speed of the system you run Xcode on. The assistant editor The assistant editor is mainly used to display the counterparts and related files to the file open in the primary editor (generally used when working with Objective-C where the .h or.m files are the related files). The assistant editors track the contents of the editor. Xcode is quite intelligent and knows the corresponding sections and counterparts. When you click on a file, it opens up in the editor. However, pressing the OPT + Shift while clicking on the file, you would be provided with an interactive dialog to select where to open the file. The options include the primary editor or the assistant editor. You can also add assistant editors as required.   Another way to open a file quickly is to use the Open Quickly option, which has a shortcut key of CMD + Shift + O . This displays a textbox that allows accessing a file from the project. The utility/inspector section The last section contains the inspector and library. This section changes based on the type of file selected in the current editor. The inspector has 6 tabs/sections and they are as follows: The file inspector ( CMD + OPT + 1 ): This displays the physical file information for the file selected. For code files, it is the text encoding, the targets that it belongs to, and the physical file path. While for the storyboard, it is the physical file path and allows setting attributes such as auto layout and size classes (new in Xcode 6). The quick help inspector ( CMD + OPT + 2 ): This displays information about the class or object selected. The identity inspector ( CMD + OPT + 3 ): This displays the class name, ID, and others that identify the object selected. The attributes inspector ( CMD + OPT + 4 ): This displays the attributes for the object selected as if it is the initial root view controller, does it extend under the top bars or not, if it has a navigation bar or not, and others. This also displays the user-defined attributes (a new feature with Xcode 6). The size inspector ( CMD + OPT + 5 ): This displays the size of the control selected and the associated constraints that help position it on the container. The connections inspector ( CMD + OPT + 6 ): This displays the connections created in the Interface Builder between the UI and the code. The lower half of this inspector contains four options that help you work efficiently, they are as follows: The file template library : This contains the options to create a new class, protocol. The options that are available when selecting the File | New option from the menu. The code snippets library : This is a wonderful but not widely used option. This can hold code snippets that can help you avoid writing repetitive blocks of code in your app. You can drag and drop the snippet to your code in the editor. This also offers features such as shortcuts, scopes, platforms, and languages. So you can have a shortcut such as appDidLoad (for example) that inserts the code to create and populate a button. This is achieved simply by setting the platform as appropriate to iOS or OS X. After creating a code snippet, as soon as you type the first few characters, the code snippet shows up in the list of autocomplete options; The object library : This is the toolbox that contains all of the controls that you need for creating your UI, be it a button, a label, a Table View, view, View Controller, or anything else. Adding a code snippet is as easy as dragging the selected code from the editor onto the snippet area. It is a little tricky because the moment you start dragging, it could break your selection highlight. You need to select the text, click (hold) and then drag it. The media library : This contains the list of all images and other media types that are available to this project/workspace. Summary In this article, you have seen a quick tour of Xcode, keeping the shortcuts and tips handy as they really do help get things done faster. The code snippets are a wonderful feature that allow for quickly setting up commonly used code with shortcut keywords. Resources for Article: Further resources on this subject: Introducing Xcode Tools for iPhone Development [article] Xcode 4 ios: Displaying Notification Messages [article] Linking OpenCV to an iOS project [article]
Read more
  • 0
  • 0
  • 9665

article-image-basic-and-interactive-plots
Packt
06 Feb 2015
19 min read
Save for later

Basic and Interactive Plots

Packt
06 Feb 2015
19 min read
In this article by Atmajitsinh Gohil, author of the book R Data Visualization Cookbook, we will cover the following topics: A simple bar plot A simple line plot Line plot to tell an effective story Merging histograms Making an interactive bubble plot (For more resources related to this topic, see here.) The main motivation behind this article is to introduce the basics of plotting in R and an element of interactivity via the googleVis package. The basic plots are important as many packages developed in R use basic plot arguments and hence understanding them creates a good foundation for new R users. We will start by exploring the scatter plots in R, which are the most basic plots for exploratory data analysis, and then delve into interactive plots. Every section will start with an introduction to basic R plots and we will build interactive plots thereafter. We will utilize the power of R analytics and implement them using the googleVis package to introduce the element of interactivity. The googleVis package is developed by Google and it uses the Google Chart API to create interactive plots. There are a range of plots available with the googleVis package and this provides us with an advantage to plot the same data on various plots and select the one that delivers an effective message. The package undergoes regular updates and releases, and new charts are implemented with every release. The readers should note that there are other alternatives available to create interactive plots in R, but it is not possible to explore all of them and hence I have selected googleVis to display interactive elements in a chart. I have selected these purely based on my experience with interactivity in plots. The other good interactive package is offered by GGobi. A simple bar plot A bar plot can often be confused with histograms. Histograms are used to study the distribution of data whereas bar plots are used to study categorical data. Both the plots may look similar to the naked eye but the main difference is that the width of a bar plot is not of significance, whereas in histograms the width of the bars signifies the frequency of data. In this recipe, I have made use of the infant mortality rate in India. The data is made available by the Government of India. The main objective is to study the basics of a bar plot in R as shown in the following screenshot: How to do it… We start the recipe by importing our data in R using the read.csv() function. R will search for the data under the current directory, and hence we use the setwd() function to set our working directory: setwd("D:/book/scatter_Area/chapter2") data = read.csv("infant.csv", header = TRUE) Once we import the data, we would like to process the data by ordering it. We order the data using the order() function in R. We would like R to order the column Total2011 in a decreasing order: data = data[order(data$Total2011, decreasing = TRUE),] We use the ifelse() function to create a new column. We would utilize this new column to add different colors to bars in our plot. We could also write a loop in R to do this task but we will keep this for later. The ifelse() function is quick and easy. We instruct R to assign yes if values in the column Total2011 are more than 12.2 and no otherwise. The 12.2 value is not randomly chosen but is the average infant mortality rate of India: new = ifelse(data$Total2011>12.2,"yes","no") Next, we would like to join the vector of yes and no to our original dataset. In R, we can join columns using the cbind() function. Rows can be combined using rbind(): data = cbind(data,new) When we initially plot the bar plot, we observe that we need more space at the bottom of the plot. We adjust the margins of a plot in R by passing the mar() argument within the par() function. The mar() function uses four arguments: bottom, left, top, and right spacing: par(mar = c(10,5,5,5)) Next, we generate a bar plot in R using the barplot() function. The abline() function is used to add a horizontal line on the bar plot: barplot(data$Total2011, las = 2, names.arg= data$India,width =0.80, border = NA,ylim=c(0,20), col = "#e34a33", main = "InfantMortality Rate of India in 2011")abline(h = 12.2, lwd =2, col = "white", lty =2) How it works… The order() function uses permutation to rearrange (decreasing or increasing) the rows based on the variable. We would like to plot the bars from highest to lowest, and hence we require to arrange the data. The ifelse() function is used to generate a new column. We would use this column under the There's more… section of this recipe. The first argument under the ifelse() function is the logical test to be performed. The second argument is the value to be assigned if the test is true, and the third argument is the value to be assigned if the logical test fails. The first argument in the barplot() function defines the height of the bars and horiz = TRUE (not used in our code) instructs R to plot the bars horizontally. The default setting in R will plot the bars vertically. The names.arg argument is used to label the bars. We also specify border = NA to remove the borders and las = 2 is specified to apply the direction to our labels. Try replacing the las values with 1,2,3, or 4 and observe how the orientation of our labels change.. The first argument in the abline() function assigns the position where the line is drawn, that is, vertical or horizontal. The lwd, lty, and col arguments are used to define the width, line type, and color of the line. There's more… While plotting a bar plot, it's a good practice to order the data in ascending or descending order. An unordered bar plot does not convey the right message and the plot is hard to read when there are more bars involved. When we observe a plot, we are interested to get the most information out, and ordering the data is the first step toward achieving this objective. We have not specified how we can use the ifelse() and cbind() functions in the plot. If we would like to color the plot with different colors to let the readers know which states have high infant mortality above the country level, we can do this by pasting col = (data$new) in place of col = "#e34a33". See also New York Times has a very interesting implementation of an interactive bar chart and can be accessed at http://www.nytimes.com/interactive/2007/09/28/business/20070930_SAFETY_GRAPHIC.html A simple line plot Line plots are simply lines connecting all the x and y dots. They are very easy to interpret and are widely used to display an upward or downward trend in data. In this recipe, we will use the googleVis package and create an interactive R line plot. We will learn how we can emphasize on certain variables in our data. The following line plot shows fertility rate: Getting ready We will use the googleVis package to generate a line plot. How to do it… In order to construct a line chart, we will install and load the googleVis package in R. We would also import the fertility data using the read.csv() function: install.packages("googleVis") library(googleVis) frt = read.csv("fertility.csv", header = TRUE, sep =",") The fertility data is downloaded from the OECD website. We can construct our line object using the gvisLineChart() function: gvisLineChart(frt, xvar = "Year","yvar=c("Australia","Austria","Belgium","Canada","Chile","OECD34"), options = list( width = 1100, height= 500, backgroundColor = " "#FFFF99",title ="Fertility Rate in OECD countries" , vAxis = "{title : 'Total Fertility " Rate',gridlines:{color:'#DEDECE',count : 4}, ticks : "   [0,1,2,3,4]}", series = "{0:{color:'black', visibleInLegend :false},        1:{color:'BDBD9D', visibleInLegend :false},        2:{color:'BDBD9D', visibleInLegend :false},            3:{color:'BDBD9D', visibleInLegend :false},           4:{color:'BDBD9D', visibleInLegend :false},          34:{color:'3333FF', visibleInLegend :true}}")) We can construct the visualization using the plot() function in R: plot(line) How it works… The first three arguments of the gvisLineChart() function are the data and the name of the columns to be plotted on the x-axis and y-axis. The options argument lists the chart API options to add and modify elements of a chart. For the purpose of this recipe, we will use part of the dataset. Hence, while we assign the series to be plotted under yvar = c(), we will specify the column names that we would like to be plotted in our chart. Note that the series starts at 0, and hence Australia, which is the first column, is in fact series 0 and not 1. For the purpose of this exercise, let's assume that we would like to demonstrate the mean fertility rate among all OECD economies to our audience. We can achieve this using series {} under option = list(). The series argument will allow us to specify or customize a specific series in our dataset. Under the gvisLineChart() function, we instruct the Google Chart API to color OECD series (series 34) and Australia (series 0) with a different color and also make the legend visible only for OECD and not the entire series. It would be best to display all the legends but we use this to show the flexibility that comes with the Google Chart API. Finally, we can use the plot() function to plot the chart in a browser. The following screenshot displays a part of the data. The dim() function gives us a general idea about the dimensions of the fertility data: New York Times Visualization often combines line plots with bar chart and pie charts. Readers should try constructing such visualization. We can use the gvisMerge() function to merge plots. The function allows merging of just two plots and hence the readers would have to use multiple gvisMerge() functions to create a very similar visualization. The same can also be constructed in R but we will lose the interactive element. See also The OECD website provides economic data related to OECD member countries. The data can be freely downloaded from the website http://www.oecd.org/statistics/. New York Times Visualization combines bar charts and line charts and can be accessed at http://www.nytimes.com/imagepages/2009/10/16/business/20091017_CHARTS_GRAPHIC.html. Line plot to tell an effective story In the previous recipe, we learned how to plot a very basic line plot and use some of the options. In this recipe, we will go a step further and make use of specific visual cues such as color and line width for easy interpretation. Line charts are a great tool to visualize time series data. The fertility data is discrete but connecting points over time provides our audience with a direction. The visualization shows the amazing progress countries such as Mexico and Turkey have achieved in reducing their fertility rate. OECD defines fertility rate as Refers to the number of children that would be born per woman, assuming no female mortality at child-bearing ages and the age-specific fertility rates of a specified country and reference period. Line plots have been widely used by New York Times to create very interesting infographics. This recipe is inspired by one of the New York Times visualizations. It is very important to understand that many of the infographics created by professionals are created using D3.js or Processing. We will not go into the detail of the same but it is good to know the working of these softwares and how they can be used to create visualizations. Getting ready We would need to install and load the googleVis package to construct a line chart. How to do it… To generate an interactive plot, we will load the fertility data in R using the read.csv() function. To generate a line chart that plots the entire dataset, we will use the gvisLineChart() function: line = gvisLineChart(frt, xvar = "Year", yvar=c("Australia",""Austria","Belgium","Canada","Chile","Czech.Republic", "Denmark","Estonia","Finland","France","Germany","Greece","Hungary"", "Iceland","Ireland","Israel","Italy","Japan","Korea","Luxembourg",""Mexico", "Netherlands","New.Zealand","Norway","Poland","Portugal","Slovakia"","Slovenia", "Spain","Sweden","Switzerland","Turkey","United.Kingdom","United."States","OECD34"), options = list( width = 1200, backgroundColor = "#ADAD85",title " ="Fertility Rate in OECD countries" , vAxis = "{gridlines:{color:'#DEDECE',count : 3}, ticks : " [0,1,2,3,4]}", series = "{0:{color:'BDBD9D', visibleInLegend :false}, 20:{color:'009933', visibleInLegend :true}, 31:{color:'996600', visibleInLegend :true}, 34:{color:'3333FF', visibleInLegend :true}}")) To display our visualization in a new browser, we use the generic R plot() function: plot(line) How it works… The arguments passed in the gvisLineChart() function, are exactly the same as discussed under the simple line plot with some minor changes. We would like to plot the entire data for this exercise, and hence we have to state all the column names in yvar =c(). Also, we would like to color all the series with the same color but highlight Mexico, Turkey, and OECD average. We have achieved this in the previous code using series {}, and further specify and customize colors and legend visibility for specific countries. In this particular plot, we have made use of the same color for all the economies but have highlighted Mexico and Turkey to signify the development and growth that took place in the 5-year period. It would also be effective if our audience could compare the OECD average with Mexico and Turkey. This provides the audience with a benchmark they can compare with. If we plot all the legends, it may make the plot too crowded and 34 legends may not make a very attractive plot. We could avoid this by only making specific legends visible. See also D3 is a great tool to develop interactive visualization and this can be accessed at http://d3js.org/. Processing is an open source software developed by MIT and can be downloaded from https://processing.org/. A good resource to pick colors and use them in our plots is the following link: http://www.w3schools.com/tags/ref_colorpicker.asp. I have used New York Times infographics as an inspiration for this plot. You can find a collection of visualization put out by New York Times in 2011 by going to this link, http://www.smallmeans.com/new-york-times-infographics/. Merging histograms Histograms help in studying the underlying distribution. It is more useful when we are trying to compare more than one histogram on the same plot; this provides us with greater insight into the skewness and the overall distribution. In this recipe, we will study how to plot a histogram using the googleVis package and how we merge more than one histogram on the same page. We will only merge two plots but we can merge more plots and try to adjust the width of each plot. This makes it easier to compare all the plots on the same page. The following plot shows two merged histograms: How to do it… In order to generate a histogram, we will install the googleVis package as well as load the same in R: install.packages("googleVis") library(googleVis) We have downloaded the prices of two different stocks and have calculated their daily returns over the entire period. We can load the data in R using the read.csv() function. Our main aim in this recipe is to plot two different histograms and plot them side by side in a browser. Hence, we require to divide our data in three different data frames. For the purpose of this recipe, we will plot the aapl and msft data frames: stk = read.csv("stock_cor.csv", header = TRUE, sep = ",") aapl = data.frame(stk$AAPL) msft = data.frame(stk$MSFT) googl = data.frame(stk$GOOGL) To generate the histograms, we implement the gvisHistogram() function: al = gvisHistogram(aapl, options = list(histogram = "{bucketSize " :1}",legend = "none",title ='Distribution of AAPL Returns', "   width = 500,hAxis = "{showTextEvery: 5,title: "     'Returns'}",vAxis = "{gridlines : {count:4}, title : "       'Frequency'}")) mft = gvisHistogram(msft, options = list(histogram = "{bucketSize " :1}",legend = "none",title ='Distribution of MSFT Returns', "   width = 500,hAxis = "{showTextEvery: 5,title: 'Returns'}","     vAxis = "{gridlines : {count:4}, title : 'Frequency'}")) We combine the two gvis objects in one browser using the gvisMerge() function: mrg = gvisMerge(al,mft, horizontal = TRUE) plot(mrg) How it works… The data.frame() function is used to construct a data frame in R. We require this step as we do not want to plot all the three histograms on the same plot. Note the use of the $ notation in the data.frame() function. The first argument in the gvisHistogram() function is our data stored as a data frame. We can display individual histograms using the plot(al) and plot(mft) functions. But in this recipe, we will plot the final output. We observe that most of the attributes of a histogram function are the same as discussed in previous recipes. The histogram functionality will use an algorithm to create buckets, but we can control this using the bucketSize as histogram = "{bucketSize :1}". Try using different bucket sizes and observe how the buckets in the histograms change. More options related to histograms can also be found in the following link under the Controlling Buckets section: https://developers.google.com/chart/interactive/docs/gallery/histogram#Buckets We have utilized showTextEvery, which is also very specific to histograms. This option allows us to specify how many horizontal axis labels we would like to show. We have used 5 to make the histogram more compact. Our main objective is to observe the distribution and the plot serves our purpose. Finally, we will implement plot() to plot the chart in our favorite browser. We do the same steps to plot the return distribution of Microsoft (MSFT). Now, we would like to place both the plots side by side and view the differences in the distribution. We will use the gvisMerge() function to generate histograms side by side. In our recipe, we have two plots for AAPL and MSFT. The default setting plots each chart vertically but we can specify horizontal = true to plot charts horizontally. Making an interactive bubble plot My first encounter with a bubble plot was while watching a TED video of Hans Roslling. The video led me to search for creating bubble plots in R; a very good introduction to this is available on the Flowing Data website. The advantage of a bubble plot is that it allows us to visualize a third variable, which in our case would be the size of the bubble. In this recipe, I have made use of the googleVis package to plot a bubble plot but you can also implement this in R. The advantage of the Google Chart API is the interactivity and the ease with which they can be attached to a web page. Also note that we could also use squares instead of circles, but this is not implemented in the Google Chart API yet. In order to implement a bubble plot, I have downloaded the crime dataset by state. The details regarding the link and definition of crime data are available in the crime.txt file and are shown in the following screenshot: How to do it… As with all the plots in this article, we will install and load the googleVis Package. We will also import our data file in R using the read.csv() function: crm = read.csv("crimeusa.csv", header = TRUE, sep =",") We can construct our bubble chart using the gvisBubbleChart() function in R: bub1 = gvisBubbleChart(crm,idvar = "States",xvar= "Robbery", yvar="Burglary", sizevar ="Population", colorvar = "Year",options = list(legend = "none",width = 900, height = 600,title=" Crime per State in 2012", sizeAxis ="{maxSize : 40, minSize:0.5}",vAxis = "{title : 'Burglary'}",hAxis= "{title :'Robbery'}"))bub2 = gvisBubbleChart(crm,idvar = "States",xvar= "Robbery", yvar="Burglary",sizevar ="Population",options = list(legend = "none",width = 900, height = 600,title=" Crime per State in 2012", sizeAxis ="{maxSize : 40, minSize:0.5}",vAxis = "{title : 'Burglary'}",hAxis= "{title :'Robbery'}"))ata How it works… The gvisBubbleChart() function uses six attributes to create a bubble chart, which are as follows: data: This is the data defined as a data frame, in our example, crm idvar: This is the vector that is used to assign IDs to the bubbles, in our example, states xvar: This is the column in the data to plot on the x-axis, in our example, Robbery yvar: This is the column in the data to plot on the y-axis, in our example, Burglary sizevar: This is the column used to define the size of the bubble colorvar: This is the column used to define the color We can define the minimum and maximum sizes of each bubble using minSize and maxSize, respectively, under options(). Note that we have used gvisMerge to portray the differences among the bubble plots. In the plot on the right, we have not made use of colorvar and hence all the bubbles are of the same size. There's more… The Google Chart API makes it easier for us to plot a bubble, but the same can be achieved using the R basic plot function. We can make use of the symbols to create a plot. The symbols need not be a bubble; it can be a square as well. By this time, you should have watched Hans' TED lecture and would be wondering how you could create a motion chart with bubbles floating around. The Google Charts API has the ability to create motion charts and the readers can definitely use the googleVis reference manual to learn about this. See also TED video by Hans Rosling can be accessed at http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen The Flowing Data website generates bubble charts using the basic R plot function and can be accessed at http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ Animated Bubble Chart by New York Times can be accessed at http://2010games.nytimes.com/medals/map.html Summary This article introduces some of the basic R plots, such as line and bar charts. It also discusses the basic elements of interactive plots using the googleVis package in R. This article is a great resource for understanding the basic R plotting techniques. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [article] Data visualization [article] Visualization as a Tool to Understand Data [article]
Read more
  • 0
  • 0
  • 2492

article-image-lync-2013-hybrid-and-lync-online
Packt
06 Feb 2015
27 min read
Save for later

Lync 2013 Hybrid and Lync Online

Packt
06 Feb 2015
27 min read
In this article, by the authors, Fabrizio Volpe, Alessio Giombini, Lasse Nordvik Wedø, and António Vargas of the book, Lync Server Cookbook, we will cover the following recipes: Introducing Lync Online Administering with the Lync Admin Center Using Lync Online Remote PowerShell Using Lync Online cmdlets Introducing Lync in a hybrid scenario Planning and configuring a hybrid deployment Moving users to the cloud Moving users back on-premises Debugging Lync Online issues (For more resources related to this topic, see here.) Introducing Lync Online Lync Online is part of the Office 365 offer and provides online users with the same Instant Messaging (IM), presence, and conferencing features that we would expect from an on-premises deployment of Lync Server 2013. Enterprise Voice, however, is not available on Office 365 tenants (or at least, it is available only with limitations regarding both specific Office 365 plans and geographical locations). There is no doubt that forthcoming versions of Lync and Office 365 will add what is needed to also support all the Enterprise Voice features in the cloud. Right now, the best that we are able to achieve is to move workloads, homing a part of our Lync users (the ones with no telephony requirements) in Office 365, while the remaining Lync users are homed on-premises. These solutions might be interesting for several reasons, including the fact that we can avoid the costs of expanding our existing on-premises resources by moving a part of our Lync-enabled users to Office 365. The previously mentioned configuration, which involves different kinds of Lync tenants, is called a hybrid deployment of Lync, and we will see how to configure it and move our users from online to on-premises and vice versa. In this Article, every time we talk about Lync Online and Office 365, we will assume that we have already configured an Office tenant. Administering with the Lync Admin Center Lync Online provides the Lync Admin Center (LAC), a dedicated control panel, to manage Lync settings. To open it, access the Office 365 portal and select Service settings, Lync, and Manage settings in the Lync admin center, as shown in the following screenshot: LAC, if you compare it with the on-premises Lync Control Panel (or with the Lync Management Shell), offers few options. For example, it is not possible to create or delete users directly inside Lync. We will see some of the tasks we are able to perform in LAC, and then, we will move to the (more powerful) Remote PowerShell. There is an alternative path to open LAC. From the Office 365 portal, navigate to Users & Groups | Active Users. Select a user, after which you will see a Quick Steps area with an Edit Lync Properties link that will open the user-editable part of LAC. How to do it... LAC is divided into five areas: users, organization, dial-in conferencing, meeting invitation, and tools, as you can see in the following screenshot: The Users panel will show us the configuration of the Lync Online enabled users. It is possible to modify the settings with the Edit option (the small pencil icon on the right): I have tried to summarize all the available options (inside the general, external communications, and dial-in conferencing tabs) in the following screenshot: Some of the user's settings are worth a mention; in the General tab, we have the following:    The Record Conversations and meetings option enables the Start recording option in the Lync client    The Allow anonymous attendees to dial-out option controls whether the anonymous users that are dialing-in to a conference are required to call the conferencing service directly or are authorized for callback    The For compliance, turn off non-archived features option disables Lync features that are not recorded by In-Place Hold for Exchange When you place an Exchange 2013 mailbox on In-Place Hold or Litigation Hold, the Microsoft Lync 2013 content (instant messaging conversations and files shared in an online meeting) is archived in the mailbox. In the dial-in conferencing tab, we have the configuration required for dial-in conferencing. The provider's drop-down menu shows a list of third parties that are able to deliver this kind of feature. The Organization tab manages privacy for presence information, push services, and external access (the equivalent of the Lync federation on-premises). If you enable external access, we will have the option to turn on Skype federation, as we can see in the following screenshot: The Dial-In Conferencing option is dedicated to the configuration of the external providers. The Meeting Invitation option allows the user to customize the Lync Meeting invitation. The Tools options offer a collection of troubleshooting resources. See also For details about Exchange In-Place Hold, see the TechNet post In-Place Hold and Litigation Hold at http://technet.microsoft.com/en-us/library/ff637980(v=exchg.150).aspx. Using Lync Online Remote PowerShell The possibility to manage Lync using Remote PowerShell on a distant deployment has been available since Lync 2010. This feature has always required a direct connection from the management station to the Remote Lync, and a series of steps that is not always simple to set up. Lync Online supports Remote PowerShell using a dedicated (64-bit only) PowerShell module, the Lync Online Connector. It is used to manage online users, and it is interesting because there are many settings and automation options that are available only through PowerShell. Getting ready Lync Online Connector requires one of the following operating systems: Windows 7 (with Service Pack 1), Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2, Windows 8, or Windows 8.1. At least PowerShell 3.0 is needed. To check it, we can use the $PSVersionTable variable. The result will be like the one in the following screenshot (taken on Windows 8.1, which uses PowerShell 4.0): How to do it... Download Windows PowerShell Module for Lync Online from the Microsoft site at http://www.microsoft.com/en-us/download/details.aspx?id=39366 and install it. It is useful to store our Office 365 credentials in an object (it is possible to launch the cmdlets at step 3 anyway, and we will be required with the Office 365 administrator credentials, but using this method, we will have to insert the authentication information again every time it is required). We can use the $credential = Get-Credential cmdlet in a PowerShell session. We will be prompted for our username and password for Lync Online, as shown in the following screenshot: To use the Online Connector, open a PowerShell session and use the New-CsOnlineSession cmdlet. One of the ways to start a remote PowerShell session is $session = New-CsOnlineSession -Credential $credential. Now, we need to import the session that we have created with Lync Online inside PowerShell, with the Import-PSSession $session cmdlet. A temporary Windows PowerShell module will be created, which contains all the Lync Online cmdlets. The name of the temporary module will be similar to the one we can see in the following screenshot: Now, we will have the cmdlets of the Lync Online module loaded in memory, in addition to any command that we already have available in PowerShell. How it works... The feature is based on a PowerShell module, the LyncOnlineConnector, shown in the following screenshot: It contains only two cmdlets, the Set-WinRMNetworkDelayMS and New-CsOnlineSession cmdlets. The latter will load the required cmdlets in memory. As we have seen in the previous steps, the Online Connector adds the Lync Online PowerShell cmdlets to the ones already available. This is something we will use when talking about hybrid deployments, where we will start from the Lync Management Shell and then import the module for Lync Online. It is a good habit to verify (and close) your previous remote sessions. This can be done by selecting a specific session (using Get-PSSession and then pointing to a specific session with the Remove-PSSession statement) or closing all the existing ones with the Get-PSSession | Remove-PSSession cmdlet. In the previous versions of the module, Microsoft Online Services Sign-In Assistant was required. This prerequisite was removed from the latest version. There's more... There are some checks that we are able to perform when using the PowerShell module for Lync Online. By launching the New-CsOnlineSession cmdlet with the –verbose switch, we will see all the messages related to the opening of the session. The result should be similar to the one shown in the following screenshot: Another verification comes from the Get-Command -Module tmp_gffrkflr.ufz command, where the module name (in this example, tmp_gffrkflr.ufz) is the temporary module we saw during the Import-PSSession step. The output of the command will show all the Lync Online cmdlets that we have loaded in memory. The Import-PSSession cmdlet imports all commands except the ones that have the same name of a cmdlet that already exists in the current PowerShell session. To overwrite the existing cmdlets, we can use the -AllowClobber parameter. See also During the introduction of this section, we also discussed the possibility to administer on-premises, remote Lync Server 2013 deployment with a remote PowerShell session. John Weber has written a great post about it in his blog Lync 2013 Remote Admin with PowerShell at http://tsoorad.blogspot.it/2013/10/lync-2013-remote-admin-with-powershell.html, which is helpful if you want to use the previously mentioned feature. Using Lync Online cmdlets In the previous recipe, we outlined the steps required to establish a remote PowerShell session with Lync Online. We have less than 50 cmdlets, as shown in the result of the Get-Command -Module command in the following screenshot: Some of them are specific for Lync Online, such as the following: Get-CsAudioConferencingProvider Get-CsOnlineUser Get-CsTenant Get-CsTenantFederationConfiguration Get-CsTenantHybridConfiguration Get-CsTenantLicensingConfiguration Get-CsTenantPublicProvider New-CsEdgeAllowAllKnownDomains New-CsEdgeAllowList New-CsEdgeDomainPattern Set-CsTenantFederationConfiguration Set-CsTenantHybridConfiguration Set-CsTenantPublicProvider Update-CsTenantMeetingUrl All the remaining cmdlets can be used either with Lync Online or with the on-premises version of Lync Server 2013. We will see the use of some of the previously mentioned cmdlets. How to do it... The Get-CsTenant cmdlet will list Lync Online tenants configured for use in our organization. The output of the command includes information such as the preferred language, registrar pool, domains, and assigned plan. The Get-CsTenantHybridConfiguration cmdlet gathers information about the hybrid configuration of Lync. Management of the federation capability for Lync Online (the feature that enables Instant Messaging and Presence information exchange with users of other domains) is based on the allowed domain and blocked domain lists, as we can see in the organization and external communications screen of LAC, shown in the following screenshot: There are similar ways to manage federation from the Lync Online PowerShell, but it required to put together different statements as follows:     We can use an accept all domains excluding the ones in the exceptions list approach. To do this, we have put the New-CsEdgeAllowAllKnownDomains cmdlet inside a variable. Then, we can use the Set-CsTenantFederationConfiguration cmdlet to allow all the domains (except the ones in the block list) for one of our domains on a tenant. We can use the example on TechNet (http://technet.microsoft.com/en-us/library/jj994088.aspx) and integrate it with Get-CsTenant.     If we prefer, we can use a block all domains but permit the ones in the allow list approach. It is required to define a domain name (pattern) for every domain to allow the New-CsEdgeDomainPattern cmdlet, and each one of them will be saved in a variable. Then, the New-CsEdgeAllowList cmdlet will create a list of allowed domains from the variables. Finally, the Set-CsTenantFederationConfiguration cmdlet will be used. The domain we will work on will be (again) cc3b6a4e-3b6b-4ad4-90be-6faa45d05642. The example on Technet (http://technet.microsoft.com/en-us/library/jj994023.aspx) will be used: $x = New-CsEdgeDomainPattern -Domain "contoso.com" $y = New-CsEdgeDomainPattern -Domain "fabrikam.com" $newAllowList = New-CsEdgeAllowList -AllowedDomain $x,$y Set-CsTenantFederationConfiguration -Tenant " cc3b6a4e-3b6b-4ad4-90be-6faa45d05642" -AllowedDomains $newAllowList The Get-CsOnlineUser cmdlet provides information about users enabled on Office 365. The result will show both users synced with Active Directory and users homed in the cloud. The command supports filters to limit the output; for example, the Get-CsOnlineUser -identity fab will gather information about the user that has alias = fab. This is an account synced from the on-premises Directory Services, so the value of the DirSyncEnabled parameter will be True. See also All the cmdlets of the Remote PowerShell for Lync Online are listed in the TechNet post Lync Online cmdlets at http://technet.microsoft.com/en-us/library/jj994021.aspx. This is the main source of details on the single statement. Introducing Lync in a hybrid scenario In a Lync hybrid deployment, we have the following: User accounts and related information homed in the on-premises Directory Services and replicated to Office 365. A part of our Lync users that consume on-premises resources and a part of them that use online (Office 365 / Lync Online) resources. The same (public) domain name used both online and on-premises (Lync-split DNS). Other Office 365 services and integration with other applications available to all our users, irrespective of where their Lync is provisioned. One way to define Lync hybrid configuration is by using an on-premises Lync deployment federated with an Office 365 / Lync Online tenant subscription. While it is not a perfect explanation, it gives us an idea of the scenario we are talking about. Not all the features of Lync Server 2013 (especially the ones related to Enterprise Voice) are available to Lync Online users. The previously mentioned motivations, along with others (due to company policies, compliance requirements, and so on), might recommend a hybrid deployment of Lync as the best available solution. What we have to clarify now is how to make those users on different deployments talk to each other, see each other's presence status, and so on. What we will see in this section is a high-level overview of the required steps. The Planning and configuring a hybrid deployment recipe will provide more details about the individual steps. The list of steps here is the one required to configure a hybrid deployment, starting from Lync on-premises. In the following sections, we will also see the opposite scenario (with our initial deployment in the cloud). How to do it... It is required to have an available Office 365 tenant configuration. Our subscription has to include Lync Online. We have to configure an Active Directory Federation Services (AD FS) server in our domain and make it available to the Internet using a public FQDN and an SSL certificate released from a third-party certification authority. Office 365 must be enabled to synchronize with our company's Directory Services, using Active Directory Sync. Our Office 365 tenant must be federated. The last step is to configure Lync for a hybrid deployment. There's more... One of the requirements for a hybrid distribution of Lync is an on-premises deployment of Lync Server 2013 or Lync Server 2010. For Lync Server 2010, it is required to have the latest available updates installed, both on the Front Ends and on the Edge servers. It is also required to have the Lync Server 2013 administrative tools installed on a separate server. More details about supported configuration are available on the TechNet post Planning for Lync Server 2013 hybrid deployments at http://technet.microsoft.com/en-us/library/jj205403.aspx. DNS SRV records for hybrid deployments, _sipfederationtls._tcp.<domain> and _sip._tls.<domain>, should point to the on-premises deployment. The lyncdiscover. <domain> record will point to the FQDN of the on-premises reverse proxy server. The _sip._tls. <domain> SRV record will resolve to the public IP of the Access Edge service of Lync on-premises. Depending on the kind of service we are using for Lync, Exchange, and SharePoint, only a part of the features related to the integration with the additional services might be available. For example, skills search is available only if we are using Lync and SharePoint on-premises. The following TechNet post Supported Lync Server 2013 hybrid configurations at http://technet.microsoft.com/en-us/library/jj945633.aspx offers a matrix of features / service deployment combinations. See also Interesting information about Lync Hybrid configuration is presented in sessions available on Channel9 and coming from the Lync Conference 2014 (Lync Online Hybrid Deep Dive at http://channel9.msdn.com/Events/Lync-Conference/Lync-Conference-2014/ONLI302) and from TechEd North America 2014 (Microsoft Lync Online Hybrid Deep Dive at http://channel9.msdn.com/Events/TechEd/NorthAmerica/2014/OFC-B341#fbid=). Planning and configuring a hybrid deployment The planning phase for a hybrid deployment starts from a simple consideration: do we have an on-premises deployment of Lync Server? If the previously mentioned scenario is true, do we want to move users to the cloud or vice versa? Although the first situation is by far the most common, we have to also consider the case in which we have our first deployment in the cloud. How to do it... This step is all that is required for the scenario that starts from Lync Online. We have to completely deploy our Lync on-premises. Establish a remote PowerShell session with Office 365. Use the shared SIP address cmdlet Set-CsTenantFederationConfiguration -SharedSipAddressSpace $True to enable Office 365 to use a Shared Session Initiation Protocol (SIP) address space with our on-premises deployment. To verify this, we can use the Get-CsTenantFederationConfiguration command. The SharedSipAddressSpace value should be set to True. All the following steps are for the scenario that starts from the on-premises Lync deployment. After we have subscribed with a tenant, the first step is to add the public domain we use for our Lync users to Office 365 (so that we can split it on the two deployments). To access the Office 365 portal, select Domains. The next step is Specify a domain name and confirm ownership. We will be required to type a domain name. If our domain is hosted on some specific providers (such as GoDaddy), the verification process can be automated, or we have to proceed manually. The process requires to add one DNS record (TXT or MX), like the ones shown in the following screenshot: If we need to check our Office 365 and on-premises deployments before continuing with the hybrid deployment, we can use the Setup Assistant for Office 365. The tool is available inside the Office 365 portal, but we have to launch it from a domain-joined computer (the login must be performed with the domain administrative credentials). In the Setup menu, we have a Quick Start and an Extend Your Setup option (we have to select the second one). The process can continue installing an app or without software installation, as shown in the following screenshot: The app (which makes the assessment of the existing deployment easier) is installed by selecting Next in the previous screen (it requires at least Windows 7 with Service Pack 1, .NET Framework 3.5, and PowerShell 2.0). Synchronization with the on-premises Active Directory is required. This last step federates Lync Server 2013 with Lync Online to allow communication between our users. The first cmdlet to use is Set-CSAccessEdgeConfiguration -AllowOutsideUsers 1 -AllowFederatedUsers 1 -UseDnsSrvRouting -EnablePartnerDiscovery 1. Note that the -EnablePartnerDiscovery parameter is required. Setting it to 1 enables automatic discovery of federated partner domains. It is possible to set it to 0. The second required cmdlet is New-CSHostingProvider -Identity LyncOnline -ProxyFqdn "sipfed.online.lync.com" -Enabled $true -EnabledSharedAddressSpace $true -HostsOCSUsers $true –VerificationLevel UseSourceVerification -IsLocal $false -AutodiscoverUrl https://webdir.online.lync.com/Autodiscover/AutodiscoverService.svc/root. The result of the commands is shown in the following screenshot: If Lync Online is already defined, we have to use the Set- CSHostingProvider cmdlet, or we can remove it (Remove-CsHostingProvider -Identity LyncOnline) and then create it using the previously mentioned cmdlet. There's more... In the Lync hybrid scenario, users created in the on-premises directory are replicated to the cloud, while users generated in the cloud will not be replicated on-premises. Lync Online users are managed using the Office 365 portal, while the users on-premises are managed using the usual tools (Lync Control Panel and Lync Management Shell). Moving users to the cloud By moving users from Lync on-premises to the cloud, we will lose some of the parameters. The operation requires the Lync administrative tools and the PowerShell module for Lync Online to be installed on the same computer. If we install the module for Lync Online before the administrative tools for Lync 2013 Server, the OCSCore.msi file overwrites the LyncOnlineConnector.ps1 file, and New-CsOnlineSession will require a -TargetServer parameter. In this situation, we have to reinstall the Lync Online module (see the following post on the Microsoft support site at http://support.microsoft.com/kb/2955287). Getting ready Remember that to move the user to Lync Online, they must be enabled for both Lync Server on-premises and Lync Online (so we have to assign the user a license for Lync Online by using the Office 365 portal). Users with no assigned licenses will show the error Move-CsUser : HostedMigration fault: Error=(507), Description=(User must has an assigned license to use Lync Online. For more details, refer to the Microsoft support site at http://support.microsoft.com/kb/2829501. How to do it... Open a new Lync Management Shell session and launch the remote session on Office 365 with the cmdlets' sequence we saw earlier. We have to add the –AllowClobber parameter so that the Lync Online module's cmdlets are able to overwrite the corresponding Lync Management Shell cmdlets: $credential = Get-Credential $session = New-CsOnlineSession -Credential $credential Import-PSSession $session -AllowClobber Open the Lync Admin Center (as we have seen in the dedicated section) by going to Service settings | Lync | Manage settings in the Lync Admin Center, and copy the first part of the URL, for example, https://admin0e.online.lync.com. Add the following string to the previous URL /HostedMigration/hostedmigrationservice.svc (in our example, the result will be https://admin0a.online.lync.com/HostedMigration/hostedmigrationservice.svc). The following cmdlet will move users from Lync on-premises to Lync Online. The required parameters are the identity of the Lync user and the URL that we prepared in step 2. The user identity is fabrizio.volpe@absoluteuc.biz: Move-CsUser -Identity fabrizio.volpe@absoluteuc.biz –Target sipfed.online.lync.com -Credential $creds -HostedMigrationOverrideUrl https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.sVc Usually, we are required to insert (again) the Office 365 administrative credentials, after which we will receive a warning about the fact that we are moving our user to a different version of the service, like the one in the following screenshot: See the There's more... section of this recipe for details about user information that is migrated to Lync Online. We are able to quickly verify whether the user has moved to Lync Online by using the Get-CsUser | fl DisplayName,HostingProvider,RegistrarPool,SipAddress command. On-premises HostingProvider is equal to SRV: and RegistrarPool is madhatter.wonderland.lab (the name of the internal Lync Front End). Lync Online values are HostingProvider : sipfed.online.lync.com, and leave RegistrarPool empty, as shown in the following screenshot (the user Fabrizio is homed on-premises, while the user Fabrizio volpe is homed on the cloud): There's more... If we plan to move more than one user, we have to add a selection and pipe it before the cmdlet we have already used, removing the –identity parameter. For example, to move all users from an Organizational Unit (OU), (for example, the LyncUsers in the Wonderland.Lab domain) to Lync Online, we can use Get-CsUser -OU "OU=LyncUsers,DC=wonderland,DC=lab"| Move-CsUser -Target sipfed.online.lync.com -Credential $creds -HostedMigrationOverrideUrl https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.sVc. We are also able to move users based on a parameter to match using the Get-CsUser –Filter cmdlet. As we mentioned earlier, not all the user information is migrated to Lync Online. Migration contact list, groups, and access control lists are migrated, while meetings, contents, and schedules are lost. We can use the Lync Meeting Update Tool to update the meeting links (which have changed when our user's home server has changed) and automatically send updated meeting invitations to participants. There is a 64-bit version (http://www.microsoft.com/en-us/download/details.aspx?id=41656) and a 32-bit version (http://www.microsoft.com/en-us/download/details.aspx?id=41657) of the previously mentioned tool. Moving users back on-premises It is possible to move back users that have been moved from the on-premises Lync deployment to the cloud, and it is also possible to move on-premises users that have been defined and enabled directly in Office 365. In the latter scenario, it is important to create the user also in the on-premises domain (Directory Service). How to do it… The Lync Online user must be created in the Active Directory (for example, I will define the BornOnCloud user that already exists in Office 365). The user must be enabled in the on-premises Lync deployment, for example, using the Lync Management Shell with the following cmdlet: Enable-CsUser -Identity "BornOnCloud" -SipAddress "SIP:BornOnCloud@absoluteuc.biz" -HostingProviderProxyFqdn "sipfed.online.lync.com" Sync the Directory Services. Now, we have to save our Office 365 administrative credentials in a $cred = Get-Credential variable and then move the user from Lync Online to the on-premises Front End using the Lync Management Shell (the -HostedMigrationOverrideURL parameter has the same value that we used in the previous section): Move-CsUser -Identity BornOnCloud@absoluteuc.biz -Target madhatter.wonderland.lab -Credential $cred -HostedMigrationOverrideURL https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.svc The Get-CsUser | fl DisplayName,HostingProvider,RegistrarPool,SipAddress cmdlet is used to verify whether the user has moved as expected. See also Guy Bachar has published an interesting post on his blog Moving Users back to Lync on-premises from Lync Online (http://guybachar.wordpress.com/2014/03/31/moving-users-back-to-lync-on-premises-from-lync-online/), where he shows how he solved some errors related to the user motion by modifying the HostedMigrationOverrideUrl parameter. Debugging Lync Online issues Getting ready When moving from an on-premises solution to a cloud tenant, the first aspect we have to accept is that we will not have the same level of control on the deployment we had before. The tools we will list are helpful in resolving issues related to Lync Online, but the level of understanding on an issue they give to a system administrator is not the same we have with tools such as Snooper or OCSLogger. Knowing this, the more users we will move to the cloud, the more we will have to use the online instruments. How to do it… The Set up Lync Online external communications site on Microsoft Support (http://support.microsoft.com/common/survey.aspx?scid=sw;en;3592&showpage=1) is a guided walk-through that helps in setting up communication between our Lync Online users and external domains. The tool provides guidelines to assist in the setup of Lync Online for small to enterprise businesses. As you can see in the following screenshot, every single task is well explained: The Remote Connectivity Analyzer (RCA) (https://testconnectivity.microsoft.com/) is an outstanding tool to troubleshoot both Lync on-premises and Lync Online. The web page includes tests to analyze common errors and misconfigurations related to Microsoft services such as Exchange, Lync, and Office 365. To test different scenarios, it is necessary to use various network protocols and ports. If we are working on a firewall-protected network, using the RCA, we are also able to test services that are not directly available to us. For Lync Online, there are some tests that are especially interesting; in the Office 365 tab, the Office 365 General Tests section includes the Office 365 Lync Domain Name Server (DNS) Connectivity Test and the Office 365 Single Sign-On Test, as shown in the following screenshot: The Single Sign-On test is really useful in a scenario. The test requires our domain username and password, both synced with the on-premises Directory Services. The steps include searching the FQDN of our AD FS server on an Internet DNS, verifying the certificate and connectivity, and then validating the token that contains the credentials. The Client tab offers to download the Microsoft Connectivity Analyzer Tool and the Microsoft Lync Connectivity Analyzer Tool, which we will see in the following two dedicated steps: The Microsoft Connectivity Analyzer Tool makes many of the tests we see in the RCA available on our desktop. The list of prerequisites is provided in the article Microsoft Connectivity Analyzer Tool (http://technet.microsoft.com/library/jj851141(v=exchg.80).aspx), and includes Windows Vista/Windows 2008 or later versions of the operating system, .NET Framework 4.5, and an Internet browser, such as Internet Explorer, Chrome, or Firefox. For the Lync tests, a 64-bit operating system is mandatory, and the UCMA runtime 4.0 is also required (it is part of Lync Server 2013 setup, and is also available for download at http://www.microsoft.com/en-us/download/details.aspx?id=34992). The tools propose ways to solve different issues, and then, they run the same tests available on the RCA site. We are able to save the results in an HTML file. The Microsoft Lync Connectivity Analyzer Tool is dedicated to troubleshooting the clients for mobile devices (the Lync Windows Store app and Lync apps). It tests all the required configurations, including autodiscover and webticket services. The 32-bit version is available at http://www.microsoft.com/en-us/download/details.aspx?id=36536, while the 64-bit version can be downloaded from http://www.microsoft.com/en-us/download/details.aspx?id=36535. .NET Framework 4.5 is required. The tool itself requires a few configuration parameters; we have to insert the user information that we usually add in the Lync app, and we have to use a couple of drop-down menus to describe the scenario we are testing (on-premises or Internet, and the kind of client we are going to test). The Show drop-down menu enables us to look not only at a summary of the test results but also at the detailed information. The detailed view includes all the information and requests sent and received during the test, with the FQDN included in the answer ticket from our services, and so on, as shown in the following screenshot: The Troubleshooting Lync Online sign-in post is a support page, available in two different versions (admins and users), and is a walk-through to help admins (or users) to troubleshoot login issues. The admin version is available at http://support.microsoft.com/common/survey.aspx?scid=sw;en;3695&showpage=1, while the user version is available at http://support.microsoft.com/common/survey.aspx?scid=sw;en;3719&showpage=1. Based on our answers to the different scenario questions, the site will propose to information or solution steps. The following screenshot is part of the resolution for the log-I issues of a company that has an enterprise subscription with a custom domain: The Office 365 portal includes some information to help us monitor our Lync subscription. In the Service Health menu, navigate to Service Health; we have a list of all the incidents and service issues of the past days. In the Reports menu, we have statistics about our Office 365 consumption, including Lync. In the following screenshot, we can see the previously mentioned pages: There's more... One interesting aspect of the Microsoft Lync Connectivity Analyzer Tool that we have seen is that it enables testing for on-premises or Office 365 accounts (both testing from inside our network and from the Internet). The previously mentioned capability makes it a great tool to troubleshoot the configuration for Lync on the mobile devices that we have deployed in our internal network. This setup is usually complex, including hair-pinning and split DNS, so the diagnostic is important to quickly find misconfigured services. See also The Troubleshooting Lync Sign-in Errors (Administrators) page on Office.com at http://office.microsoft.com/en-001/communicator-help/troubleshooting-lync-sign-in-errors-administrators-HA102759022.aspx contains a list of messages related to sign-in errors with a suggested solution or a link to additional external resources. Summary In this article, we have learned about managing Lync 2013 and Lync Online and using Lync Online Remote PowerShell and Lync Online cmdlets. Resources for Article: Further resources on this subject: Adding Dialogs [article] Innovation of Communication and Information Technologies [article] Choosing Lync 2013 Clients [article]
Read more
  • 0
  • 0
  • 12847

article-image-qlik-senses-vision
Packt
06 Feb 2015
12 min read
Save for later

Qlik Sense's Vision

Packt
06 Feb 2015
12 min read
In this article by Christopher Ilacqua, Henric Cronström, and James Richardson, authors of the book Learning Qlik® Sense, we will look at the evolving requirements that compel organizations to readdress how they deliver business intelligence and support data-driven decision-making. This is important as it supplies some of the reasons as to why Qlik® Sense is relevant and important to their success. The purpose of covering these factors is so that you can consider and plan for them in your organization. Among other things, in this article, we will cover the following topics: The ongoing data explosion The rise of in-memory processing Barrierless BI through Human-Computer Interaction The consumerization of BI and the rise of self-service The use of information as an asset The changing role of IT (For more resources related to this topic, see here.) Evolving market factors Technologies are developed and evolved in response to the needs of the environment they are created and used within. The most successful new technologies anticipate upcoming changes in order to help people take advantage of altered circumstances or reimagine how things are done. Any market is defined by both the suppliers—in this case, Qlik®—and the buyers, that is, the people who want to get more use and value from their information. Buyers' wants and needs are driven by a variety of macro and micro factors, and these are always in flux in some markets more than others. This is obviously and apparently the case in the world of data, BI, and analytics, which has been changing at a great pace due to a number of factors discussed further in the rest of this article. Qlik Sense has been designed to be the means through which organizations and the people that are a part of them thrive in a changed environment. Big, big, and even bigger data A key factor is that there's simply much more data in many forms to analyze than before. We're in the middle of an ongoing, accelerating data boom. According to Science Daily, 90 percent of the world's data was generated over the past two years. The fact is that with technologies such as Hadoop and NoSQL databases, we now have unprecedented access to cost-effective data storage. With vast amounts of data now storable and available for analysis, people need a way to sort the signal from the noise. People from a wider variety of roles—not all of them BI users or business analysts—are demanding better, greater access to data, regardless of where it comes from. Qlik Sense's fundamental design centers on bringing varied data together for exploration in an easy and powerful way. The slow spinning down of the disk At the same time, we are seeing a shift in how computation occurs and potentially, how information is managed. Fundamentals of the computing architectures that we've used for decades, the spinning disk and moving read head, are becoming outmoded. This means storing and accessing data has been around since Edison invented the cylinder phonograph in 1877. It's about time this changed. This technology has served us very well; it was elegant and reliable, but it has limitations. Speed limitations primarily. Fundamentals that we take for granted today in BI, such as relational and multidimensional storage models, were built around these limitations. So were our IT skills, whether we realized it at the time. With the use of in-memory processing and 64-bit addressable memory spaces, these limitations are gone! This means a complete change in how we think about analysis. Processing data in memory means we can do analysis that was impractical or impossible before with the old approach. With in-memory computing, analysis that would've taken days before, now takes just seconds (or much less). However, why does it matter? Because it allows us to use the time more effectively; after all, time is the most finite resource of all. In-memory computing enables us to ask more questions, test more scenarios, do more experiments, debunk more hypotheses, explore more data, and run more simulations in the short window available to us. For IT, it means no longer trying to second-guess what users will do months or years in advance and trying to premodel it in order to achieve acceptable response times. People hate watching the hourglass spin. Qlik Sense's predecessor QlikView® was built on the exploitation of in-memory processing; Qlik Sense has it at its core too. Ubiquitous computing and the Internet of Things You may know that more than a billion people use Facebook, but did you know that the majority of those people do so from a mobile device? The growth in the number of devices connected to the Internet is absolutely astonishing. According to Cisco's Zettabyte Era report, Internet traffic from wireless devices will exceed traffic from wired devices in 2014. If we were writing this article even as recently as a year ago, we'd probably be talking about mobile BI as a separate thing from desktop or laptop delivered analytics. The fact of the matter is that we've quickly gone beyond that. For many people now, the most common way to use technology is on a mobile device, and they expect the kind of experience they've become used to on their iOS or Android device to be mirrored in complex software, such as the technology they use for visual discovery and analytics. From its inception, Qlik Sense has had mobile usage in the center of its design ethos. It's the first data discovery software to be built for mobiles, and that's evident in how it uses HTML5 to automatically render output for the device being used, whatever it is. Plug in a laptop running Qlik Sense to a 70-inch OLED TV and the visual output is resized and re-expressed to optimize the new form factor. So mobile is the new normal. This may be astonishing but it's just the beginning. Mobile technology isn't just a medium to deliver information to people, but an acceleration of data production for analysis too. By 2020, pretty much everyone and an increasing number of things will be connected to the Internet. There are 7 billion people on the planet today. Intel predicts that by 2020, more than 31 billion devices will be connected to the Internet. So, that's not just devices used by people directly to consume or share information. More and more things will be put online and communicate their state: cars, fridges, lampposts, shoes, rubbish bins, pets, plants, heating systems—you name it. These devices will generate a huge amount of data from sensors that monitor all kinds of measurable attributes: temperature, velocity, direction, orientation, and time. This means an increasing opportunity to understand a huge gamut of data, but without the right technology and approaches it will be complex to analyze what is going on. Old methods of analysis won't work, as they don't move quickly enough. The variety and volume of information that can be analyzed will explode at an exponential rate. The rise of this type of big data makes us redefine how we build, deliver, and even promote analytics. It is an opportunity for those organizations that can exploit it through analysis; this can sort the signals from the noise and make sense of the patterns in the data. Qlik Sense is designed as just such a signal booster; it takes how users can zoom and pan through information too large for them to easily understand the product. Unbound Human-Computer Interaction We touched on the boundary between the computing power and the humans using it in the previous section. Increasingly, we're removing barriers between humans and technology. Take the rise of touch devices. Users don't want to just view data presented to them in a static form. Instead, they want to "feel" the data and interact with it. The same is increasingly true of BI. The adoption of BI tools has been too low because the technology has been hard to use. Adoption has been low because in the past BI tools often required people to conform to the tool's way of working, rather than reflecting the user's way of thinking. The aspiration for Qlik Sense (when part of the QlikView.Next project) was that the software should be both "gorgeous and genius". The genius part obviously refers to the built-in intelligence, the smarts, the software will have. The gorgeous part is misunderstood or at least oversimplified. Yes, it means cosmetically attractive (which is important) but much more importantly, it means enjoyable to use and experience. In other words, Qlik Sense should never be jarring to users but seamless, perhaps almost transparent to them, inducing a state of mental flow that encourages thinking about the question being considered rather than the tool used to answer it. The aim was to be of most value to people. Qlik Sense will empower users to explore their data and uncover hidden insights, naturally. Evolving customer requirements It is not only the external market drivers that impact how we use information. Our organizations and the people that work within them are also changing in their attitude towards technology, how they express ideas through data, and how increasingly they make use of data as a competitive weapon. Consumerization of BI and the rise of self-service The consumerization of any technology space is all about how enterprises are affected by, and can take advantage of, new technologies and models that originate and develop in the consumer marker, rather than in the enterprise IT sector. The reality is that individuals react quicker than enterprises to changes in technology. As such, consumerization cannot be stopped, nor is it something to be adopted. It can be embraced. While it's not viable to build a BI strategy around consumerization alone, its impact must be considered. Consumerization makes itself felt in three areas: Technology: Most investment in innovation occurs in the consumer space first, with enterprise vendors incorporating consumer-derived features after the fact. (Think about how vendors added the browser as a UI for business software applications.) Economics: Consumer offerings are often less expensive or free (to try) with a low barrier of entry. This drives prices down, including enterprise sectors, and alters selection behavior. People: Demographics, which is the flow of Millennial Generation into the workplace, and the blurring of home/work boundaries and roles, which may be seen from a traditional IT perspective as rogue users, with demands to BYOPC or device. In line with consumerization, BI users want to be able to pick up and just use the technology to create and share engaging solutions; they don't want to read the manual. This places a high degree of importance on the Human-Computer Interaction (HCI) aspects of a BI product (refer to the preceding list) and governed access to information and deployment design. Add mobility to this and you get a brand new sourcing and adoption dynamic in BI, one that Qlik engendered, and Qlik Sense is designed to take advantage of. Think about how Qlik Sense Desktop was made available as a freemium offer. Information as an asset and differentiator As times change, so do differentiators. For example, car manufacturers in the 1980s differentiated themselves based on reliability, making sure their cars started every single time. Today, we expect that our cars will start; reliability is now a commodity. The same is true for ERP systems. Originally, companies implemented ERPs to improve reliability, but in today's post-ERP world, companies are shifting to differentiating their businesses based on information. This means our focus changes from apps to analytics. And analytics apps, like those delivered by Qlik Sense, help companies access the data they need to set themselves apart from the competition. However, to get maximum return from information, the analysis must be delivered fast enough, and in sync with the operational tempo people need. Things are speeding up all the time. For example, take the fashion industry. Large mainstream fashion retailers used to work two seasons per year. Those that stuck to that were destroyed by fast fashion retailers. The same is true for old style, system-of-record BI tools; they just can't cope with today's demands for speed and agility. The rise of information activism A new, tech-savvy generation is entering the workforce, and their expectations are different than those of past generations. The Beloit College Mindset List for the entering class of 2017 gives the perspective of students entering college this year, how they see the world, and the reality they've known all their lives. For this year's freshman class, Java has never been just a cup of coffee and a tablet is no longer something you take in the morning. This new generation of workers grew up with the Internet and is less likely to be passive with data. They bring their own devices everywhere they go, and expect it to be easy to mash-up data, communicate, and collaborate with their peers. The evolution and elevation of the role of IT We've all read about how the role of IT is changing, and the question CIOs today must ask themselves is: "How do we drive innovation?". IT must transform from being gatekeepers (doers) to storekeepers (enablers), providing business users with self-service tools they need to be successful. However, to achieve this transformation, they need to stock helpful tools and provide consumable information products or apps. Qlik Sense is a key part of the armory that IT needs to provide to be successful in this transformation. Summary In this article, we looked at the factors that provide the wider context for the use of Qlik Sense. The factors covered arise out of both increasing technical capability and demands to compete in a globalized, information-centric world, where out-analyzing your competitors is a key success factor. Resources for Article: Further resources on this subject: Securing QlikView Documents [article] Conozca QlikView [article] Introducing QlikView elements [article]
Read more
  • 0
  • 0
  • 2208

article-image-five-kinds-python-functions-python-34-edition
Packt
06 Feb 2015
33 min read
Save for later

The Five Kinds of Python Functions Python 3.4 Edition

Packt
06 Feb 2015
33 min read
This article is written by Steven Lott, author of the book Functional Python Programming. You can find more about him at http://slott-softwarearchitect.blogspot.com. (For more resources related to this topic, see here.) What's This About? We're going to look at various ways that Python 3 lets us define things which behave like functions. The proper term here is Callable – we're looking at objects that can be called like a function. We'll look at the following Python constructs: Function definitions Higher-order functions Function wrappers (around methods) Lambdas Callable objects Generator functions and the yield parameter And yes, we're aware that the list above has six items on it. That's because higher-order functions in Python aren't really all that complex or different. In some languages, functions that take functions are arguments involving special syntax. In Python, it's simple and common and barely worth mentioning as a separate topic. We'll look at when it's appropriate and inappropriate to use one or the other of these various functional forms. Some background Let's take a quick peek at a basic bit of mathematical formalism. We'll look at a function as an abstract formalism. We often annotate it like this: This shows us that f() is a function. It has one argument, x, and will map this to a single value, y. Some mathematical functions are written in front, for example, y=sin x. Some are written in other places around the argument, for example, y=|x|. In Python, the syntax is more consistent, for example, we use a function like this: >>> abs(-5)5 We've applied the abs() function to an argument value of -5. The argument value was mapped to a value of 5. Terminology Consider the following function: In this definition, the argument is a pair of values, (a,b). This is called the domain. We can summarize it as the domain of values for which the function is defined. Outside this domain, the function is not defined. In Python, we get a TypeError exception if we provide one value or three values as the argument. The function maps the domain pair to a pair of values, (q,r). This is the range of the function. We can call this the range of values that could be returned by the function. Mathematical function features As we look at the abstract mathematical definition of functions, we note that functions are generally assumed to have no hysteresis; they have no history or memory of prior use. This is sometimes called the property of being idempotent: the results are always the same for a given argument value. We see this in Python as a common feature. But it's not universally true. We'll look at a number of exceptions to the rule of idempotence. Here's an example of the usual situation: >>> int("10f1", 16)4337 The value returned from the evaluation of int("10f1", 16) never changes. There are, however, some common examples of non-idempotent functions in Python. Examples of hysteresis Here are three common situations where a function has hysteresis. In some cases, results vary based on history. In other cases, results vary based on events in some external environment, such as follows: Random number generators. We don't want them to produce the same value over and over again. The Python random.randrange() function, is not obviously idempotent. OS functions depend on the state of the machine as a whole. The os.listdir() function returns values that depend on the use of functions such as os.unlink(), os.rename(), and open() (among several others).While the rules are generally simple, it requires a stateful object outside the narrow world of the code itself. These are examples of Python functions that don't completely fit the formal mathematical definition; they lack idempotence, and their values depend on history, other functions, or both. Function Definitions Python has two statements that are essential features of function definition. The def statement specifies the domain and the return statement(s) specify the range. A simplified gloss of the syntax is as follows: def name(params):   body   return expression In effect, the function's domain is defined by the parameters provided in the def statement. This list of parameter names is not all the information on the domain, however. Even if we use one of the Python extensions to add type annotations, that's still not all the information. There may be if statements in the body of the function that impose additional explicit restrictions. There may be other functions that impose their own kind of implicit restrictions. If, for example, the body included math.sqrt() then there would be an implicit restriction on some values being non-negative. The return statements provide the function's range. An empty return statement means a range of simply None values. When there are multiple return statements, the range is the union of the ranges on all the return statements. This mapping between Python syntax and mathematical concepts isn't very complete. We need more information about a function. Example definition Here's an example of function definition: def odd(n):   """odd(n) -> boolean, true if n is odd."""   return n % 2 == 1 What do does this definition tell us? Several things such as: Domain: We know that this function accepts n, a single object. Range: Boolean value, True if n is an odd number. This is the most likely interpretation. It's also remotely possible that the class of n has repurposed __mod__() or __rmod__() methods, in which case the semantics can be pretty obscure. Because of the inherent ambiguity in Python, this function has provided a triple-quoted """Docstring""" parameter with a summary of the function. This is a best practice, and should be followed universally except in articles like this where it gets too long-winded to include a docstring parameter everywhere. In this case, the doctoring parameter doesn't state unambiguously that n is intended to be a number. There are two ways to handle this gap, they are as follows: Actually include words like n is a number in the docstring parameter Include the docstring parameter test cases that show the required behavior Either is acceptable. Both are preferable. Using a function To complete this example, here's how we'd use this odd little function named odd(): >>> odd(3)True>>> odd(4)False This kind of example text can be included into the docstring parameter to create two test cases that offer insight into what the function really means. The lack of declarations More verbose type declarations—as used in many popular programming languages—aren't actually enough information to fully specify a function's domain and range. To be rigorously complete, we need type definitions that include optional predicates. Take a look at the following command: isinstance(n,int) and n >= 0 The assert statement is a good place for this kind of additional argument domain checking. This isn't the perfect solution because assert statements can be disabled very easily. It can help during design and testing and it can help people to read your code. The fussy formal declarations of data type used in other languages are not really needed in Python. Python replaces an up-front claim about required types with a runtime search for appropriate class methods. This works because each Python object has all the type information bound into it. Static compile-time type information is redundant, since the runtime type information is complete. A Python function definition is pretty spare. In includes the minimal amount of information about the function. There are no formal declaration of parameter types or return type. This odd little function will work with any object that implements the % operator: Generally, this means any object that implements __mod__() or __rmod__(). This means most subclasses of numbers.Number. It also means instances of any class that happen to provide these methods. That could become very weird, but still possible. We hesitate to think about non-numeric objects that work with the number-like % operator. Some Python features In Python, functions we declare are proper first-class objects. This means that they have attributes that can be assigned to variables and placed into collections. Quite a few clever things can be done with function objects. One of the most elegant things is to use a function as an argument or a return value from another function. The ability to do this means that we can easily create and use higher-order functions in Python. For folks who know languages such as C (and C++), functions aren't proper first-class objects. A pointer to a function, however, is a first class object in C. But the function itself is a block of code that can't easily be manipulated. We'll look at a number of simple ways in which we can write—and use—higher-order functions in Python. Functions are objects Consider the following command example: >>> not_even = odd>>> not_even(3)True We've assigned the odd little function object to a new variable, not_even. This creates an alias for a function. While this isn't always the best idea, there are times when we might want to provide an alternate name for a function as part of maintaining reverse compatibility with a previous release of a library. Using functions Consider the following function definition: def some_test(function, value):   print(function, value)   return function(value) This function's domain includes arguments named function and value. We can see that it prints the arguments, then applies the function argument to the given value. When we use the preceding function, it looks like this: >>> some_test(odd, 3)<function odd at 0x613978> 3True The some_test() function accepted a function as an argument. When we printed the function, we got a summary, <function odd at 0x613978>, that shows us some information about the object. We also show a summary of the argument value, 3. When we applied the function to a value, we got the expected result. We can—of course—extend this concept. In particular, we can apply a single function to many values. Higher-order Functions Higher-order functions become particularly useful when we apply them to collections of objects. The built-in map() function applies a simple function to each value in an argument sequence. Here's an example: >>> list(map(odd, [1,2,3,4]))[True, False, True, False] We've used the map() function to apply the odd() function to each value in the sequence. This is a lot like evaluating: >>> [odd(x) for x in [1,2,3,4]] We've created a list comprehension instead of applying a higher-order map() function. This is equivalent to the following command snippet: [odd(1), odd(2), odd(3), odd(4)] Here, we've manually applied the odd() function to each value in a sequence. Yes, that's a diesel engine alternator and some hoses: We'll use this alternator as a subject for some concrete examples of higher-order functions. Diesel engine background Some basic diesel engine mechanics. The following some basic information: The engine turns the alternator. The alternator generates pulses that drive the tachometer. Amongst other things, like charging the batteries. The alternator provides an indirect measurement of engine RPMs. Direct measurement would involve connecting to a small geared shaft. It's difficult and expensive. We already have a tachometer; it's just incorrect. The new alternator has new wheels. The ratios between engine and alternator have changed. We're not interested in installing a new tachometer. Instead, we'll create a conversion from a number on the tachometer, which is calibrated to the old alternator, to a proper number of engine RPMs. This has to allow the change in ratio between the original tachometer and the new tach. Let's collect some data and see what we can figure out about engine RPMs. New alternator First approximation: all we did was get new wheels. We can presume that the old tachometer was correct. Since the new wheel is smaller, we'll have higher alternator RPMs. That means higher readings on the old tachometer. Here's the key question: How far wrong are the RPMs? The old wheel was approximately 3.5 RPM and the new wheel is approximately 2.5 RPM. We can compute the potential ratio between what the tach says and what the engine is really doing: >>> 3.5/2.51.4>>> 1/_0.7142857142857143 That's nice. Is it right? Can we really just multiply and display RPMs by .7 to get actual engine RPMs? Let's create the conversion card first, then collect some more data. Use case Given RPM on the tachometer, what's the real RPM of the engine? Use the following command to find the RPM: def eng(r):   return r/1.4 Use it like the following: >>> eng(2100)1500.0 This seems useful. Tach says 2100, engine (theoretically) spinning at 1500, more or less. Let's confirm our hypothesis with some real data. Data collection Over a period of time, we recorded tachometer readings and actual RPMs using a visual RPM measuring device. The visual device requires a strip of reflective tape on one of the engine wheels. It uses a laser and counts returns per minute. Simple. Elegant. Accurate. It's really inconvenient. But it got some data we could digest. Skipping some boring statistics, we wind up with the following function that maps displayed RPMs to actual RPMs, such as this: def eng2(r):   return 0.7724*r**1.0134 Here's a sample result: >>> eng2(2100)1797.1291903589386 When tach says 2100, the engine is measured as spinning at about 1800 RPM. That's not quite the same as the theoretical model. But it's so close that it gives us a lot of confidence in this version. Of course, the number displayed is hideous. All that floating-point cruft is crazy. What can we do? Rounding is only part of the solution. We need to think through the use case. After all, we use this standing at the helm of the boat; how much detail is appropriate? Limits and ranges The engine has governors and only runs between 800 and 2500 RPM. There's a very tight limit here. Realistically, we're talking about this small range of values: >>> list(range(800, 2500, 200))[800, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400] There's no sensible reason for proving any more detailed engine RPMs. It's a sailboat; top speed is 7.5 knots (Nautical miles per hour). Wind and current have far more impact on the boat speed than the difference between 1600 and 1700 RPMs. The tach can't be read to closer than 100-200 RPM. It's not digital, it's a red pointer near little tick lines. There's no reason to preserve more than a few bits of precision. Example of Tach translation Given the engine RPMs and the conversion function, we can deduce that the tachometer display will be between 1000 to 3200. This will map to engine RPMs in the range of about 800 to 2500. We can confirm this with a mapping like this: >>> list(map(eng2, range(1000,3200,200)))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] We've applied the eng2() mapping from tach to engine RPM. For tach readings between 1000 and 3200 in steps of 200, we've computed the actual engine RPMs. For those who use spreadsheets a lot, the range() function is like filling a column with values. The map(eng2, …) function is like filling an adjacent column with a calculation. We've created the result of applying a function to each value of a given range. As shown, this is little difficult to use. We need to do a little more cleanup. What other function do we need to apply to the results? Round to 100 Here's a function that will round up to the nearest 100: def next100(n):   return int(round(n, -2)) We could call this a kind of composite function built from a partial application of round() and int() functions. If we map this function to the previous results, we get something a little easier to work with. How does this look? >>> tach= range(1000,3200,200)>>> list(map(next100, map(eng2, tach)))[800, 1000, 1200, 1400, 1500, 1700, 1900, 2100, 2200, 2400, 2600] This expression is a bit complex; let's break it down into three discrete steps: First, map the eng2() function to tach numbers between 1000 and 3200. The result is effectively a sequence of values (it's not actually a list, it's a generator, a potential list) Second, map the next100() function to results of previous mapping Finally, collect a single list object from the results We've applied two functions, eng2() and next100(), to a list of values. In principle, we've created a kind of composite function, next100○eng20(rpm). Python doesn't support function composition directly, hence the complex-looking map of map syntax. Interleave sequences of values The final step is to create a table that shows both the tachometer reading and the computed engine RPMs. We need to interleave the input and output values into a single list of pairs. Here are the tach readings we're working with, as a list: >>> tach= range(1000,3200,200) Here are the engine RPMs: >>> engine= list(map(next100,map(eng2,tach))) Here's how we can interleave the two to create something that shows our tachometer reading and engine RPMs: >>> list(zip(tach, engine))[(1000, 800), (1200, 1000), (1400, 1200), (1600, 1400), (1800, 1500), (2000, 1700),(2200, 1900), (2400, 2100), (2600, 2200), (2800, 2400), (3000, 2600)] The rest is pretty-printing. What's important is that we could take functions like eng() or eng2() and apply it to columns of numbers, creating columns of results. The map() function means that we don't have to write explicit for loops to simply apply a function to a sequence of values. Map is lazy We have a few other observations about the Python higher-order functions. First, these functions are lazy, they don't compute any results until required by other statements or expressions. Because they don't actually create intermediate list objects, they may be quite fast. The laziness feature is true for the built-in higher-order functions map() and filter(). It's also true for many of the functions in the itertools library. Many of these functions don't simply create a list object, they yield values as requested. For debugging purposes, we use list() to see what's being produced. If we don't apply list() to the result of a lazy function, we simply see that it's a lazy function. Here's an example: >>> map(lambda x:x*1.4, range(1000,3200,200))<map object at 0x102130610> We don't see a proper result here, because the lazy map() function didn't do anything. The list(), tuple(), or set() functions will force a lazy map() function to actually get up off the couch and compute something. Function Wrappers There are a number of Python functions which are syntactic sugar for method functions. One example is the len() function. This function behaves as if it had the following definition: def len(obj):   return obj.__len__() The function acts like it's simply invoking the object's built-in __len__() method. There are several Python functions that exist only to make the syntax a little more readable. Post-fix syntax purists would prefer to see syntax such as some_list.len(). Those who like their code to look a little more mathematical prefer len(some_list). Some people will go so far as to claim that the presence of prefix functions means that Python isn't strictly object-oriented. This is false; Python is very strictly object-oriented. It doesn't—however—use only postfix method notation. We can write function wrappers to make some method functions a little more palatable. Another good example is the divmod() function. This relies on two method functions, such as the following: a.__divmod__(b) b.__rdivmod__(a) The usual operator rules apply here. If the class for object a implements __divmod__(), then that's used to compute the result. If not, then the same test is made for the class of object b; if there's an implementation, that will be used to compute the results. Otherwise, it's undefined and we'll get an exception. Why wrap a method? Function wrappers for methods are syntactic sugar. They exist to make object methods look like simple functions. In some cases, the functional view is more succinct and expressive. Sometimes the object involved is obvious. For example, the os module functions provide access to OS-level libraries. The OS object is concealed inside the module. Sometimes the object is implied. For example, the random module makes a Random instance for us. We can simply call random.randint() without worrying about the object that was required for this to work properly. Lambdas A lambda is an anonymous function with a degenerate body. It's like a function in some respects and it's unlike a function because of the following two things: A lambda has no name A lambda has no statements A lambda's body is a single expression, nothing more. This expression can have parameters, however, which is why a lambda is a handy form of a callable function. The syntax is essentially as follows: lambda params : expression Here's a concrete example: lambda r: 0.7724*r**1.0134 You may recognize this as the eng2() function defined previously. We don't always need a complete, formal function. Sometimes, we just need an expression that has parameters. Speaking theoretically, a lambda is a one-argument function. When we have multi-argument functions, we can transform it to a series of one-argument lambda forms. This transformation can be helpful for optimization. None of that applies to Python. We'll move on. Using a Lambda with map Here are two equivalent results: map(eng2, tach) map(lambda r: 0.7724*r**1.0134, tach) Here's a previous example, using the lambda instead of the function: >>> tach= range(1000,3200,200)>>> list( map(lambda r: 0.7724*r**1.0134, tach))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] You could scroll back to see that the results are the same. If we're doing a small thing once only, a lambda object might be more clear than a complete function definition. Emphasis here is on small once only. If we start trying to reuse a lambda object, or feel the need to assign a lambda object to a variable, we should really consider a function definition and the associated docstring and doctest features. Another use of Lambdas A common use of lambdas is with three other higher-order functions: sort(), min(), and max(). We might use one of these with a list object: list.sort(key= lambda x: expr) list.min(key= lambda x: expr) list.max(key= lambda x: expr) In each case, we're using a lambda object to embed an expression into the argument values for a function. In some cases, the expression might be very sophisticated; in other cases, it might be something as trivial as lambda x: x[1]. When the expression is trivial, a lambda object is a good idea. If the expression is going to get reused, however, a lambda object might be a bad idea. You can do this… But… The following kind of statement makes sense: some_name = lambda x: 3*x+1 We've created a callable object that takes a single argument value and returns a numeric value such as the following command snippet: def some_name(x): return 3*x+1. There are some differences. Most notably the following: A lambda object is all on one line of code. A possible advantage. There's no docstring. A disadvantage for lambdas of any complexity. Nor is there any doctest in the missing docstring. A significant problem for a lambda object that requires testing. There are ways to test lambdas with doctest outside a docstring, but it seems simpler to switch to a full function definition. We can't easily apply decorators to it. To do it, we lose the @decorator syntax. We can't use any Python statements in it. In particular, no try-except block is possible. For these reasons, we suggest limiting the use of lambdas to truly trivial situations. Callable Objects A callable object fits the model of a function. The unifying feature of all of the things we've looked at is that they're callable. Functions are the primary example of being callable but objects can also be callable. Callable objects can be subclasses of collections.abc.Callable. Because of Python's flexibility, this isn't a requirement, it's merely a good idea. To be callable, a class only needs to provide a __call__() method. Here's a complete callable class definition: from collections.abc import Callableclass Engine(Callable):   def __call__(self, tach):       return 0.7724*tach**1.0134 We've imported the collections.abc.Callable class. This will provide some assurance that any class that extends this abstract superclass will provide a definition for the __call__() method. This is a handy error-checking feature. Our class extends Callable by providing the needed __call__() method. In this case, the __call__() method performs a calculation on the single parameter value, returning a single result. Here's a callable object built from this class: eng= Engine() This creates a function that we can then use. We can evaluate eng(1000) to get the engine RPMs when the tach reads 1000. Callable objects step-by-step There are two parts to making a function a callable object. We'll emphasize these for folks who are new to object-oriented programming: Define a class. Generally, we make this a subclass of collections.abc.Callable. Technically, we only need to implement a __call__() method. It helps to use the proper superclass because it might help catch a few common mistakes. Create an instance of the class. This instance will be a callable object. The object that's created will be very similar to a defined function. And very similar to a lambda object that's been assigned to a variable. While it will be similar to a def statement, it will have one important additional feature: hysteresis. This can be the source of endless bugs. It can also be a way to improve performance. Callables can have hysteresis Here's an example of a callable object that uses hysteresis as a kind of optimization: class Factorial(Callable):   def __init__(self):       self.previous = {}   def __call__(self, n):       if n not in self.previous:           self.previous[n]= self.compute(n)       return self.previous[n]   def compute(self, n):       if n == 0 : return 1       return n*self.__call__(n-1)Here's how we can use this:>>> fact= Factorial()>>> fact(5)120 We create an instance of the class, and then call the instance to compute a value for us. The initializer The initialization method looks like this:    def __init__(self):       self.previous = {} This function creates a cache of previously computed values. This is a technique called memoization. If we've already computed a result once, it's in the self.previous cache; we don't need to compute it again, we already know the answer. The Callable interface The required __call__() method looks like this:    def __call__(self, n):       if n not in self.previous:           self.previous[n]= self.compute(n)       return self.previous[n] We've checked the memoization cache first. If the value is not there, we're forced to compute the answer, and insert it into the cache. The final answer is always a value in the cache. A common what if question is what if we have a function of multiple arguments? There are two minuscule changes to support more complex arguments. Use def __call__(self, *n): and self.compute(*n). Since we're only computing factorial, there's no need to over-generalize. The Compute method The essential computation has been allocated to a method called compute. It looks like this:    def compute(self, n):       if n == 0: return 1           return n*self.__call__(n-1) This does the real work of the callable object: it computes n!. In this case, we've used a pretty standard recursive factorial definition. This recursion relies on the __call__() method to check the cache for previous values. If we don't expect to compute values larger than 1000! (a 2,568 digit number, by the way) the recursion works nicely. If we think we need to compute really large factorials, we'll need to use a different approach. Execute the following code to compute very large factorials: functools.reduce(operator.mul, range(1,n+1)) Either way, we can depend on the internal memoization to leverage previous results. Note the potential issue Hysteresis—memory of what came before—is available to the callable objects. We call functions and lambdas stateless, where callable objects can be stateful. This may be desirable to optimize performance. We can memoize the previous results or we can design an object that's simply confusing. Consider a function like divmod() that returns two values. We could try to define a callable object that first returns the quotient and on the second call with the same arguments returns the remainder: >>> crazy_divmod(355,113)3>>> crazy_divmod(255,113)16 This is technically possible. But it's crazy. Warning: Stay away. We generally expect idempotence: functions do the same thing each time. Implementing memoization didn't alter the basic idempotence of our factorial function. Generator Functions Here's a fun generator, the Collatz function. The function creates a sequence using a simple pair of rules. We'll could call this rule, Half-Or-Three-Plus-One (HOTPO). We'll call it collatz(): def collatz(n):   if n % 2 == 0:        return n//2   else:       return 3*n+1 Each integer argument yields a next integer. These can form a chain. For example, if we start with collatz(13), we get 40. The value of collatz(40) is 20. Here's the sequence of values: 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1At 1, it loops: 1 → 4 → 2 → 1 … Interestingly, all chains seem to lead—eventually—to 1. To explore this, we need a simple function that will build a chain from a given starting value. Successive values Here's a generator function that will build a list object. This iterates through values in the sequence until it reaches 1, when it terminates: def col_list(n):   seq= [n]   while n != 1:       n= collatz(n)       seq.append(n)   return seq This is not wrong. But it's not really the most useful implementation. This always creates a sequence object. In many cases, we don't really want an object, we only want information about the sequence. We might only want the length, or the largest numbers, or the sum. This is where a generator function might be more useful. A generator function yields elements from the sequence instead of building the sequence as a single object. Generator functions To create a generator function, we write a function that has a loop; inside the loop, there's a yield statement. A function with a yield statement is effectively an Iterable object, it can be used in a for statement to produce values. It doesn't create a big list object, it creates the items that can be accumulated into a list or tuple object. A generator function is lazy: it doesn't compute anything unless forced to by another function needing results. We can iterate through as many (or as few) of the results as we need. For example, list(some_generator()) forces all values to be returned. For another example of a lazy generator, look at how range() objects work. If we evaluate range(10), we only get a generator. If we evaluate list(range(10)), we get a list object. The Collatz generator Here's a generator function that will produce sequences of values using the collatz() method rule shown previously: def col_iter(n):   yield n   while n != 1:       n= collatz(n)        yield n When we use this in a for loop or with the list() function, this will yield the argument number. While the number is not 1, it will apply the collatz() function and yield successive values in the chain. When it has yielded 1, it will will terminate. One common pattern for generator functions is to replace all list-accumulation statements with yield statements. Instead of building a list one time at a time, we yield each item. The collatz() function it lazy. We don't get an answer unless we use list() or tuple() or some variation of a for statement context. Using a generator function Here's how this function looks in practice: >>> for i in col_iter(3):…   print(i)3105168421 We've used the generator function in a for loop so that it will yield all of the values until it terminates. Collatz function sequences Now we can do some exploration of this Collatz sequence. Here are a few evaluations of the col_iter() function: >>> list(col_iter(3))[3, 10, 5, 16, 8, 4, 2, 1]>>> list(col_iter(5))[5, 16, 8, 4, 2, 1]>>> list(col_iter(6))[6, 3, 10, 5, 16, 8, 4, 2, 1]>>> list(syracuse_iter(13))[13, 40, 20, 10, 5, 16, 8, 4, 2, 1] There's an interesting pattern here. It seems that from 16, we know the rest. Generalizing this: from any number we've already seen, we know the rest. Wait. This means that memoization might be a big help in exploring the values created by this sequence. When we start combining function design patterns like this, we're doing functional programming. We're stepping outside the box of purely object-oriented Python. Alternate styles Here is an alternative version of the collatz() function: def collatz2(n):   return n//2 if n%2 == 0 else 3*n+1 This simply collapses the if statements into a single if expression and may not help much. We also have this: collatz3= lambda n: n//2 if n%2 == 0 else 3*n+1 We've collapsed the expression into a lambda object. Helpful? Perhaps not. On the other hand, the function doesn't really need all of the overhead of a full function definition and multiple statements. The lambda object seems to capture everything relevant. Functions as object There's a higher-level function that will produce values until some ending condition is met. We can plug in one of the versions of the collatz() function and a termination test into this general-purpose function: def recurse_until(ending, the_function, n):   yield n   while not ending(n):       n= the_function(n)       yield n This requires two plug-in functions, they are as follows: ending() is a function to test to see whether we're done, for example, lambda n: n==1 the_function() is a form of the Collatz function We've completely uncoupled the general idea of recursively applying a function from a specific function and a specific terminating condition. Using the recurs_until() function We can apply this higher-order recurse_until() function like this: >>> recurse_until(lambda n: n==1, syracuse2, 9)<generator object recurse_until at 0x1021278c0> What's that? That's how a lazy generator looks: it didn't return any values because we didn't demand any values. We need to use it in a loop or some kind of expression that iterates through all available values. The list() function, for example, will collect all of the values. Getting the list of values Here's how we make the lazy generator do some work: >>> list(_)[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] The _ variable is the previously computed value. It relieves us from the burden of having to write an assignment statement. We can write an expression, see the results, and know the results were automatically saved in the _ variable. Project Euler #14 Which starting number, under one million, produces the longest chain? Try it without memoization. It's really simple: >>> collatz_len= [len(list(recurse_until(lambda n: n==1, collatz2, i))) ... for i in range(1,11)]>>> results = zip(collatz_len, range(1,11))>>> max(results)(20, 9)>>> list(col_iter(9))[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] We defined collatz_len as a list. We're writing a list comprehension that shows the values built from a generator expression. The generator expression evaluates len(something) for i in range(1,11). This means we'll be collecting ten values into the list, each of which is the length of something. The something is a list object built from the recurse_until(lambda n: n==1, collatz2, i) function that we discussed. This will compute the sequence of Collatz values starting from i and proceeding until the value in the sequence is 1. We've zipped the lengths with the original values of i. This will create pairs of lengths and starting numbers. The maximum length will now have a starting value associated with it so that we can confirm that the results match our expectations. And yes, this Project Euler problem could—in principle—be solved in a single line of code. Will this scale to 100? 1,000? 1,000,000? How much will memoization help? Summary In this article, we've looked at five (or six) kinds of Python callables. They all fit the y = f(x) model of a function to varying degrees. When is it appropriate to use each of these different ways to express the same essential concept? Functions are created with def and return. It shouldn't come as a surprise that this should cover most cases. This allows a docstring comment and doctest test cases. We could call these def functions, since they're built with the def statement. Higher-order functions—map(), filter(), and the itertools library—are generally written as plain-old def functions. They're higher-order because they accept functions as arguments or return functions as results. Otherwise, they're just functions. Function wrappers—len(), divmod(), pow(), str(), and repr()—are function syntax wrapped around object methods. These are def'd functions with very tiny bodies. We use them because a.pow(2) doesn't seem as clear as pow(2,a). Lambdas are appropriate for one-time use of something so simple that it doesn't deserve being wrapped in a def statement body. In some cases, we have a small nugget of code that seems more clear when written as a lambda expression rather than a more complete function definition. Simple filter rules, and simple computations are often more clearly shown as a lambda object. The Callable objects have a special property that other functions lack: hysteresis. They can retain the results of previous calculations. We've used this hysteresis property to implement memoizing. This can be a huge performance boost. Callable objects can be used badly, however, to create objects that have simply bizarre behavior. Most functions should strive for idempotence—the same arguments should yield the same results. Generator functions are created with a def statement and at least one yield statement. These functions are iterable. They can be used in a for statement to examine each resulting value. They can also be used with functions like list(), tuple(), and set() to create an actual object from the iterable sequence of values. We might combine them with higher-order functions to do complex processing, one item at a time. It's important to work with each of these kinds of callables. If you only have one tool—a hammer—then every problem has to be reshaped into a nail before you can solve it. Once you have multiple tools available, you can pick the tools that provides the most succinct and expressive solution to the problem. Resources for Article: Further resources on this subject: Expert Python Programming [article] Python Network Programming Cookbook [article] Learning Python Design Patterns [article]
Read more
  • 0
  • 0
  • 2725

article-image-getting-your-own-video-and-feeds
Packt
06 Feb 2015
18 min read
Save for later

Getting Your Own Video and Feeds

Packt
06 Feb 2015
18 min read
"One server to satisfy them all" could have been the name of this article by David Lewin, the author of BeagleBone Media Center. We now have a great media server where we can share any media, but we would like to be more independent so that we can choose the functionalities the server can have. The goal of this article is to let you cross the bridge, where you are going to increase your knowledge by getting your hands dirty. After all, you want to build your own services, so why not create your own contents as well. (For more resources related to this topic, see here.) More specifically, here we will begin by building a webcam streaming service from scratch, and we will see how this can interact with what we have implemented previously in the server. We will also see how to set up a service to retrieve RSS feeds. We will discuss the services in the following sections: Installing and running MJPG-Streamer Detecting the hardware device and installing drivers and libraries for a webcam Configuring RSS feeds with Leed Detecting the hardware device and installing drivers and libraries for a webcam Even though today many webcams are provided with hardware encoding capabilities such as the Logitech HD Pro series, we will focus on those without this capability, as we want to have a low budget project. You will then learn how to reuse any webcam left somewhere in a box because it is not being used. At the end, you can then create a low cost video conference system as well. How to know your webcam As you plug in the webcam, the Linux kernel will detect it, so you can read every detail it's able to retrieve about the connected device. We are going to see two ways to retrieve the webcam we have plugged in: the easy one that is not complete and the harder one that is complete. "All magic comes with a price."                                                                                     –Rumpelstiltskin, Once Upon a Time Often, at a certain point in your installation, you have to choose between the easy or the hard way. Most of the time, powerful Linux commands or tools are not thought to be easy at first but after some experiments you'll discover that they really can make your life better. Let's start with the fast and easy way, which is lsusb : debian@arm:~$ lsusb Bus 001 Device 002: ID 046d:0802 Logitech, Inc. Webcam C200 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub This just confirms that the webcam is running well and is seen correctly from the USB. Most of the time we want more details, because a hardware installation is not exactly as described in books or documentations, so you might encounter slight differences. This is why the second solution comes in. Among some of the advantages, you are able to know each step that has taken place when the USB device was discovered by the board and Linux, such as in a hardware scenario: debian@arm:~$ dmesg A UVC device (here, a Logitech C200) has been used to obtain these messages Most probably, you won't exactly have the same outputs, but they should be close enough so that you can interpret them easily when they are referred to: New USB device found: This is the main message. In case of any issue, we will check its presence elsewhere. This message indicates that this is a hardware error and not a software or configuration error that you need to investigate. idVendor and idProduct: This message indicates that the device has been detected. This information is interesting so you can check the constructor detail. Most recent webcams are compatible with the Linux USB Video Class (UVC), you can check yours at http://www.ideasonboard.org/uvc/#devices. Among all the messages, you should also look for the one that says Registered new interface driver interface because failing to find it can be a clue that Linux could detect the device but wasn't able to install it. The new device will be detected as /dev/video0. Nevertheless, at start, you can see your webcam as a different device name according to your BeagleBone configuration, for example, if a video capable cape is already plugged in. Setting up your webcam Now we know what is seen from the USB level. The next step is to use the crucial Video4Linux driver, which is like a Swiss army knife for anything related to video capture: debian@arm:~$ Install v4l-utils The primary use of this tool is to inquire about what the webcam can provide with some of its capabilities: debian@arm:~$ v4l2-ctl -–all There are four distinctive sections that let you know how your webcam will be used according to the current settings: Driver info (1) : This contains the following information: Name, vendor, and product IDs that we find in the system message The driver info (the kernel's version) Capabilities: the device is able to provide video streaming Video capture supported format(s) (2): This contains the following information: What resolution(s) are to be used. As this example uses an old webcam, there is not much to choose from but you can easily have a lot of choices with devices nowadays. The pixel format is all about how the data is encoded but more details can be retrieved about format capabilities (see the next paragraph). The remaining stuff is relevant only if you want to know in precise detail. Crop capabilities (3): This contains your current settings. Indeed, you can define the video crop window that will be used. If needed, use the crop settings: --set-crop-output=top=<x>,left=<y>,width=<w>,height=<h> Video input (4): This contains the following information: The input number. Here we have used 0, which is the one that we found previously. Its current status. The famous frames per second, which gives you a local ratio. This is not what you will obtain when you'll be using a server, as network latencies will downgrade this ratio value. You can grab capabilities for each parameter. For instance, if you want to see all the video formats the webcam can provide, type this command: debian@arm:~$ v4l2-ctl --list-formats Here, we see that we can also use MJPEG format directly provided by the cam. While this part is not mandatory, such a hardware tour is interesting because you know what you can do with your device. It is also a good habit to be able to retrieve diagnostics when the webcam shows some bad signs. If you would like to get more in depth knowledge about your device, install the uvcdynctrl package, which lets you retrieve all the formats and frame rates supported. Installing and running MJPG-Streamer Now that we have checked the chain from the hardware level up to the driver, we can install the software that will make use of Video4Linux for video streaming. Here comes MJPG-Streamer. This application aims to provide you with a JPEG stream on the network available for browsers and all video applications. Besides this, we are also interested in this solution as it's made for systems with less advanced CPU, so we can start MJPG-Streamer as a service. With this streamer, you can also use the built-hardware compression and even control webcams such as pan, tilt, rotations, zoom capabilities, and so on. Installing MJPG-Streamer Before installing MJPG-Streamer, we will install all the necessary dependencies: debian@arm:~$ install subversion libjpeg8-dev imagemagick Next, we will retrieve the code from the project: debian@arm:~$ svn checkout http://svn.code.sf.net/p/mjpg-streamer/code/ mjpg-streamer-code You can now build the executable from the sources you just downloaded by performing the following steps: Enter the following into the local directory you have downloaded: debian@arm:~$ cd mjpg-streamer-code/mjpg-streamer Then enter the following command: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ make When the compilation is complete, we end up with some new files. From this picture the new green files are produced from the compilation: there are the executables and some plugins as well. That's all that is needed, so the application is now considered ready. We can now try it out. Not so much to do after all, don't you think? Starting the application This section aims at getting you started quickly with MJPG-Streamer. At the end, we'll see how to start it as a service on boot. Before getting started, the server requires some plugins to be copied into the dedicated lib directory for this purpose: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ sudo cp input_uvc.so output_http.so /usr/lib The MJPG-Streamer application has to know the path where these files can be found, so we define the following environment variable: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ export LD_LIBRARY_PATH=/usr/ lib;$LD_LIBRARY_PATH Enough preparation! Time to start streaming: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www" As the script starts, the input parameters that will be taken into consideration are displayed. You can now identify this information, as they have been explained previously: The detected device from V4L2 The resolution that will be displayed, according to your settings Which port will be opened Some controls that depend on your camera capabilities (tilt, pan, and so on) If you need to change the port used by MJPG-Streamer, add -p xxxx at the end of the command, which is shown as follows: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www –p 1234" Let's add some security If you want to add some security, then you should set the credentials: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg-streamer -o "output_http.so -w ./www -c debian:temppwd" Credentials can always be stolen and used without your consent. The best way to ensure that your stream is confidential all along would be to encrypt it. So if you intend to use strong encryption for secured applications, the crypto-cape is worth taking a look at http://datko.net/2013/10/03/howto_crypto_beaglebone_black/. "I'm famous" – your first stream That's it. The webcam is made accessible to everyone across the network from BeagleBone; you can access the video from your browser and connect to http://192.168.0.15:8080/. You will then see the default welcome screen, bravo!: Your first contact with the MJPG-Server You might wonder how you would get informed about which port to use among those already assigned. Using our stream across the network Now that the webcam is available across the network, you have several options to handle this: You can use the direct flow available from the home page. On the left-hand side menu, just click on the stream tab. Using VLC, you can open the stream with the direct link available at http://192.168.0.15:8080/?action=stream.The VideoLAN menu tab is a M3U-playlist link generator that you can click on. This will generate a playlist file you can open thereafter. In this case, VLC is efficient, as you can transcode the webcam stream to any format you need. Although it's not mandatory, this solution is the most efficient, as it frees the BeagleBone's CPU so that your server can focus on providing services. Using MediaDrop, we can integrate this new stream in our shiny MediaDrop server, knowing that currently MediaDrop doesn't support direct local streams. You can create a new post with the related URL link in the message body, as shown in the following screenshot: Starting the streaming service automatically on boot In the beginning, we saw that MJPG-Streamer needs only one command line to be started. We can put it in a bash script, but servicing on boot is far better. For this, use a console text editor – nano or vim – and create a file dedicated to this service. Let's call it start_mjpgstreamer and add the following commands: #! /bin/sh # /etc/init.d/start_mjpgstreamer export LD_LIBRARY_PATH="/home/debian/mjpg-streamer/mjpg-streamer-code/ mjpg-streamer;$LD_LIBRARY_PATH" EXEC_PATH="/home/debian/mjpg-streamer/mjpg-streamer-code/mjpg-streamer" $EXEC_PATH/mjpg_streamer -i "input_uvc.so" -o "output_http.so -w EXEC_PATH /www" You can then use administrator rights to add it to the services: debian@arm:~$ sudo /etc/init.d/start_mjpgstreamer start On the next reboot, MJPG-Streamer will be started automatically. Exploring new capabilities to install For those about to explore, we salute you! Plugins Remember that at the beginning of this article, we began the demonstration with two plugins: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www" If we take a moment to look at these plugins, we will understand that the first plugin is responsible for handling the webcam directly from the driver. Simply ask for help and options as follows: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer --input "input_uvc.so --help" The second plugin is about the web server settings: The path to the directory contains the final web server HTML pages. This implies that you can modify the existing pages with a little effort or create new ones based on those provided. Force a special port to be used. Like I said previously, port use is dedicated for a server. You define here which will be the one for this service. You can discover many others by asking: debian@arm:~$ ./mjpg_streamer --output "output_http.so --help" Apart from input_uvc and output_http, you have other available plugins to play with. Let's take a look at the plugins directory. Another tool for the webcam The Mjpg_streamer project is dedicated for streaming over network, but it is not the only one. For instance, do you have any specific needs such as monitoring your house/son/cat/Jon Snow figurine? buuuuzzz: if you answered yes to the last one, you just defined yourself as a geek. Well, in that case the Motion project is for you; just install the motion package and start it with the default motion.conf configuration. You will then record videos and pictures of any moving object/person that will be detected. As MJPG-Streamer motion aims to be a low CPU consumer, it works very well on BeagleBone Black. Configuring RSS feeds with Leed Our server can handle videos, pictures, and music from any source and it would be cool to have another tool to retrieve news from some RSS providers. This can be done with Leed, a RSS project organized for servers. You can have a final result, as shown in the following screenshot: This project has a "quick and easy" installation spirit, so you can give it a try without harness. Leed (for Light Feed) allows you to you access RSS feeds from any browser, so no RSS reader application is needed, and every user in your network can read them as well. You install it on the server and feeds are automatically updated. Well, the truth behind the scenes is that a cron task does this for you. You will be guided to set some synchronisation after the installation. Creating the environment for Leed in three steps We already have Apache, MySQL, and PHP installed, and we need a few other prerequisites to run Leed: Create a database for Leed Download the project code and set permissions Install Leed itself Creating a database for Leed You will begin by opening a MySQL session: debian@arm:~$ mysql –u root –p What we need here is to have a dedicated Leed user with its database. This user will be connected using the following: create user 'debian_leed'@'localhost' IDENTIFIED BY 'temppwd'; create database leed_db; use leed_db; grant create, insert, update, select, delete on leed_db.* to debian_leed@localhost; exit Downloading the project code and setting permissions We prepared our server to have its environment ready for Leed, so after getting the latest version, we'll get it working with Apache by performing the following steps: From your home, retrieve the latest project's code. It will also create a dedicated directory: debian@arm:~$ git clone https://github.com/ldleman/Leed.git debian@arm:~$ ls mediadrop mjpg-streamer Leed music Now, we need to put this new directory where the Apache server can find it: debian@arm:~$ sudo mv Leed /var/www/ Change the permissions for the application: debian@arm:~$ chmod 777 /var/www/Leed/ -R Installing Leed When you go to the server address (http//192.168.0.15/leed/install.php), you'll get the following installation screen: We now need to fill in the database details that we previously defined and add the Administrator credentials as well. Now save and quit. Don't worry about the explanations, we'll discuss these settings thereafter. It's important that all items from the prerequisites list on the right are green. Otherwise, a warning message will be displayed about the wrong permissions settings, as shown in the following screenshot: After the configuration, the installation is complete: Leed is now ready for you. Setting up a cron job for feed updates If you want automatic updates for your feeds, you'll need to define a synchronization task with cron: Modify cron jobs: debian@arm:~$ sudo crontab –e Add the following line: 0 * * * * wget -q -O /var/www/leed/logsCron "http://192.168.0.15/Leed/action.php?action=synchronize Save it and your feeds will be refreshed every hour. Finally, some little cleanup: remove install.php for security matters: debian@arm:~$ rm /var/www/Leed/install.php Using Leed to add your RSS feed When you need to add some feeds from the Manage menu, in Feed Options (on the right- hand side) select Preferences and you just have to paste the RSS link and add it with the button: You might find it useful to organize your feeds into groups, as we did for movies in MediaDrop. The Rename button will serve to achieve this goal. For example, here a TV Shows category has been created, so every feed related to this type will be organized on the main screen. Some Leed preferences settings in a server environment You will be asked to choose between two synchronisation modes: Complete and Graduated. Complete: This isto be used in a usual computer, as it will update all your feeds in a row, which is a CPU consuming task Graduated: Look for the oldest 10 feeds and update them if required You also have the possibility of allowing anonymous people to read your feeds. Setting Allow anonymous readers to Yeswill let your guests access your feeds but not add any. Extending Leed with plugins If you want to extend Leed capabilities, you can use the Leed Market—as the author defined it—from Feed options in the Manage menu. There, you'll be directed to the Leed Market space. Installation is just a matter of downloading the ZIP file with all plugins: debian@arm:~/Leed$ wget  https://github.com/ldleman/Leed-market/archive/master.zip debian@arm:~/Leed$ sudo unzip master.zip Let's use the AdBlock plugin for this example: Copy the content of the AdBlock plugin directory where Leed can see it: debian@arm:~/Leed$ sudo cp –r Leed-market-master/adblock /var/www/Leed/plugins Connect yourself and set the plugin by navigating to Manage | Available Plugins and then activate adblock withEnable, as follows: In this article, we covered: Some words about the hardware How to know your webcam Configuring RSS feeds with Leed Summary In this article, we had some good experiments with the hardware part of the server "from the ground," to finally end by successfully setting up the webcam service on boot. We discovered hardware detection, a way to "talk" with our local webcam and thus to be able to see what happens when we plug a device in the BeagleBone. Through the topics, we also discovered video4linux to retrieve information about the device, and learned about configuring devices. Along the way, we encountered MJPG-Streamer. Finally, it's better to be on our own instead of being dependent on some GUI interfaces, where you always wonder where you need to click. Finally, our efforts have been rewarded, as we ended up with a web page we can use and modify according to our tastes. RSS news can also be provided by our server so that you can manage all your feeds in one place, read them anywhere, and even organize dedicated groups. Plenty of concepts have been seen for hardware and software. Then think of this article as a concrete example you can use and adapt to understand how Linux works. I hope you enjoyed this freedom of choice, as you drag ideas and drop them in your BeagleBone as services. We entered in the DIY area, showing you ways to explore further. You can argue, saying that we can choose the software but still use off the shelf commercial devices. Resources for Article: Further resources on this subject: Using PVR with Raspbmc [Article] Pulse width modulator [Article] Making the Unit Very Mobile - Controlling Legged Movement [Article]
Read more
  • 0
  • 0
  • 4608
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-upgrading-interface
Packt
06 Feb 2015
4 min read
Save for later

Upgrading the interface

Packt
06 Feb 2015
4 min read
In this article by Marco Schwartz and Oliver Manickum authors of the book Programming Arduino with LabVIEW, we will see how to design an interfave using LabVIEW. (For more resources related to this topic, see here.) At this stage, we know that we have our two sensors working and that they were interfaced correctly with the LabVIEW interface. However, we can do better; for now, we simply have a text display of the measurements, which is not elegant to read. Also, the light-level measurement goes from 0 to 5, which doesn't mean anything for somebody who will look at the interface for the first time. Therefore, we will modify the interface slightly. We will add a temperature gauge to display the data coming from the temperature sensor, and we will modify the output of the reading from the photocell to display the measurement from 0 (no light) to 100 percent (maximum brightness). We first need to place the different display elements. To do this, perform the following steps: Start with Front Panel. You can use a temperature gauge for the temperature and a simple slider indicator for Light Level. You will find both in the Indicators submenu of LabVIEW. After that, simply place them on the right-hand side of the interface and delete the other indicators we used earlier. Also, name the new indicators accordingly so that we can know to which element we have to connect them later. Then, it is time to go back to Block Diagram to connect the new elements we just added in Front Panel. For the temperature element, it is easy: you can simply connect the temperature gauge to the TMP36 output pin. For the light level, we will make slightly more complicated changes. We will divide the measured value beside the Analog Read element by 5, thus obtaining an output value between 0 and 1. Then, we will multiply this value by 100, to end up with a value going from 0 to 100 percent of the ambient light level. To do so perform the following steps: The first step is to place two elements corresponding to the two mathematical operations we want to do: a divide operator and a multiply operator. You can find both of them in the Functions panel of LabVIEW. Simply place them close to the Analog Read element in your program. After that, right-click on one of the inputs of each operator element, and go to Create | Constant to create a constant input for each block. Add a value of 5 for the division block, and add a value of 100 for the multiply block. Finally, connect the output of the Analog Read element to the input of the division block, the output of this block to the input of the multiply block, and the output of the multiply block to the input of the Light Level indicator. You can now go back to Front Panel to see the new interface in action. You can run the program again by clicking on the little arrow on the toolbar. You should immediately see that Temperature is now indicated by the gauge on the right and Light Level is immediately changing on the slider, depending on how you cover the sensor with your hand. Summary In this article, we connected a temperature sensor and a light-level sensor to Arduino and built a simple LabVIEW program to read data from these sensors. Then, we built a nice graphical interface to visualize the data coming from these sensors. There are many ways you can build other projects based on what you learned in this article. You can, for example, connect higher temperatures and/or more light-level sensors to the Arduino board and display these measurements in the interface. You can also connect other kinds of sensors that are supported by LabVIEW, for example, other analog sensors. For example, you can add a barometric pressure sensor or a humidity sensor to the project to build an even more complete weather-measurement station. One other interesting extension of this article will be to use the storage and plotting capabilities of LabVIEW to dynamically plot the history of the measured data inside the LabVIEW interface. Resources for Article: Further resources on this subject: The Arduino Mobile Robot [article] Using the Leap Motion Controller with Arduino [article] Avoiding Obstacles Using Sensors [article]
Read more
  • 0
  • 0
  • 7274

article-image-visualforce-development-apex
Packt
06 Feb 2015
12 min read
Save for later

Visualforce Development with Apex

Packt
06 Feb 2015
12 min read
In this article by Matt Kaufman and Michael Wicherski, authors of the book Learning Apex Programming, we will see how we can use Apex to extend the Salesforce1 Platform. We will also see how to create a customized Force.com page. (For more resources related to this topic, see here.) Apex on its own is a powerful tool to extend the Salesforce1 Platform. It allows you to define your own database logic and fully customize the behavior of the platform. Sometimes, controlling "what happens behind the scenes isn't enough. You might have a complex process that needs to step users through a wizard or need to present data in a format that isn't native to the Salesforce1 Platform, or maybe even make things look like your corporate website. Anytime you need to go beyond custom logic and implement a custom interface, you can turn to Visualforce. Visualforce is the user interface framework for the Salesforce1 Platform. It supports the use of HTML, JavaScript, CSS, and Flash—all of which enable you to build your own custom web pages. These web pages are stored and hosted by the Salesforce1 Platform and can be exposed to just your internal users, your external community users, or publicly to the world. But wait, there's more! Also included with Visualforce is a robust markup language. This markup language (which is also referred to as Visualforce) allows you to bind your web pages to data and actions stored on the platform. It also allows you to leverage Apex for code-based objects and actions. Like the rest of the platform, the markup portion of Visualforce is upgraded three times a year with new tags and features. All of these features mean that Visualforce is very powerful. s-con-what? Before the "introduction of Visualforce, the Salesforce1 Platform had a feature called s-controls. These were simple files where you could write HTML, CSS, and JavaScript. There was no custom markup language included. In order to make things look like the Force.com GUI, a lot of HTML was required. If you wanted to create just a simple input form for a new Account record, so much HTML code was required. The following is just a" small, condensed excerpt of what the HTML would look like if you wanted to recreate such a screen from scratch: <div class="bPageTitle"><div class="ptBody"><div class="content"> <img src="/s.gif" class="pageTitleIcon" title="Account" /> <h1 class="pageType">    Account Edit<span class="titleSeparatingColon">:</span> </h1> <h2 class="pageDescription"> New Account</h2> <div class="blank">&nbsp;</div> </div> <div class="links"></div></div><div   class="ptBreadcrumb"></div></div> <form action="/001/e" method="post" onsubmit="if   (window.ffInAlert) { return false; }if (window.sfdcPage   &amp;&amp; window.sfdcPage.disableSaveButtons) { return   window.sfdcPage.disableSaveButtons(); }"> <div class="bPageBlock brandSecondaryBrd bEditBlock   secondaryPalette"> <div class="pbHeader">    <table border="0" cellpadding="0" cellspacing="0"><tbody>      <tr>      <td class="pbTitle">      <img src="/s.gif" width="12" height="1" class="minWidth"         style="margin-right: 0.25em;margin-right: 0.25em;margin-       right: 0.25em;">      <h2 class="mainTitle">Account Edit</h2>      </td>      <td class="pbButton" id="topButtonRow">      <input value="Save" class="btn" type="submit">      <input value="Cancel" class="btn" type="submit">      </td>      </tr>    </tbody></table> </div> <div class="pbBody">    <div class="pbSubheader brandTertiaryBgr first       tertiaryPalette" >    <span class="pbSubExtra"><span class="requiredLegend       brandTertiaryFgr"><span class="requiredExampleOuter"><span       class="requiredExample">&nbsp;</span></span>      <span class="requiredMark">*</span>      <span class="requiredText"> = Required Information</span>      </span></span>      <h3>Account Information<span         class="titleSeparatingColon">:</span> </h3>    </div>    <div class="pbSubsection">    <table class="detailList" border="0" cellpadding="0"     cellspacing="0"><tbody>      <tr>        <td class="labelCol requiredInput">        <label><span class="requiredMark">*</span>Account         Name</label>      </td>      <td class="dataCol col02">        <div class="requiredInput"><div         class="requiredBlock"></div>        <input id="acc2" name="acc2" size="20" type="text">        </div>      </td>      <td class="labelCol">        <label>Website</label>      </td>      <td class="dataCol">        <span>        <input id="acc12" name="acc12" size="20" type="text">        </span>      </td>      </tr>    </tbody></table>    </div> </div> <div class="pbBottomButtons">    <table border="0" cellpadding="0" cellspacing="0"><tbody>    <tr>      <td class="pbTitle"><img src="/s.gif" width="12" height="1"       class="minWidth" style="margin-right: 0.25em;margin-right:       0.25em;margin-right: 0.25em;">&nbsp;</td>      <td class="pbButtonb" id="bottomButtonRow">      <input value=" Save " class="btn" title="Save"         type="submit">      <input value="Cancel" class="btn" type="submit">      </td>    </tr>    </tbody></table> </div> <div class="pbFooter secondaryPalette"><div class="bg"> </div></div> </div> </form> We did our best to trim down this HTML to as little as possible. Despite all of our efforts, it still "took up more space than we wanted. The really sad part is that all of that code only results in the following screenshot: Not only was it time consuming to write all this HTML, but odds were that we wouldn't get it exactly right the first time. Worse still, every time the business requirements changed, we had to go through the exhausting effort of modifying the HTML code. Something had to change in order to provide us relief. That something was the introduction of Visualforce and its markup language. Your own personal Force.com The markup "tags in Visualforce correspond to various parts of the Force.com GUI. These tags allow you to quickly generate HTML markup without actually writing any HTML. It's really one of the greatest tricks of the Salesforce1 Platform. You can easily create your own custom screens that look just like the built-in ones with less effort than it would take you to create a web page for your corporate website. Take a look at the Visualforce markup that corresponds to the HTML and screenshot we showed you earlier: <apex:page standardController="Account" > <apex:sectionHeader title="Account Edit" subtitle="New Account"     /> <apex:form>    <apex:pageBlock title="Account Edit" mode="edit" >      <apex:pageBlockButtons>        <apex:commandButton value="Save" action="{!save}" />        <apex:commandButton value="Cancel" action="{!cancel}" />      </apex:pageBlockButtons>      <apex:pageBlockSection title="Account Information" >        <apex:inputField value="{!account.Name}" />        <apex:inputField value="{!account.Website}" />      </apex:pageBlockSection>    </apex:pageBlock> </apex:form> </apex:page> Impressive! With "merely these 15 lines of markup, we can render nearly 100 lines of earlier HTML. Don't believe us, you can try it out yourself. Creating a Visualforce page Just like" triggers and classes, Visualforce pages can "be created and edited using the Force.com IDE. The Force.com GUI also includes a web-based editor to work with Visualforce pages. To create a new Visualforce page, perform these simple steps: Right-click on your project and navigate to New | Visualforce Page. The Create New Visualforce Page window appears as shown: Enter" the label and name for your "new page in the Label and Name fields, respectively. For this example, use myTestPage. Select the API version for the page. For this example, keep it at the default value. Click on Finish. A progress bar will appear followed by your new Visualforce page. Remember that you always want to create your code in a Sandbox or Developer Edition org, not directly in Production. It is technically possible to edit Visualforce pages in Production, but you're breaking all sorts of best practices when you do. Similar to other markup languages, every tag in a Visualforce page must be closed. Tags and their corresponding closing tags must also occur in a proper order. The values of tag attributes are enclosed by double quotes; however, single quotes can be used inside the value to denote text values. Every Visualforce page starts with the <apex:page> tag and ends with </apex:page> as shown: <apex:page> <!-- Your content goes here --> </apex:page> Within "the <apex:page> tags, you can paste "your existing HTML as long as it is properly ordered and closed. The result will be a web page hosted by the Salesforce1 Platform. Not much to see here If you are" a web developer, then there's a lot you can "do with Visualforce pages. Using HTML, CSS, and images, you can create really pretty web pages that educate your users. If you have some programming skills, you can also use JavaScript in your pages to allow for interaction. If you have access to web services, you can use JavaScript to call the web services and make a really powerful application. Check out the following Visualforce page for an example of what you can do: <apex:page> <script type="text/javascript"> function doStuff(){    var x = document.getElementById("myId");    console.log(x); } </script> <img src="http://www.thisbook.com/logo.png" /> <h1>This is my title</h1> <h2>This is my subtitle</h2> <p>In a world where books are full of code, there was only one     that taught you everything you needed to know about Apex!</p> <ol>    <li>My first item</li>    <li>Etc.</li> </ol> <span id="myId"></span> <iframe src="http://www.thisbook.com/mypage.html" /> <form action="http://thisbook.com/submit.html" >    <input type="text" name="yoursecret" /> </form> </apex:page> All of this code is standalone and really has nothing to do with the Salesforce1 Platform other than being hosted by it. However, what really makes Visualforce powerful is its ability to interact with your data, which allows your pages to be more dynamic. Even better, you" can write Apex code to control how "your pages behave, so instead of relying on client-side JavaScript, your logic can run server side. Summary In this article we learned how a few features of Apex and how we can use it to extend the SalesForce1 Platform. We also created a custom Force.com page. Well, you've made a lot of progress. Not only can you write code to control how the database behaves, but you can create beautiful-looking pages too. You're an Apex rock star and nothing is going to hold you back. It's time to show your skills to the world. If you want to dig deeper, buy the book and read Learning Apex Programming in a simple step-by-step fashion by using Apex, the language for extension of the Salesforce1 Platform. Resources for Article: Further resources on this subject: Learning to Fly with Force.com [article] Building, Publishing, and Supporting Your Force.com Application [article] Adding a Geolocation Trigger to the Salesforce Account Object [article]
Read more
  • 0
  • 0
  • 2474

article-image-remote-access
Packt
06 Feb 2015
32 min read
Save for later

Remote Access

Packt
06 Feb 2015
32 min read
In this article by Jordan Krause, author of the book Windows Server 2012 R2 Administrator Cookbook, we will see how Windows Server 2012 R2 by Microsoft brings a whole new way of looking at remote access. Companies have historically relied on third-party tools to connect remote users into the network, such as traditional and SSL VPN provided by appliances from large networking vendors. I'm here to tell you those days are gone. Those of us running Microsoft-centric shops can now rely on Microsoft technologies to connect our remote workforce. Better yet is that these technologies are included with the Server 2012 R2 operating system, and have functionality that is much improved over anything that a traditional VPN can provide. Regular VPN does still have a place in the remote access space, and the great news is that you can also provide it with Server 2012 R2. Our primary focus for this article will be DirectAccess (DA). DA is kind of like automatic VPN. There is nothing the user needs to do in order to be connected to work. Whenever they are on the Internet, they are also connected automatically to the corporate network. DirectAccess is an amazing way to have your Windows 7 and Windows 8 domain joined systems connected back to the network for data access and for management of those traveling machines. DirectAccess has actually been around since 2008, but the first version came with some steep infrastructure requirements and was not widely used. Server 2012 R2 brings a whole new set of advantages and makes implementation much easier than in the past. I still find many server and networking admins who have never heard of DirectAccess, so let's spend some time together exploring some of the common tasks associated with it. In this article, we will cover the following recipes: Configuring DirectAccess, VPN, or a combination of the two Pre-staging Group Policy Objects (GPOs) to be used by DirectAccess Enhancing the security of DirectAccess by requiring certificate authentication Building your Network Location Server (NLS) on its own system  (For more resources related to this topic, see here.) There are two "flavors" of remote access available in Windows Server 2012 R2. The most common way to implement the Remote Access role is to provide DirectAccess for your Windows 7 and Windows 8 domain joined client computers, and VPN for the rest. The DirectAccess machines are typically your company-owned corporate assets. One of the primary reasons that DirectAccess is usually only for company assets is that the client machines must be joined to your domain, because the DirectAccess configuration settings are brought down to the client through a GPO. I doubt you want home and personal computers joining your domain. VPN is therefore used for down level clients such as Windows XP, and for home and personal devices that want to access the network. Since this is a traditional VPN listener with all regular protocols available such as PPTP, L2TP, SSTP, it can even work to connect devices such as smartphones. There is a third function available within the Server 2012 R2 Remote Access role, called the Web Application Proxy ( WAP ). This function is not used for connecting remote computers fully into the network as DirectAccess and VPN are; rather, WAP is used for publishing internal web resources out to the internet. For example, if you are running Exchange and Lync Server inside your network and want to publish access to these web-based resources to the internet for external users to connect to, WAP would be a mechanism that could publish access to these resources. The term for publishing out to the internet like this is Reverse Proxy, and WAP can act as such. It can also behave as an ADFS Proxy. For further information on the WAP role, please visit: http://technet.microsoft.com/en-us/library/dn584107.aspx One of the most confusing parts about setting up DirectAccess is that there are many different ways to do it. Some are good ideas, while others are not. Before we get rolling with recipes, we are going to cover a series of questions and answers to help guide you toward a successful DA deployment. The first question that always presents itself when setting up DA is "How do I assign IP addresses to my DirectAccess server?". This is quite a loaded question, because the answer depends on how you plan to implement DA, which features you plan to utilize, and even upon how secure you believe your DirectAccess server to be. Let me ask you some questions, pose potential answers to those questions, and discuss the effects of making each decision. DirectAccess Planning Q&A Which client operating systems can connect using DirectAccess? Answer: Windows 7 Ultimate, Windows 7 Enterprise, and Windows 8.x Enterprise. You'll notice that the Professional SKU is missing from this list. That is correct, Windows 7 and Windows 8 Pro do not contain the DirectAccess connectivity components. Yes, this does mean that Surface Pro tablets cannot utilize DirectAccess out of the box. However, I have seen many companies now install Windows 8 Enterprise onto their Surface tablets, effectively turning them into "Surface Enterprises." This works fine and does indeed enable them to be DirectAccess clients. In fact, I am currently typing this text on a DirectAccess-connected Surface "Pro turned Enterprise" tablet. Do I need one or two NICs on my DirectAccess server? Answer: Technically, you could set it up either way. In practice however, it really is designed for dual-NIC implementation. Single NIC DirectAccess works okay sometimes to establish a proof-of-concept to test out the technology. But I have seen too many problems with single NIC implementation in the field to ever recommend it for production use. Stick with two network cards, one facing the internal network and one facing the Internet. Do my DirectAccess servers have to be joined to the domain? Answer: Yes. Does DirectAccess have site-to-site failover capabilities? Answer: Yes, though only Windows 8.x client computers can take advantage of it. This functionality is called Multi-Site DirectAccess. Multiple DA servers that are spread out geographically can be joined together in a multi-site array. Windows 8 client computers keep track of each individual entry point and are able to swing between them as needed or at user preference. Windows 7 clients do not have this capability and will always connect through their primary site. What are these things called 6to4, Teredo, and IP-HTTPS I have seen in the Microsoft documentation? Answer: 6to4, Teredo, and IP-HTTPS are all IPv6 transition tunneling protocols. All DirectAccess packets that are moving across the internet between DA client and DA server are IPv6 packets. If your internal network is IPv4, then when those packets reach the DirectAccess server they get turned down into IPv4 packets, by some special components called DNS64 and NAT64. While these functions handle the translation of packets from IPv6 into IPv4 when necessary inside the corporate network, the key point here is that all DirectAccess packets that are traveling over the Internet part of the connection are always IPv6. Since the majority of the Internet is still IPv4, this means that we must tunnel those IPv6 packets inside something to get them across the Internet. That is the job of 6to4, Teredo, and IP-HTTPS. 6to4 encapsulates IPv6 packets into IPv4 headers and shuttles them around the internet using protocol 41. Teredo similarly encapsulates IPv6 packets inside IPv4 headers, but then uses UDP port 3544 to transport them. IP-HTTPS encapsulates IPv6 inside IPv4 and then inside HTTP encrypted with TLS, essentially creating an HTTPS stream across the Internet. This, like any HTTPS traffic, utilizes TCP port 443. The DirectAccess traffic traveling inside either kind of tunnel is always encrypted, since DirectAccess itself is protected by IPsec. Do I want to enable my clients to connect using Teredo? Answer: Most of the time, the answer here is yes. Probably the biggest factor that weighs on this decision is whether or not you are still running Windows 7 clients. When Teredo is enabled in an environment, this gives the client computers an opportunity to connect using Teredo, rather than all clients connecting in over the IP-HTTPS protocol. IP-HTTPS is sort of the "catchall" for connections, but Teredo will be preferred by clients if it is available. For Windows 7 clients, Teredo is quite a bit faster than IP-HTTPS. So enabling Teredo on the server side means your Windows 7 clients (the ones connecting via Teredo) will have quicker response times, and the load on your DirectAccess server will be lessened. This is because Windows 7 clients who are connecting over IP-HTTPS are encrypting all of the traffic twice. This also means that the DA server is encrypting/decrypting everything that comes and goes twice. In Windows 8, there is an enhancement that brings IP-HTTPS performance almost on par with Teredo, and so environments that are fully cut over to Windows 8 will receive less benefit from the extra work that goes into making sure Teredo works. Can I place my DirectAccess server behind a NAT? Answer: Yes, though there is a downside. Teredo cannot work if the DirectAccess server is sitting behind a NAT. For Teredo to be available, the DA server must have an External NIC that has two consecutive public IP addresses. True public addresses. If you place your DA server behind any kind of NAT, Teredo will not be available and all clients will connect using the IP-HTTPS protocol. Again, if you are using Windows 7 clients, this will decrease their speed and increase the load on your DirectAccess server. How many IP addresses do I need on a standalone DirectAccess server? Answer: I am going to leave single NIC implementation out of this answer since I don't recommend it anyway. For scenarios where you are sitting the External NIC behind a NAT or, for any other reason, are limiting your DA to IP-HTTPS only, then we need one external address and one internal address. The external address can be a true public address or a private NATed DMZ address. Same with the internal; it could be a true internal IP or a DMZ IP. Make sure both NICs are not plugged into the same DMZ, however. For a better installation scenario that allows Teredo connections to be possible, you would need two consecutive public IP addresses on the External NIC and a single internal IP on the Internal NIC. This internal IP could be either true internal or DMZ. But the public IPs would really have to be public for Teredo to work. Do I need an internal PKI? Answer: Maybe. If you want to connect Windows 7 clients, then the answer is yes. If you are completely Windows 8, then technically you do not need internal PKI. But you really should use it anyway. Using an internal PKI, which can be a single, simple Windows CA server, increases the security of your DirectAccess infrastructure. You'll find out during this article just how easy it is to require certificates as part of the tunnel building authentication process. Configuring DirectAccess, VPN, or a combination of the two Now that we have some general ideas about how we want to implement our remote access technologies, where do we begin? Most services that you want to run on a Windows Server begin with a role installation, but the implementation of remote access begins before that. Let's walk through the process of taking a new server and turning it into a Microsoft Remote Access server. Getting ready All of our work will be accomplished on a new Windows Server 2012 R2. We are taking the two-NIC approach to networking, and so we have two NICs installed on this server. The Internal NIC is plugged into the corporate network and the External NIC is plugged into the Internet for the sake of simplicity. The External NIC could just as well be plugged into a DMZ. How to do it... Follow these steps to turn your new server into a Remote Access server: Assign IP addresses to your server. Remember, the most important part is making sure that the Default Gateway goes on the External NIC only. Join the new server to your domain. Install an SSL certificate onto your DirectAccess server that you plan to use for the IP-HTTPS listener. This is typically a certificate purchased from a public CA. If you're planning to use client certificates for authentication, make sure to pull down a copy of the certificate to your DirectAccess server. You want to make sure certificates are in place before you start with the configuration of DirectAccess. This way the wizards will be able to automatically pull in information about those certificates in the first run. If you don't, DA will set itself up to use self-signed certificates, which are a security no-no. Use Server Manager to install the Remote Access role. You should only do this after completing the steps listed earlier. If you plan to load balance multiple DirectAccess servers together at a later time, make sure to also install the feature called Network Load Balancing . After selecting your role and feature, you will be asked which Remote Access role services you want to install. For our purposes in getting the remote workforce connected back into the corporate network, we want to choose DirectAccess and VPN (RAS) .  Now that the role has been successfully installed, you will see a yellow exclamation mark notification near the top of Server Manager indicating that you have some Post-deployment Configuration that needs to be done. Do not click on Open the Getting Started Wizard ! Unfortunately, Server Manager leads you to believe that launching the Getting Started Wizard (GSW) is the logical next step. However, using the GSW as the mechanism for configuring your DirectAccess settings is kind of like roasting a marshmallow with a pair of tweezers. In order to ensure you have the full range of options available to you as you configure your remote access settings, and that you don't get burned later, make sure to launch the configuration this way: Click on the Tools menu from inside Server Manager and launch the Remote Access Management Console . In the left window pane, click on Configuration | DirectAccess and VPN . Click on the second link, the one that says Run the Remote Access Setup Wizard . Please note that once again the top option is to run that pesky Getting Started Wizard. Don't do it! I'll explain why in the How it works… section of this recipe. Now you have a choice that you will have to answer for yourself. Are you configuring only DirectAccess, only VPN, or a combination of the two? Simply click on the option that you want to deploy. Following your choice, you will see a series of steps (steps 1 through 4) that need to be accomplished. This series of mini-wizards will guide you through the remainder of the DirectAccess and VPN particulars. This recipe isn't large enough to cover every specific option included in those wizards, but at least you now know the correct way to bring a DirectAccess/VPN server into operation. How it works... The remote access technologies included in Server 2012 R2 have great functionality, but their initial configuration can be confusing. Following the procedure listed in this recipe will set you on the right path to be successful in your deployment, and prevent you from running into issues down the road. The reason that I absolutely recommend you stay away from using the "shortcut" deployment method provided by the Getting Started Wizard is twofold: GSW skips a lot of options as it sets up DirectAccess, so you don't really have any understanding of how it works after finishing. You may have DA up and running, but have no idea how it's authenticating or working under the hood. This holds so much potential for problems later, should anything suddenly stop working. GSW employs a number of bad security practices in order to save time and effort in the setup process. For example, using the GSW usually means that your DirectAccess server will be authenticating users without client certificates, which is not a best practice. Also, it will co-host something called the NLS website on itself, which is also not a best practice. Those who utilize the GSW to configure DirectAccess will find that their GPO, which contains the client connectivity settings, will be security-filtered to the Domain Computers group. Even though it also contains a WMI filter that is supposed to limit that policy application to mobile hardware such as laptops, this is a terribly scary thing to see inside GPO filtering settings. You probably don't want all of your laptops to immediately start getting DA connectivity settings, but that is exactly what the GSW does for you. Perhaps worst, the GSW will create and make use of self-signed SSL certificates to validate its web traffic, even the traffic coming in from the Internet! This is a terrible practice and is the number one reason that should convince you that clicking on the Getting Started Wizard is not in your best interests. Pre-staging Group Policy Objects (GPOs) to be used by DirectAccess One of the great things about DirectAccess is that all of the connectivity settings the client computers need in order to connect are contained within a Group Policy Object (GPO). This means that you can turn new client computers into DirectAccess-connected clients without ever touching that system. Once configured properly, all you need to do is add the new computer account to an Active Directory security group, and during the next automatic Group Policy refresh cycle (usually within 90 minutes), that new laptop will be connecting via DirectAccess whenever outside the corporate network. You can certainly choose not to pre-stage anything with the GPOs and DirectAccess will still work. When you get to the end of the DA configuration wizards, it will inform you that two new GPOs are about to be created inside Active Directory. One GPO is used to contain the DirectAccess server settings and the other GPO is used to contain the DirectAccess client settings. If you allow the wizard to handle the generation of these GPOs, it will create them, link them, filter them, and populate them with settings automatically. About half of the time I see folks do it this way and they are forever happy with letting the wizard manage those GPOs now and in the future. The other half of the time, it is desired that we maintain a little more personal control over the GPOs. If you are setting up a new DA environment but your credentials don't have permission to create GPOs, the wizard is not going to be able to create them either. In this case, you will need to work with someone on your Active Directory team to get them created. Another reason to manage the GPOs manually is to have better control over placement of these policies. When you let the DirectAccess wizard create the GPOs, it will link them to the top level of your domain. It also sets Security Filtering on those GPOs so they are not going to be applied to everything in your domain, but when you open up the Group Policy Management Console you will always see those DirectAccess policies listed right up there at the top level of the domain. Sometimes this is simply not desirable. So for this reason also, you may want to choose to create and manage the GPOs by hand, so that we can secure placement and links where we specifically want them to be located. The key factors here are to make sure your DirectAccess Server Settings GPO applies to only the DirectAccess server or servers in your environment. And that the DirectAccess Client Settings GPO applies to only the DA client computers that you plan to enable in your network. The best practice here is to specify this GPO to only apply to a specific Active Directory security group so that you have full control over which computer accounts are in that group. I have seen some folks do it based only on the OU links and include whole OUs in the filtering for the clients GPO (foregoing the use of an AD group at all), but doing it this way makes it quite a bit more difficult to add or remove machines from the access list in the future. Requiring certificates as part of your DirectAccess tunnel authentication process is a good idea in any environment. It makes the solution more secure, and enables advanced functionality. The primary driver for most companies to require these certificates is the enablement of Windows 7 clients to connect via DirectAccess, but I suggest that anyone using DirectAccess in any capacity make use of these certs. They are simple to deploy, easy to configure, and give you some extra peace of mind that only computers who have a certificate issued directly to them from your own internal CA server are going to be able to connect through your DirectAccess entry point. Getting ready While the DirectAccess wizards themselves are run from the DirectAccess server, our work with this recipe is not. The Group Policy settings that we will be configuring are all accomplished within Active Directory, and we will be doing the work from a Domain Controller in our environment. How to do it... To pre-stage Group Policy Objects (GPOs) for use with DirectAccess: On your Domain Controller, launch the Group Policy Management Console . Expand Forest | Domains | Your Domain Name . There should be a listing here called Group Policy Object . Right-click on that and choose New . Name your new GPO something like DirectAccess Server Settings. Click on the new DirectAccess Server Settings GPO and it should open up automatically to the Scope tab. We need to adjust the Security Filtering section so that this GPO only applies to our DirectAccess server. This is a critical step for each GPO to ensure the settings that are going to be placed here do not get applied to the wrong computers. Remove Authenticated Users that is prepopulated in that list. The list should now be empty. Click the Add… button and search for the computer account of your DirectAccess server. Mine is called RA-01. By default this window will only search user accounts, so you will need to adjust Object Types to include Computers before it will allow you to add your server into this filtering list. Your Security Filtering list should now look like this:  Now click on the Details tab of your GPO. Change the GPO Status to be User configuration settings disabled . We do this because our GPO is only going to contain computer-level settings, nothing at the user level. The last thing to do is link your GPO to an appropriate container. Since we have Security Filtering enabled, our GPO is only ever going to apply its settings to the RA-01 server; however, without creating a link, the GPO will not even attempt to apply itself to anything. My RA-01 server is sitting inside the OU called Remote Access Servers . So I will right-click on my Remote Access Servers OU and choose Link an Existing GPO… .  Choose the new DirectAccess Server Settings from the list of available GPOs and click on the OK button. This creates the link and puts the GPO into action. Since there are not yet any settings inside the GPO, it won't actually make any changes on the server. The DirectAccess configuration wizards take care of populating the GPO with the settings that are needed. Now we simply need to rinse and repeat all of these steps to create another GPO, something like DirectAccess Client Settings . You want to set up the client settings GPO in the same way. Make sure that it is filtering to only the Active Directory Security Group that you created to contain your DirectAccess client computers. And make sure to link it to an appropriate container that will include those computer accounts. So maybe your clients GPO will look something like this:  How it works... Creating GPOs in Active Directory is a simple enough task, but it is critical that you configure the Links and Security Filtering correctly. If you do not take care to ensure that these DirectAccess connection settings are only going to apply to the machines that actually need the settings, you could create a world of trouble by internal servers getting remote access connection settings and cause them issues with connection while inside the network. Enhancing the security of DirectAccess by requiring certificate authentication When a DirectAccess client computer builds its IPsec tunnels back to the corporate network, it has the ability to require a certificate as part of that authentication process. In earlier versions of DirectAccess, the one in Server 2008 R2 and the one provided by Unified Access Gateway ( UAG ), these certificates were required in order to make DirectAccess work. Setting up the certificates really isn't a big deal at all; as long as there is a CA server in your network you are already prepared to issue the certs needed at no cost. Unfortunately, though, there must have been enough complaints back to Microsoft in order for them to make these certificates "recommended" instead of "required" and they created a new mechanism in Windows 8 and Server 2012 called KerberosProxy that can be used to authenticate the tunnels instead. This allows the DirectAccess tunnels to build without the computer certificate, making that authentication process less secure. I'm here to strongly recommend that you still utilize certificates in your installs! They are not difficult to set up, and using them makes your tunnel authentication stronger. Further, many of you may not have a choice and will still be required to install these certificates. Only simple DirectAccess scenarios that are all Windows 8 on the client side can get away with the shortcut method of foregoing certs. Anybody who still wants to connect Windows 7 via DirectAccess will need to use certificates on all of their client computers, both Windows 7 and Windows 8. In addition to Windows 7 access, anyone who intends to use the advanced features of DirectAccess such as load balancing, multi-site, or two-factor authentication will also need to utilize these certificates. With any of these scenarios, certificates become a requirement again, not a recommendation. In my experience, almost everyone still has Windows 7 clients that would benefit from being DirectAccess connected, and it's always a good idea to make your DA environment redundant by having load balanced servers. This further emphasizes the point that you should just set up certificate authentication right out of the gate, whether or not you need it initially. You might decide to make a change later that would require certificates and it would be easier to have them installed from the get-go rather than trying to incorporate them later into a running DA environment. Getting ready In order to distribute certificates, you will need a CA server running in your network. Once certificates are distributed to the appropriate places, the rest of our work will be accomplished from our Server 2012 R2 DirectAccess server. How to do it... Follow these steps to make use of certificates as part of the DirectAccess tunnel authentication process: The first thing that you need to do is distribute certificates to your DirectAccess servers and all DirectAccess client computers. The easiest way to do this is by using the built-in Computer template provided by default in a Windows CA server. If you desire to build a custom certificate template for this purpose, you can certainly do so. I recommend that you duplicate the Computer template and build it from there. Whenever I create a custom template for use with DirectAccess, I try to make sure that it meets the following criterias: The Subject Name of the certificate should match the Common Name of the computer (which is also the FQDN of the computer). The Subject Alternative Name ( SAN ) of the certificate should match the DNS Name of the computer (which is also the FQDN of the computer). The certificate should serve the Intended Purposes of both Client Authentication and Server Authentication . You can issue the certificates manually using Microsoft Management Console (MMC). Otherwise, you can lessen your hands-on administrative duties by enabling Autoenrollment. Now that we have certificates distributed to our DirectAccess clients and servers, log in to your primary DirectAccess server and open up the Remote Access Management Console . Click on Configuration in the top-left corner. You should now see steps 1 through 4 listed. Click Edit… listed under Step 2 . Now you can either click Next twice or click on the word Authentication to jump directly to the authentication screen. Check the box that says Use computer certificates . Now we have to specify the Certification Authority server that issued our client certificates. If you used an intermediary CA to issue your certs, make sure to check the appropriate checkbox. Otherwise, most of the time, certificates are issued from a root CA and in this case you would simply click on the Browse… button and look for your CA in the list. This screen is sometimes confusing because people expect to have to choose the certificate itself from the list. This is not the case. What you are actually choosing from this list is the Certificate Authority server that issued the certificates. Make any other appropriate selections on the Authentication screen. For example, many times when we require client certificates for authentication, it is because we have Windows 7 computers that we want to connect via DirectAccess. If that is the case for you, select the checkbox for Enable Windows 7 client computers to connect via DirectAccess .  How it works... Requiring certificates as part of your DirectAccess tunnel authentication process is a good idea in any environment. It makes the solution more secure, and enables advanced functionality. The primary driver for most companies to require these certificates is the enablement of Windows 7 clients to connect via DirectAccess, but I suggest that anyone using DirectAccess in any capacity make use of these certs. They are simple to deploy, easy to configure, and give you some extra peace of mind that only computers who have a certificate issued directly to them from your own internal CA server are going to be able to connect through your DirectAccess entry point. Building your Network Location Server (NLS) on its own system If you zipped through the default settings when configuring DirectAccess, or worse used the Getting Started Wizard, chances are that your Network Location Server ( NLS ) is running right on the DirectAccess server itself. This is not the recommended method for using NLS, it really should be running on a separate web server. In fact, if you later want to do something more advanced such as setting up load balanced DirectAccess servers, you're going to have to move NLS off onto a different server anyway. So you might as well do it right the first time. NLS is a very simple requirement, yet a critical one. It is just a website, it doesn't matter what content the site has, and it only has to run inside your network. Nothing has to be externally available. In fact, nothing should be externally available, because you only want this site being accessed internally. This NLS website is a large part of the mechanism by which DirectAccess client computers figure out when they are inside the office and when they are outside. If they can see the NLS website, they know they are inside the network and will disable DirectAccess name resolution, effectively turning off DA. If they do not see the NLS website, they will assume they are outside the corporate network and enable DirectAccess name resolution. There are two gotchas with setting up an NLS website: The first is that it must be HTTPS, so it does need a valid SSL certificate. Since this website is only running inside the network and being accessed from domain-joined computers, this SSL certificate can easily be one that has been issued from your internal CA server. So no cost associated there. The second catch that I have encountered a number of times is that for some reason the default IIS splash screen page doesn't make for a very good NLS website. If you set up a standard IIS web server and use the default site as NLS, sometimes it works to validate the connections and sometimes it doesn't. Given that, I always set up a specific site that I create myself, just to be on the safe side. So let's work together to follow the exact process I always take when setting up NLS websites in a new DirectAccess environment. Getting ready Our NLS website will be hosted on an IIS server we have that runs Server 2012 R2. Most of the work will be accomplished from this web server, but we will also be creating a DNS record and will utilize a Domain Controller for that task. How to do it... Let's work together to set up our new Network Location Server website: First decide on an internal DNS name to use for this website and set it up in DNS of your domain. I am going to use nls.mydomain.local and am creating a regular Host (A) record that points nls.mydomain.local at the IP address of my web server. Now log in to that web server and let's create some simple content for this new website. Create a new folder called C:NLS. Inside your new folder, create a new Default.htm file. Edit this file and throw some simple text in there. I usually say something like This is the NLS website used by DirectAccess. Please do not delete or modify me!.  Remember, this needs to be an HTTPS website, so before we try setting up the actual website, we should acquire the SSL certificate that we need to use with this site. Since this certificate is coming from my internal CA server, I'm going to open up MMC on my web server to accomplish this task. Once MMC is opened, snap-in the Certificates module. Make sure to choose Computer account and then Local computer when it prompts you for which certificate store you want to open. Expand Certificates (Local Computer) | Personal | Certificates . Right-click on this Certificates folder and choose All Tasks | Request New Certificate… . Click Next twice and you should see your list of certificate templates that are available on your internal CA server. If you do not see one that looks appropriate for requesting a website certificate, you may need to check over the settings on your CA server to make sure the correct templates are configured for issuing. My template is called Custom Web Server . Since this is a web server certificate, there is some additional information that I need to provide in my request in order to successfully issue a certificate. So I go ahead and click on that link that says More information is required to enroll for this certificate. Click here to configure settings. .  Drop-down the Subject name | Type menu and choose the option Common name . Enter a common name for our website into the Value field, which in my case is nls.mydomain.local. Click the Add button and your CN should move over to the right side of the screen like this:  Click on OK then click on the Enroll button. You should now have an SSL certificate sitting in your certificates store that can be used to authenticate traffic moving to our nls.mydomain.local name. Open up Internet Information Services (IIS) Manager , and browse to the Sites folder. Go ahead and remove the default website that IIS automatically set up, so that we can create our own NLS website without any fear of conflict. Click on the Add Website… action. Populate the information as shown in the following screenshot. Make sure to choose your own IP address and SSL certificate from the lists, of course:  Click the OK button and you now have an NLS website running successfully in your network. You should be able to open up a browser on a client computer sitting inside the network and successfully browse to https://nls.mydomain.local. How it works... In this recipe, we configured a basic Network Location Server website for use with our DirectAccess environment. This site will do exactly what we need it to when our DA client computers try to validate whether they are inside or outside the corporate network. While this recipe meets our requirements for NLS, and in fact puts us into a good practice of installing DirectAccess with NLS being hosted on its own web server, there is yet another step you could take to make it even better. Currently this web server is a single point of failure for NLS. If this web server goes down or has a problem, we would have DirectAccess client computers inside the office who would think they are outside, and they would have some major name resolution problems until we sorted out the NLS problem. Given that, it is a great idea to make NLS redundant. You could cluster servers together, use Microsoft Network Load Balancing ( NLB ), or even use some kind of hardware load balancer if you have one available in your network. This way you could run the same NLS website on multiple web servers and know that your clients will still work properly in the event of a web server failure. Summary This article encourages you to use Windows Server 2012 R2 as the connectivity platform that brings your remote computers into the corporate network. We discussed DirectAccess and VPN in this article. We also saw how to configure DirectAccess and VPN, and how to secure DirectAccess using certificate authentication. Resources for Article: Further resources on this subject: Cross-premise Connectivity [article] Setting Up and Managing E-mails and Batch Processing [article] Upgrading from Previous Versions [article]
Read more
  • 0
  • 0
  • 1564

article-image-event-driven-programming
Packt
06 Feb 2015
22 min read
Save for later

Event-driven Programming

Packt
06 Feb 2015
22 min read
In this article by Alan Thorn author of the book Mastering Unity Scripting will cover the following topics: Events Event management (For more resources related to this topic, see here.) The Update events for MonoBehaviour objects seem to offer a convenient place for executing code that should perform regularly over time, spanning multiple frames, and possibly multiple scenes. When creating sustained behaviors over time, such as artificial intelligence for enemies or continuous motion, it may seem that there are almost no alternatives to filling an Update function with many if and switch statements, branching your code in different directions depending on what your objects need to do at the current time. But, when the Update events are seen this way, as a default place to implement prolonged behaviors, it can lead to severe performance problems for larger and more complex games. On deeper analysis, it's not difficult to see why this would be the case. Typically, games are full of so many behaviors, and there are so many things happening at once in any one scene that implementing them all through the Update functions is simply unfeasible. Consider the enemy characters alone, they need to know when the player enters and leaves their line of sight, when their health is low, when their ammo has expired, when they're standing on harmful terrain, when they're taking damage, when they're moving or not, and lots more. On thinking initially about this range of behaviors, it seems that all of them require constant and continuous attention because enemies should always know, instantly, when changes in these properties occur as a result of the player input. That is, perhaps, the main reason why the Update function seems to be the most suitable place in these situations but there are better alternatives, namely, event-driven programming. By seeing your game and your application in terms of events, you can make considerable savings in performance. This article then considers the issue of events and how to manage them game wide. Events Game worlds are fully deterministic systems; in Unity, the scene represents a shared 3D Cartesian space and timeline inside which finite GameObjects exist. Things only happen within this space when the game logic and code permits them to. For example, objects can only move when there is code somewhere that tells them to do so, and under specific conditions, such as when the player presses specific buttons on the keyboard. Notice from the example that behaviors are not simply random but are interconnected; objects move only when keyboard events occur. There is an important connection established between the actions, where one action entails another. These connections or linkages are referred to as events; each unique connection being a single event. Events are not active but passive; they represent moments of opportunity but not action in themselves, such as a key press, a mouse click, an object entering a collider volume, the player being attacked, and so on. These are examples of events and none of them say what the program should actually do, but only the kind of scenario that just happened. Event-driven programming starts with the recognition of events as a general concept and comes to see almost every circumstance in a game as an instantiation of an event; that is, as an event situated in time, not just an event concept but as a specific event that happens at a specific time. Understanding game events like these is helpful because all actions in a game can then be seen as direct responses to events as and when they happen. Specifically, events are connected to responses; an event happens and triggers a response. Further, the response can go on to become an event that triggers further responses and so on. In other words, the game world is a complete, integrated system of events and responses. Once the world is seen this way, the question then arises as to how it can help us improve performance over simply relying on the Update functions to move behaviors forward on every frame. And the method is simply by finding ways to reduce the frequency of events. Now, stated in this way, it may sound a crude strategy, but it's important. To illustrate, let's consider the example of an enemy character firing a weapon at the player during combat. Throughout the gameplay, the enemy will need to keep track of many properties. Firstly, their health, because when it runs low the enemy should seek out medical kits and aids to restore their health again. Secondly, their ammo, because when it runs low the enemy should seek to collect more and also the enemy will need to make reasoned judgments about when to fire at the player, such as only when they have a clear line of sight. Now, by simply thinking about this scenario, we've already identified some connections between actions that might be identified as events. But before taking this consideration further, let's see how we might implement this behavior using an Update function, as shown in the following code sample 4-1. Then, we'll look at how events can help us improve on that implementation: // Update is called once per frame void Update () {    //Check enemy health    //Are we dead?    if(Health <= 0)    {          //Then perform die behaviour          Die();          return;    }    //Check for health low    if(health <= 20)    {        //Health is low, so find first-aid          RunAndFindHealthRestore();          return;    }    //Check ammo    //Have we run out of ammo?    if(Ammo <= 0)    {          //Then find more          SearchMore();          return;    }    //Health and ammo are fine. Can we see player? If so, shoot    if(HaveLineOfSight)    {            FireAtPlayer();    } } The preceding code sample 4-1 shows a heavy Update function filled with lots of condition checking and responses. In essence, the Update function attempts to merge event handling and response behaviors into one and the results in an unnecessarily expensive process. If we think about the event connections between these different processes (the health and ammo check), we see how the code could be refactored more neatly. For example, ammo only changes on two occasions: when a weapon is fired or when new ammo is collected. Similarly, health only changes on two occasions: when an enemy is successfully attacked by the player or when an enemy collects a first-aid kit. In the first case, there is a reduction, and in the latter case, an increase. Since these are the only times when the properties change (the events), these are the only points where their values need to be validated. See the following code sample 4-2 for a refactored enemy, which includes C# properties and a much reduced Update function: using UnityEngine; using System.Collections; public class EnemyObject : MonoBehaviour {    //-------------------------------------------------------    //C# accessors for private variables    public int Health    {          get{return _health;}          set          {                //Clamp health between 0-100                _health = Mathf.Clamp(value, 0, 100);               //Check if dead                if(_health <= 0)                {                      OnDead();                      return;                }                //Check health and raise event if required                if(_health <= 20)               {                      OnHealthLow();                      return;                }          }    }    //-------------------------------------------------------    public int Ammo    {          get{return _ammo;}          set          {              //Clamp ammo between 0-50              _ammo = Mathf.Clamp(value,0,50);                //Check if ammo empty                if(_ammo <= 0)                {                      //Call expired event                      OnAmmoExpired();                      return;                }          }    }    //-------------------------------------------------------    //Internal variables for health and ammo    private int _health = 100;    private int _ammo = 50;    //-------------------------------------------------------    // Update is called once per frame    void Update ()    {    }    //-------------------------------------------------------    //This event is called when health is low    void OnHealthLow()    {          //Handle event response here    }    //-------------------------------------------------------    //This event is called when enemy is dead    void OnDead()    {        //Handle event response here    }    //-------------------------------------------------------    //Ammo run out event    void OnAmmoExpired()    {        //Handle event response here    }    //------------------------------------------------------- } The enemy class in the code sample 4-2 has been refactored to an event-driven design, where properties such as Ammo and Health are validated not inside the Update function but on assignment. From here, events are raised wherever appropriate based on the newly assigned values. By adopting an event-driven design, we introduce performance optimization and cleanness into our code; we reduce the excess baggage and value checks as found with the Update function in the code sample 4-1, and instead we only allow value-specific events to drive our code, knowing they'll be invoked only at the relevant times. Event management Event-driven programming can make our lives a lot easier. But no sooner than we accept events into the design do we come across a string of new problems that require a thoroughgoing resolution. Specifically, we saw in the code sample 4-2 how C# properties for health and ammo are used to validate and detect for relevant changes and then to raise events (such as OnDead) where appropriate. This works fine in principle, at least when the enemy must be notified about events that happen to itself. However, what if an enemy needed to know about the death of another enemy or needed to know when a specified number of other enemies had been killed? Now, of course, thinking about this specific case, we could go back to the enemy class in the code sample 4-2 and amend it to call an OnDead event not just for the current instance but for all other enemies using functions such as SendMessage. But this doesn't really solve our problem in the general sense. In fact, let's state the ideal case straight away; we want every object to optionally listen for every type of event and to be notified about them as and when they happen, just as easily as if the event had happened to them. So the question that we face now is about how to code an optimized system to allow easy event management like this. In short, we need an EventManager class that allows objects to listen to specific events. This system relies on three central concepts, as follows: Event Listener: A listener refers to any object that wants to be notified about an event when it happens, even its own events. In practice, almost every object will be a listener for at least one event. An enemy, for example, may want notifications about low health and low ammo among others. In this case, it's a listener for at least two separate events. Thus, whenever an object expects to be told when an event happens, it becomes a listener. Event Poster: In contrast to listeners, when an object detects that an event has occurred, it must announce or post a public notification about it that allows all other listeners to be notified. In the code sample 4-2, the enemy class detects the Ammo and Health events using properties and then calls the internal events, if required. But to be a true poster in this sense, we require that the object must raise events at a global level. Event Manager: Finally, there's an overarching singleton Event Manager object that persists across levels and is globally accessible. This object effectively links listeners to posters. It accepts notifications of events sent by posters and then immediately dispatches the notifications to all appropriate listeners in the form of events. Starting event management with interfaces The first or original entity in the event handling system is the listener—the thing that should be notified about specific events as and when they happen. Potentially, a listener could be any kind of object or any kind of class; it simply expects to be notified about specific events. In short, the listener will need to register itself with the Event Manager as a listener for one or more specific events. Then, when the event actually occurs, the listener should be notified directly by a function call. So, technically, the listener raises a type-specificity issue for the Event Manager about how the manager should invoke an event on the listener if the listener could potentially be an object of any type. Of course, this issue can be worked around, as we've seen, using either SendMessage or BroadcastMessage. Indeed, there are event handling systems freely available online, such as NotificationCenter that rely on these functions. However, we'll avoid them using interfaces and use polymorphism instead, as both SendMessage and BroadcastMessage rely heavily on reflection. Specifically, we'll create an interface from which all listener objects derive. More information on the freely available NotificationCenter (C# version) is available from the Unity wiki at http://wiki.unity3d.com/index.php?title=CSharpNotificationCenter. In C#, an interface is like a hollow abstract base class. Like a class, an interface brings together a collection of methods and functions into a single template-like unit. But, unlike a class, an interface only allows you to define function prototypes such as the name, return type, and arguments for a function. It doesn't let you define a function body. The reason being that an interface simply defines the total set of functions that a derived class will have. The derived class may implement the functions however necessary, and the interface simply exists so that other objects can invoke the functions via polymorphism without knowing the specific type of each derived class. This makes interfaces a suitable candidate to create a Listener object. By defining a Listener interface from which all objects will be derived, every object has the ability to be a listener for events. The following code sample 4-3 demonstrates a sample Listener interface: 01 using UnityEngine; 02 using System.Collections; 03 //----------------------------------------------------------- 04 //Enum defining all possible game events 05 //More events should be added to the list 06 public enum EVENT_TYPE {GAME_INIT, 07                                GAME_END, 08                                 AMMO_EMPTY, 09                                 HEALTH_CHANGE, 10                                 DEAD}; 11 //----------------------------------------------------------- 12 //Listener interface to be implemented on Listener classes 13 public interface IListener 14 { 15 //Notification function invoked when events happen 16 void OnEvent(EVENT_TYPE Event_Type, Component Sender,    Object Param = null); 17 } 18 //----------------------------------------------------------- The following are the comments for the code sample 4-3: Lines 06-10: This enumeration should define a complete list of all possible game events that could be raised. The sample code lists only five game events: GAME_INIT, GAME_END, AMMO_EMPTY, HEALTH_CHANGE, and DEAD. Your game will presumably have many more. You don't actually need to use enumerations for encoding events; you could just use integers. But I've used enumerations to improve event readability in code. Lines 13-17: The Listener interface is defined as IListener using the C# interfaces. It supports just one event, namely OnEvent. This function will be inherited by all derived classes and will be invoked by the manager whenever an event occurs for which the listener is registered. Notice that OnEvent is simply a function prototype; it has no body. More information on C# interfaces can be found at http://msdn.microsoft.com/en-us/library/ms173156.aspx. Using the IListener interface, we now have the ability to make a listener from any object using only class inheritance; that is, any object can now declare itself as a listener and potentially receive events. For example, a new MonoBehaviour component can be turned into a listener with the following code sample 4-4. This code uses multiple inheritance, that is, it inherits from two classes. More information on multiple inheritance can be found at http://www.dotnetfunda.com/articles/show/1185/multiple-inheritance-in-csharp: using UnityEngine; using System.Collections; public class MyCustomListener : MonoBehaviour, IListener {    // Use this for initialization    void Start () {}    // Update is called once per frame    void Update () {}    //---------------------------------------    //Implement OnEvent function to receive Events    public void OnEvent(EVENT_TYPE Event_Type, Component Sender, Object Param = null)    {    }    //--------------------------------------- } Creating an EventManager Any object can now be turned into a listener, as we've seen. But still the listeners must register themselves with a manager object of some kind. Thus, it is the duty of the manager to call the events on the listeners when the events actually happen. Let's now turn to the manager itself and its implementation details. The manager class will be called EventManager, as shown in the following code sample 4-5. This class, being a persistent singleton object, should be attached to an empty GameObject in the scene where it will be directly accessible to every other object through a static instance property. More on this class and its usage is considered in the subsequent comments: 001 using UnityEngine; 002 using System.Collections; 003 using System.Collections.Generic; 004 //----------------------------------- 005 //Singleton EventManager to send events to listeners 006 //Works with IListener implementations 007 public class EventManager : MonoBehaviour 008 { 009     #region C# properties 010 //----------------------------------- 011     //Public access to instance 012     public static EventManager Instance 013       { 014             get{return instance;} 015            set{} 016       } 017   #endregion 018 019   #region variables 020       // Notifications Manager instance (singleton design pattern) 021   private static EventManager instance = null; 022 023     //Array of listeners (all objects registered for events) 024     private Dictionary<EVENT_TYPE, List<IListener>> Listeners          = new Dictionary<EVENT_TYPE, List<IListener>>(); 025     #endregion 026 //----------------------------------------------------------- 027     #region methods 028     //Called at start-up to initialize 029     void Awake() 030     { 031             //If no instance exists, then assign this instance 032             if(instance == null) 033           { 034                   instance = this; 035                   DontDestroyOnLoad(gameObject); 036           } 037             else 038                   DestroyImmediate(this); 039     } 040//----------------------------------------------------------- 041     /// <summary> 042     /// Function to add listener to array of listeners 043     /// </summary> 044     /// <param name="Event_Type">Event to Listen for</param> 045     /// <param name="Listener">Object to listen for event</param> 046     public void AddListener(EVENT_TYPE Event_Type, IListener        Listener) 047    { 048           //List of listeners for this event 049           List<IListener> ListenList = null; 050 051           // Check existing event type key. If exists, add to list 052           if(Listeners.TryGetValue(Event_Type,                out ListenList)) 053           { 054                   //List exists, so add new item 055                   ListenList.Add(Listener); 056                   return; 057           } 058 059           //Otherwise create new list as dictionary key 060           ListenList = new List<IListener>(); 061           ListenList.Add(Listener); 062           Listeners.Add(Event_Type, ListenList); 063     } 064 //----------------------------------------------------------- 065       /// <summary> 066       /// Function to post event to listeners 067       /// </summary> 068       /// <param name="Event_Type">Event to invoke</param> 069       /// <param name="Sender">Object invoking event</param> 070       /// <param name="Param">Optional argument</param> 071       public void PostNotification(EVENT_TYPE Event_Type,          Component Sender, Object Param = null) 072       { 073           //Notify all listeners of an event 074 075           //List of listeners for this event only 076           List<IListener> ListenList = null; 077 078           //If no event exists, then exit 079           if(!Listeners.TryGetValue(Event_Type,                out ListenList)) 080                   return; 081 082             //Entry exists. Now notify appropriate listeners 083             for(int i=0; i<ListenList.Count; i++) 084             { 085                   if(!ListenList[i].Equals(null)) 086                   ListenList[i].OnEvent(Event_Type, Sender, Param); 087             } 088     } 089 //----------------------------------------------------------- 090     //Remove event from dictionary, including all listeners 091     public void RemoveEvent(EVENT_TYPE Event_Type) 092     { 093           //Remove entry from dictionary 094           Listeners.Remove(Event_Type); 095     } 096 //----------------------------------------------------------- 097       //Remove all redundant entries from the Dictionary 098     public void RemoveRedundancies() 099     { 100             //Create new dictionary 101             Dictionary<EVENT_TYPE, List<IListener>>                TmpListeners = new Dictionary                <EVENT_TYPE, List<IListener>>(); 102 103             //Cycle through all dictionary entries 104             foreach(KeyValuePair<EVENT_TYPE, List<IListener>>                Item in Listeners) 105             { 106                   //Cycle all listeners, remove null objects 107                   for(int i = Item.Value.Count-1; i>=0; i--) 108                   { 109                         //If null, then remove item 110                         if(Item.Value[i].Equals(null)) 111                                 Item.Value.RemoveAt(i); 112                   } 113 114           //If items remain in list, then add to tmp dictionary 115                   if(Item.Value.Count > 0) 116                         TmpListeners.Add (Item.Key,                              Item.Value); 117             } 118 119             //Replace listeners object with new dictionary 120             Listeners = TmpListeners; 121     } 122 //----------------------------------------------------------- 123       //Called on scene change. Clean up dictionary 124       void OnLevelWasLoaded() 125       { 126           RemoveRedundancies(); 127       } 128 //----------------------------------------------------------- 129     #endregion 130 } More information on the OnLevelWasLoaded event can be found at http://docs.unity3d.com/ScriptReference/MonoBehaviour.OnLevelWasLoaded.html. The following are the comments for the code sample 4-5: Line 003: Notice the addition of the System.Collections.Generic namespace giving us access to additional mono classes, including the Dictionary class. This class will be used throughout the EventManager class. In short, the Dictionary class is a special kind of 2D array that allows us to store a database of values based on key-value pairing. More information on the Dictionary class can be found at http://msdn.microsoft.com/en-us/library/xfhwa508%28v=vs.110%29.aspx. Line 007: The EventManager class is derived from MonoBehaviour and should be attached to an empty GameObject in the scene where it will exist as a persistent singleton. Line 024: A private member variable Listeners is declared using a Dictionary class. This structure maintains a hash-table array of key-value pairs, which can be looked up and searched like a database. The key-value pairing for the EventManager class takes the form of EVENT_TYPE and List<Component>. In short, this means that a list of event types can be stored (such as HEALTH_CHANGE), and for each type there could be none, one, or more components that are listening and which should be notified when the event occurs. In effect, the Listeners member is the primary data structure on which the EventManager relies to maintain who is listening for what. Lines 029-039: The Awake function is responsible for the singleton functionality, that is, to make the EventManager class into a singleton object that persists across scenes. Lines 046-063: The AddListener method of EventManager should be called by a Listener object once for each event for which it should listen. The method accepts two arguments: the event to listen for (Event_Type) and a reference to the listener object itself (derived from IListener), which should be notified if and when the event happens. The AddListener function is responsible for accessing the Listeners dictionary and generating a new key-value pair to store the connection between the event and the listener. Lines 071-088: The PostNotification function can be called by any object, whether a listener or not, whenever an event is detected. When called, the EventManager cycles all matching entries in the dictionary, searching for all listeners connected to the current event, and notifies them by invoking the OnEvent method through the IListener interface. Lines 098-127: The final methods for the EventManager class are responsible for maintaining data integrity of the Listeners structure when a scene change occurs and the EventManager class persists. Although the EventManager class persists across scenes, the listener objects themselves in the Listeners variable may not do so. They may get destroyed on scene changes. If so, scene changes will invalidate some listeners, leaving the EventManager with invalid entries. Thus, the RemoveRedundancies method is called to find and eliminate all invalid entries. The OnLevelWasLoaded event is invoked automatically by Unity whenever a scene change occurs. More information on the OnLevelWasLoaded event can be found online at: http://docs.unity3d.com/ScriptReference/MonoBehaviour.OnLevelWasLoaded.html. #region and #endregion The two preprocessor directives #region and #endregion (in combination with the code folding feature) can be highly useful for improving the readability of your code and also for improving the speed with which you can navigate the source file. They add organization and structure to your source code without affecting its validity or execution. Effectively, #region marks the top of a code block and #endregion marks the end. Once a region is marked, it becomes foldable, that is, it becomes collapsible using the MonoDevelop code editor, provided the code folding feature is enabled. Collapsing a region of code is useful for hiding it from view, which allows you to concentrate on reading other areas relevant to your needs, as shown in the following screenshot: Enabling code folding in MonoDevelop To enable code folding in MonoDevelop, select Options in Tools from the application menu. This displays the Options window. From here, choose the General tab in the Text Editor option and click on Enable code folding as well as Fold #regions by default. Using EventManager Now, let's see how to put the EventManager class to work in a practical context from the perspective of listeners and posters in a single scene. First, to listen for an event (any event) a listener must register itself with the EventManager singleton instance. Typically, this will happen once and at the earliest opportunity, such as the Start function. Do not use the Awake function; this is reserved for an object's internal initialization as opposed to the functionality that reaches out beyond the current object to the states and setup of others. See the following code sample 4-6 and notice that it relies on the Instance static property to retrieve a reference to the active EventManager singleton: //Called at start-up void Start() { //Add myself as listener for health change events EventManager.Instance.AddListener(EVENT_TYPE.HEALTH_CHANGE, this); } Having registered listeners for one or more events, objects can then post notifications to EventManager as events are detected, as shown in the following code sample 4-7: public int Health { get{return _health;} set {    //Clamp health between 0-100    _health = Mathf.Clamp(value, 0, 100);    //Post notification - health has been changed   EventManager.Instance. PostNotification(EVENT_TYPE.HEALTH_CHANGE, this, _health); } } Finally, after a notification is posted for an event, all the associated listeners are updated automatically through EventManager. Specifically, EventManager will call the OnEvent function of each listener, giving listeners the opportunity to parse event data and respond where needed, as shown in the following code sample 4-7: //Called when events happen public void OnEvent(EVENT_TYPE Event_Type, Component Sender, object Param = null) { //Detect event type switch(Event_Type) {    case EVENT_TYPE.HEALTH_CHANGE:          OnHealthChange(Sender, (int)Param);    break; } } Summary This article focused on the manifold benefits available for your applications by adopting an event-driven framework consistently through the EventManager class. In implementing such a manager, we were able to rely on either interfaces or delegates, and either method is powerful and extensible. Specifically, we saw how it's easy to add more and more functionality into an Update function but how doing this can lead to severe performance issues. Better is to analyze the connections between your functionality to refactor it into an event-driven framework. Essentially, events are the raw material of event-driven systems. They represent a necessary connection between one action (the cause) and another (the response). To manage events, we created the EventManager class—an integrated class or system that links posters to listeners. It receives notifications from posters about events as and when they happen and then immediately dispatches a function call to all listeners for the event. Resources for Article: Further resources on this subject: Customizing skin with GUISkin [Article] 2D Twin-stick Shooter [Article] Components in Unity [Article]
Read more
  • 0
  • 0
  • 6437
article-image-extending-elasticsearch-scripting
Packt
06 Feb 2015
21 min read
Save for later

Extending ElasticSearch with Scripting

Packt
06 Feb 2015
21 min read
In article by Alberto Paro, the author of ElasticSearch Cookbook Second Edition, we will cover about the following recipes: (For more resources related to this topic, see here.) Installing additional script plugins Managing scripts Sorting data using scripts Computing return fields with scripting Filtering a search via scripting Introduction ElasticSearch has a powerful way of extending its capabilities with custom scripts, which can be written in several programming languages. The most common ones are Groovy, MVEL, JavaScript, and Python. In this article, we will see how it's possible to create custom scoring algorithms, special processed return fields, custom sorting, and complex update operations on records. The scripting concept of ElasticSearch can be seen as an advanced stored procedures system in the NoSQL world; so, for an advanced usage of ElasticSearch, it is very important to master it. Installing additional script plugins ElasticSearch provides native scripting (a Java code compiled in JAR) and Groovy, but a lot of interesting languages are also available, such as JavaScript and Python. In older ElasticSearch releases, prior to version 1.4, the official scripting language was MVEL, but due to the fact that it was not well-maintained by MVEL developers, in addition to the impossibility to sandbox it and prevent security issues, MVEL was replaced with Groovy. Groovy scripting is now provided by default in ElasticSearch. The other scripting languages can be installed as plugins. Getting ready You will need a working ElasticSearch cluster. How to do it... In order to install JavaScript language support for ElasticSearch (1.3.x), perform the following steps: From the command line, simply enter the following command: bin/plugin --install elasticsearch/elasticsearch-lang-javascript/2.3.0 This will print the following result: -> Installing elasticsearch/elasticsearch-lang-javascript/2.3.0... Trying http://download.elasticsearch.org/elasticsearch/elasticsearch-lang-javascript/ elasticsearch-lang-javascript-2.3.0.zip... Downloading ....DONE Installed lang-javascript If the installation is successful, the output will end with Installed; otherwise, an error is returned. To install Python language support for ElasticSearch, just enter the following command: bin/plugin -install elasticsearch/elasticsearch-lang-python/2.3.0 The version number depends on the ElasticSearch version. Take a look at the plugin's web page to choose the correct version. How it works... Language plugins allow you to extend the number of supported languages to be used in scripting. During the ElasticSearch startup, an internal ElasticSearch service called PluginService loads all the installed language plugins. In order to install or upgrade a plugin, you need to restart the node. The ElasticSearch community provides common scripting languages (a list of the supported scripting languages is available on the ElasticSearch site plugin page at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-plugins.html), and others are available in GitHub repositories (a simple search on GitHub allows you to find them). The following are the most commonly used languages for scripting: Groovy (http://groovy.codehaus.org/): This language is embedded in ElasticSearch by default. It is a simple language that provides scripting functionalities. This is one of the fastest available language extensions. Groovy is a dynamic, object-oriented programming language with features similar to those of Python, Ruby, Perl, and Smalltalk. It also provides support to write a functional code. JavaScript (https://github.com/elasticsearch/elasticsearch-lang-javascript): This is available as an external plugin. The JavaScript implementation is based on Java Rhino (https://developer.mozilla.org/en-US/docs/Rhino) and is really fast. Python (https://github.com/elasticsearch/elasticsearch-lang-python): This is available as an external plugin, based on Jython (http://jython.org). It allows Python to be used as a script engine. Considering several benchmark results, it's slower than other languages. There's more... Groovy is preferred if the script is not too complex; otherwise, a native plugin provides a better environment to implement complex logic and data management. The performance of every language is different; the fastest one is the native Java. In the case of dynamic scripting languages, Groovy is faster, as compared to JavaScript and Python. In order to access document properties in Groovy scripts, the same approach will work as in other scripting languages: doc.score: This stores the document's score. doc['field_name'].value: This extracts the value of the field_name field from the document. If the value is an array or if you want to extract the value as an array, you can use doc['field_name'].values. doc['field_name'].empty: This returns true if the field_name field has no value in the document. doc['field_name'].multivalue: This returns true if the field_name field contains multiple values. If the field contains a geopoint value, additional methods are available, as follows: doc['field_name'].lat: This returns the latitude of a geopoint. If you need the value as an array, you can use the doc['field_name'].lats method. doc['field_name'].lon: This returns the longitude of a geopoint. If you need the value as an array, you can use the doc['field_name'].lons method. doc['field_name'].distance(lat,lon): This returns the plane distance, in miles, from a latitude/longitude point. If you need to calculate the distance in kilometers, you should use the doc['field_name'].distanceInKm(lat,lon) method. doc['field_name'].arcDistance(lat,lon): This returns the arc distance, in miles, from a latitude/longitude point. If you need to calculate the distance in kilometers, you should use the doc['field_name'].arcDistanceInKm(lat,lon) method. doc['field_name'].geohashDistance(geohash): This returns the distance, in miles, from a geohash value. If you need to calculate the same distance in kilometers, you should use doc['field_name'] and the geohashDistanceInKm(lat,lon) method. By using these helper methods, it is possible to create advanced scripts in order to boost a document by a distance that can be very handy in developing geolocalized centered applications. Managing scripts Depending on your scripting usage, there are several ways to customize ElasticSearch to use your script extensions. In this recipe, we will see how to provide scripts to ElasticSearch via files, indexes, or inline. Getting ready You will need a working ElasticSearch cluster populated with the populate script (chapter_06/populate_aggregations.sh), available at https://github.com/aparo/ elasticsearch-cookbook-second-edition. How to do it... To manage scripting, perform the following steps: Dynamic scripting is disabled by default for security reasons; we need to activate it in order to use dynamic scripting languages such as JavaScript or Python. To do this, we need to turn off the disable flag (script.disable_dynamic: false) in the ElasticSearch configuration file (config/elasticseach.yml) and restart the cluster. To increase security, ElasticSearch does not allow you to specify scripts for non-sandbox languages. Scripts can be placed in the scripts directory inside the configuration directory. To provide a script in a file, we'll put a my_script.groovy script in the config/scripts location with the following code content: doc["price"].value * factor If the dynamic script is enabled (as done in the first step), ElasticSearch allows you to store the scripts in a special index, .scripts. To put my_script in the index, execute the following command in the command terminal: curl -XPOST localhost:9200/_scripts/groovy/my_script -d '{ "script":"doc["price"].value * factor" }' The script can be used by simply referencing it in the script_id field; use the following command: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "match_all": {} }, "sort": {    "_script" : {      "script_id" : "my_script",      "lang" : "groovy",      "type" : "number",      "ignore_unmapped" : true,      "params" : {        "factor" : 1.1      },      "order" : "asc"    } } }' How it works... ElasticSearch allows you to load your script in different ways; each one of these methods has their pros and cons. The most secure way to load or import scripts is to provide them as files in the config/scripts directory. This directory is continuously scanned for new files (by default, every 60 seconds). The scripting language is automatically detected by the file extension, and the script name depends on the filename. If the file is put in subdirectories, the directory path becomes part of the filename; for example, if it is config/scripts/mysub1/mysub2/my_script.groovy, the script name will be mysub1_mysub2_my_script. If the script is provided via a filesystem, it can be referenced in the code via the "script": "script_name" parameter. Scripts can also be available in the special .script index. These are the REST end points: To retrieve a script, use the following code: GET http://<server>/_scripts/<language>/<id"> To store a script use the following code: PUT http://<server>/_scripts/<language>/<id> To delete a script use the following code: DELETE http://<server>/_scripts/<language>/<id> The indexed script can be referenced in the code via the "script_id": "id_of_the_script" parameter. The recipes that follow will use inline scripting because it's easier to use it during the development and testing phases. Generally, a good practice is to develop using the inline dynamic scripting in a request, because it's faster to prototype. Once the script is ready and no changes are needed, it can be stored in the index since it is simpler to call and manage. In production, a best practice is to disable dynamic scripting and store the script on the disk (generally, dumping the indexed script to disk). See also The scripting page on the ElasticSearch website at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html Sorting data using script ElasticSearch provides scripting support for the sorting functionality. In real world applications, there is often a need to modify the default sort by the match score using an algorithm that depends on the context and some external variables. Some common scenarios are given as follows: Sorting places near a point Sorting by most-read articles Sorting items by custom user logic Sorting items by revenue Getting ready You will need a working ElasticSearch cluster and an index populated with the script, which is available at https://github.com/aparo/ elasticsearch-cookbook-second-edition. How to do it... In order to sort using scripting, perform the following steps: If you want to order your documents by the price field multiplied by a factor parameter (that is, sales tax), the search will be as shown in the following code: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "match_all": {} }, "sort": {    "_script" : {      "script" : "doc["price"].value * factor",      "lang" : "groovy",      "type" : "number",      "ignore_unmapped" : true,    "params" : {        "factor" : 1.1      },            "order" : "asc"        }    } }' In this case, we have used a match_all query and a sort script. If everything is correct, the result returned by ElasticSearch should be as shown in the following code: { "took" : 7, "timed_out" : false, "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0 }, "hits" : {    "total" : 1000,    "max_score" : null,    "hits" : [ {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "161",      "_score" : null, "_source" : … truncated …,      "sort" : [ 0.0278578661440021 ]    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "634",      "_score" : null, "_source" : … truncated …,     "sort" : [ 0.08131364254827411 ]    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "465",      "_score" : null, "_source" : … truncated …,      "sort" : [ 0.1094966959069832 ]    } ] } } How it works... The sort scripting allows you to define several parameters, as follows: order (default "asc") ("asc" or "desc"): This determines whether the order must be ascending or descending. script: This contains the code to be executed. type: This defines the type to convert the value. params (optional, a JSON object): This defines the parameters that need to be passed. lang (by default, groovy): This defines the scripting language to be used. ignore_unmapped (optional): This ignores unmapped fields in a sort. This flag allows you to avoid errors due to missing fields in shards. Extending the sort with scripting allows the use of a broader approach to score your hits. ElasticSearch scripting permits the use of every code that you want. You can create custom complex algorithms to score your documents. There's more... Groovy provides a lot of built-in functions (mainly taken from Java's Math class) that can be used in scripts, as shown in the following table: Function Description time() The current time in milliseconds sin(a) Returns the trigonometric sine of an angle cos(a) Returns the trigonometric cosine of an angle tan(a) Returns the trigonometric tangent of an angle asin(a) Returns the arc sine of a value acos(a) Returns the arc cosine of a value atan(a) Returns the arc tangent of a value toRadians(angdeg) Converts an angle measured in degrees to an approximately equivalent angle measured in radians toDegrees(angrad) Converts an angle measured in radians to an approximately equivalent angle measured in degrees exp(a) Returns Euler's number raised to the power of a value log(a) Returns the natural logarithm (base e) of a value log10(a) Returns the base 10 logarithm of a value sqrt(a) Returns the correctly rounded positive square root of a value cbrt(a) Returns the cube root of a double value IEEEremainder(f1, f2) Computes the remainder operation on two arguments, as prescribed by the IEEE 754 standard ceil(a) Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer floor(a) Returns the largest (closest to positive infinity) value that is less than or equal to the argument and is equal to a mathematical integer rint(a) Returns the value that is closest in value to the argument and is equal to a mathematical integer atan2(y, x) Returns the angle theta from the conversion of rectangular coordinates (x,y_) to polar coordinates (r,_theta) pow(a, b) Returns the value of the first argument raised to the power of the second argument round(a) Returns the closest integer to the argument random() Returns a random double value abs(a) Returns the absolute value of a value max(a, b) Returns the greater of the two values min(a, b) Returns the smaller of the two values ulp(d) Returns the size of the unit in the last place of the argument signum(d) Returns the signum function of the argument sinh(x) Returns the hyperbolic sine of a value cosh(x) Returns the hyperbolic cosine of a value tanh(x) Returns the hyperbolic tangent of a value hypot(x,y) Returns sqrt(x^2+y^2) without an intermediate overflow or underflow acos(a) Returns the arc cosine of a value atan(a) Returns the arc tangent of a value If you want to retrieve records in a random order, you can use a script with a random method, as shown in the following code: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "match_all": {} }, "sort": {    "_script" : {      "script" : "Math.random()",      "lang" : "groovy",      "type" : "number",      "params" : {}    } } }' In this example, for every hit, the new sort value is computed by executing the Math.random() scripting function. See also The official ElasticSearch documentation at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html Computing return fields with scripting ElasticSearch allows you to define complex expressions that can be used to return a new calculated field value. These special fields are called script_fields, and they can be expressed with a script in every available ElasticSearch scripting language. Getting ready You will need a working ElasticSearch cluster and an index populated with the script (chapter_06/populate_aggregations.sh), which is available at https://github.com/aparo/ elasticsearch-cookbook-second-edition. How to do it... In order to compute return fields with scripting, perform the following steps: Return the following script fields: "my_calc_field": This concatenates the text of the "name" and "description" fields "my_calc_field2": This multiplies the "price" value by the "discount" parameter From the command line, execute the following code: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/ _search?&pretty=true&size=3' -d '{ "query": {    "match_all": {} }, "script_fields" : {    "my_calc_field" : {      "script" : "doc["name"].value + " -- " + doc["description"].value"    },    "my_calc_field2" : {      "script" : "doc["price"].value * discount",      "params" : {       "discount" : 0.8      }    } } }' If everything works all right, this is how the result returned by ElasticSearch should be: { "took" : 4, "timed_out" : false, "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0 }, "hits" : {    "total" : 1000,    "max_score" : 1.0,    "hits" : [ {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "4",      "_score" : 1.0,      "fields" : {        "my_calc_field" : "entropic -- accusantium",        "my_calc_field2" : 5.480038242170081      }    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "9",      "_score" : 1.0,      "fields" : {        "my_calc_field" : "frankie -- accusantium",        "my_calc_field2" : 34.79852410178313      }    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "11",      "_score" : 1.0,      "fields" : {        "my_calc_field" : "johansson -- accusamus",        "my_calc_field2" : 11.824173084636591      }    } ] } } How it works... The scripting fields are similar to executing an SQL function on a field during a select operation. In ElasticSearch, after a search phase is executed and the hits to be returned are calculated, if some fields (standard or script) are defined, they are calculated and returned. The script field, which can be defined with all the supported languages, is processed by passing a value to the source of the document and, if some other parameters are defined in the script (in the discount factor example), they are passed to the script function. The script function is a code snippet; it can contain everything that the language allows you to write, but it must be evaluated to a value (or a list of values). See also The Installing additional script plugins recipe in this article to install additional languages for scripting The Sorting using script recipe to have a reference of the extra built-in functions in Groovy scripts Filtering a search via scripting ElasticSearch scripting allows you to extend the traditional filter with custom scripts. Using scripting to create a custom filter is a convenient way to write scripting rules that are not provided by Lucene or ElasticSearch, and to implement business logic that is not available in the query DSL. Getting ready You will need a working ElasticSearch cluster and an index populated with the (chapter_06/populate_aggregations.sh) script, which is available at https://github.com/aparo/ elasticsearch-cookbook-second-edition. How to do it... In order to filter a search using a script, perform the following steps: Write a search with a filter that filters out a document with the value of age less than the parameter value: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "filtered": {      "filter": {        "script": {          "script": "doc["age"].value > param1",          "params" : {            "param1" : 80          }        }      },      "query": {        "match_all": {}      }    } } }' In this example, all the documents in which the value of age is greater than param1 are qualified to be returned. If everything works correctly, the result returned by ElasticSearch should be as shown here: { "took" : 30, "timed_out" : false, "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0 }, "hits" : {    "total" : 237,    "max_score" : 1.0,    "hits" : [ {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "9",      "_score" : 1.0, "_source" :{ … "age": 83, … }    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "23",      "_score" : 1.0, "_source" : { … "age": 87, … }    }, {      "_index" : "test-index",      "_type" : "test-type",      "_id" : "47",      "_score" : 1.0, "_source" : {…. "age": 98, …}    } ] } } How it works... The script filter is a language script that returns a Boolean value (true/false). For every hit, the script is evaluated, and if it returns true, the hit passes the filter. This type of scripting can only be used as Lucene filters, not as queries, because it doesn't affect the search (the exceptions are constant_score and custom_filters_score). These are the scripting fields: script: This contains the code to be executed params: These are optional parameters to be passed to the script lang (defaults to groovy): This defines the language of the script The script code can be any code in your preferred and supported scripting language that returns a Boolean value. There's more... Other languages are used in the same way as Groovy. For the current example, I have chosen a standard comparison that works in several languages. To execute the same script using the JavaScript language, use the following code: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "filtered": {      "filter": {        "script": {          "script": "doc["age"].value > param1",          "lang":"javascript",          "params" : {            "param1" : 80          }        }      },      "query": {        "match_all": {}      }    } } }' For Python, use the following code: curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search?&pretty=true&size=3' -d '{ "query": {    "filtered": {      "filter": {        "script": {          "script": "doc["age"].value > param1",          "lang":"python",          "params" : {            "param1" : 80          }        }      },      "query": {        "match_all": {}      }    } } }' See also The Installing additional script plugins recipe in this article to install additional languages for scripting The Sorting data using script recipe in this article to get a reference of the extra built-in functions in Groovy scripts Summary In this article you have learnt the ways you can use scripting to extend the ElasticSearch functional capabilities using different programming languages. Resources for Article: Further resources on this subject: Indexing the Data [Article] Low-Level Index Control [Article] Designing Puppet Architectures [Article]
Read more
  • 0
  • 0
  • 8475

article-image-introduction-apache-zookeeper
Packt
05 Feb 2015
26 min read
Save for later

Introduction to Apache ZooKeeper

Packt
05 Feb 2015
26 min read
In this article by Saurav Haloi, author of the book a Apache Zookeeper Essentials, we will learn about Apache ZooKeeper is a software project of the Apache Software Foundation; it provides an open source solution to the various coordination problems in large distributed systems. ZooKeeper as a centralized coordination service is distributed and highly reliable, running on a cluster of servers called a ZooKeeper Ensemble. Distributed consensus, group management, presence protocols, and leader election are implemented by the service so that the applications do not need to reinvent the wheel by implementing them on its own. On top of these, the primitives exposed by ZooKeeper can be used by applications to build much more powerful abstractions for solving a wide variety of problems. (For more resources related to this topic, see here.) Apache ZooKeeper is implemented in Java. It ships with C, Java, Perl, and Python client bindings. Community contributed client libraries are available for a plethora of languages like Go, Scala, Erlang, and so on. Apache ZooKeeper is widely used by large number of organizations, such as Yahoo Inc., Twitter, Netflix and Facebook, in their distributed application platforms as a coordination service. In this article we will look into installation and configuration of Apache ZooKeeper, some of the concepts associated with it followed by programming using Python client library of ZooKeeper. We will also read how we can implement some of the important constructs of distributed programming using ZooKeeper. Download and installation ZooKeeper is supported by a wide variety of platforms. GNU/Linux and Oracle Solaris are supported as development and production platforms for both server and client. Windows and Mac OS X are recommended only as development platforms for both server and client. ZooKeeper is implemented in Java and requires Java 6 or later versions to run. Let's download the stable version from one of the mirrors, say Georgia Tech's Apache download mirror (http://b.gatech.edu/1xElxRb) in the following example: $ wgethttp://www.gtlib.gatech.edu/pub/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz$ ls -alh zookeeper-3.4.6.tar.gz-rw-rw-r-- 1 saurav saurav 17M Feb 20 2014 zookeeper-3.4.6.tar.gz Once we have downloaded the ZooKeeper tarball, installing and setting up a standalone ZooKeeper node is pretty simple and straightforward. Let's extract the compressed tar archive into /usr/share: $ tar -C /usr/share -zxf zookeeper-3.4.6.tar.gz$ cd /usr/share/zookeeper-3.4.6/$ lsbin CHANGES.txt contrib docs ivy.xml LICENSE.txtREADME_packaging.txt recipes zookeeper-3.4.6.jar zookeeper-3.4.6.jar.md5build.xml conf dist-maven ivysettings.xml libNOTICE.txt README.txt src zookeeper-3.4.6.jar.asczookeeper-3.4.6.jar.sha1 The location where the ZooKeeper archive is extracted in our case, /usr/share/zookeeper-3.4.6, can be exported as ZK_HOME as follows: $ export ZK_HOME=/usr/share/zookeeper-3.4.6 Configuration Once we have extracted the tarball, the next thing is to configure ZooKeeper. The conf folder holds the configuration files for ZooKeeper. ZooKeeper needs a configuration file called zoo.cfg in the conf folder inside the extracted ZooKeeper folder. There is a sample configuration file that contains some of the configuration parameters for reference. Let's create our configuration file with the following minimal parameters and save it in the conf directory: $ cat conf/zoo.cfgtickTime=2000dataDir=/var/lib/zookeeperclientPort=2181 The configuration parameters' meanings are explained here: tickTime: This is measured in milliseconds; it is used for session registration and to do regular heartbeats by clients with the ZooKeeper service. The minimum session timeout will be twice the tickTime parameter. dataDir: This is the location to store the in-memory state of ZooKeeper; it includes database snapshots and the transaction log of updates to the database. Extracting the ZooKeeper archive won't create this directory, so if this directory doesn't exist in the system, you will need to create it and set writable permission to it. clientPort: This is the port that listens for client connections, so it is where the ZooKeeper clients will initiate a connection. The client port can be set to any number, and different servers can be configured to listen on different ports. The default is 2181. ZooKeeper needs the JAVA_HOME environment variable to be set correctly. To see if this is set in your system, run the following command: $ echo $JAVA_HOME Starting the ZooKeeper server Now, considering that Java is installed and working properly, let's go ahead and start the ZooKeeper server. All ZooKeeper administration scripts to start/stop the server and invoke the ZooKeeper command shell are shipped along with the archive in the bin folder with the following code: $ pwd /usr/share/zookeeper-3.4.6/bin $ ls README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh The scripts with the .sh extension are for Unix platforms (GNU/Linux, Mac OS X, and so on), and the scripts with the .cmd extension are for Microsoft Windows operating systems. To start the ZooKeeper server in a GNU/Linux system, you need to execute the zkServer.sh script as follows. This script gives options to start, stop, restart, and see the status of the ZooKeeper server: $ ./zkServer.sh JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd} To avoid going to the ZooKeeper install directory to run these scripts, you can include it in your PATH variable as follows: export PATH=$PATH:/usr/share/zookeeper-3.4.6/bin Executing zkServer.sh with the start argument will start the ZooKeeper server. A successful start of the server will show the following output: $ zkServer.sh start JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED To verify that the ZooKeeper server has started, you can use the following ps command: $ ps –ef | grep zookeeper | grep –v grep | awk '{print $2}' 5511 The ZooKeeper server's status can be checked with the zkServer.sh script as follows: $ zkServer.sh status JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: standalone Connecting to ZooKeeper with a Java-based shell To start the Java-based ZooKeeper command-line shell, we simply need to run zkCli.sh of the ZK_HOME/bin folder with the server IP and port as follows: ${ZK_HOME}/bin/zkCli.sh –server zk_server:port In our case, we are running our ZooKeeper server on the same machine, so the ZooKeeper server will be localhost, or the loop-back address will be 127.0.0.1. The default port we configured was 2181: $ zkCli.sh -server localhost:2181 As we connect to the running ZooKeeper instance, we will see the output similar to the following one in the terminal (some output is omitted): Connecting to localhost:2181 ............... ............... Welcome to ZooKeeper! JLine support is enabled ............. WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] To see a listing of the commands supported by the ZooKeeper Java shell, you can run the help command in the shell prompt: [zk: localhost:2181(CONNECTED) 0] help ZooKeeper -server host:port cmd args connect host:port get path [watch] ls path [watch] set path data [version] rmr path delquota [-n|-b] path quit printwatches on|off create [-s] [-e] path data acl stat path [watch] close ls2 path [watch] history listquota path setAcl path acl getAcl path sync path redo cmdno addauth scheme auth delete path [version] setquota -n|-b val path We can execute a few simple commands to get a feel of the command-line interface. Let's start by running the ls command, which, as in Unix, is used for listing: [zk: localhost:2181(CONNECTED) 1] ls / [zookeeper] Now, the ls command returned a string called zookeeper, which is a znode in the ZooKeeper terminology. We can create a znode through the ZooKeeper shell as follows: To begin with, let's create a HelloWorld znode with empty data. [zk: localhost:2181(CONNECTED) 2] create /HelloWorld "" Created /HelloWorld [zk: localhost:2181(CONNECTED) 3] ls / [zookeeper, HelloWorld] We can delete the znode created by issuing the delete command as follows: [zk: localhost:2181(CONNECTED) 4] delete /HelloWorld [zk: localhost:2181(CONNECTED) 5] ls / [zookeeper] The ZooKeeper data model ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace of data registers. The namespace looks quite similar to a Unix filesystem. The data registers are known as znodes in the ZooKeeper nomenclature. ZooKeeper has two types of znodes: persistent and ephemeral. There is a third type that you might have heard of, called a sequential znode, which is a kind of a qualifier for the other two types. Both persistent and ephemeral znodes can be sequential znodes as well. The persistent znode As the name suggests, persistent znodes have a lifetime in the ZooKeeper’s namespace until they’re explicitly deleted. A znode can be deleted by calling the delete API call. The ephemeral znode An ephemeral znode is deleted by the ZooKeeper service when the creating client’s session ends. An end to a client’s session can happen because of disconnection due to a client crash or explicit termination of the connection. The sequential znode A sequential znode is assigned a sequence number by ZooKeeper as a part of its name during its creation. The value of a monotonously increasing counter (maintained by the parent znode) is appended to the name of the znode. The ZooKeeper Watches ZooKeeper is designed to be a scalable and robust centralized service for very large distributed applications. A common design anti-pattern associated while accessing such services by clients is through polling or a pull kind of a model. A pull model often suffers from scalability problems when implemented in large and complex distributed systems. To solve this problem, ZooKeeper designers implemented a mechanism where clients can get notifications from the ZooKeeper service instead of polling for events. This resembles a push model, where notifications are pushed to the registered clients of the ZooKeeper service. Clients can register with the ZooKeeper service for any changes associated with a znode. This registration is known as setting a watch on a znode in ZooKeeper terminology. Watches allow clients to get notifications when a znode changes in any way. A watch is a one-time operation, which means that it triggers only one notification. To continue receiving notifications over time, the client must reregister the watch upon receiving each event notification. ZooKeeper watches are a one-time trigger. What this means is that if a client receives a watch event and wants to get notified of future changes, it must set another watch. Whenever a watch is triggered, a notification is dispatched to the client that had set the watch. Watches are maintained in the ZooKeeper server to which a client is connected, and this makes it a fast and lean means of event notification. The watches are triggered for the following three changes to a znode: Any changes to the data of a znode, such as when new data is written to the znode’s data field using the setData operation. Any changes to the children of a znode. For instance, children of a znode are deleted with the delete operation. A znode being created or deleted, which could happen in the event that a new znode is added to a path or an existing one is deleted. Again, ZooKeeper asserts the following guarantees with respect to watches and notifications: ZooKeeper ensures that watches are always ordered in the FIFO manner and that notifications are always dispatched in order Watch notifications are delivered to a client before any other change is made to the same znode The order of the watch events are ordered with respect to the updates seen by the ZooKeeper service ZooKeeper operations ZooKeeper’s data model and its API support the following nine basic operations: Operation Description Operation Event-generating Actions exists A znode is created or deleted, or its data is updated getChildren A child of a znode is created or deleted, or the znode itself is deleted getData A znode is deleted or its data is updated Watches and ZooKeeper operations The read operations in znodes, such as exists, getChildren, and getData, allow watches to be set on them. On the other hand, the watches triggered by znode's write operations, such as create, delete, and setData. ACL operations do not participate in watches. The following are the types of watch events that might occur during a znode state change: NodeChildrenChanged: A znode’s child is created or deleted NodeCreated: A znode is created in a ZooKeeper path NodeDataChanged: The data associated with a znode is updated NodeDeleted: A znode is deleted in a ZooKeeper path Programming with Apache ZooKeeper with Python ZooKeeper is easily programmable and has client binding for a plethora of languages. Its shipped with official Java, C, Perl and Python client libraries. Here we will look at programming ZooKeeper with Python: Apache ZooKeeper is shipped with an official client binding for Python, which is developed on top of the C bindings. It can be found in the contrib/zkpython directory of the ZooKeeper distribution. To build and install the Python binding, refer to the instructions in the README file there. In this section, we will study about another popular Python client library for ZooKeeper, called Kazoo (https://kazoo.readthedocs.org/). Kazoo is a pure Python library for ZooKeeper, which means that unlike the official Python bindings, Kazoo is implemented fully in Python and has no dependency on the C bindings of ZooKeeper. Along with providing both synchronous and asynchronous APIs, the Kazoo library also provides APIs for some distributed data structure primitives such as distributed locks, leader election, distributed queues, and so on. Installation of Kazoo is very simple, which can be done either with pip or easy_install installers: Using pip, Kazoo can be installed with the following command: $ pip install kazoo Using easy_install, Kazoo is installed as follows: $ easy_install kazoo To verify whether Kazoo is installed properly, let's try to connect to the ZooKeeper instance and print the list of znodes in the root path of the tree, as shown in the following screenshot: In the preceding example, we imported the KazooClient, which is the main ZooKeeper client class. Then, we created an object of the class (an instance of KazooClient) by connecting to the ZooKeeper instance that is running on the localhost. Once we called the start() method, it initiates a connection to the ZooKeeper server. Once successfully connected, the instance contains the handle to the ZooKeeper session. Now, when we called the get_children() method on the root path of the ZooKeeper namespace, it returned a list of the children. Finally, we closed the connection by calling the stop() method. A watcher implementation Kazoo provides a higher-level child and data watching API's as a recipe through a module called kazoo.recipe.watchers. This module provides the implementation of DataWatch and ChildrenWatch along with another class called PatientChildrenWatch. The PatientChildrenWatch> class returns values after the children of a node don't change for a period of time, unlike the other two, which return each time an event is generated. Let's look at the implementation of a simple children watcher client, which will generate an event each time a znode is added or deleted from the ZooKeeper path: import signal from kazoo.client import KazooClient from kazoo.recipe.watchers import ChildrenWatch zoo_path = '/MyPath' zk = KazooClient(hosts='localhost:2181') zk.start() zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) def child_watch_func(children): print "List of Children %s" % children while True: signal.pause() In this simple implementation of a children watcher, we connect to the ZooKeeper server that is running in the localhost, using the following code, and create a path /MyPath: zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) We then set a children watcher on this path and register a callback method child_watch_func, which prints the current list of children on the event generated in /MyPath. When we run this client watcher in a terminal, it starts listening to events: On another terminal, we will create some znodes in/MyPath with the ZooKeeper shell: We observe that the children watcher client receives these znode creation events, and it prints the list of the current children in the terminal window: Similarly, if we delete the znodes that we just created, the watcher will receive the events and subsequently will print the children listing in the console: The messages shown in the following screenshot are printed in the terminal where the children watcher is running: ZooKeeper recipes In this section, you will learn to develop high-level distributed system constructs and data structures using ZooKeeper. As mentioned earlier, most of these constructs and functions are of utmost importance in building scalable distributed architectures, but they are fairly complicated to implement from scratch. Developers can often get bogged down while implementing these and integrating them with their application logic. In this section, you will learn how to develop algorithms to build some of these high-level functions using ZooKeeper primitives and data model and see how ZooKeeper makes it simple, scalable, and error free, with much lesser code. Barrier Barrier is a type of synchronization method used in distributed systems to block the processing of a set of nodes until a condition is satisfied. It defines a point where all nodes must stop their processing and cannot proceed until all the other nodes reach this barrier. The algorithm to implement a barrier using ZooKeeper is as follows: To start with, a znode is designated to be a barrier znode, say /zk_barrier. The barrier is said to be active in the system if this barrier znode exists . Each client calls the ZooKeeper API's exists() function on /zk_barrier by registering for watch events on the barrier znode (the watch event is set to true). If the exists() method returns false, the barrier no longer exists, and the client proceeds with its computation. Else, if the exists() method returns true, the clients just waits for watch events. Whenever the barrier exit condition is met, the client in charge of the barrier will delete /zk_barrier. The deletion triggers a watch event, and on getting this notification, the client calls the exists() function on /zk_barrier again. Step 7 returns true, and the clients can proceed further. The barrier exists until the barrier znode ceases to exist! In this way, we can implement a barrier using ZooKeeper without much of an effort. The example cited so far is for a simple barrier to stop a group of distributed processes from waiting on some condition and then proceed together when the condition is met. There is another type of barrier that aids in synchronizing the beginning and end of a computation; this is known as double barrier. The logic of a double barrier states that a computation is started when the required number of processes join the barrier. The processes leave after completing the computation, and when the number of processes participating in the barrier become zero, the computation is stated to end. The algorithm for a double barrier is implemented by having a barrier znode that serves the purpose of being a parent for individual process znodes participating in the computation. It's algorithm is outlined as follows: Phase 1: Joining the barrier znode can be done as follows: Suppose the barrier znode is represented by znode/barrier. Every client process registers with the barrier znode by creating an ephemeral znode with /barrier as the parent. In real scenarios, clients might register using their hostnames. The client process sets a watch event for the existence of another znode called ready under the /barrier znode and waits for the node to appear. A number N is predefined in the system; this governs the minimum number of clients to join the barrier before the computation can start. While joining the barrier, each client process finds the number of child znodes of /barrier: M = getChildren(/barrier, watch=false) 5. If M is less than N, the client waits for the watch event registered in step 3. Else, if M is equal to N, then the client process creates the ready znode under /barrier. The creation of the ready znode in step 5 triggers the watch event, and each client starts the computation that they were waiting so far to do. Phase 2: Leaving the barrier can be done as follows: Client processing on finishing the computation deletes the znode it created under /barrier (in step 2 of Phase 1: Joining the barrier). The client process then finds the number of children under /barrier: M = getChildren(/barrier, watch=True) If M is not equal to 0, this client waits for notifications (observe that we have set the watch event to True in the preceding call). If M is equal to 0, then the client exits the barrier znode The preceding procedure suffers from a potential herd effect where all client processes wake up to check the number of children left in the barrier when a notification is triggered. To get away with this, we can use a sequential ephemeral znode to be created in step 2 of Phase 1: Joining the barrier. Every client process watches it's next lowest sequential ephemeral znode to go away as an exit criterion. This way, only a single event is generated for any client completing the computation, and hence, not all clients need to wake up together to check on its exit condition. For a large number of client processes participating in a barrier, the herd effect can negatively impact the scalability of the ZooKeeper service, and developers should be aware of such scenarios. A Java language implementation of a double barrier can be found in the ZooKeeper documentation at http://zookeeper.apache.org/doc/r3.4.6/zookeeperTutorial.html. Queue A distributed queue is a very common data structure used in distributed systems. A special implementation of a queue, called a producer-consumer queue, is where a collection of processes called producers generate or create new items and put them in the queue, while consumer processes remove the items from the queue and process them. The addition and removal of items in the queue follow a strict ordering of first in first out (FIFO). A producer-consumer queue can be implemented using ZooKeeper. A znode will be designated to hold a queue instance, say queue-znode. All queue items are stored as znodes under this znode. Producers add an item to the queue by creating a znode under the queue-znode, and consumers retrieve the items by getting and then deleting a child from the queue-znode. The FIFO order of the items is maintained using sequential property of znode provided by ZooKeeper. When a producer process creates a znode for a queue item, it sets the sequential flag. This lets ZooKeeper append the znode name with a monotonically increasing sequence number as the suffix. ZooKeeper guarantees that the sequence numbers are applied in order and are not reused. The consumer process processes the items in the correct order by looking at the sequence number of the znode. The pseudocode for the algorithm to implement a producer-consumer queue using ZooKeeper is shown here: Let /_QUEUE_ represent the top-level znode for our queue implementation, which is also called the queue-node. Clients acting as producer processes put something into the queue by calling the create() method with the znode name as "queue-" and set the sequence and ephemeral flags if the create() method call is set true: create( “queue-“, SEQUENCE_EPHEMERAL) The sequence flag lets the new znode get a name like queue-N, where N is a monotonically increasing number Clients acting as consumer processes process a getChildren() method call on the queue-node with a watch event set to true: M = getChildren(/_QUEUE_, true) It sorts the children list M, takes out the lowest numbered child znode from the list, starts processing on it by taking out the data from the znode, and then deletes it. The client picks up items from the list and continues processing on them. On reaching the end of the list, the client should check again whether any new items are added to the queue by issuing another get_children() method call. > The algorithm continues when get_children() returns an empty list; this means that no more znodes or items are left under /_QUEUE_. It's quite possible that in step 3, the deletion of a znode by a client will fail because some other client has gained access to the znode while this client was retrieving the item. In such scenarios, the client should retry the delete call. Using this algorithm for implementation of a generic queue, we can also build a priority queue out of it, where each item can have a priority tagged to it. The algorithm and implementation is left as an exercise to the readers. C and Java implementations of the distributed queue recipe are shipped along with the ZooKeeper distribution under the recipes folder. Developers can use this recipe to implement distributed lock in their applications. Kazoo, the Python client library for ZooKeeper, has distributed queue implementations inside the kazoo.recipe.queue module. This queue implementation has priority assignment to the queue items support as well as queue locking support that are built into it. Lock A lock in a distributed system is an important primitive that provides the applications with a means to synchronize their access to shared resources. Distributed locks need to be globally synchronous to ensure that no two clients can hold the same lock at any instance of time. Typical scenarios where locks are inevitable are when the system as a whole needs to ensure that only one node of the cluster is allowed to carry out an operation at a given time, such as: Write to a shared database or file Act as a decision subsystem Process all I/O requests from other nodes   ZooKeeper can be used to implement mutually exclusive locks for processes that run on different servers across different networks and even geographically apart.   To build a distributed lock with ZooKeeper, a persistent znode is designated to be the main lock-znode. Client processes that want to acquire the lock will create an ephemeral znode with a sequential flag set under the lock-znode. The crux of the algorithm is that the lock is owned by the client process whose child znode has the lowest sequence number. ZooKeeper guarantees the order of the sequence number, as sequence znodes are numbered in a monotonically increasing order. Suppose there are three znodes under the lock-znode: l1, l2, and l3. The client process that created l1 will be the owner of the lock. If the client wants to release the lock, it simply deletes l1, and then, the owner of l2 will be the lock owner and so on. The pseudocode for the algorithm to implement a distributed lock service with ZooKeeper is shown here: Let the parent lock node be represented by a persistent znode, /_locknode_, in the Zookeeper tree. Phase 1: Acquire a lock with the following steps: Call the create("/_locknode_/lock-",CreateMode=EPHEMERAL_SEQUENTIAL) method. Call the getChildren("/_locknode_/lock-", false) method on the lock node. Here, the watch flag is set to false, as otherwise, it can lead to a herd effect. If the znode created by the client in step 1 has the lowest sequence number suffix, then the client is owner of the lock, and it exits the algorithm. Call the exists("/_locknode_/, True) method. If the exists() method returns false, go to step 2. If the exists() method returns true, wait for notifications for the watch event set in step 4. Phase 2: Release a lock as follows: The client holding the lock deletes the node, thereby triggering the next client in line to acquire the lock. The client that created the next higher sequence node will be notified and hold the lock. The watch for this event was set in step 4 of Phase 1,:Acquire a lock. While it's not recommended that you use a distributed system with a large number of clients due to the herd effect, if the other clients also need to know about the change of lock ownership, they could set a watch on the /_locknode_ lock node for events of the NodeChildrenChanged type and can determine the current owner. If there was a partial failure in the creation of znode due to connection loss, it's possible that the client won't be able to correctly determine whether it successfully created the child znode. To resolve such a situation, the client can store its session ID in the znode data field or even as a part of the znode name itself. As a client retains the same session ID after a reconnect, it can easily determine whether the child znode was created by it by looking at the session ID. The idea of creating an ephemeral znode prevents a potential dead-lock situation that might arise when a client dies while holding a lock. However, as the property of the ephemeral znode dictates that it gets deleted when the session times out or expires, ZooKeeper will delete the znode created by the dead client, and the algorithm runs as usual. However, if the client hangs for some reason but the ZooKeeper session is still active, then we might get into a deadlock. This can be solved by having a monitor client that triggers an alarm when the lock holding time for a client crosses a predefined time out. The ZooKeeper distribution is shipped with the C and Java language implementation of a distributed lock in the recipes folder. The recipe implements the algorithm you have learned so far and also takes into account the problems associated with partial failure and herd effect. The previous recipe of a mutually exclusive lock can be modified to implement a shared lock as well. Readers can find the algorithm and pseudocode for a shared lock using Zookeeper in the documentation at http://zookeeper.apache.org/doc/r3.4.6/recipes.html#Shared+Locks. More ZooKeeper recipes are available at: http://zookeeper.apache.org/doc/trunk/recipes.html Summary In this article, we read about the fundamentals of Apache ZooKeeper, programming it and how to implement common distributed data structures with ZooKeeper. For more details on Apache ZooKeeper, please visit its project page. Resources for Article: Further resources on this subject: Creating an Apache JMeter™ test workbench [article] Apache Maven and m2eclipse [article] Coverage with Apache Karaf Pax Exam tests [article]
Read more
  • 0
  • 0
  • 5305

article-image-transformations-using-mapreduce
Packt
05 Feb 2015
19 min read
Save for later

Transformations Using Map/Reduce

Packt
05 Feb 2015
19 min read
In this article written by Adam Boduch, author of the book Lo-Dash Essentials, we'll be looking at all the interesting things we can do with Lo-Dash and the map/reduce programming model. We'll start off with the basics, getting our feet wet with some basic mappings and basic reductions. As we progress through the article, we'll start introducing more advanced techniques to think in terms of map/reduce with Lo-Dash. The goal, once you've reached the end of this article, is to have a solid understanding of the Lo-Dash functions available that aid in mapping and reducing collections. Additionally, you'll start to notice how disparate Lo-Dash functions work together in the map/reduce domain. Ready? (For more resources related to this topic, see here.) Plucking values Consider that as your informal introduction to mapping because that's essentially what it's doing. It's taking an input collection and mapping it to a new collection, plucking only the properties we're interested in. This is shown in the following example: var collection = [ { name: 'Virginia', age: 45 }, { name: 'Debra', age: 34 }, { name: 'Jerry', age: 55 }, { name: 'Earl', age: 29 } ]; _.pluck(collection, 'age'); // → [ 45, 34, 55, 29 ] This is about as simple a mapping operation as you'll find. In fact, you can do the same thing with map(): var collection = [ { name: 'Michele', age: 58 }, { name: 'Lynda', age: 23 }, { name: 'William', age: 35 }, { name: 'Thomas', age: 41 } ]; _.map(collection, 'name'); // → // [ // "Michele", // "Lynda", // "William", // "Thomas" // ] As you'd expect, the output here is exactly the same as it would be with pluck(). In fact, pluck() is actually using the map() function under the hood. The callback passed to map() is constructed using property(), which just returns the specified property value. The map() function falls back to this plucking behavior when a string instead of a function is passed to it. With that brief introduction to the nature of mapping, let's dig a little deeper and see what's possible in mapping collections. Mapping collections In this section, we'll explore mapping collections. Mapping one collection to another ranges from composing really simple—as we saw in the preceding section—to sophisticated callbacks. These callbacks that map each item in the collection can include or exclude properties and can calculate new values. Besides, we can apply functions to these items. We'll also address the issue of filtering collections and how this can be done in conjunction with mapping. Including and excluding properties When applied to an object, the pick() function generates a new object containing only the specified properties. The opposite of this function, omit(), generates an object with every property except those specified. Since these functions work fine for individual object instances, why not use them in a collection? You can use both of these functions to shed properties from collections by mapping them to new ones, as shown in the following code: var collection = [ { first: 'Ryan', last: 'Coleman', age: 23 }, { first: 'Ann', last: 'Sutton', age: 31 }, { first: 'Van', last: 'Holloway', age: 44 }, { first: 'Francis', last: 'Higgins', age: 38 } ]; _.map(collection, function(item) { return _.pick(item, [ 'first', 'last' ]); }); // → // [ // { first: "Ryan", last: "Coleman" }, // { first: "Ann", last: "Sutton" }, // { first: "Van", last: "Holloway" }, // { first: "Francis", last: "Higgins" } // ] Here, we're creating a new collection using the map() function. The callback function supplied to map() is applied to each item in the collection. The item argument is the original item from the collection. The callback is expected to return the mapped version of that item and this version could be anything, including the original item itself. Be careful when manipulating the original item in map() callbacks. If the item is an object and it's referenced elsewhere in your application, it could have unintended consequences. We're returning a new object as the mapped item in the preceding code. This is done using the pick() function. We only care about the first and the last properties. Our newly mapped collection looks identical to the original, except that no item has an age property. This newly mapped collection is seen in the following code: var collection = [ { first: 'Clinton', last: 'Park', age: 19 }, { first: 'Dana', last: 'Hines', age: 36 }, { first: 'Pete', last: 'Ross', age: 31 }, { first: 'Annie', last: 'Cross', age: 48 } ]; _.map(collection, function(item) { return _.omit(item, 'first'); }); // → // [ // { last: "Park", age: 19 }, // { last: "Hines", age: 36 }, // { last: "Ross", age: 31 }, // { last: "Cross", age: 48 } // ] The preceding code follows the same approach as the pick() code. The only difference is that we're excluding the first property from the newly created collection. You'll also notice that we're passing a string containing a single property name instead of an array of property names. In addition to passing strings or arrays as the argument to pick() or omit(), we can pass in a function callback. This is suitable when it's not very clear which objects in a collection should have which properties. Using a callback like this inside a map() callback lets us perform detailed comparisons and transformations on collections while using very little code: function invalidAge(value, key) { return key === 'age' && value < 40; } var collection = [ { first: 'Kim', last: 'Lawson', age: 40 }, { first: 'Marcia', last: 'Butler', age: 31 }, { first: 'Shawna', last: 'Hamilton', age: 39 }, { first: 'Leon', last: 'Johnston', age: 67 } ]; _.map(collection, function(item) { return _.omit(item, invalidAge); }); // → // [ // { first: "Kim", last: "Lawson", age: 40 }, // { first: "Marcia", last: "Butler" }, // { first: "Shawna", last: "Hamilton" }, // { first: "Leon", last: "Johnston", age: 67 } // ] The new collection generated by this code excludes the age property for items where the age value is less than 40. The callback supplied to omit() is applied to each key-value pair in the object. This code is a good illustration of the conciseness achievable with Lo-Dash. There's a lot of iterative code running here and there is no for or while statement in sight. Performing calculations It's time now to turn our attention to performing calculations in our map() callbacks. This entails looking at the item and, based on its current state, computing a new value that will be ultimately mapped to the new collection. This could mean extending the original item's properties or replacing one with a newly computed value. Whichever the case, it's a lot easier to map these computations than to write your own logic that applies these functions to every item in your collection. This is explained using the following example: var collection = [ { name: 'Valerie', jqueryYears: 4, cssYears: 3 }, { name: 'Alonzo', jqueryYears: 1, cssYears: 5 }, { name: 'Claire', jqueryYears: 3, cssYears: 1 }, { name: 'Duane', jqueryYears: 2, cssYears: 0 } ]; _.map(collection, function(item) { return _.extend({ experience: item.jqueryYears + item.cssYears, specialty: item.jqueryYears >= item.cssYears ? 'jQuery' : 'CSS' }, item); }); // → // [ // { // experience": 7, // specialty": "jQuery", // name": "Valerie", // jqueryYears": 4, // cssYears: 3 // }, // { // experience: 6, // specialty: "CSS", // name: "Alonzo", // jqueryYears: 1, // cssYears: 5 // }, // { // experience: 4, // specialty: "jQuery", // name: "Claire", // jqueryYears: 3, // cssYears: 1 // }, // { // experience: 2, // specialty: "jQuery", // name: "Duane", // jqueryYears: 2, // cssYears: 0 // } // ] Here, we're mapping each item in the original collection to an extended version of it. Particularly, we're computing two new values for each item—experience and speciality. The experience property is simply the sum of the jqueryYears and cssYears properties. The speciality property is computed based on the larger value of the jqueryYears and cssYears properties. Earlier, I mentioned the need to be careful when modifying items in map() callbacks. In general, it's a bad idea. It's helpful to try and remember that map() is used to generate new collections, not to modify existing collections. Here's an illustration of the horrific consequences of not being careful: var app = {}, collection = [ { name: 'Cameron', supervisor: false }, { name: 'Lindsey', supervisor: true }, { name: 'Kenneth', supervisor: false }, { name: 'Caroline', supervisor: true } ]; app.supervisor = _.find(collection, { supervisor: true }); _.map(collection, function(item) { return _.extend(item, { supervisor: false }); }); console.log(app.supervisor); // → { name: "Lindsey", supervisor: false } The destructive nature of this callback is not obvious at all and next to impossible for programmers to track down and diagnose. Its nature is essentially resetting the supervisor attribute for each item. If these items are used anywhere else in the application, the supervisor property value will be clobbered whenever this map job is executed. If you need to reset values like this, ensure that the change is mapped to the new value and not made to the original. Mapping also works with primitive values as the item. Often, we'll have an array of primitive values that we'd like transformed into an alternative representation. For example, let's say you have an array of sizes, expressed in bytes. You can map those arrays to a new collection with those sizes expressed as human-readable values, using the following code: function bytes(b) { var units = [ 'B', 'K', 'M', 'G', 'T', 'P' ], target = 0; while (b >= 1024) { b = b / 1024; target++; } return (b % 1 === 0 ? b : b.toFixed(1)) + units[target] + (target === 0 ? '' : 'B'); } var collection = [ 1024, 1048576, 345198, 120120120 ]; _.map(collection, bytes); // → [ "1KB", "1MB", "337.1KB", "114.6MB" ] The bytes() function takes a numerical argument, which is the number of bytes to be formatted. This is the starting unit. We just keep incrementing the target unit until we have something that is less than 1024. For example, the last item in our collection maps to '114.6MB'. The bytes() function can be passed directly to map() since it's expecting values in our collection as they are. Calling functions We don't always have to write our own callback functions for map(). Wherever it makes sense, we're free to leverage Lo-Dash functions to map our collection items. For example, let's say we have a collection and we'd like to know the size of each item. There's a size() Lo-Dash function we can use as our map() callback, as follows: var collection = [ [ 1, 2 ], [ 1, 2, 3 ], { first: 1, second: 2 }, { first: 1, second: 2, third: 3 } ]; _.map(collection, _.size); // → [ 2, 3, 2, 3 ] This code has the added benefit that the size() function returns consistent results, no matter what kind of argument is passed to it. In fact, any function that takes a single argument and returns a new value based on that argument is a valid candidate for a map() callback. For instance, we could also map the minimum and maximum value of each item: var source = _.range(1000), collection = [ _.sample(source, 50), _.sample(source, 100), _.sample(source, 150) ]; _.map(collection, _.min); // → [ 20, 21, 1 ] _.map(collection, _.max); // → [ 931, 985, 991 ] What if we want to map each item of our collection to a sorted version? Since we do not sort the collection itself, we don't care about the item positions within the collection, but the items themselves, if they're arrays, for instance. Let's see what happens with the following code: var collection = [ [ 'Evan', 'Veronica', 'Dana' ], [ 'Lila', 'Ronald', 'Dwayne' ], [ 'Ivan', 'Alfred', 'Doug' ], [ 'Penny', 'Lynne', 'Andy' ] ]; _.map(collection, _.compose(_.first, function(item) { return _.sortBy(item); })); // → [ "Dana", "Dwayne", "Alfred", "Andy" ] This code uses the compose() function to construct a map() callback. The first function returns the sorted version of the item by passing it to sortBy(). The first() item of this sorted list is then returned as the mapped item. The end result is a new collection containing the alphabetically first item from each array in our collection, with three lines of code. This is not bad. Filtering and mapping Filtering and mapping are two closely related collection operations. Filtering extracts only those collection items that are of particular interest in a given context. Mapping transforms collections to produce new collections. But what if you only want to map a certain subset of your collection? Then it would make sense to chain together the filtering and mapping operations, right? Here's an example of what that might look like: var collection = [ { name: 'Karl', enabled: true }, { name: 'Sophie', enabled: true }, { name: 'Jerald', enabled: false }, { name: 'Angie', enabled: false } ]; _.compose( _.partialRight(_.map, 'name'), _.partialRight(_.filter, 'enabled') )(collection); // → [ "Karl", "Sophie" ] This map is executed using compose() to build a function that is called right away, with our collection as the argument. The function is composed of two partials. We're using partialRight() on both arguments because we want the collection supplied as the leftmost argument in both cases. The first partial function is filter(). We're partially applying the enabled argument. So this function will filter our collection before it's passed to map(). This brings us to our next partial in the function composition. The result of filtering the collection is passed to map(), which has the name argument partially applied. The end result is a collection with enabled name strings. The important thing to note about the preceding code is that the filtering operation takes place before the map() function is run. We could have stored the filtered collection in an intermediate variable instead of streamlining with compose(). Regardless of flavor, it's important that the items in your mapped collection correspond to the items in the source collection. It's conceivable to filter out the items in the map() callback by not returning anything, but this is ill-advised as it doesn't map well, both figuratively and literally. Mapping objects The previous section focused on collections and how to map them. But wait, objects are collections too, right? That is indeed correct, but it's worth differentiating between the more traditional collections, arrays, and plain objects. The main reason is that there are implications with ordering and keys when performing map/reduce. At the end of the day, arrays and objects serve different use cases with map/reduce, and this article tries to acknowledge these differences. Now we'll start looking at some techniques Lo-Dash programmers employ when working with objects and mapping them to collections. There are a number of factors to consider such as the keys within an object and calling methods on objects. We'll take a look at the relationship between key-value pairs and how they can be used in a mapping context. Working with keys We can use the keys of a given object in interesting ways to map the object to a new collection. For example, we can use the keys() function to extract the keys of an object and map them to values other than the property value, as shown in the following example: var object = { first: 'Ronald', last: 'Walters', employer: 'Packt' }; _.map(_.sortBy(_.keys(object)), function(item) { return object[item]; }); // → [ "Packt", "Ronald", "Walters" ] The preceding code builds an array of property values from object. It does so using map(), which is actually mapping the keys() array of object. These keys are sorted using sortBy(). So Packt is the first element of the resulting array because employer is alphabetically first in the object keys. Sometimes, it's desirable to perform lookups in other objects and map those values to a target object. For example, not all APIs return everything you need for a given page, packaged in a neat little object. You have to do joins and build the data you need. This is shown in the following code: var users = {}, preferences = {}; _.each(_.range(100), function() { var id = _.uniqueId('user-'); users[id] = { type: 'user' }; preferences[id] = { emailme: !!(_.random()) }; }); _.map(users, function(value, key) { return _.extend({ id: key }, preferences[key]); }); // → // [ // { id: "user-1", emailme: true }, // { id: "user-2", emailme: false }, // ... // ] This example builds two objects, users and preferences. In the case of each object, the keys are user identifiers that we're generating with uniqueId(). The user objects just have some dummy attribute in them, while the preferences objects have an emailme attribute, set to a random Boolean value. Now let's say we need quick access to this preference for all users in the users object. As you can see, it's straightforward to implement using map() on the users object. The callback function returns a new object with the user ID. We extend this object with the preference for that particular user by looking at them by key. Calling methods Objects aren't limited to storing primitive strings and numbers. Properties can store functions as their values, or methods, as they're commonly referred. However, depending on the context where you're using your object, methods aren't always callable, especially if you have little or no control over the context where your objects are used. One technique that's helpful in situations such as these is mapping the result of calling these methods and using this result in the context in question. Let's see how this can be done with the following code: var object = { first: 'Roxanne', last: 'Elliot', name: function() { return this.first + ' ' + this.last; }, age: 38, retirement: 65, working: function() { return this.retirement - this.age; } }; _.map(object, function(value, key) { var item = {}; item[key] = _.isFunction(value) ? object[key]() : value return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] _.map(object, function(value, key) { var item = {}; item[key] = _.result(object, key); return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] Here, we have an object with both primitive property values and methods that use these properties. Now we'd like to map the results of calling those methods and we will experiment with two different approaches. The first approach uses the isFunction() function to determine whether the property value is callable or not. If it is, we call it and return that value. The second approach is a little easier to implement and achieves the same outcome. The result() function is applied to the object using the current key. This tests whether we're working with a function or not, so our code doesn't have to. In the first approach to mapping method invocations, you might have noticed that we're calling the method using object[key]() instead of value(). The former retains the context as the object variable, but the latter loses the context, since it is invoked as a plain function without any object. So when you're writing mapping callbacks that call methods and not getting the expected results, make sure the method's context is intact. Perhaps, you have an object but you're not sure which properties are methods. You can use functions() to figure this out and then map the results of calling each method to an array, as shown in the following code: var object = { firstName: 'Fredrick', lastName: 'Townsend', first: function() { return this.firstName; }, last: function() { return this.lastName; } }; var methods = _.map(_.functions(object), function(item) { return [ _.bindKey(object, item) ]; }); _.invoke(methods, 0); // → [ "Fredrick", "Townsend" ] The object variable has two methods, first() and last(). Assuming we didn't know about these methods, we can find them using functions(). Here, we're building a methods array using map(). The input is an array containing the names of all the methods of the given object. The value we're returning is interesting. It's a single-value array; you'll see why in a moment. The value of this array is a function built by passing the object and the name of the method to bindKey(). This function, when invoked, will always use object as its context. Lastly, we use invoke() to invoke each method in our methods array, building a new result array. Recall that our map() callback returned an array. This was a simple hack to make invoke() work, since it's a convenient way to call methods. It generally expects a key as the second argument, but a numerical index works just as well, since they're both looked up as same. Mapping key-value pairs Just because you're working with an object doesn't mean it's ideal, or even necessary. That's what map() is for—mapping what you're given to what you need. For instance, the property values are sometimes all that matter for what you're doing, and you can dispense with the keys entirely. For that, we have the values() function and we feed the values to map(): var object = { first: 'Lindsay', last: 'Castillo', age: 51 }; _.map(_.filter(_.values(object), _.isString), function(item) { return '<strong>' + item + '</strong>'; }); // → [ "<strong>Lindsay</strong>", "<strong>Castillo</strong>" ] All we want from the object variable here is a list of property values, which are strings, so that we can format them. In other words, the fact that the keys are first, last, and age is irrelevant. So first, we call values() to build an array of values. Next, we pass that array to filter(), removing anything that's not a string. We then pass the output of this to map, where we're able to map the string using <strong/> tags. The opposite might also be true—the value is completely meaningless without its key. If that's the case, it may be fitting to map key-value pairs to a new collection, as shown in the following example: function capitalize(s) { return s.charAt(0).toUpperCase() + s.slice(1); } function format(label, value) { return '<label>' + capitalize(label) + ':</label>' + '<strong>' + value + '</strong>'; } var object = { first: 'Julian', last: 'Ramos', age: 43 }; _.map(_.pairs(object), function(pair) { return format.apply(undefined, pair); }); // → // [ // "<label>First:</label><strong>Julian</strong>", // "<label>Last:</label><strong>Ramos</strong>", // "<label>Age:</label><strong>43</strong>" // ] We're passing the result of running our object through the pairs() function to map(). The argument passed to our map callback function is an array, the first element being the key and the second being the value. It so happens that the format() function expects a key and a value to format the given string, so we're able to use format.apply() to call the function, passing it the pair array. This approach is just a matter of taste. There's no need to call pairs() before map(). We could just as easily have called format directly. But sometimes, this approach is preferred, and the reasons, not least of which is the style of the programmer, are wide and varied. Summary This article introduced you to the map/reduce programming model and how Lo-Dash tools help realize it in your application. First, we examined mapping collections, including how to choose which properties get included and how to perform calculations. We then moved on to mapping objects. Keys can have an important role in how objects get mapped to new objects and collections. There are also methods and functions to consider when mapping. Resources for Article: Further resources on this subject: The First Step [article] Recursive directives [article] AngularJS Project [article]
Read more
  • 0
  • 0
  • 6209
article-image-building-next-generation-web-meteor
Packt
05 Feb 2015
9 min read
Save for later

Building the next generation Web with Meteor

Packt
05 Feb 2015
9 min read
This article by Fabian Vogelsteller, the author of Building Single-page Web Apps with Meteor, explores the full-stack framework of Meteor. Meteor is not just a JavaScript library such as jQuery or AngularJS. It's a full-stack solution that contains frontend libraries, a Node.js-based server, and a command-line tool. All this together lets us write large-scale web applications in JavaScript, on both the server and client, using a consistent API. (For more resources related to this topic, see here.) Even with Meteor being quite young, already a few companies such as https://lookback.io, https://respond.ly and https://madeye.io use Meteor already in their production environment. If you want to see for yourself what's made with Meteor, take a look at http://madewith.meteor.com. Meteor makes it easy for us to build web applications quickly and takes care of the boring processes such as file linking, minifying, and concatenating of files. Here are a few highlights of what is possible with Meteor: We can build complex web applications amazingly fast using templates that automatically update themselves when data changes We can push new code to all clients on the fly while they are using our app Meteor core packages come with a complete account solution, allowing a seamless integration with Facebook, Twitter, and more Data will automatically be synced across clients, keeping every client in the same state in almost real time Latency compensation will make our interface appear super fast while the server response happens in the background With Meteor, we never have to link files with the <script> tags in HTML. Meteor's command-line tool automatically collects JavaScript or CSS files in our application's folder and links them in the index.html file, which is served to clients on initial page load. This makes structuring our code in separate files as easy as creating them. Meteor's command-line tool also watches all files inside our application's folder for changes and rebuilds them on the fly when they change. Additionally, it starts a Meteor server that serves the app's files to the clients. When a file changes, Meteor reloads the site of every client while preserving its state. This is called a hot code reload. In production, the build process also concatenates and minifies our CSS and JavaScript files. By simply adding the less and coffee core packages, we can even write all styles in LESS and code in CoffeeScript with no extra effort. The command-line tool is also the tool for deploying and bundling our app so that we can run it on a remote server. Sounds awesome? Let's take a look at what's needed to use Meteor Adding basic packages Packages in Meteor are libraries that can be added to our projects. The nice thing about Meteor packages is that they are self-contained units, which run out of the box. They mostly add either some templating functionality or provide extra objects in the global namespace of our project. Packages can also add features to Meteor's build process like the stylus package, which lets us write our app's style files with the stylus pre-processor syntax. Writing templates in Meteor Normally when we build websites, we build the complete HTML on the server side. This was quite straightforward; every page is built on the server, then it is sent to the client, and at last JavaScript added some additional animation or dynamic behavior to it. This is not so in single-page apps, where each page needs to be already in the client's browser so that it can be shown at will. Meteor solves that problem by providing templates that exists in JavaScript and can be placed in the DOM at some point. These templates can have nested templates, allowing for and easy way to reuse and structure an app's HTML layout. Since Meteor is so flexible in terms of folder and file structure, any *.html page can contain a template and will be parsed during Meteor's build process. This allows us to put all templates in the my-meteor-blog/client/templates folder. This folder structure is chosen as it helps us organizing templates while our app grows. Meteor template engine is called Spacebars, which is a derivative of the handlebars template engine. Spacebars is built on top of Blaze, which is Meteor's reactive DOM update engine. Meteor and databases Meteor currently uses MongoDB by default to store data on the server, although there are drivers planned for relational databases, too. If you are adventurous, you can try one of the community-built SQL drivers, such as the numtel:mysql package from https://atmospherejs.com/numtel/mysql. MongoDB is a NoSQL database. This means it is based on a flat document structure instead of a relational table structure. Its document approach makes it ideal for JavaScript as documents are written in BJSON, which is very similar to the JSON format. Meteor has a database everywhere approach, which means we have the same API to query the database on the client as well as on the server. Yet, when we query the database on the client, we are only able to access data that we published to a client. MongoDB uses a datastructure called a collection, which is the equivalent of a table in an SQL database. Collections contain documents, where each document has its own unique ID. These documents are JSON-like structures and can contain properties with values, even with multiple dimensions: { "_id": "W7sBzpBbov48rR7jW", "myName": "My Document Name", "someProperty": 123456, "aNestedProperty": { "anotherOne": "With another string" } } These collections are used to store data in the servers MongoDB as well as the client-sides minimongo collections, which is an in-memory database mimicking the behavior of the real MongoDB. The MongoDB API let us use a simple JSON-based query language to get documents from a collection. We can pass additional options to only ask for specific fields or sort the returned documents. These are very powerful features, especially on the client side, to display data in various ways. Data everywhere In Meteor, we can use the browser console to update data, which means we update the database from the client. This works because Meteor automatically syncs these changes to the server and updates the database accordingly. This is happening because we have the autopublish and insecure core packages added to our project by default. The autopublish package publishes automatically all documents to every client, whereas the insecure package allows every client to update database records by its _id field. Obviously, this works well for prototyping but is infeasible for production, as every client could manipulate our database. If we remove the insecure package, we would need to add the "allow and deny" rules to determine what a client is allowed to update and what not; otherwise all updates will get denied. Differences between client and server collections Meteor has a database everywhere approach. This means it provides the same API on the client as on the server. The data flow is controlled using a publication subscription model. On the server sits the real MongoDB database, which stores data persistently. On the client Meteor has a package called minimongo, which is a pure in-memory database mimicking most of MongoDB's query and update functions. Every time a client connects to its Meteor server, Meteor downloads the documents the client subscribed to and stores them in its local minimongo database. From here, they can be displayed in a template or processed by functions. When the client updates a document, Meteor syncs it back to the server, where it is passed through any allow/deny functions before being persistently stored in the database. This works also in the other way, when a document in the server-side database changes, it will get automatically sync to every client that is subscribed to it, keeping every connected client up to date. Syncing data – the current Web versus the new Web In the current Web, most pages are either static files hosted on a server or dynamically generated by a server on a request. This is true for most server-side-rendered websites, for example, those written with PHP, Rails, or Django. Both of these techniques required no effort besides being displayed by the clients; therefore, they are called thin clients. In modern web applications, the idea of the browser has moved from thin clients to fat clients. This means most of the website's logic resides on the client and the client asks for the data it needs. Currently, this is mostly done via calls to an API server. This API server then returns data, commonly in JSON form, giving the client an easy way to handle it and use it appropriately. Most modern websites are a mixture of thin and fat clients. Normal pages are server-side-rendered, where only some functionality, such as a chat box or news feed, is updated using API calls. Meteor, however, is built on the idea that it's better to use the calculation power of all clients instead of one single server. A pure fat client or a single-page app contains the entire logic of a website's frontend, which is send down on the initial page load. The server then merely acts as a data source, sending only the data to the clients. This can happen by connecting to an API and utilizing AJAX calls, or as with Meteor, using a model called publication/subscription. In this model, the server offers a range of publications and each client decides which dataset it wants to subscribe to. Compared with AJAX calls, the developer doesn't have to take care of any downloading or uploading logic. The Meteor client syncs all of the data automatically in the background as soon as it subscribes to a specific dataset. When data on the server changes, the server sends the updated documents to the clients and vice versa, as shown in the following diagram: Summary Meteor comes with more great ways of building pure JavaScript applications such as simple routing and simple ways to make components, which can be packaged for others to use. Meteor's reactivity model, which allows you to rerun any function and template helpers at will, allows for great consistent interfaces and simple dependency tracking, which is a key for large-scale JavaScript applications. If you want to dig deeper, buy the book and read How to build your own blog as single-page web application in a simple step-by-step fashion by using Meteor, the next generation web! Resources for Article: Further resources on this subject: Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article] Marionette View Types and Their Use [article]
Read more
  • 0
  • 0
  • 1897

article-image-google-app-engine
Packt
05 Feb 2015
11 min read
Save for later

Google App Engine

Packt
05 Feb 2015
11 min read
In this article by Massimiliano Pippi, author of the book Python for Google App Engine, in this article, you will learn how to write a web application and seeing the platform in action. Web applications commonly provide a set of features such as user authentication and data storage. App Engine provides the services and tools needed to implement such features. (For more resources related to this topic, see here.) In this article, we will see: Details of the webapp2 framework How to authenticate users Storing data on Google Cloud Datastore Building HTML pages using templates Experimenting on the Notes application To better explore App Engine and Cloud Platform capabilities, we need a real-world application to experiment on; something that's not trivial to write, with a reasonable list of requirements. A good candidate is a note-taking application; we will name it Notes. Notes enable the users to add, remove, and modify a list of notes; a note has a title and a body of text. Users can only see their personal notes, so they must authenticate before using the application. The main page of the application will show the list of notes for logged-in users and a form to add new ones. The code from the helloworld example is a good starting point. We can simply change the name of the root folder and the application field in the app.yaml file to match the new name we chose for the application, or we can start a new project from scratch named notes. Authenticating users The first requirement for our Notes application is showing the home page only to users who are logged in and redirect others to the login form; the users service provided by App Engine is exactly what we need and adding it to our MainHandler class is quite simple: import webapp2 from google.appengine.api import users class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: self.response.write('Hello Notes!') else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) app = webapp2.WSGIApplication([ ('/', MainHandler) ], debug=True) The user package we import on the second line of the previous code provides access to users' service functionalities. Inside the get() method of the MainHandler class, we first check whether the user visiting the page has logged in or not. If they have, the get_current_user() method returns an instance of the user class provided by App Engine and representing an authenticated user; otherwise, it returns None as output. If the user is valid, we provide the response as we did before; otherwise, we redirect them to the Google login form. The URL of the login form is returned using the create_login_url() method, and we call it, passing as a parameter the URL we want to redirect users to after a successful authentication. In this case, we want to redirect users to the same URL they are visiting, provided by webapp2 in the self.request.uri property. The webapp2 framework also provides handlers with a redirect() method we can use to conveniently set the right status and location properties of the response object so that the client browsers will be redirected to the login page. HTML templates with Jinja2 Web applications provide rich and complex HTML user interfaces, and Notes is no exception but, so far, response objects in our applications contained just small pieces of text. We could include HTML tags as strings in our Python modules and write them in the response body but we can imagine how easily it could become messy and hard to maintain the code. We need to completely separate the Python code from HTML pages and that's exactly what a template engine does. A template is a piece of HTML code living in its own file and possibly containing additional, special tags; with the help of a template engine, from the Python script, we can load this file, properly parse special tags, if any, and return valid HTML code in the response body. App Engine includes in the Python runtime a well-known template engine: the Jinja2 library. To make the Jinja2 library available to our application, we need to add this code to the app.yaml file under the libraries section: libraries: - name: webapp2 version: "2.5.2" - name: jinja2 version: latest We can put the HTML code for the main page in a file called main.html inside the application root. We start with a very simple page: <!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8"> <title>Notes</title> </head> <body> <div class="container"> <h1>Welcome to Notes!</h1> <p> Hello, <b>{{user}}</b> - <a href="{{logout_url}}">Logout</a> </p> </div> </body> </html> Most of the content is static, which means that it will be rendered as standard HTML as we see it but there is a part that is dynamic and whose content depend on which data will be passed at runtime to the rendering process. This data is commonly referred to as template context. What has to be dynamic is the username of the current user and the link used to log out from the application. The HTML code contains two special elements written in the Jinja2 template syntax, {{user}} and {{logout_url}}, that will be substituted before the final output occurs. Back to the Python script; we need to add the code to initialize the template engine before the MainHandler class definition: import os import jinja2 jinja_env = jinja2.Environment( loader=jinja2.FileSystemLoader(os.path.dirname(__file__))) The environment instance stores engine configuration and global objects, and it's used to load templates instances; in our case, instances are loaded from HTML files on the filesystem in the same directory as the Python script. To load and render our template, we add the following code to the MainHandler.get() method: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) Similar to how we get the login URL, the create_logout_url() method provided by the user service returns the absolute URI to the logout procedure that we assign to the logout_url variable. We then create the template_context dictionary that contains the context values we want to pass to the template engine for the rendering process. We assign the nickname of the current user to the user key in the dictionary and the logout URL string to the logout_url key. The get_template() method from the jinja_env instance takes the name of the file that contains the HTML code and returns a Jinja2 template object. To obtain the final output, we call the render() method on the template object passing in the template_context dictionary whose values will be accessed, specifying their respective keys in the HTML file with the template syntax elements {{user}} and {{logout_url}}. Handling forms The main page of the application is supposed to list all the notes that belong to the current user but there isn't any way to create such notes at the moment. We need to display a web form on the main page so that users can submit details and create a note. To display a form to collect data and create notes, we put the following HTML code right below the username and the logout link in the main.html template file: {% if note_title %} <p>Title: {{note_title}}</p> <p>Content: {{note_content}}</p> {% endif %} <h4>Add a new note</h4> <form action="" method="post"> <div class="form-group"> <label for="title">Title:</label> <input type="text" id="title" name="title" /> </div> <div class="form-group"> <label for="content">Content:</label> <textarea id="content" name="content"></textarea> </div> <div class="form-group"> <button type="submit">Save note</button> </div> </form> Before showing the form, a message is displayed only when the template context contains a variable named note_title. To do this, we use an if statement, executed between the {% if note_title %} and {% endif %} delimiters; similar delimiters are used to perform for loops or assign values inside a template. The action property of the form tag is empty; this means that upon form submission, the browser will perform a POST request to the same URL, which in this case is the home page URL. As our WSGI application maps the home page to the MainHandler class, we need to add a method to this class so that it can handle POST requests: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) def post(self): user = users.get_current_user() if user is None: self.error(401) logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, 'note_title': self.request.get('title'), 'note_content': self.request.get('content'), } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) When the form is submitted, the handler is invoked and the post() method is called. We first check whether a valid user is logged in; if not, we raise an HTTP 401: Unauthorized error without serving any content in the response body. Since the HTML template is the same served by the get() method, we still need to add the logout URL and the user name to the context. In this case, we also store the data coming from the HTML form in the context. To access the form data, we call the get() method on the self.request object. The last three lines are boilerplate code to load and render the home page template. We can move this code in a separate method to avoid duplication: def _render_template(self, template_name, context=None): if context is None: context = {} template = jinja_env.get_template(template_name) return template.render(context) In the handler class, we will then use something like this to output the template rendering result: self.response.out.write( self._render_template('main.html', template_context)) We can try to submit the form and check whether the note title and content are actually displayed above the form. Summary Thanks to App Engine, we have already implemented a rich set of features with a relatively small effort so far. We have discovered some more details about the webapp2 framework and its capabilities, implementing a nontrivial request handler. We have learned how to use the App Engine users service to provide users authentication. We have delved into some fundamental details of Datastore and now we know how to structure data in grouped entities and how to effectively retrieve data with ancestor queries. In addition, we have created an HTML user interface with the help of the Jinja2 template library, learning how to serve static content such as CSS files. Resources for Article: Further resources on this subject: Machine Learning in IPython with scikit-learn [Article] Introspecting Maya, Python, and PyMEL [Article] Driving Visual Analyses with Automobile Data (Python) [Article]
Read more
  • 0
  • 0
  • 3192
Modal Close icon
Modal Close icon