Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-push-your-data-web
Packt
22 Feb 2016
27 min read
Save for later

Push your data to the Web

Packt
22 Feb 2016
27 min read
This article covers the following topics: An introduction to the Shiny app framework Creating your first Shiny app The connection between the server file and the user interface The concept of reactive programming Different types of interface layouts, widgets, and Shiny tags How to create a dynamic user interface Ways to share your Shiny applications with others How to deploy Shiny apps to the web (For more resources related to this topic, see here.) Introducing Shiny – the app framework The Shiny package delivers a powerful framework to build fully featured interactive Web applications just with R and RStudio. Basic Shiny applications typically consist of two components: ~/shinyapp |-- ui.R |-- server.R While the ui.R function represents the appearance of the user interface, the server.R function contains all the code for the execution of the app. The look of the user interface is based on the famous Twitter bootstrap framework, which makes the look and layout highly customizable and fully responsive. In fact, you only need to know R and how to use the shiny package to build a pretty web application. Also, a little knowledge of HTML, CSS, and JavaScript may help. If you want to check the general possibilities and what is possible with the Shiny package, it is advisable to take a look at the inbuilt examples. Just load the library and enter the example name: library(shiny) runExample("01_hello") As you can see, running the first example opens the Shiny app in a new window. This app creates a simple histogram plot where you can interactively change the number of bins. Further, this example allows you to inspect the corresponding ui.R and server.R code files. There are currently eleven inbuilt example apps: 01_hello 02_text 03_reactivity 04_mpg 05_sliders 06_tabsets 07_widgets 08_html 09_upload 10_download 11_timer These examples focus mainly on the user interface possibilities and elements that you can create with Shiny. Creating a new Shiny web app with RStudio RStudio offers a fast and easy way to create the basis of every new Shiny app. Just click on New Project and select the New Directory option in the newly opened window: After that, click on the Shiny Web Application field: Give your new app a name in the next step, and click on Create Project: RStudio will then open a ready-to-use Shiny app by opening a prefilled ui.R and server.R file: You can click on the now visible Run App button in the right corner of the file pane to display the prefilled example application. Creating your first Shiny application In your effort to create your first Shiny application, you should first create or consider rough sketches for your app. Questions that you might ask in this context are, What do I want to show? How do I want it to show?, and so on. Let's say we want to create an application that allows users to explore some of the variables of the mtcars dataset. The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). Sketching the final app We want the user of the app to be able to select one out of the three variables of the dataset that gets displayed in a histogram. Furthermore, we want users to get a summary of the dataset under the main plot. So, the following figure could be a rough project sketch: Constructing the user interface for your app We will reuse the already opened ui.R file from the RStudio example, and adapt it to our needs. The layout of the ui.R file for your first app is controlled by nested Shiny functions and looks like the following lines: library(shiny) shinyUI(pageWithSidebar( headerPanel("My First Shiny App"), sidebarPanel( selectInput(inputId = "variable", label = "Variable:", choices = c ("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( plotOutput("carsPlot"), verbatimTextOutput ("carsSummary") ) )) Creating the server file The server file holds all the code for the execution of the application: library(shiny) library(datasets) shinyServer(function(input, output) { output$carsPlot <- renderPlot({ hist(mtcars[,input$variable], main = "Histogram of mtcars variables", xlab = input$variable) }) output$carsSummary <- renderPrint({ summary(mtcars[,input$variable]) }) }) The final application After changing the ui.R and the server.R files according to our needs, just hit the Run App button and the final app opens in a new window: As planned in the app sketch, the app offers the user a drop-down menu to choose the desired variable on the left side, and shows a histogram and data summary of the selected variable on the right side. Deconstructing the final app into its components For a better understanding of the Shiny application logic and the interplay of the two main files, ui.R and server.R, we will disassemble your first app again into its individual parts. The components of the user interface We have divided the user interface into three parts: After loading the Shiny library, the complete look of the app gets defined by the shinyUI() function. In our app sketch, we chose a sidebar look; therefore, the shinyUI function holds the argument, pageWithSidebar(): library(shiny) shinyUI(pageWithSidebar( ... The headerPanel() argument is certainly the simplest component, since usually only the title of the app will be stored in it. In our ui.R file, it is just a single line of code: library(shiny) shinyUI(pageWithSidebar( titlePanel("My First Shiny App"), ... The sidebarPanel() function defines the look of the sidebar, and most importantly, handles the input of the variables of the chosen mtcars dataset: library(shiny) shinyUI(pageWithSidebar( titlePanel("My First Shiny App"), sidebarPanel( selectInput(inputId = "variable", label = "Variable:", choices = c ("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), ... Finally, the mainPanel() function ensures that the output is displayed. In our case, this is the histogram and the data summary for the selected variables: library(shiny) shinyUI(pageWithSidebar( titlePanel("My First Shiny App"), sidebarPanel( selectInput(inputId = "variable", label = "Variable:", choices = c ("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( plotOutput("carsPlot"), verbatimTextOutput ("carsSummary") ) )) The server file in detail While the ui.R file defines the look of the app, the server.R file holds instructions for the execution of the R code. Again, we use our first app to deconstruct the related server.R file into its main important parts. After loading the needed libraries, datasets, and further scripts, the function, shinyServer(function(input, output) {} ), defines the server logic: library(shiny) library(datasets) shinyServer(function(input, output) { The marked lines of code that follow translate the inputs of the ui.R file into matching outputs. In our case, the server side output$ object is assigned to carsPlot, which in turn was called in the mainPanel() function of the ui.R file as plotOutput(). Moreover, the render* function, in our example it is renderPlot(), reflects the type of output. Of course, here it is the histogram plot. Within the renderPlot() function, you can recognize the input$ object assigned to the variables that were defined in the user interface file: library(shiny) library(datasets) shinyServer(function(input, output) { output$carsPlot <- renderPlot({ hist(mtcars[,input$variable], main = "Histogram of mtcars variables", xlab = input$variable) }) ... In the following lines, you will see another type of the render function, renderPrint() , and within the curly braces, the actual R function, summary(), with the defined input variable: library(shiny) library(datasets) shinyServer(function(input, output) { output$carsPlot <- renderPlot({ hist(mtcars[,input$variable], main = "Histogram of mtcars variables", xlab = input$variable) }) output$carsSummary <- renderPrint({ summary(mtcars[,input$variable]) }) }) There are plenty of different render functions. The most used are as follows: renderPlot: This creates normal plots renderPrin: This gives printed output types renderUI: This gives HTML or Shiny tag objects renderTable: This gives tables, data frames, and matrices renderText: This creates character strings Every code outside the shinyServer() function runs only once on the first launch of the app, while all the code in between the brackets and before the output functions runs as often as a user visits or refreshes the application. The code within the output functions runs every time a user changes the widget that belongs to the corresponding output. The connection between the server and the ui file As already inspected in our decomposed Shiny app, the input functions of the ui.R file are linked with the output functions of the server file. The following figure illustrates this again: The concept of reactivity Shiny uses a reactive programming model, and this is a big deal. By applying reactive programming, the framework is able to be fast, efficient, and robust. Briefly, changing the input in the user interface, Shiny rebuilds the related output. Shiny uses three reactive objects: Reactive source Reactive conductor Reactive endpoint For simplicity, we use the formal signs of the RStudio documentation: The implementation of a reactive source is the reactive value; that of a reactive conductor is a reactive expression; and the reactive endpoint is also called the observer. The source and endpoint structure As taught in the previous section, the defined input of the ui.R links is the output of the server.R file. For simplicity, we use the code from our first Shiny app again, along with the introduced formal signs: ... output$carsPlot <- renderPlot({ hist(mtcars[,input$variable], main = "Histogram of mtcars variables", xlab = input$variable) }) ... The input variable, in our app these are the Horsepower; Miles per Gallon, and Number of Carburetors choices, represents the reactive source. The histogram called carsPlot stands for the reactive endpoint. In fact, it is possible to link the reactive source to numerous reactive endpoints, and also conversely. In our Shiny app, we also connected the input variable to our first and second output—carsSummary: ... output$carsPlot <- renderPlot({ hist(mtcars[,input$variable], main = "Histogram of mtcars variables", xlab = input$variable) }) output$carsSummary <- renderPrint({ summary(mtcars[,input$variable]) }) ... To sum it up, this structure ensures that every time a user changes the input, the output refreshes automatically and accordingly. The purpose of the reactive conductor The reactive conductor differs from the reactive source and the endpoint is so far that this reactive type can be dependent and can have dependents. Therefore, it can be placed between the source, which can only have dependents and the endpoint, which in turn can only be dependent. The primary function of a reactive conductor is the encapsulation of heavy and difficult computations. In fact, reactive expressions are caching the results of these computations. The following graph displays a possible connection of the three reactive types: In general, reactivity raises the impression of a logically working directional system; after input, the output occurs. You get the feeling that an input pushes information to an output. But this isn't the case. In reality, it works vice versa. The output pulls the information from the input. And this all works due to sophisticated server logic. The input sends a callback to the server, which in turn informs the output that pulls the needed value from the input and shows the result to the user. But of course, for a user, this all feels like an instant updating of any input changes, and overall, like a responsive app's behavior. Of course, we have just touched upon the main aspects of reactivity, but now you know what's really going on under the hood of Shiny. Discovering the scope of the Shiny user interface After you know how to build a simple Shiny application, as well as how reactivity works, let us take a look at the next step: the various resources to create a custom user interface. Furthermore, there are nearly endless possibilities to shape the look and feel of the layout. As already mentioned, the entire HTML, CSS, and JavaScript logic and functions of the layout options are based on the highly flexible bootstrap framework. And, of course, everything is responsive by default, which makes it possible for the final application layout to adapt to the screen of any device. Exploring the Shiny interface layouts Currently, there are four common shinyUI () page layouts: pageWithSidebar() fluidPage() navbarPage() fixedPage() These page layouts can be, in turn, structured with different functions for a custom inner arrangement structure of the page layout. In the following sections, we are introducing the most useful inner layout functions. As an example, we will use our first Shiny application again. The sidebar layout The sidebar layout, where the sidebarPanel() function is used as the input area, and the mainPanel() function as the output, just like in our first Shiny app. The sidebar layout uses the pageWithSidebar() function: library(shiny) shinyUI(pageWithSidebar( headerPanel("The Sidebar Layout"), sidebarPanel( selectInput(inputId = "variable", label = "This is the sidebarPanel", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( tags$h2("This is the mainPanel"), plotOutput("carsPlot"), verbatimTextOutput("carsSummary") ) )) When you only change the first three functions, you can create exactly the same look as the application with the fluidPage() layout. This is the sidebar layout with the fluidPage() function: library(shiny) shinyUI(fluidPage( titlePanel("The Sidebar Layout"), sidebarLayout( sidebarPanel( selectInput(inputId = "variable", label = "This is the sidebarPanel", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( tags$h2("This is the mainPanel"), plotOutput("carsPlot"), verbatimTextOutput("carsSummary") ) ) ))   The grid layout The grid layout is where rows are created with the fluidRow() function. The input and output are made within free customizable columns. Naturally, a maximum of 12 columns from the bootstrap grid system must be respected. This is the grid layout with the fluidPage () function and a 4-8 grid: library(shiny) shinyUI(fluidPage( titlePanel("The Grid Layout"), fluidRow( column(4, selectInput(inputId = "variable", label = "Four-column input area", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), column(8, tags$h3("Eight-column output area"), plotOutput("carsPlot"), verbatimTextOutput("carsSummary") ) ) )) As you can see from inspecting the previous ui.R file, the width of the columns is defined within the fluidRow() function, and the sum of these two columns adds up to 12. Since the allocation of the columns is completely flexible, you can also create something like the grid layout with the fluidPage() function and a 4-4-4 grid: library(shiny) shinyUI(fluidPage( titlePanel("The Grid Layout"), fluidRow( column(4, selectInput(inputId = "variable", label = "Four-column input area", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), column(4, tags$h5("Four-column output area"), plotOutput("carsPlot") ), column(4, tags$h5("Another four-column output area"), verbatimTextOutput("carsSummary") ) ) )) The tabset panel layout The tabsetPanel() function can be built into the mainPanel() function of the aforementioned sidebar layout page. By applying this function, you can integrate several tabbed outputs into one view. This is the tabset layout with the fluidPage() function and three tab panels: library(shiny) shinyUI(fluidPage( titlePanel("The Tabset Layout"), sidebarLayout( sidebarPanel( selectInput(inputId = "variable", label = "Select a variable", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( tabsetPanel( tabPanel("Plot", plotOutput("carsPlot")), tabPanel("Summary", verbatimTextOutput("carsSummary")), tabPanel("Raw Data", dataTableOutput("tableData")) ) ) ) )) After changing the code to include the tabsetPanel() function, the three tabs with the tabPanel() function display the respective output. With the help of this layout, you are no longer dependent on representing several outputs among themselves. Instead, you can display each output in its own tab, while the sidebar does not change. The position of the tabs is flexible and can be assigned to be above, below, right, and left. For example, in the following code file detail, the position of the tabsetPanel() function was assigned as follows: ... mainPanel( tabsetPanel(position = "below", tabPanel("Plot", plotOutput("carsPlot")), tabPanel("Summary", verbatimTextOutput("carsSummary")), tabPanel("Raw Data", tableOutput("tableData")) ) ) ... The navlist panel layout The navlistPanel() function is similar to the tabsetPanel() function, and can be seen as an alternative if you need to integrate a large number of tabs. The navlistPanel() function also uses the tabPanel() function to include outputs: library(shiny) shinyUI(fluidPage( titlePanel("The Navlist Layout"), navlistPanel( "Discovering The Dataset", tabPanel("Plot", plotOutput("carsPlot")), tabPanel("Summary", verbatimTextOutput("carsSummary")), tabPanel("Another Plot", plotOutput("barPlot")), tabPanel("Even A Third Plot", plotOutput("thirdPlot"), "More Information", tabPanel("Raw Data", tableOutput("tableData")), tabPanel("More Datatables", tableOutput("moreData")) ) ))   The navbar page as the page layout In the previous examples, we have used the page layouts, fluidPage() and pageWithSidebar(), in the first line. But, especially when you want to create an application with a variety of tabs, sidebars, and various input and output areas, it is recommended that you use the navbarPage() layout. This function makes use of the standard top navigation of the bootstrap framework: library(shiny) shinyUI(navbarPage("The Navbar Page Layout", tabPanel("Data Analysis", sidebarPanel( selectInput(inputId = "variable", label = "Select a variable", choices = c("Horsepower" = "hp", "Miles per Gallon" = "mpg", "Number of Carburetors" = "carb"), selected = "hp") ), mainPanel( plotOutput("carsPlot"), verbatimTextOutput("carsSummary") ) ), tabPanel("Calculations" … ), tabPanel("Some Notes" … ) )) Adding widgets to your application After inspecting the most important page layouts in detail, we now look at the different interface input and output elements. By adding these widgets, panels, and other interface elements to an application, we can further customize each page layout. Shiny input elements Already, in our first Shiny application, we got to know a typical Shiny input element: the selection box widget. But, of course, there are a lot more widgets with different types of uses. All widgets can have several arguments; the minimum setup is to assign an inputId, which instructs the input slot to communicate with the server file, and a label to communicate with a widget. Each widget can also have its own specific arguments. As an example, we are looking at the code of a slider widget. In the previous screenshot are two versions of a slider; we took the slider range for inspection: sliderInput(inputId = "sliderExample", label = "Slider range", min = 0, max = 100, value = c(25, 75)) Besides the mandatory arguments, inputId and label, three more values have been added to the slider widget. The min and max arguments specify the minimum and maximum values that can be selected. In our example, these are 0 and 100. A numeric vector was assigned to the value argument, and this creates a double-ended range slider. This vector must logically be within the set minimum and maximum values. Currently, there are more than twenty different input widgets, which in turn are all individually configurable by assigning to them their own set of arguments. A brief overview of the output elements As we have seen, the output elements in the ui.R file are connected to the rendering functions in the server file. The mainly used output elements are: htmlOutput imageOutput plotOutput tableOutput textOutput verbatimTextOutput downloadButton Due to their unambiguous naming, the purpose of these elements should be clear. Individualizing your app even further with Shiny tags Although you don't need to know HTML to create stunning Shiny applications, you have the option to create highly customized apps with the usage of raw HTML or so-called Shiny tags. To add raw HTML, you can use the HTML() function. We will focus on Shiny tags in the following list. Currently, there are over a 100 different Shiny tag objects, which can be used to add text styling, colors, different headers, visual and audio, lists, and many more things. You can use these tags by writing tags $tagname. Following is a brief list of useful tags: tags$h1: This is first level header; of course you can also use the known h1 -h6 tags$hr: This makes a horizontal line, also known as a thematic break tags$br: This makes a line break, a popular way to add some space tags$strong = This makes the text bold tags$div: This makes a division of text with a uniform style tags$a: This links to a webpage tags$iframe: This makes an inline frame for embedding possibilities The following ui.R file and corresponding screenshot show the usage of Shiny tags by an example: shinyUI(fluidPage( fluidRow( column(6, tags$h3("Customize your app with Shiny tags!"), tags$hr(), tags$a(href = "http://www.rstudio.com", "Click me"), tags$hr() ), column(6, tags$br(), tags$em("Look - the R project logo"), tags$br(), tags$img(src = "http://www.r-project.org/Rlogo.png") ) ), fluidRow( column(6, tags$strong("We can even add a video"), tags$video(src = "video.mp4", type = "video/mp4", autoplay = NA, controls = NA) ), column(6, tags$br(), tags$ol( tags$li("One"), tags$li("Two"), tags$li("Three")) ) ) ))   Creating dynamic user interface elements We know how to build completely custom user interfaces with all the bells and whistles. But all the introduced types of interface elements are fixed and static. However, if you need to create dynamic interface elements, Shiny offers three ways to achieve it: The conditinalPanel() function: The renderUI() function The use of directly injected JavaScript code In the following section, we only show how to use the first two ways, because firstly, they are built into the Shiny package, and secondly, the JavaScript method is indicated as experimental. Using conditionalPanel The condtionalPanel() functions allow you to show or hide interface elements dynamically, and is set in the ui.R file. The dynamic of this function is achieved by JavaScript expressions, but as usual in the Shiny package, all you need to know is R programming. The following example application shows how this function works for the ui.R file: library(shiny) shinyUI(fluidPage( titlePanel("Dynamic Interface With Conditional Panels"), column(4, wellPanel( sliderInput(inputId = "n", label= "Number of points:", min = 10, max = 200, value = 50, step = 10) )), column(5, "The plot below will be not displayed when the slider value", "is less than 50.", conditionalPanel("input.n >= 50", plotOutput("scatterPlot", height = 300) ) ) )) The following example application shows how this function works for the Related server.R file: library(shiny) shinyServer(function(input, output) { output$scatterPlot <- renderPlot({ x <- rnorm(input$n) y <- rnorm(input$n) plot(x, y) }) }) The code for this example application was taken from the Shiny gallery of RStudio (http://shiny.rstudio.com/gallery/conditionalpanel-demo.html). As readable in both code files, the defined function, input.n, is the linchpin for the dynamic behavior of the example app. In the conditionalPanel() function, it is defined that inputId="n" must have a value of 50 or higher, while the input and output of the plot will work as already defined. Taking advantage of the renderUI function The renderUI() function is hooked, contrary to the previous model, to the server file to create a dynamic user interface. We have already introduced different render output functions in this article. The following example code shows the basic functionality using the ui.R file: # Partial example taken from the Shiny documentation numericInput("lat", "Latitude"), numericInput("long", "Longitude"), uiOutput("cityControls") The following example code shows the basic functionality of the Related sever.R file: # Partial example output$cityControls <- renderUI({ cities <- getNearestCities(input$lat, input$long) checkboxGroupInput("cities", "Choose Cities", cities) }) As described, the dynamic of this method gets defined in the renderUI() process as an output, which then gets displayed through the uiOutput() function in the ui.R file. Sharing your Shiny application with others Typically, you create a Shiny application not only for yourself, but also for other users. There are a two main ways to distribute your app; either you let users download your application, or you deploy it on the web. Offering a download of your Shiny app By offering the option to download your final Shiny application, other users can run your app locally. Actually, there are four ways to deliver your app this way. No matter which way you choose, it is important that the user has installed R and the Shiny package on his/her computer. Gist Gist is a public code sharing pasteboard from GitHub. To share your app this way, it is important that both the ui.R file and the server.R file are in the same Gist and have been named correctly. Take a look at the following screenshot: There are two options to run apps via Gist. First, just enter runGist("Gist_URL") in the console of RStudio; or second, just use the Gist ID and place it in the shiny::runGist("Gist_ID") function. Gist is a very easy way to share your application, but you need to keep in mind that your code is published on a third-party server. GitHub The next way to enable users to download your app is through a GitHub repository: To run an application from GitHub, you need to enter the command, shiny::runGitHub ("Repository_Name", "GitHub_Account_Name"), in the console: Zip file There are two ways to share a Shiny application by zip file. You can either let the user download the zip file over the web, or you can share it via email, USB stick, memory card, or any other such device. To download a zip file via the Web, you need to type runUrl ("Zip_File_URL") in the console: Package Certainly, a much more labor-intensive but also publically effective way is to create a complete R package for your Shiny application. This especially makes sense if you have built an extensive application that may help many other users. Another advantage is the fact that you can also publish your application on CRAN. Later in the book, we will show you how to create an R package. Deploying your app to the web After showing you the ways users can download your app and run it on their local machines, we will now check the options to deploy Shiny apps to the web. Shinyapps.io http://www.shinyapps.io/ is a Shiny app- hosting service by RStudio. There is a free-to- use account package, but it is limited to a maximum of five applications, 25 so-called active hours, and the apps are branded with the RStudio logo. Nevertheless, this service is a great way to publish one's own applications quickly and easily to the web. To use http://www.shinyapps.io/ with RStudio, a few R packages and some additional operating system software needs to be installed: RTools (If you use Windows) GCC (If you use Linux) XCode Command Line Tools (If you use Mac OS X) The devtools R package The shinyapps package Since the shinyapps package is not on CRAN, you need to install it via GitHub by using the devtools package: if (!require("devtools")) install.packages("devtools") devtools::install_github("rstudio/shinyapps") library(shinyapps) When everything that is needed is installed ,you are ready to publish your Shiny apps directly from the RStudio IDE. Just click on the Publish icon, and in the new window you will need to log in to your http://www.shinyapps.io/ account once, if you are using it for the first time. All other times, you can directly create a new Shiny app or update an existing app: After clicking on Publish, a new tab called Deploy opens in the console pane, showing you the progress of the deployment process. If there is something set incorrectly, you can use the deployment log to find the error: When the deployment is successful, your app will be publically reachable with its own web address on http://www.shinyapps.io/.   Setting up a self-hosted Shiny server There are two editions of the Shiny Server software: an open source edition and the professional edition. The open source edition can be downloaded for free and you can use it on your own server. The Professional edition offers a lot more features and support by RStudio, but is also priced accordingly. Diving into the Shiny ecosystem Since the Shiny framework is such an awesome and powerful tool, a lot of people, and of course, the creators of RStudio and Shiny have built several packages around it that are enormously extending the existing functionalities of Shiny. These almost infinite possibilities of technical and visual individualization, which are possible by deeply checking the Shiny ecosystem, would certainly go beyond the scope of this article. Therefore, we are presenting only a few important directions to give a first impression. Creating apps with more files In this article, you have learned how to build a Shiny app consisting of two files: the server.R and the ui.R. To include every aspect, we first want to point out that it is also possible to create a single file Shiny app. To do so, create a file called app.R. In this file, you can include both the server.R and the ui.R file. Furthermore, you can include global variables, data, and more. If you build larger Shiny apps with multiple functions, datasets, options, and more, it could be very confusing if you do all of it in just one file. Therefore, single-file Shiny apps are a good idea for simple and small exhibition apps with a minimal setup. Especially for large Shiny apps, it is recommended that you outsource extensive custom functions, datasets, images, and more into your own files, but put them into the same directory as the app. An example file setup could look like this: ~/shinyapp |-- ui.R |-- server.R |-- helper.R |-- data |-- www |-- js |-- etc   To access the helper file, you just need to add source("helpers.R") into the code of your server.R file. The same logic applies to any other R files. If you want to read in some data from your data folder, you store it in a variable that is also in the head of your server.R file, like this: myData &lt;- readRDS("data/myDataset.rds") Expanding the Shiny package As said earlier, you can expand the functionalities of Shiny with several add-on packages. There are currently ten packages available on CRAN with different inbuilt functions to add some extra magic to your Shiny app. shinyAce: This package makes available Ace editor bindings to enable a rich text-editing environment within Shiny. shinybootstrap2: The latest Shiny package uses bootstrap 3; so, if you built your app with bootstrap 2 features, you need to use this package. shinyBS: This package adds the additional features of the original Twitter Bootstraptheme, such as tooltips, modals, and others, to Shiny. shinydashboard: This packages comes from the folks at RStudio and enables the user to create stunning and multifunctional dashboards on top of Shiny. shinyFiles: This provides functionality for client-side navigation of the server side file system in Shiny apps. shinyjs: By using this package, you can perform common JavaScript operations in Shiny applications without having to know any JavaScript. shinyRGL: This package provides Shiny wrappers for the RGL package. This package exposes RGL's ability to export WebGL visualization in a shiny-friendly format. shinystan: This package is, in fact, not a real add-on. Shinystan is a fantastic full-blown Shiny application to give users a graphical interface for Markov chain Monte Carlo simulations. shinythemes: This packages gives you the option of changing the whole look and feel of your application by using different inbuilt bootstrap themes. shinyTree: This exposes bindings to jsTree—a JavaScript library that supports interactive trees—to enable rich, editable trees in Shiny. Of course, you can find a bunch of other packages with similar or even more functionalities, extensions, and also comprehensive Shiny apps on GitHub. Summary To learn more about Shiny, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Learning Shiny (https://www.packtpub.com/application-development/learning-shiny) Mastering Machine Learning with R (https://www.packtpub.com/big-data-and-business-intelligence/mastering-machine-learning-r) Mastering Data Analysis with R (https://www.packtpub.com/big-data-and-business-intelligence/mastering-data-analysis-r)
Read more
  • 0
  • 0
  • 15036

article-image-training-neural-networks-efficiently-using-keras
Packt
22 Feb 2016
9 min read
Save for later

Training neural networks efficiently using Keras

Packt
22 Feb 2016
9 min read
In this article, we will take a look at Keras, one of the most recently developed libraries to facilitate neural network training. The development on Keras started in the early months of 2015; as of today, it has evolved into one of the most popular and widely used libraries that are built on top of Theano, and allows us to utilize our GPU to accelerate neural network training. One of its prominent features is that it's a very intuitive API, which allows us to implement neural networks in only a few lines of code. Once you have Theano installed, you can install Keras from PyPI by executing the following command from your terminal command line: (For more resources related to this topic, see here.) pip install Keras For more information about Keras, please visit the official website at http://keras.io. To see what neural network training via Keras looks like, let's implement a multilayer perceptron to classify the handwritten digits from the MNIST dataset. The MNIST dataset can be downloaded from http://yann.lecun.com/exdb/mnist/ in four parts as listed here: train-images-idx3-ubyte.gz: These are training set images (9912422 bytes) train-labels-idx1-ubyte.gz: These are training set labels (28881 bytes) t10k-images-idx3-ubyte.gz: These are test set images (1648877 bytes) t10k-labels-idx1-ubyte.gz: These are test set labels (4542 bytes) After downloading and unzipped the archives, we place the files into a directory mnist in our current working directory, so that we can load the training as well as the test dataset using the following function: import os import struct import numpy as np def load_mnist(path, kind='train'): """Load MNIST data from `path`""" labels_path = os.path.join(path, '%s-labels-idx1-ubyte' % kind) images_path = os.path.join(path, '%s-images-idx3-ubyte' % kind) with open(labels_path, 'rb') as lbpath: magic, n = struct.unpack('>II', lbpath.read(8)) labels = np.fromfile(lbpath, dtype=np.uint8) with open(images_path, 'rb') as imgpath: magic, num, rows, cols = struct.unpack(">IIII", imgpath.read(16)) images = np.fromfile(imgpath, dtype=np.uint8).reshape(len(labels), 784) return images, labels X_train, y_train = load_mnist('mnist', kind='train') print('Rows: %d, columns: %d' % (X_train.shape[0], X_train.shape[1])) Rows: 60000, columns: 784 X_test, y_test = load_mnist('mnist', kind='t10k') print('Rows: %d, columns: %d' % (X_test.shape[0], X_test.shape[1])) Rows: 10000, columns: 784 On the following pages, we will walk through the code examples for using Keras step by step, which you can directly execute from your Python interpreter. However, if you are interested in training the neural network on your GPU, you can either put it into a Python script, or download the respective code from the Packt Publishing website. In order to run the Python script on your GPU, execute the following command from the directory where the mnist_keras_mlp.py file is located: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python mnist_keras_mlp.py To continue with the preparation of the training data, let's cast the MNIST image array into 32-bit format: >>> import theano >>> theano.config.floatX = 'float32' >>> X_train = X_train.astype(theano.config.floatX) >>> X_test = X_test.astype(theano.config.floatX) Next, we need to convert the class labels (integers 0-9) into the one-hot format. Fortunately, Keras provides a convenient tool for this: >>> from keras.utils import np_utils >>> print('First 3 labels: ', y_train[:3]) First 3 labels: [5 0 4] >>> y_train_ohe = np_utils.to_categorical(y_train) >>> print('nFirst 3 labels (one-hot):n', y_train_ohe[:3]) First 3 labels (one-hot): [[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]] Now, we can get to the interesting part and implement a neural network. However, we will replace the logistic units in the hidden layer with hyperbolic tangent activation functions, replace the logistic function in the output layer with softmax, and add an additional hidden layer. Keras makes these tasks very simple, as you can see in the following code implementation: >>> from keras.models import Sequential >>> from keras.layers.core import Dense >>> from keras.optimizers import SGD >>> np.random.seed(1) >>> model = Sequential() >>> model.add(Dense(input_dim=X_train.shape[1], ... output_dim=50, ... init='uniform', ... activation='tanh')) >>> model.add(Dense(input_dim=50, ... output_dim=50, ... init='uniform', ... activation='tanh')) >>> model.add(Dense(input_dim=50, ... output_dim=y_train_ohe.shape[1], ... init='uniform', ... activation='softmax')) >>> sgd = SGD(lr=0.001, decay=1e-7, momentum=.9) >>> model.compile(loss='categorical_crossentropy', optimizer=sgd) First, we initialize a new model using the Sequential class to implement a feedforward neural network. Then, we can add as many layers to it as we like. However, since the first layer that we add is the input layer, we have to make sure that the input_dim attribute matches the number of features (columns) in the training set (here, 768). Also, we have to make sure that the number of output units (output_dim) and input units (input_dim) of two consecutive layers match. In the preceding example, we added two hidden layers with 50 hidden units plus 1 bias unit each. Note that bias units are initialized to 0 in fully connected networks in Keras. This is in contrast to the MLP implementation, where we initialized the bias units to 1, which is a more common (not necessarily better) convention. Finally, the number of units in the output layer should be equal to the number of unique class labels—the number of columns in the one-hot encoded class label array. Before we can compile our model, we also have to define an optimizer. In the preceding example, we chose a stochastic gradient descent optimization. Furthermore, we can set values for the weight decay constant and momentum learning to adjust the learning rate at each epoch. Lastly, we set the cost (or loss) function to categorical_crossentropy. The (binary) cross-entropy is just the technical term for the cost function in logistic regression, and the categorical cross-entropy is its generalization for multi-class predictions via softmax. After compiling the model, we can now train it by calling the fit method. Here, we are using mini-batch stochastic gradient with a batch size of 300 training samples per batch. We train the MLP over 50 epochs, and we can follow the optimization of the cost function during training by setting verbose=1. The validation_split parameter is especially handy, since it will reserve 10 percent of the training data (here, 6,000 samples) for validation after each epoch, so that we can check if the model is overfitting during training. >>> model.fit(X_train, ... y_train_ohe, ... nb_epoch=50, ... batch_size=300, ... verbose=1, ... validation_split=0.1, ... show_accuracy=True) Train on 54000 samples, validate on 6000 samples Epoch 0 54000/54000 [==============================] - 1s - loss: 2.2290 - acc: 0.3592 - val_loss: 2.1094 - val_acc: 0.5342 Epoch 1 54000/54000 [==============================] - 1s - loss: 1.8850 - acc: 0.5279 - val_loss: 1.6098 - val_acc: 0.5617 Epoch 2 54000/54000 [==============================] - 1s - loss: 1.3903 - acc: 0.5884 - val_loss: 1.1666 - val_acc: 0.6707 Epoch 3 54000/54000 [==============================] - 1s - loss: 1.0592 - acc: 0.6936 - val_loss: 0.8961 - val_acc: 0.7615 […] Epoch 49 54000/54000 [==============================] - 1s - loss: 0.1907 - acc: 0.9432 - val_loss: 0.1749 - val_acc: 0.9482 Printing the value of the cost function is extremely useful during training, since we can quickly spot whether the cost is decreasing during training and stop the algorithm earlier if otherwise to tune the hyperparameters values. To predict the class labels, we can then use the predict_classes method to return the class labels directly as integers: >>> y_train_pred = model.predict_classes(X_train, verbose=0) >>> print('First 3 predictions: ', y_train_pred[:3]) >>> First 3 predictions: [5 0 4] Finally, let's print the model accuracy on training and test sets: >>> train_acc = np.sum( ... y_train == y_train_pred, axis=0) / X_train.shape[0] >>> print('Training accuracy: %.2f%%' % (train_acc * 100)) Training accuracy: 94.51% >>> y_test_pred = model.predict_classes(X_test, verbose=0) >>> test_acc = np.sum(y_test == y_test_pred, ... axis=0) / X_test.shape[0] print('Test accuracy: %.2f%%' % (test_acc * 100)) Test accuracy: 94.39% Note that this is just a very simple neural network without optimized tuning parameters. If you are interested in playing more with Keras, please feel free to further tweak the learning rate, momentum, weight decay, and number of hidden units. Although Keras is great library for implementing and experimenting with neural networks, there are many other Theano wrapper libraries that are worth mentioning. A prominent example is Pylearn2 (http://deeplearning.net/software/pylearn2/), which has been developed in the LISA lab in Montreal. Also, Lasagne (https://github.com/Lasagne/Lasagne) may be of interest to you if you prefer a more minimalistic but extensible library, that offers more control over the underlying Theano code. Summary We caught a glimpse of the most beautiful and most exciting algorithms in the whole machine learning field: artificial neural networks. I can recommend you to follow the works of the leading experts in this field, such as Geoff Hinton (http://www.cs.toronto.edu/~hinton/), Andrew Ng (http://www.andrewng.org), Yann LeCun (http://yann.lecun.com), Juergen Schmidhuber (http://people.idsia.ch/~juergen/), and Yoshua Bengio (http://www.iro.umontreal.ca/~bengioy), just to name a few. To learn more about material design, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Building Machine Learning Systems with Python (https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python) Neural Network Programming with Java (https://www.packtpub.com/networking-and-servers/neural-network-programming-java) Resources for Article: Further resources on this subject: Python Data Analysis Utilities [article] Machine learning and Python – the Dream Team [article] Adding a Spark to R [article]
Read more
  • 0
  • 0
  • 3708

article-image-social-media-insight-using-naive-bayes
Packt
22 Feb 2016
48 min read
Save for later

Social Media Insight Using Naive Bayes

Packt
22 Feb 2016
48 min read
Text-based datasets contain a lot of information, whether they are books, historical documents, social media, e-mail, or any of the other ways we communicate via writing. Extracting features from text-based datasets and using them for classification is a difficult problem. There are, however, some common patterns for text mining. (For more resources related to this topic, see here.) We look at disambiguating terms in social media using the Naive Bayes algorithm, which is a powerful and surprisingly simple algorithm. Naive Bayes takes a few shortcuts to properly compute the probabilities for classification, hence the term naive in the name. It can also be extended to other types of datasets quite easily and doesn't rely on numerical features. The model in this article is a baseline for text mining studies, as the process can work reasonably well for a variety of datasets. We will cover the following topics in this article: Downloading data from social network APIs Transformers for text Naive Bayes classifier Using JSON for saving and loading datasets The NLTK library for extracting features from text The F-measure for evaluation Disambiguation Text is often called an unstructured format. There is a lot of information there, but it is just there; no headings, no required format, loose syntax and other problems prohibit the easy extraction of information from text. The data is also highly connected, with lots of mentions and cross-references—just not in a format that allows us to easily extract it! We can compare the information stored in a book with that stored in a large database to see the difference. In the book, there are characters, themes, places, and lots of information. However, the book needs to be read and, more importantly, interpreted to gain this information. The database sits on your server with column names and data types. All the information is there and the level of interpretation needed is quite low. Information about the data, such as its type or meaning is called metadata, and text lacks it. A book also contains some metadata in the form of a table of contents and index but the degree is significantly lower than that of a database. One of the problems is the term disambiguation. When a person uses the word bank, is this a financial message or an environmental message (such as river bank)? This type of disambiguation is quite easy in many circumstances for humans (although there are still troubles), but much harder for computers to do. In this article, we will look at disambiguating the use of the term Python on Twitter's stream. A message on Twitter is called a tweet and is limited to 140 characters. This means there is little room for context. There isn't much metadata available although hashtags are often used to denote the topic of the tweet. When people talk about Python, they could be talking about the following things: The programming language Python Monty Python, the classic comedy group The snake Python A make of shoe called Python There can be many other things called Python. The aim of our experiment is to take a tweet mentioning Python and determine whether it is talking about the programming language, based only on the content of the tweet. Downloading data from a social network We are going to download a corpus of data from Twitter and use it to sort out spam from useful content. Twitter provides a robust API for collecting information from its servers and this API is free for small-scale usage. It is, however, subject to some conditions that you'll need to be aware of if you start using Twitter's data in a commercial setting. First, you'll need to sign up for a Twitter account (which is free). Go to http://twitter.com and register an account if you do not already have one. Next, you'll need to ensure that you only make a certain number of requests per minute. This limit is currently 180 requests per hour. It can be tricky ensuring that you don't breach this limit, so it is highly recommended that you use a library to talk to Twitter's API. You will need a key to access Twitter's data. Go to http://twitter.com and sign in to your account. When you are logged in, go to https://apps.twitter.com/ and click on Create New App. Create a name and description for your app, along with a website address. If you don't have a website to use, insert a placeholder. Leave the Callback URL field blank for this app—we won't need it. Agree to the terms of use (if you do) and click on Create your Twitter application. Keep the resulting website open—you'll need the access keys that are on this page. Next, we need a library to talk to Twitter. There are many options; the one I like is simply called twitter, and is the official Twitter Python library. You can install twitter using pip3 install twitter if you are using pip to install your packages. If you are using another system, check the documentation at https://github.com/sixohsix/twitter. Create a new IPython Notebook to download the data. We will create several notebooks in this article for various different purposes, so it might be a good idea to also create a folder to keep track of them. This first notebook, ch6_get_twitter, is specifically for downloading new Twitter data. First, we import the twitter library and set our authorization tokens. The consumer key, consumer secret will be available on the Keys and Access Tokens tab on your Twitter app's page. To get the access tokens, you'll need to click on the Create my access token button, which is on the same page. Enter the keys into the appropriate places in the following code: import twitter consumer_key = "<Your Consumer Key Here>" consumer_secret = "<Your Consumer Secret Here>" access_token = "<Your Access Token Here>" access_token_secret = "<Your Access Token Secret Here>" authorization = twitter.OAuth(access_token, access_token_secret, consumer_key, consumer_secret) We are going to get our tweets from Twitter's search function. We will create a reader that connects to twitter using our authorization, and then use that reader to perform searches. In the Notebook, we set the filename where the tweets will be stored: import os output_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_tweets.json") We also need the json library for saving our tweets: import json Next, create an object that can read from Twitter. We create this object with our authorization object that we set up earlier: t = twitter.Twitter(auth=authorization) We then open our output file for writing. We open it for appending—this allows us to rerun the script to obtain more tweets. We then use our Twitter connection to perform a search for the word Python. We only want the statuses that are returned for our dataset. This code takes the tweet, uses the json library to create a string representation using the dumps function, and then writes it to the file. It then creates a blank line under the tweet so that we can easily distinguish where one tweet starts and ends in our file: with open(output_filename, 'a') as output_file: search_results = t.search.tweets(q="python", count=100)['statuses'] for tweet in search_results: if 'text' in tweet: output_file.write(json.dumps(tweet)) output_file.write("nn") In the preceding loop, we also perform a check to see whether there is text in the tweet or not. Not all of the objects returned by twitter will be actual tweets (some will be actions to delete tweets and others). The key difference is the inclusion of text as a key, which we test for. Running this for a few minutes will result in 100 tweets being added to the output file. You can keep rerunning this script to add more tweets to your dataset, keeping in mind that you may get some duplicates in the output file if you rerun it too fast (that is, before Twitter gets new tweets to return!). Loading and classifying the dataset After we have collected a set of tweets (our dataset), we need labels to perform classification. We are going to label the dataset by setting up a form in an IPython Notebook to allow us to enter the labels. The dataset we have stored is nearly in a JSON format. JSON is a format for data that doesn't impose much structure and is directly readable in JavaScript (hence the name, JavaScript Object Notation). JSON defines basic objects such as numbers, strings, lists and dictionaries, making it a good format for storing datasets if they contain data that isn't numerical. If your dataset is fully numerical, you would save space and time using a matrix-based format like in NumPy. A key difference between our dataset and real JSON is that we included newlines between tweets. The reason for this was to allow us to easily append new tweets (the actual JSON format doesn't allow this easily). Our format is a JSON representation of a tweet, followed by a newline, followed by the next tweet, and so on. To parse it, we can use the json library but we will have to first split the file by newlines to get the actual tweet objects themselves. Set up a new IPython Notebook (I called mine ch6_label_twitter) and enter the dataset's filename. This is the same filename in which we saved the data in the previous section. We also define the filename that we will use to save the labels to. The code is as follows: import os input_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_tweets.json") labels_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_classes.json") As stated, we will use the json library, so import that too: import json We create a list that will store the tweets we received from the file: tweets = [] We then iterate over each line in the file. We aren't interested in lines with no information (they separate the tweets for us), so check if the length of the line (minus any whitespace characters) is zero. If it is, ignore it and move to the next line. Otherwise, load the tweet using json.loads (which loads a JSON object from a string) and add it to our list of tweets. The code is as follows: with open(input_filename) as inf: for line in inf: if len(line.strip()) == 0: continue tweets.append(json.loads(line)) We are now interested in classifying whether an item is relevant to us or not (in this case, relevant means refers to the programming language Python). We will use the IPython Notebook's ability to embed HTML and talk between JavaScript and Python to create a viewer of tweets to allow us to easily and quickly classify the tweets as spam or not. The code will present a new tweet to the user (you) and ask for a label: is it relevant or not? It will then store the input and present the next tweet to be labeled. First, we create a list for storing the labels. These labels will be stored whether or not the given tweet refers to the programming language Python, and it will allow our classifier to learn how to differentiate between meanings. We also check if we have any labels already and load them. This helps if you need to close the notebook down midway through labeling. This code will load the labels from where you left off. It is generally a good idea to consider how to save at midpoints for tasks like this. Nothing hurts quite like losing an hour of work because your computer crashed before you saved the labels! The code is as follows: labels = [] if os.path.exists(labels_filename): with open(labels_filename) as inf: labels = json.load(inf) Next, we create a simple function that will return the next tweet that needs to be labeled. We can work out which is the next tweet by finding the first one that hasn't yet been labeled. The code is as follows: def get_next_tweet(): return tweet_sample[len(labels)]['text'] The next step in our experiment is to collect information from the user (you!) on which tweets are referring to Python (the programming language) and which are not. As of yet, there is not a good, straightforward way to get interactive feedback with pure Python in IPython Notebooks. For this reason, we will use some JavaScript and HTML to get this input from the user. Next we create some JavaScript in the IPython Notebook to run our input. Notebooks allow us to use magic functions to embed HTML and JavaScript (among other things) directly into the Notebook itself. Start a new cell with the following line at the top: %%javascript The code in here will be in JavaScript, hence the curly braces that are coming up. Don't worry, we will get back to Python soon. Keep in mind here that the following code must be in the same cell as the %%javascript magic function. The first function we will define in JavaScript shows how easy it is to talk to your Python code from JavaScript in IPython Notebooks. This function, if called, will add a label to the labels array (which is in python code). To do this, we load the IPython kernel as a JavaScript object and give it a Python command to execute. The code is as follows: function set_label(label){ var kernel = IPython.notebook.kernel; kernel.execute("labels.append(" + label + ")"); load_next_tweet(); } At the end of that function, we call the load_next_tweet function. This function loads the next tweet to be labeled. It runs on the same principle; we load the IPython kernel and give it a command to execute (calling the get_next_tweet function we defined earlier). However, in this case we want to get the result. This is a little more difficult. We need to define a callback, which is a function that is called when the data is returned. The format for defining callback is outside the scope of this book. If you are interested in more advanced JavaScript/Python integration, consult the IPython documentation. The code is as follows: function load_next_tweet(){ var code_input = "get_next_tweet()"; var kernel = IPython.notebook.kernel; var callbacks = { 'iopub' : {'output' : handle_output}}; kernel.execute(code_input, callbacks, {silent:false}); } The callback function is called handle_output, which we will define now. This function gets called when the Python function that kernel.execute calls returns a value. As before, the full format of this is outside the scope of this book. However, for our purposes the result is returned as data of the type text/plain, which we extract and show in the #tweet_text div of the form we are going to create in the next cell. The code is as follows: function handle_output(out){ var res = out.content.data["text/plain"]; $("div#tweet_text").html(res); } Our form will have a div that shows the next tweet to be labeled, which we will give the ID #tweet_text. We also create a textbox to enable us to capture key presses (otherwise, the Notebook will capture them and JavaScript won't do anything). This allows us to use the keyboard to set labels of 1 or 0, which is faster than using the mouse to click buttons—given that we will need to label at least 100 tweets. Run the previous cell to embed some JavaScript into the page, although nothing will be shown to you in the results section. We are going to use a different magic function now, %%html. Unsurprisingly, this magic function allows us to directly embed HTML into our Notebook. In a new cell, start with this line: %%html For this cell, we will be coding in HTML and a little JavaScript. First, define a div element to store our current tweet to be labeled. I've also added some instructions for using this form. Then, create the #tweet_text div that will store the text of the next tweet to be labeled. As stated before, we need to create a textbox to be able to capture key presses. The code is as follows: <div name="tweetbox"> Instructions: Click in textbox. Enter a 1 if the tweet is relevant, enter 0 otherwise.<br> Tweet: <div id="tweet_text" value="text"></div><br> <input type=text id="capture"></input><br> </div> Don't run the cell just yet! We create the JavaScript for capturing the key presses. This has to be defined after creating the form, as the #tweet_text div doesn't exist until the above code runs. We use the JQuery library (which IPython is already using, so we don't need to include the JavaScript file) to add a function that is called when key presses are made on the #capture textbox we defined. However, keep in mind that this is a %%html cell and not a JavaScript cell, so we need to enclose this JavaScript in the <script> tags. We are only interested in key presses if the user presses the 0 or the 1, in which case the relevant label is added. We can determine which key was pressed by the ASCII value stored in e.which. If the user presses 0 or 1, we append the label and clear out the textbox. The code is as follows: <script> $("input#capture").keypress(function(e) { if(e.which == 48) { set_label(0); $("input#capture").val(""); }else if (e.which == 49){ set_label(1); $("input#capture").val(""); } }); All other key presses are ignored. As a last bit of JavaScript for this article (I promise), we call the load_next_tweet() function. This will set the first tweet to be labeled and then close off the JavaScript. The code is as follows: load_next_tweet(); </script> After you run this cell, you will get an HTML textbox, alongside the first tweet's text. Click in the textbox and enter 1 if it is relevant to our goal (in this case, it means is the tweet related to the programming language Python) and a 0 if it is not. After you do this, the next tweet will load. Enter the label and the next one will load. This continues until the tweets run out. When you finish all of this, simply save the labels to the output filename we defined earlier for the class values: with open(labels_filename, 'w') as outf: json.dump(labels, outf) You can call the preceding code even if you haven't finished. Any labeling you have done to that point will be saved. Running this Notebook again will pick up where you left off and you can keep labeling your tweets. This might take a while to do this! If you have a lot of tweets in your dataset, you'll need to classify all of them. If you are pushed for time, you can download the same dataset I used, which contains classifications. Creating a replicable dataset from Twitter In data mining, there are lots of variables. These aren't just in the data mining algorithms—they also appear in the data collection, environment, and many other factors. Being able to replicate your results is important as it enables you to verify or improve upon your results. Getting 80 percent accuracy on one dataset with algorithm X, and 90 percent accuracy on another dataset with algorithm Y doesn't mean that Y is better. We need to be able to test on the same dataset in the same conditions to be able to properly compare. On running the preceding code, you will get a different dataset to the one I created and used. The main reasons are that Twitter will return different search results for you than me based on the time you performed the search. Even after that, your labeling of tweets might be different from what I do. While there are obvious examples where a given tweet relates to the python programming language, there will always be gray areas where the labeling isn't obvious. One tough gray area I ran into was tweets in non-English languages that I couldn't read. In this specific instance, there are options in Twitter's API for setting the language, but even these aren't going to be perfect. Due to these factors, it is difficult to replicate experiments on databases that are extracted from social media, and Twitter is no exception. Twitter explicitly disallows sharing datasets directly. One solution to this is to share tweet IDs only, which you can share freely. In this section, we will first create a tweet ID dataset that we can freely share. Then, we will see how to download the original tweets from this file to recreate the original dataset. First, we save the replicable dataset of tweet IDs. Creating another new IPython Notebook, first set up the filenames. This is done in the same way we did labeling but there is a new filename where we can store the replicable dataset. The code is as follows: import os input_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_tweets.json") labels_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_classes.json") replicable_dataset = os.path.join(os.path.expanduser("~"), "Data", "twitter", "replicable_dataset.json") We load the tweets and labels as we did in the previous notebook: import json tweets = [] with open(input_filename) as inf: for line in inf: if len(line.strip()) == 0: continue tweets.append(json.loads(line)) if os.path.exists(labels_filename): with open(classes_filename) as inf: labels = json.load(inf) Now we create a dataset by looping over both the tweets and labels at the same time and saving those in a list: dataset = [(tweet['id'], label) for tweet, label in zip(tweets, labels)] Finally, we save the results in our file: with open(replicable_dataset, 'w') as outf: json.dump(dataset, outf) Now that we have the tweet IDs and labels saved, we can recreate the original dataset. If you are looking to recreate the dataset I used for this article, it can be found in the code bundle that comes with this book. Loading the preceding dataset is not difficult but it can take some time. Start a new IPython Notebook and set the dataset, label, and tweet ID filenames as before. I've adjusted the filenames here to ensure that you don't overwrite your previously collected dataset, but feel free to change these if you want. The code is as follows: import os tweet_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "replicable_python_tweets.json") labels_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "replicable_python_classes.json") replicable_dataset = os.path.join(os.path.expanduser("~"), "Data", "twitter", "replicable_dataset.json") Then load the tweet IDs from the file using JSON: import json with open(replicable_dataset) as inf: tweet_ids = json.load(inf) Saving the labels is very easy. We just iterate through this dataset and extract the IDs. We could do this quite easily with just two lines of code (open file and save tweets). However, we can't guarantee that we will get all the tweets we are after (for example, some may have been changed to private since collecting the dataset) and therefore the labels will be incorrectly indexed against the data. As an example, I tried to recreate the dataset just one day after collecting them and already two of the tweets were missing (they might be deleted or made private by the user). For this reason, it is important to only print out the labels that we need. To do this, we first create an empty actual labels list to store the labels for tweets that we actually recover from twitter, and then create a dictionary mapping the tweet IDs to the labels. The code is as follows: actual_labels = [] label_mapping = dict(tweet_ids) Next, we are going to create a twitter server to collect all of these tweets. This is going to take a little longer. Import the twitter library that we used before, creating an authorization token and using that to create the twitter object: import twitter consumer_key = "<Your Consumer Key Here>" consumer_secret = "<Your Consumer Secret Here>" access_token = "<Your Access Token Here>" access_token_secret = "<Your Access Token Secret Here>" authorization = twitter.OAuth(access_token, access_token_secret, consumer_key, consumer_secret) t = twitter.Twitter(auth=authorization) Iterate over each of the twitter IDs by extracting the IDs into a list using the following command: all_ids = [tweet_id for tweet_id, label in tweet_ids] Then, we open our output file to save the tweets: with open(tweets_filename, 'a') as output_file: The Twitter API allows us get 100 tweets at a time. Therefore, we iterate over each batch of 100 tweets: for start_index in range(0, len(tweet_ids), 100): To search by ID, we first create a string that joins all of the IDs (in this batch) together: id_string = ",".join(str(i) for i in all_ids[start_index:start_index+100]) Next, we perform a statuses/lookup API call, which is defined by Twitter. We pass our list of IDs (which we turned into a string) into the API call in order to have those tweets returned to us: search_results = t.statuses.lookup(_id=id_string) Then for each tweet in the search results, we save it to our file in the same way we did when we were collecting the dataset originally: for tweet in search_results: if 'text' in tweet: output_file.write(json.dumps(tweet)) output_file.write("nn") As a final step here (and still under the preceding if block), we want to store the labeling of this tweet. We can do this using the label_mapping dictionary we created before, looking up the tweet ID. The code is as follows: actual_labels.append(label_mapping[tweet['id']]) Run the previous cell and the code will collect all of the tweets for you. If you created a really big dataset, this may take a while—Twitter does rate-limit requests. As a final step here, save the actual_labels to our classes file: with open(labels_filename, 'w') as outf: json.dump(actual_labels, outf) Text transformers Now that we have our dataset, how are we going to perform data mining on it? Text-based datasets include books, essays, websites, manuscripts, programming code, and other forms of written expression. All of the algorithms we have seen so far deal with numerical or categorical features, so how do we convert our text into a format that the algorithm can deal with? There are a number of measurements that could be taken. For instance, average word and average sentence length are used to predict the readability of a document. However, there are lots of feature types such as word occurrence which we will now investigate. Bag-of-words One of the simplest but highly effective models is to simply count each word in the dataset. We create a matrix, where each row represents a document in our dataset and each column represents a word. The value of the cell is the frequency of that word in the document. Here's an excerpt from The Lord of the Rings, J.R.R. Tolkien: Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in halls of stone, Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them all, One Ring to find them, One Ring to bring them all and in the darkness bind them. In the Land of Mordor where the Shadows lie.                                            - J.R.R. Tolkien's epigraph to The Lord of The Rings The word the appears nine times in this quote, while the words in, for, to, and one each appear four times. The word ring appears three times, as does the word of. We can create a dataset from this, choosing a subset of words and counting the frequency: Word the one ring to Frequency 9 4 3 4 We can use the counter class to do a simple count for a given string. When counting words, it is normal to convert all letters to lowercase, which we do when creating the string. The code is as follows: s = """Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in halls of stone, Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them all, One Ring to find them, One Ring to bring them all and in the darkness bind them. In the Land of Mordor where the Shadows lie. """.lower() words = s.split() from collections import Counter c = Counter(words) Printing c.most_common(5) gives the list of the top five most frequently occurring words. Ties are not handled well as only five are given and a very large number of words all share a tie for fifth place. The bag-of-words model has three major types. The first is to use the raw frequencies, as shown in the preceding example. This does have a drawback when documents vary in size from fewer words to many words, as the overall values will be very different. The second model is to use the normalized frequency, where each document's sum equals 1. This is a much better solution as the length of the document doesn't matter as much. The third type is to simply use binary features—a value is 1 if the word occurs at all and 0 if it doesn't. We will use binary representation in this article. Another popular (arguably more popular) method for performing normalization is called term frequency - inverse document frequency, or tf-idf. In this weighting scheme, term counts are first normalized to frequencies and then divided by the number of documents in which it appears in the corpus There are a number of libraries for working with text data in Python. We will use a major one, called Natural Language ToolKit (NLTK). The scikit-learn library also has the CountVectorizer class that performs a similar action, and it is recommended you take a look at it. However the NLTK version has more options for word tokenization. If you are doing natural language processing in python, NLTK is a great library to use. N-grams A step up from single bag-of-words features is that of n-grams. An n-gram is a subsequence of n consecutive tokens. In this context, a word n-gram is a set of n words that appear in a row. They are counted the same way, with the n-grams forming a word that is put in the bag. The value of a cell in this dataset is the frequency that a particular n-gram appears in the given document. The value of n is a parameter. For English, setting it to between 2 to 5 is a good start, although some applications call for higher values. As an example, for n=3, we extract the first few n-grams in the following quote: Always look on the bright side of life. The first n-gram (of size 3) is Always look on, the second is look on the, the third is on the bright. As you can see, the n-grams overlap and cover three words. Word n-grams have advantages over using single words. This simple concept introduces some context to word use by considering its local environment, without a large overhead of understanding the language computationally. A disadvantage of using n-grams is that the matrix becomes even sparser—word n-grams are unlikely to appear twice (especially in tweets and other short documents!). Specially for social media and other short documents, word n-grams are unlikely to appear in too many different tweets, unless it is a retweet. However, in larger documents, word n-grams are quite effective for many applications. Another form of n-gram for text documents is that of a character n-gram. Rather than using sets of words, we simply use sets of characters (although character n-grams have lots of options for how they are computed!). This type of dataset can pick up words that are misspelled, as well as providing other benefits. We will test character n-grams in this article. Other features There are other features that can be extracted too. These include syntactic features, such as the usage of particular words in sentences. Part-of-speech tags are also popular for data mining applications that need to understand meaning in text. Such feature types won't be covered in this book. If you are interested in learning more, I recommend Python 3 Text Processing with NLTK 3 Cookbook, Jacob Perkins, Packt publication. Naive Bayes Naive Bayes is a probabilistic model that is unsurprisingly built upon a naive interpretation of Bayesian statistics. Despite the naive aspect, the method performs very well in a large number of contexts. It can be used for classification of many different feature types and formats, but we will focus on one in this article: binary features in the bag-of-words model. Bayes' theorem For most of us, when we were taught statistics, we started from a frequentist approach. In this approach, we assume the data comes from some distribution and we aim to determine what the parameters are for that distribution. However, those parameters are (perhaps incorrectly) assumed to be fixed. We use our model to describe the data, even testing to ensure the data fits our model. Bayesian statistics instead model how people (non-statisticians) actually reason. We have some data and we use that data to update our model about how likely something is to occur. In Bayesian statistics, we use the data to describe the model rather than using a model and confirming it with data (as per the frequentist approach). Bayes' theorem computes the value of P(A|B), that is, knowing that B has occurred, what is the probability of A. In most cases, B is an observed event such as it rained yesterday, and A is a prediction it will rain today. For data mining, B is usually we observed this sample and A is it belongs to this class. We will see how to use Bayes' theorem for data mining in the next section. The equation for Bayes' theorem is given as follows: As an example, we want to determine the probability that an e-mail containing the word drugs is spam (as we believe that such a tweet may be a pharmaceutical spam). A, in this context, is the probability that this tweet is spam. We can compute P(A), called the prior belief directly from a training dataset by computing the percentage of tweets in our dataset that are spam. If our dataset contains 30 spam messages for every 100 e-mails, P(A) is 30/100 or 0.3. B, in this context, is this tweet contains the word 'drugs'. Likewise, we can compute P(B) by computing the percentage of tweets in our dataset containing the word drugs. If 10 e-mails in every 100 of our training dataset contain the word drugs, P(B) is 10/100 or 0.1. Note that we don't care if the e-mail is spam or not when computing this value. P(B|A) is the probability that an e-mail contains the word drugs if it is spam. It is also easy to compute from our training dataset. We look through our training set for spam e-mails and compute the percentage of them that contain the word drugs. Of our 30 spam e-mails, if 6 contain the word drugs, then P(B|A) is calculated as 6/30 or 0.2. From here, we use Bayes' theorem to compute P(A|B), which is the probability that a tweet containing the word drugs is spam. Using the previous equation, we see the result is 0.6. This indicates that if an e-mail has the word drugs in it, there is a 60 percent chance that it is spam. Note the empirical nature of the preceding example—we use evidence directly from our training dataset, not from some preconceived distribution. In contrast, a frequentist view of this would rely on us creating a distribution of the probability of words in tweets to compute similar equations. Naive Bayes algorithm Looking back at our Bayes' theorem equation, we can use it to compute the probability that a given sample belongs to a given class. This allows the equation to be used as a classification algorithm. With C as a given class and D as a sample in our dataset, we create the elements necessary for Bayes' theorem, and subsequently Naive Bayes. Naive Bayes is a classification algorithm that utilizes Bayes' theorem to compute the probability that a new data sample belongs to a particular class. P(C) is the probability of a class, which is computed from the training dataset itself (as we did with the spam example). We simply compute the percentage of samples in our training dataset that belong to the given class. P(D) is the probability of a given data sample. It can be difficult to compute this, as the sample is a complex interaction between different features, but luckily it is a constant across all classes. Therefore, we don't need to compute it at all. We will see later how to get around this issue. P(D|C) is the probability of the data point belonging to the class. This could also be difficult to compute due to the different features. However, this is where we introduce the naive part of the Naive Bayes algorithm. We naively assume that each feature is independent of each other. Rather than computing the full probability of P(D|C), we compute the probability of each feature D1, D2, D3, … and so on. Then, we multiply them together: P(D|C) = P(D1|C) x P(D2|C).... x P(Dn|C) Each of these values is relatively easy to compute with binary features; we simply compute the percentage of times it is equal in our sample dataset. In contrast, if we were to perform a non-naive Bayes version of this part, we would need to compute the correlations between different features for each class. Such computation is infeasible at best, and nearly impossible without vast amounts of data or adequate language analysis models. From here, the algorithm is straightforward. We compute P(C|D) for each possible class, ignoring the P(D) term. Then we choose the class with the highest probability. As the P(D) term is consistent across each of the classes, ignoring it has no impact on the final prediction. How it works As an example, suppose we have the following (binary) feature values from a sample in our dataset: [0, 0, 0, 1]. Our training dataset contains two classes with 75 percent of samples belonging to the class 0, and 25 percent belonging to the class 1. The likelihood of the feature values for each class are as follows: For class 0: [0.3, 0.4, 0.4, 0.7] For class 1: [0.7, 0.3, 0.4, 0.9] These values are to be interpreted as: for feature 1, it is a 1 in 30 percent of cases for class 0. We can now compute the probability that this sample should belong to the class 0. P(C=0) = 0.75 which is the probability that the class is 0. P(D) isn't needed for the Naive Bayes algorithm. Let's take a look at the calculation: P(D|C=0) = P(D1|C=0) x P(D2|C=0) x P(D3|C=0) x P(D4|C=0) = 0.3 x 0.6 x 0.6 x 0.7 = 0.0756 The second and third values are 0.6, because the value of that feature in the sample was 0. The listed probabilities are for values of 1 for each feature. Therefore, the probability of a 0 is its inverse: P(0) = 1 – P(1). Now, we can compute the probability of the data point belonging to this class. An important point to note is that we haven't computed P(D), so this isn't a real probability. However, it is good enough to compare against the same value for the probability of the class 1. Let's take a look at the calculation: P(C=0|D) = P(C=0) P(D|C=0) = 0.75 * 0.0756 = 0.0567 Now, we compute the same values for the class 1: P(C=1) = 0.25 P(D) isn't needed for naive Bayes. Let's take a look at the calculation: P(D|C=1) = P(D1|C=1) x P(D2|C=1) x P(D3|C=1) x P(D4|C=1) = 0.7 x 0.7 x 0.6 x 0.9 = 0.2646 P(C=1|D) = P(C=1)P(D|C=1) = 0.25 * 0.2646 = 0.06615 Normally, P(C=0|D) + P(C=1|D) should equal to 1. After all, those are the only two possible options! However, the probabilities are not 1 due to the fact we haven't included the computation of P(D) in our equations here. The data point should be classified as belonging to the class 1. You may have guessed this while going through the equations anyway; however, you may have been a bit surprised that the final decision was so close. After all, the probabilities in computing P(D|C) were much, much higher for the class 1. This is because we introduced a prior belief that most samples generally belong to the class 0. If the classes had been equal sizes, the resulting probabilities would be much different. Try it yourself by changing both P(C=0) and P(C=1) to 0.5 for equal class sizes and computing the result again. Application We will now create a pipeline that takes a tweet and determines whether it is relevant or not, based only on the content of that tweet. To perform the word extraction, we will be using the NLTK, a library that contains a large number of tools for performing analysis on natural language. We will use NLTK in future articles as well. To get NLTK on your computer, use pip to install the package: pip3 install nltk If that doesn't work, see the NLTK installation instructions at www.nltk.org/install.html. We are going to create a pipeline to extract the word features and classify the tweets using Naive Bayes. Our pipeline has the following steps: Transform the original text documents into a dictionary of counts using NLTK's word_tokenize function. Transform those dictionaries into a vector matrix using the DictVectorizer transformer in scikit-learn. This is necessary to enable the Naive Bayes classifier to read the feature values extracted in the first step. Train the Naive Bayes classifier, as we have seen in previous articles. We will need to create another Notebook (last one for the article!) called ch6_classify_twitter for performing the classification. Extracting word counts We are going to use NLTK to extract our word counts. We still want to use it in a pipeline, but NLTK doesn't conform to our transformer interface. We will therefore need to create a basic transformer to do this to obtain both fit and transform methods, enabling us to use this in a pipeline. First, set up the transformer class. We don't need to fit anything in this class, as this transformer simply extracts the words in the document. Therefore, our fit is an empty function, except that it returns self which is necessary for transformer objects. Our transform is a little more complicated. We want to extract each word from each document and record True if it was discovered. We are only using the binary features here—True if in the document, False otherwise. If we wanted to use the frequency we would set up counting dictionaries. Let's take a look at the code: from sklearn.base import TransformerMixin class NLTKBOW(TransformerMixin): def fit(self, X, y=None): return self def transform(self, X): return [{word: True for word in word_tokenize(document)} for document in X] The result is a list of dictionaries, where the first dictionary is the list of words in the first tweet, and so on. Each dictionary has a word as key and the value true to indicate this word was discovered. Any word not in the dictionary will be assumed to have not occurred in the tweet. Explicitly stating that a word's occurrence is False will also work, but will take up needless space to store. Converting dictionaries to a matrix This step converts the dictionaries built as per the previous step into a matrix that can be used with a classifier. This step is made quite simple through the DictVectorizer transformer. The DictVectorizer class simply takes a list of dictionaries and converts them into a matrix. The features in this matrix are the keys in each of the dictionaries, and the values correspond to the occurrence of those features in each sample. Dictionaries are easy to create in code, but many data algorithm implementations prefer matrices. This makes DictVectorizer a very useful class. In our dataset, each dictionary has words as keys and only occurs if the word actually occurs in the tweet. Therefore, our matrix will have each word as a feature and a value of True in the cell if the word occurred in the tweet. To use DictVectorizer, simply import it using the following command: from sklearn.feature_extraction import DictVectorizer Training the Naive Bayes classifier Finally, we need to set up a classifier and we are using Naive Bayes for this article. As our dataset contains only binary features, we use the BernoulliNB classifier that is designed for binary features. As a classifier, it is very easy to use. As with DictVectorizer, we simply import it and add it to our pipeline: from sklearn.naive_bayes import BernoulliNB Putting it all together Now comes the moment to put all of these pieces together. In our IPython Notebook, set the filenames and load the dataset and classes as we have done before. Set the filenames for both the tweets themselves (not the IDs!) and the labels that we assigned to them. The code is as follows: import os input_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_tweets.json") labels_filename = os.path.join(os.path.expanduser("~"), "Data", "twitter", "python_classes.json") Load the tweets themselves. We are only interested in the content of the tweets, so we extract the text value and store only that. The code is as follows: tweets = [] with open(input_filename) as inf: for line in inf: if len(line.strip()) == 0: continue tweets.append(json.loads(line)['text']) Load the labels for each of the tweets: with open(classes_filename) as inf: labels = json.load(inf) Now, create a pipeline putting together the components from before. Our pipeline has three parts: The NLTKBOW transformer we created A DictVectorizer transformer A BernoulliNB classifier The code is as follows: from sklearn.pipeline import Pipeline pipeline = Pipeline([('bag-of-words', NLTKBOW()), ('vectorizer', DictVectorizer()), ('naive-bayes', BernoulliNB()) ]) We can nearly run our pipeline now, which we will do with cross_val_score as we have done many times before. Before that though, we will introduce a better evaluation metric than the accuracy metric we used before. As we will see, the use of accuracy is not adequate for datasets when the number of samples in each class is different. Evaluation using the F1-score When choosing an evaluation metric, it is always important to consider cases where that evaluation metric is not useful. Accuracy is a good evaluation metric in many cases, as it is easy to understand and simple to compute. However, it can be easily faked. In other words, in many cases you can create algorithms that have a high accuracy by poor utility. While our dataset of tweets (typically, your results may vary) contains about 50 percent programming-related and 50 percent nonprogramming, many datasets aren't as balanced as this. As an example, an e-mail spam filter may expect to see more than 80 percent of incoming e-mails be spam. A spam filter that simply labels everything as spam is quite useless; however, it will obtain an accuracy of 80 percent! To get around this problem, we can use other evaluation metrics. One of the most commonly employed is called an f1-score (also called f-score, f-measure, or one of many other variations on this term). The f1-score is defined on a per-class basis and is based on two concepts: the precision and recall. The precision is the percentage of all the samples that were predicted as belonging to a specific class that were actually from that class. The recall is the percentage of samples in the dataset that are in a class and actually labeled as belonging to that class. In the case of our application, we could compute the value for both classes (relevant and not relevant). However, we are really interested in the spam. Therefore, our precision computation becomes the question: of all the tweets that were predicted as being relevant, what percentage were actually relevant? Likewise, the recall becomes the question: of all the relevant tweets in the dataset, how many were predicted as being relevant? After you compute both the precision and recall, the f1-score is the harmonic mean of the precision and recall: To use the f1-score in scikit-learn methods, simply set the scoring parameter to f1. By default, this will return the f1-score of the class with label 1. Running the code on our dataset, we simply use the following line of code: scores = cross_val_score(pipeline, tweets, labels, scoring='f1') We then print out the average of the scores: import numpy as np print("Score: {:.3f}".format(np.mean(scores))) The result is 0.798, which means we can accurately determine if a tweet using Python relates to the programing language nearly 80 percent of the time. This is using a dataset with only 200 tweets in it. Go back and collect more data and you will find that the results increase! More data usually means a better accuracy, but it is not guaranteed! Getting useful features from models One question you may ask is what are the best features for determining if a tweet is relevant or not? We can extract this information from of our Naive Bayes model and find out which features are the best individually, according to Naive Bayes. First we fit a new model. While the cross_val_score gives us a score across different folds of cross-validated testing data, it doesn't easily give us the trained models themselves. To do this, we simply fit our pipeline with the tweets, creating a new model. The code is as follows: model = pipeline.fit(tweets, labels) Note that we aren't really evaluating the model here, so we don't need to be as careful with the training/testing split. However, before you put these features into practice, you should evaluate on a separate test split. We skip over that here for the sake of clarity. A pipeline gives you access to the individual steps through the named_steps attribute and the name of the step (we defined these names ourselves when we created the pipeline object itself). For instance, we can get the Naive Bayes model: nb = model.named_steps['naive-bayes'] From this model, we can extract the probabilities for each word. These are stored as log probabilities, which is simply log(P(A|f)), where f is a given feature. The reason these are stored as log probabilities is because the actual values are very low. For instance, the first value is -3.486, which correlates to a probability under 0.03 percent. Logarithm probabilities are used in computation involving small probabilities like this as they stop underflow errors where very small values are just rounded to zeros. Given that all of the probabilities are multiplied together, a single value of 0 will result in the whole answer always being 0! Regardless, the relationship between values is still the same; the higher the value, the more useful that feature is. We can get the most useful features by sorting the array of logarithm probabilities. We want descending order, so we simply negate the values first. The code is as follows: top_features = np.argsort(-feature_probabilities[1])[:50] The preceding code will just give us the indices and not the actual feature values. This isn't very useful, so we will map the feature's indices to the actual values. The key is the DictVectorizer step of the pipeline, which created the matrices for us. Luckily this also records the mapping, allowing us to find the feature names that correlate to different columns. We can extract the features from that part of the pipeline: dv = model.named_steps['vectorizer'] From here, we can print out the names of the top features by looking them up in the feature_names_ attribute of DictVectorizer. Enter the following lines into a new cell and run it to print out a list of the top features: for i, feature_index in enumerate(top_features): print(i, dv.feature_names_[feature_index], np.exp(feature_probabilities[1][feature_index])) The first few features include :, http, # and @. These are likely to be noise (although the use of a colon is not very common outside programming), based on the data we collected. Collecting more data is critical to smoothing out these issues. Looking through the list though, we get a number of more obvious programming features: 7 for 0.188679245283 11 with 0.141509433962 28 installing 0.0660377358491 29 Top 0.0660377358491 34 Developer 0.0566037735849 35 library 0.0566037735849 36 ] 0.0566037735849 37 [ 0.0566037735849 41 version 0.0471698113208 43 error 0.0471698113208 There are some others too that refer to Python in a work context, and therefore might be referring to the programming language (although freelance snake handlers may also use similar terms, they are less common on Twitter): 22 jobs 0.0660377358491 30 looking 0.0566037735849 31 Job 0.0566037735849 34 Developer 0.0566037735849 38 Freelancer 0.0471698113208 40 projects 0.0471698113208 47 We're 0.0471698113208 That last one is usually in the format: We're looking for a candidate for this job. Looking through these features gives us quite a few benefits. We could train people to recognize these tweets, look for commonalities (which give insight into a topic), or even get rid of features that make no sense. For example, the word RT appears quite high in this list; however, this is a common Twitter phrase for retweet (that is, forwarding on someone else's tweet). An expert could decide to remove this word from the list, making the classifier less prone to the noise we introduced by having a small dataset. Summary In this article, we looked at text mining—how to extract features from text, how to use those features, and ways of extending those features. In doing this, we looked at putting a tweet in context—was this tweet mentioning python referring to the programming language? We downloaded data from a web-based API, getting tweets from the popular microblogging website Twitter. This gave us a dataset that we labeled using a form we built directly in the IPython Notebook. We also looked at reproducibility of experiments. While Twitter doesn't allow you to send copies of your data to others, it allows you to send the tweet's IDs. Using this, we created code that saved the IDs and recreated most of the original dataset. Not all tweets were returned; some had been deleted in the time since the ID list was created and the dataset was reproduced. We used a Naive Bayes classifier to perform our text classification. This is built upon the Bayes' theorem that uses data to update the model, unlike the frequentist method that often starts with the model first. This allows the model to incorporate and update new data, and incorporate a prior belief. In addition, the naive part allows to easily compute the frequencies without dealing with complex correlations between features. The features we extracted were word occurrences—did this word occur in this tweet? This model is called bag-of-words. While this discards information about where a word was used, it still achieves a high accuracy on many datasets. This entire pipeline of using the bag-of-words model with Naive Bayes is quite robust. You will find that it can achieve quite good scores on most text-based tasks. It is a great baseline for you, before trying more advanced models. As another advantage, the Naive Bayes classifier doesn't have any parameters that need to be set (although there are some if you wish to do some tinkering). In the next article, we will look at extracting features from another type of data, graphs, in order to make recommendations on who to follow on social media. Resources for Article: Further resources on this subject: Putting the Fun in Functional Python[article] Python Data Analysis Utilities[article] Leveraging Python in the World of Big Data[article]
Read more
  • 0
  • 0
  • 4053

article-image-dynamic-graphics
Packt
22 Feb 2016
64 min read
Save for later

Dynamic Graphics

Packt
22 Feb 2016
64 min read
There is no question that the rendering system of modern graphics devices is complicated. Even rendering a single triangle to the screen engages many of these components, since GPUs are designed for large amounts of parallelism, as opposed to CPUs, which are designed to handle virtually any computational scenario. Modern graphics rendering is a high-speed dance of processing and memory management that spans software, hardware, multiple memory spaces, multiple languages, multiple processors, multiple processor types, and a large number of special-case features that can be thrown into the mix. To make matters worse, every graphics situation we will come across is different in its own way. Running the same application against a different device, even by the same manufacturer, often results in an apples-versus-oranges comparison due to the different capabilities and functionality they provide. It can be difficult to determine where a bottleneck resides within such a complex chain of devices and systems, and it can take a lifetime of industry work in 3D graphics to have a strong intuition about the source of performance issues in modern graphics systems. Thankfully, Profiling comes to the rescue once again. If we can gather data about each component, use multiple performance metrics for comparison, and tweak our Scenes to see how different graphics features affect their behavior, then we should have sufficient evidence to find the root cause of the issue and make appropriate changes. So in this article, you will learn how to gather the right data, dig just deep enough into the graphics system to find the true source of the problem, and explore various solutions to work around a given problem. There are many more topics to cover when it comes to improving rendering performance, so in this article we will begin with some general techniques on how to determine whether our rendering is limited by the CPU or by the GPU, and what we can do about either case. We will discuss optimization techniques such as Occlusion Culling and Level of Detail (LOD) and provide some useful advice on Shader optimization, as well as large-scale rendering features such as lighting and shadows. Finally, since mobile devices are a common target for Unity projects, we will also cover some techniques that may help improve performance on limited hardware. (For more resources related to this topic, see here.) Profiling rendering issues Poor rendering performance can manifest itself in a number of ways, depending on whether the device is CPU-bound, or GPU-bound; in the latter case, the root cause could originate from a number of places within the graphics pipeline. This can make the investigatory stage rather involved, but once the source of the bottleneck is discovered and the problem is resolved, we can expect significant improvements as small fixes tend to reap big rewards when it comes to the rendering subsystem. The CPU sends rendering instructions through the graphics API, that funnel through the hardware driver to the GPU device, which results in commands entering the GPU's Command Buffer. These commands are processed by the massively parallel GPU system one by one until the buffer is empty. But there are a lot more nuances involved in this process. The following shows a (greatly simplified) diagram of a typical GPU pipeline (which can vary based on technology and various optimizations), and the broad rendering steps that take place during each stage: The top row represents the work that takes place on the CPU, the act of calling into the graphics API, through the hardware driver, and pushing commands into the GPU. Ergo, a CPU-bound application will be primarily limited by the complexity, or sheer number, of graphics API calls. Meanwhile, a GPU-bound application will be limited by the GPU's ability to process those calls, and empty the Command Buffer in a reasonable timeframe to allow for the intended frame rate. This is represented in the next two rows, showing the steps taking place in the GPU. But, because of the device's complexity, they are often simplified into two different sections: the front end and the back end. The front end refers to the part of the rendering process where the GPU has received mesh data, a draw call has been issued, and all of the information that was fed into the GPU is used to transform vertices and run through Vertex Shaders. Finally, the rasterizer generates a batch of fragments to be processed in the back end. The back end refers to the remainder of the GPU's processing stages, where fragments have been generated, and now they must be tested, manipulated, and drawn via Fragment Shaders onto the frame buffer in the form of pixels. Note that "Fragment Shader" is the more technically accurate term for Pixel Shaders. Fragments are generated by the rasterization stage, and only technically become pixels once they've been processed by the Shader and drawn to the Frame Buffer. There are a number of different approaches we can use to determine where the root cause of a graphics rendering issue lies: Profiling the GPU with the Profiler Examining individual frames with the Frame Debugger Brute Force Culling GPU profiling Because graphics rendering involves both the CPU and GPU, we must examine the problem using both the CPU Usage and GPU Usage areas of the Profiler as this can tell us which component is working hardest. For example, the following screenshot shows the Profiler data for a CPU-bound application. The test involved creating thousands of simple objects, with no batching techniques taking place. This resulted in an extremely large Draw Call count (around 15,000) for the CPU to process, but giving the GPU relatively little work to do due to the simplicity of the objects being rendered: This example shows that the CPU's "rendering" task is consuming a large amount of cycles (around 30 ms per frame), while the GPU is only processing for less than 16 ms, indicating that the bottleneck resides in the CPU. Meanwhile, Profiling a GPU-bound application via the Profiler is a little trickier. This time, the test involves creating a small number of high polycount objects (for a low Draw Call per vertex ratio), with dozens of real-time point lights and an excessively complex Shader with a texture, normal texture, heightmap, emission map, occlusion map, and so on, (for a high workload per pixel ratio). The following screenshot shows Profiler data for the example Scene when it is run in a standalone application: As we can see, the rendering task of the CPU Usage area matches closely with the total rendering costs of the GPU Usage area. We can also see that the CPU and GPU time costs at the bottom of the image are relatively similar (41.48 ms versus 38.95 ms). This is very unintuitive as we would expect the GPU to be working much harder than the CPU. Be aware that the CPU/GPU millisecond cost values are not calculated or revealed unless the appropriate Usage Area has been added to the Profiler window. However, let's see what happens when we test the same exact Scene through the Editor: This is a better representation of what we would expect to see in a GPU-bound application. We can see how the CPU and GPU time costs at the bottom are closer to what we would expect to see (2.74 ms vs 64.82 ms). However, this data is highly polluted. The spikes in the CPU and GPU Usage areas are the result of the Profiler Window UI updating during testing, and the overhead cost of running through the Editor is also artificially increasing the total GPU time cost. It is unclear what causes the data to be treated this way, and this could certainly change in the future if enhancements are made to the Profiler in future versions of Unity, but it is useful to know this drawback. Trying to determine whether our application is truly GPU-bound is perhaps the only good excuse to perform a Profiler test through the Editor. The Frame Debugger A new feature in Unity 5 is the Frame Debugger, a debugging tool that can reveal how the Scene is rendered and pieced together, one Draw Call at a time. We can click through the list of Draw Calls and observe how the Scene is rendered up to that point in time. It also provides a lot of useful details for the selected Draw Call, such as the current render target (for example, the shadow map, the camera depth texture, the main camera, or other custom render targets), what the Draw Call did (drawing a mesh, drawing a static batch, drawing depth shadows, and so on), and what settings were used (texture data, vertex colors, baked lightmaps, directional lighting, and so on). The following screenshot shows a Scene that is only being partially rendered due to the currently selected Draw Call within the Frame Debugger. Note the shadows that are visible from baked lightmaps that were rendered during an earlier pass before the object itself is rendered: If we are bound by Draw Calls, then this tool can be effective in helping us figure out what the Draw Calls are being spent on, and determine whether there are any unnecessary Draw Calls that are not having an effect on the scene. This can help us come up with ways to reduce them, such as removing unnecessary objects or batching them somehow. We can also use this tool to observe how many additional Draw Calls are consumed by rendering features, such as shadows, transparent objects, and many more. This could help us, when we're creating multiple quality levels for our game, to decide what features to enable/disable under the low, medium, and high quality settings. Brute force testing If we're poring over our Profiling data, and we're still not sure we can determine the source of the problem, we can always try the brute force method: cull a specific activity from the Scene and see if it results in greatly increased performance. If a small change results in a big speed improvement, then we have a strong clue about where the bottleneck lies. There's no harm in this approach if we eliminate enough unknown variables to be sure the data is leading us in the right direction. We will cover different ways to brute force test a particular issue in each of the upcoming sections. CPU-bound If our application is CPU-bound, then we will observe a generally poor FPS value within the CPU Usage area of the Profiler window due to the rendering task. However, if VSync is enabled the data will often get muddied up with large spikes representing pauses as the CPU waits for the screen refresh rate to come around before pushing the current frame buffer. So, we should make sure to disable the VSync block in the CPU Usage area before deciding the CPU is the problem. Brute-forcing a test for CPU-bounding can be achieved by reducing Draw Calls. This is a little unintuitive since, presumably, we've already been reducing our Draw Calls to a minimum through techniques such as Static and Dynamic Batching, Atlasing, and so forth. This would mean we have very limited scope for reducing them further. What we can do, however, is disable the Draw-Call-saving features such as batching and observe if the situation gets significantly worse than it already is. If so, then we have evidence that we're either already, or very close to being, CPU-bound. At this point, we should see whether we can re-enable these features and disable rendering for a few choice objects (preferably those with low complexity to reduce Draw Calls without over-simplifying the rendering of our scene). If this results in a significant performance improvement then, unless we can find further opportunities for batching and mesh combining, we may be faced with the unfortunate option of removing objects from our scene as the only means of becoming performant again. There are some additional opportunities for Draw Call reduction, including Occlusion Culling, tweaking our Lighting and Shadowing, and modifying our Shaders. These will be explained in the following sections. However, Unity's rendering system can be multithreaded, depending on the targeted platform, which version of Unity we're running, and various settings, and this can affect how the graphics subsystem is being bottlenecked by the CPU, and slightly changes the definition of what being CPU-bound means. Multithreaded rendering Multithreaded rendering was first introduced in Unity v3.5 in February 2012, and enabled by default on multicore systems that could handle the workload; at the time, this was only PC, Mac, and Xbox 360. Gradually, more devices were added to this list, and since Unity v5.0, all major platforms now enable multithreaded rendering by default (and possibly some builds of Unity 4). Mobile devices were also starting to feature more powerful CPUs that could support this feature. Android multithreaded rendering (introduced in Unity v4.3) can be enabled through a checkbox under Platform Settings | Other Settings | Multithreaded Rendering. Multithreaded rendering on iOS can be enabled by configuring the application to make use of the Apple Metal API (introduced in Unity v4.6.3), under Player Settings | Other Settings | Graphics API. When multithreaded rendering is enabled, tasks that must go through the rendering API (OpenGL, DirectX, or Metal), are handed over from the main thread to a "worker thread". The worker thread's purpose is to undertake the heavy workload that it takes to push rendering commands through the graphics API and driver, to get the rendering instructions into the GPU's Command Buffer. This can save an enormous number of CPU cycles for the main thread, where the overwhelming majority of other CPU tasks take place. This means that we free up extra cycles for the majority of the engine to process physics, script code, and so on. Incidentally, the mechanism by which the main thread notifies the worker thread of tasks operates in a very similar way to the Command Buffer that exists on the GPU, except that the commands are much more high-level, with instructions like "render this object, with this Material, using this Shader", or "draw N instances of this piece of procedural geometry", and so on. This feature has been exposed in Unity 5 to allow developers to take direct control of the rendering subsystem from C# code. This customization is not as powerful as having direct API access, but it is a step in the right direction for Unity developers to implement unique graphical effects. Confusingly, the Unity API name for this feature is called "CommandBuffer", so be sure not to confuse it with the GPU's Command Buffer. Check the Unity documentation on CommandBuffer to make use of this feature: http://docs.unity3d.com/ScriptReference/Rendering.CommandBuffer.html. Getting back to the task at hand, when we discuss the topic of being CPU-bound in graphics rendering, we need to keep in mind whether or not the multithreaded renderer is being used, since the actual root cause of the problem will be slightly different depending on whether this feature is enabled or not. In single-threaded rendering, where all graphics API calls are handled by the main thread, and in an ideal world where both components are running at maximum capacity, our application would become bottlenecked on the CPU when 50 percent or more of the time per frame is spent handling graphics API calls. However, resolving these bottlenecks can be accomplished by freeing up work from the main thread. For example, we might find that greatly reducing the amount of work taking place in our AI subsystem will improve our rendering significantly because we've freed up more CPU cycles to handle the graphics API calls. But, when multithreaded rendering is taking place, this task is pushed onto the worker thread, which means the same thread isn't being asked to manage both engine work and graphics API calls at the same time. These processes are mostly independent, and even though additional work must still take place in the main thread to send instructions to the worker thread in the first place (via the internal CommandBuffer system), it is mostly negligible. This means that reducing the workload in the main thread will have little-to-no effect on rendering performance. Note that being GPU-bound is the same regardless of whether multithreaded rendering is taking place. GPU Skinning While we're on the subject of CPU-bounding, one task that can help reduce CPU workload, at the expense of additional GPU workload, is GPU Skinning. Skinning is the process where mesh vertices are transformed based on the current location of their animated bones. The animation system, working on the CPU, only transforms the bones, but another step in the rendering process must take care of the vertex transformations to place the vertices around those bones, performing a weighted average over the bones connected to those vertices. This vertex processing task can either take place on the CPU or within the front end of the GPU, depending on whether the GPU Skinning option is enabled. This feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. Front end bottlenecks It is not uncommon to use a mesh that contains a lot of unnecessary UV and Normal vector data, so our meshes should be double-checked for this kind of superfluous fluff. We should also let Unity optimize the structure for us, which minimizes cache misses as vertex data is read within the front end. We will also learn some useful Shader optimization techniques shortly, when we begin to discuss back end optimizations, since many optimization techniques apply to both Fragment and Vertex Shaders. The only attack vector left to cover is finding ways to reduce actual vertex counts. The obvious solutions are simplification and culling; either have the art team replace problematic meshes with lower polycount versions, and/or remove some objects from the scene to reduce the overall polygon count. If these approaches have already been explored, then the last approach we can take is to find some kind of middle ground between the two. Level Of Detail Since it can be difficult to tell the difference between a high quality distance object and a low quality one, there is very little reason to render the high quality version. So, why not dynamically replace distant objects with something more simplified? Level Of Detail (LOD), is a broad term referring to the dynamic replacement of features based on their distance or form factor relative to the camera. The most common implementation is mesh-based LOD: dynamically replacing a mesh with lower and lower detailed versions as the camera gets farther and farther away. Another example might be replacing animated characters with versions featuring fewer bones, or less sampling for distant objects, in order to reduce animation workload. The built-in LOD feature is available in the Unity 4 Pro Edition and all editions of Unity 5. However, it is entirely possible to implement it via Script code in Unity 4 Free Edition if desired. Making use of LOD can be achieved by placing multiple objects in the Scene and making them children of a GameObject with an attached LODGroup component. The LODGroup's purpose is to generate a bounding box from these objects, and decide which object should be rendered based on the size of the bounding box within the camera's field of view. If the object's bounding box consumes a large area of the current view, then it will enable the mesh(es) assigned to lower LOD groups, and if the bounding box is very small, it will replace the mesh(es) with those from higher LOD groups. If the mesh is too far away, it can be configured to hide all child objects. So, with the proper setup, we can have Unity replace meshes with simpler alternatives, or cull them entirely, which eases the burden on the rendering process. Check the Unity documentation for more detailed information on the LOD feature: http://docs.unity3d.com/Manual/LevelOfDetail.html. This feature can cost us a large amount of development time to fully implement; artists must generate lower polygon count versions of the same object, and level designers must generate LOD groups, configure them, and test them to ensure they don't cause jarring transitions as the camera moves closer or farther away. It also costs us in memory and runtime CPU; the alternative meshes need to be kept in memory, and the LODGroup component must routinely test whether the camera has moved to a new position that warrants a change in LOD level. In this era of graphics card capabilities, vertex processing is often the least of our concerns. Combined with the additional sacrifices needed for LOD to function, developers should avoid preoptimizing by automatically assuming LOD will help them. Excessive use of the feature will lead to burdening other parts of our application's performance, and chew up precious development time, all for the sake of paranoia. If it hasn't been proven to be a problem, then it's probably not a problem! Scenes that feature large, expansive views of the world, and lots of camera movement, should consider implementing this technique very early, as the added distance and massive number of visible objects will exacerbate the vertex count enormously. Scenes that are always indoors, or feature a camera with a viewpoint looking down at the world (real-time strategy and MOBA games, for example) should probably steer clear of implementing LOD from the beginning. Games somewhere between the two should avoid it until necessary. It all depends on how many vertices are expected to be visible at any given time and how much variability in camera distance there will be. Note that some game development middleware companies offer third-party tools for automated LOD mesh generation. These might be worth investigating to compare their ease of use versus quality loss versus cost effectiveness. Disable GPU Skinning As previously mentioned, we could enable GPU Skinning to reduce the burden on a CPU-bound application, but enabling this feature will push the same workload into the front end of the GPU. Since Skinning is one of those "embarrassingly parallel" processes that fits well with the GPU's parallel architecture, it is often a good idea to perform the task on the GPU. But this task can chew up precious time in the front end preparing the vertices for fragment generation, so disabling it is another option we can explore if we're bottlenecked in this area. Again, this feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. GPU Skinning is available in Unity 4 Pro Edition, and all editions of Unity 5. Reduce tessellation There is one last task that takes place in the front end process and that we need to consider: tessellation. Tessellation through Geometry Shaders can be a lot of fun, as it is a relatively underused technique that can really make our graphical effects stand out from the crowd of games that only use the most common effects. But, it can contribute enormously to the amount of processing work taking place in the front end. There are no simple tricks we can exploit to improve tessellation, besides improving our tessellation algorithms, or easing the burden caused by other front end tasks to give our tessellation tasks more room to breathe. Either way, if we have a bottleneck in the front end and are making use of tessellation techniques, we should double-check that they are not consuming the lion's share of the front end's budget. Back end bottlenecks The back end is the more interesting part of the GPU pipeline, as many more graphical effects take place during this stage. Consequently, it is the stage that is significantly more likely to suffer from bottlenecks. There are two brute force tests we can attempt: Reduce resolution Reduce texture quality These changes will ease the workload during two important stages at the back end of the pipeline: fill rate and memory bandwidth, respectively. Fill rate tends to be the most common source of bottlenecks in the modern era of graphics rendering, so we will cover it first. Fill rate By reducing screen resolution, we have asked the rasterization system to generate significantly fewer fragments and transpose them over a smaller canvas of pixels. This will reduce the fill rate consumption of the application, giving a key part of the rendering pipeline some additional breathing room. Ergo, if performance suddenly improves with a screen resolution reduction, then fill rate should be our primary concern. Fill rate is a very broad term referring to the speed at which the GPU can draw fragments. But, this only includes fragments that have survived all of the various conditional tests we might have enabled within the given Shader. A fragment is merely a "potential pixel," and if it fails any of the enabled tests, then it is immediately discarded. This can be an enormous performance-saver as the pipeline can skip the costly drawing step and begin work on the next fragment instead. One such example is Z-testing, which checks whether the fragment from a closer object has already been drawn to the same pixel already. If so, then the current fragment is discarded. If not, then the fragment is pushed through the Fragment Shader and drawn over the target pixel, which consumes exactly one draw from our fill rate. Now imagine multiplying this process by thousands of overlapping objects, each generating hundreds or thousands of possible fragments, for high screen resolutions causing millions, or billions, of fragments to be generated each and every frame. It should be fairly obvious that skipping as many of these draws as we can will result in big rendering cost savings. Graphics card manufacturers typically advertise a particular fill rate as a feature of the card, usually in the form of gigapixels per second, but this is a bit of a misnomer, as it would be more accurate to call it gigafragments per second; however this argument is mostly academic. Either way, larger values tell us that the device can potentially push more fragments through the pipeline, so with a budget of 30 GPix/s and a target frame rate of 60 Hz, we can afford to process 30,000,000,000/60 = 500 million fragments per frame before being bottlenecked on fill rate. With a resolution of 2560x1440, and a best-case scenario where each pixel is only drawn over once, then we could theoretically draw the entire scene about 125 times without any noticeable problems. Sadly, this is not a perfect world, and unless we take significant steps to avoid it, we will always end up with some amount of redraw over the same pixels due to the order in which objects are rendered. This is known as overdraw, and it can be very costly if we're not careful. The reason that resolution is a good attack vector to check for fill rate bounding is that it is a multiplier. A reduction from a resolution of 2560x1440 to 800x600 is an improvement factor of about eight, which could reduce fill rate costs enough to make the application perform well again. Overdraw Determining how much overdraw we have can be represented visually by rendering all objects with additive alpha blending and a very transparent flat color. Areas of high overdraw will show up more brightly as the same pixel is drawn over with additive blending multiple times. This is precisely how the Scene view's Overdraw shading mode reveals how much overdraw our scene is suffering. The following screenshot shows a scene with several thousand boxes drawn normally, and drawn using the Scene view's Overdraw shading mode: At the end of the day, fill rate is provided as a means of gauging the best-case behavior. In other words, it's primarily a marketing term and mostly theoretical. But, the technical side of the industry has adopted the term as a way of describing the back end of the pipeline: the stage where fragment data is funneled through our Shaders and drawn to the screen. If every fragment required an absolute minimum level of processing (such as a Shader that returned a constant color), then we might get close to that theoretical maximum. The GPU is a complex beast, however, and things are never so simple. The nature of the device means it works best when given many small tasks to perform. But, if the tasks get too large, then fill rate is lost due to the back end not being able to push through enough fragments in time and the rest of the pipeline is left waiting for tasks to do. There are several more features that can potentially consume our theoretical fill rate maximum, including but not limited to alpha testing, alpha blending, texture sampling, the amount of fragment data being pulled through our Shaders, and even the color format of the target render texture (the final Frame Buffer in most cases). The bad news is that this gives us a lot of subsections to cover, and a lot of ways to break the process, but the good news is it gives us a lot of avenues to explore to improve our fill rate usage. Occlusion Culling One of the best ways to reduce overdraw is to make use of Unity's Occlusion Culling system. The system works by partitioning Scene space into a series of cells and flying through the world with a virtual camera making note of which cells are invisible from other cells (are occluded) based on the size and position of the objects present. Note that this is different to the technique of Frustum Culling, which culls objects not visible from the current camera view. This feature is always active in all versions, and objects culled by this process are automatically ignored by the Occlusion Culling system. Occlusion Culling is available in the Unity 4 Pro Edition and all editions of Unity 5. Occlusion Culling data can only be generated for objects properly labeled Occluder Static and Occludee Static under the StaticFlags dropdown. Occluder Static is the general setting for static objects where we want it to hide other objects, and be hidden by large objects in its way. Occludee Static is a special case for transparent objects that allows objects behind them to be rendered, but we want them to be hidden if something large blocks their visibility. Naturally, because one of the static flags must be enabled for Occlusion Culling, this feature will not work for dynamic objects. The following screenshot shows how effective Occlusion Culling can be at reducing the number of visible objects in our Scene: This feature will cost us in both application footprint and incur some runtime costs. It will cost RAM to keep the Occlusion Culling data structure in memory, and there will be a CPU processing cost to determine which objects are being occluded in each frame. The Occlusion Culling data structure must be properly configured to create cells of the appropriate size for our Scene, and the smaller the cells, the longer it takes to generate the data structure. But, if it is configured correctly for the Scene, Occlusion Culling can provide both fill rate savings through reduced overdraw, and Draw Call savings by culling non-visible objects. Shader optimization Shaders can be a significant fill rate consumer, depending on their complexity, how much texture sampling takes place, how many mathematical functions are used, and so on. Shaders do not directly consume fill rate, but do so indirectly because the GPU must calculate or fetch data from memory during Shader processing. The GPU's parallel nature means any bottleneck in a thread will limit how many fragments can be pushed into the thread at a later date, but parallelizing the task (sharing small pieces of the job between several agents) provides a net gain over serial processing (one agent handling each task one after another). The classic example is a vehicle assembly line. A complete vehicle requires multiple stages of manufacture to complete. The critical path to completion might involve five steps: stamping, welding, painting, assembly, and inspection, and each step is completed by a single team. For any given vehicle, no stage can begin before the previous one is finished, but whatever team handled the stamping for the last vehicle can begin stamping for the next vehicle as soon as it has finished. This organization allows each team to become masters of their particular domain, rather than trying to spread their knowledge too thin, which would likely result in less consistent quality in the batch of vehicles. We can double the overall output by doubling the number of teams, but if any team gets blocked, then precious time is lost for any given vehicle, as well as all future vehicles that would pass through the same team. If these delays are rare, then they can be negligible in the grand scheme, but if not, and one stage takes several minutes longer than normal each and every time it must complete the task, then it can become a bottleneck that threatens the release of the entire batch. The GPU parallel processors work in a similar way: each processor thread is an assembly line, each processing stage is a team, and each fragment is a vehicle. If the thread spends a long time processing a single stage, then time is lost on each fragment. This delay will multiply such that all future fragments coming through the same thread will be delayed. This is a bit of an oversimplification, but it often helps to paint a picture of how poorly optimized Shader code can chew up our fill rate, and how small improvements in Shader optimization provide big benefits in back end performance. Shader programming and optimization have become a very niche area of game development. Their abstract and highly-specialized nature requires a very different kind of thinking to generate Shader code compared to gameplay and engine code. They often feature mathematical tricks and back-door mechanisms for pulling data into the Shader, such as precomputing values in texture files. Because of this, and the importance of optimization, Shaders tend to be very difficult to read and reverse-engineer. Consequently, many developers rely on prewritten Shaders, or visual Shader creation tools from the Asset Store such as Shader Forge or Shader Sandwich. This simplifies the act of initial Shader code generation, but might not result in the most efficient form of Shaders. If we're relying on pre-written Shaders or tools, we might find it worthwhile to perform some optimization passes over them using some tried-and-true techniques. So, let's focus on some easily reachable ways of optimizing our Shaders. Consider using Shaders intended for mobile platforms The built-in mobile Shaders in Unity do not have any specific restrictions that force them to only be used on mobile devices. They are simply optimized for minimum resource usage (and tend to feature some of the other optimizations listed in this section). Desktop applications are perfectly capable of using these Shaders, but they tend to feature a loss of graphical quality. It only becomes a question of whether the loss of graphical quality is acceptable. So, consider doing some testing with the mobile equivalents of common Shaders to see whether they are a good fit for your game. Use small data types GPUs can calculate with smaller data types more quickly than larger types (particularly on mobile platforms!), so the first tweak we can attempt is replacing our float data types (32-bit, floating point) with smaller versions such as half (16-bit, floating point), or even fixed (12-bit, fixed point). The size of the data types listed above will vary depending on what floating point formats the target platform prefers. The sizes listed are the most common. The importance for optimization is in the relative size between formats. Color values are good candidates for precision reduction, as we can often get away with less precise color values without any noticeable loss in coloration. However, the effects of reducing precision can be very unpredictable for graphical calculations. So, changes such as these can require some testing to verify whether the reduced precision is costing too much graphical fidelity. Note that the effects of these tweaks can vary enormously between one GPU architecture and another (for example, AMD versus Nvidia versus Intel), and even GPU brands from the same manufacturer. In some cases, we can make some decent performance gains for a trivial amount of effort. In other cases, we might see no benefit at all. Avoid changing precision while swizzling Swizzling is the Shader programming technique of creating a new vector (an array of values) from an existing vector by listing the components in the order in which we wish to copy them into the new structure. Here are some examples of swizzling: float4 input = float4(1.0, 2.0, 3.0, 4.0); // initial test value float2 val1 = input.yz; // swizzle two components float3 val2 = input.zyx; // swizzle three components in a different order float4 val3 = input.yyy; // swizzle the same component multiple times float sclr = input.w; float3 val4 = sclr.xxx // swizzle a scalar multiple times We can use both the xyzw and rgba representations to refer to the same components, sequentially. It does not matter whether it is a color or vector; they just make the Shader code easier to read. We can also list components in any order we like to fill in the desired data, repeating them if necessary. Converting from one precision type to another in a Shader can be a costly operation, but converting the precision type while simultaneously swizzling can be particularly painful. If we have mathematical operations that rely on being swizzled into different precision types, it would be wiser if we simply absorbed the high-precision cost from the very beginning, or reduced precision across the board to avoid the need for changes in precision. Use GPU-optimized helper functions The Shader compiler often performs a good job of reducing mathematical calculations down to an optimized version for the GPU, but compiled custom code is unlikely to be as effective as both the Cg library's built-in helper functions and the additional helpers provided by the Unity Cg included files. If we are using Shaders that include custom function code, perhaps we can find an equivalent helper function within the Cg or Unity libraries that can do a better job than our custom code can. These extra include files can be added to our Shader within the CGPROGRAM block like so: CGPROGRAM // other includes #include "UnityCG.cginc" // Shader code here ENDCG Example Cg library functions to use are abs() for absolute values, lerp() for linear interpolation, mul() for multiplying matrices, and step() for step functionality. Useful UnityCG.cginc functions include WorldSpaceViewDir() for calculating the direction towards the camera, and Luminance() for converting a color to grayscale. Check the following URL for a full list of Cg standard library functions: http://http.developer.nvidia.com/CgTutorial/cg_tutorial_appendix_e.html. Check the Unity documentation for a complete and up-to-date list of possible include files and their accompanying helper functions: http://docs.unity3d.com/Manual/SL-BuiltinIncludes.html. Disable unnecessary features Perhaps we can make savings by simply disabling Shader features that aren't vital. Does the Shader really need multiple passes, transparency, Z-writing, alpha-testing, and/or alpha blending? Will tweaking these settings or removing these features give us a good approximation of our desired effect without losing too much graphical fidelity? Making such changes is a good way of making fill rate cost savings. Remove unnecessary input data Sometimes the process of writing a Shader involves a lot of back and forth experimentation in editing code and viewing it in the Scene. The typical result of this is that input data that was needed when the Shader was going through early development is now surplus fluff once the desired effect has been obtained, and it's easy to forget what changes were made when/if the process drags on for a long time. But, these redundant data values can cost the GPU valuable time as they must be fetched from memory even if they are not explicitly used by the Shader. So, we should double check our Shaders to ensure all of their input geometry, vertex, and fragment data is actually being used. Only expose necessary variables Exposing unnecessary variables from our Shader to the accompanying Material(s) can be costly as the GPU can't assume these values are constant. This means the Shader code cannot be compiled into a more optimized form. This data must be pushed from the CPU with every pass since they can be modified at any time through the Material's methods such as SetColor(), SetFloat(), and so on. If we find that, towards the end of the project, we always use the same value for these variables, then they can be replaced with a constant in the Shader to remove such excess runtime workload. The only cost is obfuscating what could be critical graphical effect parameters, so this should be done very late in the process. Reduce mathematical complexity Complicated mathematics can severely bottleneck the rendering process, so we should do whatever we can to limit the damage. Complex mathematical functions could be replaced with a texture that is fed into the Shader and provides a pre-generated table for runtime lookup. We may not see any improvement with functions such as sin and cos, since they've been heavily optimized to make use of GPU architecture, but complex methods such as pow, exp, log, and other custom mathematical processes can only be optimized so much, and would be good candidates for simplification. This is assuming we only need one or two input values, which are represented through the X and Y coordinates of the texture, and mathematical accuracy isn't of paramount importance. This will cost us additional graphics memory to store the texture at runtime (more on this later), but if the Shader is already receiving a texture (which they are in most cases) and the alpha channel is not being used, then we could sneak the data in through the texture's alpha channel, costing us literally no performance, and the rest of the Shader code and graphics system would be none-the-wiser. This will involve the customization of art assets to include such data in any unused color channel(s), requiring coordination between programmers and artists, but is a very good way of saving Shader processing costs with no runtime sacrifices. In fact, Material properties and textures are both excellent entry points for pushing work from the Shader (the GPU) onto the CPU. If a complex calculation does not need to vary on a per pixel basis, then we could expose the value as a property in the Material, and modify it as needed (accepting the overhead cost of doing so from the previous section Only expose necessary variables). Alternatively, if the result varies per pixel, and does not need to change often, then we could generate a texture file from script code, containing the results of the calculations in the RGBA values, and pulling the texture into the Shader. Lots of opportunities arise when we ignore the conventional application of such systems, and remember to think of them as just raw data being transferred around. Reduce texture lookups While we're on the subject of texture lookups, they are not trivial tasks for the GPU to process and they have their own overhead costs. They are the most common cause of memory access problems within the GPU, especially if a Shader is performing samples across multiple textures, or even multiple samples across a single texture, as they will likely inflict cache misses in memory. Such situations should be simplified as much as possible to avoid severe GPU memory bottlenecking. Even worse, sampling a texture in a random order would likely result in some very costly cache misses for the GPU to suffer through, so if this is being done, then the texture should be reordered so that it can be sampled in a more sequential order. Avoid conditional statements In modern day CPU architecture, conditional statements undergo a lot of clever predictive techniques to make use of instruction-level parallelism. This is a feature where the CPU attempts to predict which direction a conditional statement will go in before it has actually been resolved, and speculatively begins processing the most likely result of the conditional using any free components that aren't being used to resolve the conditional (fetching some data from memory, copying some floats into unused registers, and so on). If it turns out that the decision is wrong, then the current result is discarded and the proper path is taken instead. So long as the cost of speculative processing and discarding false results is less than the time spent waiting to decide the correct path, and it is right more often than it is wrong, then this is a net gain for the CPU's speed. However, this feature is not possible on GPU architecture because of its parallel nature. The GPU's cores are typically managed by some higher-level construct that instructs all cores under its command to perform the same machine-code-level instruction simultaneously. So, if the Fragment Shader requires a float to be multiplied by 2, then the process will begin by having all cores copy data into the appropriate registers in one coordinated step. Only when all cores have finished copying to the registers will the cores be instructed to begin the second step: multiplying all registers by 2. Thus, when this system stumbles into a conditional statement, it cannot resolve the two statements independently. It must determine how many of its child cores will go down each path of the conditional, grab the list of required machine code instructions for one path, resolve them for all cores taking that path, and repeat for each path until all possible paths have been processed. So, for an if-else statement (two possibilities), it will tell one group of cores to process the "true" path, then ask the remaining cores to process the "false" path. Unless every core takes the same path, it must process both paths every time. So, we should avoid branching and conditional statements in our Shader code. Of course, this depends on how essential the conditional is to achieving the graphical effect we desire. But, if the conditional is not dependent on per pixel behavior, then we would often be better off absorbing the cost of unnecessary mathematics than inflicting a branching cost on the GPU. For example, we might be checking whether a value is non-zero before using it in a calculation, or comparing against some global flag in the Material before taking one action or another. Both of these cases would be good candidates for optimization by removing the conditional check. Reduce data dependencies The compiler will try its best to optimize our Shader code into the more GPU-friendly low-level language so that it is not waiting on data to be fetched when it could be processing some other task. For example, the following poorly-optimized code, could be written in our Shader: float sum = input.color1.r; sum = sum + input.color2.g; sum = sum + input.color3.b; sum = sum + input.color4.a; float result = calculateSomething(sum); If we were able to force the Shader compiler to compile this code into machine code instructions as it is written, then this code has a data dependency such that each calculation cannot begin until the last finishes due to the dependency on the sum variable. But, such situations are often detected by the Shader compiler and optimized into a version that uses instruction-level parallelism (the code shown next is the high-level code equivalent of the resulting machine code): float sum1, sum2, sum3, sum4; sum1 = input.color1.r; sum2 = input.color2.g; sum3 = input.color3.b sum4 = input.color4.a; float sum = sum1 + sum2 + sum3 + sum4; float result = CalculateSomething(sum); In this case, the compiler would recognize that it can fetch the four values from memory in parallel and complete the summation once all four have been fetched independently via thread-level parallelism. This can save a lot of time, relative to performing the four fetches one after another. However, long chains of data dependency can absolutely murder Shader performance. If we create a strong data dependency in our Shader's source code, then it has been given no freedom to make such optimizations. For example, the following data dependency would be painful on performance, as one step cannot be completed without waiting on another to fetch data and performing the appropriate calculation. float4 val1 = tex2D(_tex1, input.texcoord.xy); float4 val2 = tex2D(_tex2, val1.yz); float4 val3 = tex2D(_tex3, val2.zw); Strong data dependencies such as these should be avoided whenever possible. Surface Shaders If we're using Unity's Surface Shaders, which are a way for Unity developers to get to grips with Shader programming in a more simplified fashion, then the Unity Engine takes care of converting our Surface Shader code for us, abstracting away some of the optimization opportunities we have just covered. However, it does provide some miscellaneous values that can be used as replacements, which reduce accuracy but simplify the mathematics in the resulting code. Surface Shaders are designed to handle the general case fairly efficiently, but optimization is best achieved with a personal touch. The approxview attribute will approximate the view direction, saving costly operations. halfasview will reduce the precision of the view vector, but beware of its effect on mathematical operations involving multiple precision types. noforwardadd will limit the Shader to only considering a single directional light, reducing Draw Calls since the Shader will render in only a single pass, but reducing lighting complexity. Finally, noambient will disable ambient lighting in the Shader, removing some extra mathematical operations that we may not need. Use Shader-based LOD We can force Unity to render distant objects using simpler Shaders, which can be an effective way of saving fill rate, particularly if we're deploying our game onto multiple platforms or supporting a wide range of hardware capability. The LOD keyword can be used in the Shader to set the onscreen size factor that the Shader supports. If the current LOD level does not match this value, it will drop to the next fallback Shader and so on until it finds the Shader that supports the given size factor. We can also change a given Shader object's LOD value at runtime using the maximumLOD property. This feature is similar to the mesh-based LOD covered earlier, and uses the same LOD values for determining object form factor, so it should be configured as such. Memory bandwidth Another major component of back end processing and a potential source of bottlenecks is memory bandwidth. Memory bandwidth is consumed whenever a texture must be pulled from a section of the GPU's main video memory (also known as VRAM). The GPU contains multiple cores that each have access to the same area of VRAM, but they also each contain a much smaller, local Texture Cache that stores the current texture(s) the GPU has been most recently working with. This is similar in design to the multitude of CPU cache levels that allow memory transfer up and down the chain, as a workaround for the fact that faster memory will, invariably, be more expensive to produce, and hence smaller in capacity compared to slower memory. Whenever a Fragment Shader requests a sample from a texture that is already within the core's local Texture Cache, then it is lightning fast and barely perceivable. But, if a texture sample request is made, that does not yet exist within the Texture Cache, then it must be pulled in from VRAM before it can be sampled. This fetch request risks cache misses within VRAM as it tries to find the relevant texture. The transfer itself consumes a certain amount of memory bandwidth, specifically an amount equal to the total size of the texture file stored within VRAM (which may not be the exact size of the original file, nor the size in RAM, due to GPU-level compression). It's for this reason that, if we're bottlenecked on memory bandwidth, then performing a brute force test by reducing texture quality would suddenly result in a performance improvement. We've shrunk the size of our textures, easing the burden on the GPU's memory bandwidth, allowing it to fetch the necessary textures much quicker. Globally reducing texture quality can be achieved by going to Edit | Project Settings | Quality | Texture Quality and setting the value to Half Res, Quarter Res, or Eighth Res. In the event that memory bandwidth is bottlenecked, then the GPU will keep fetching the necessary texture files, but the entire process will be throttled as the Texture Cache waits for the data to appear before processing the fragment. The GPU won't be able to push data back to the Frame Buffer in time to be rendered onto the screen, blocking the whole process and culminating in a poor frame rate. Ultimately, proper usage of memory bandwidth is a budgeting concern. For example, with a memory bandwidth of 96 GB/sec per core and a target frame rate of 60 frames per second, then the GPU can afford to pull 96/60 = 1.6 GB worth of texture data every frame before being bottlenecked on memory bandwidth. Memory bandwidth is often listed on a per core basis, but some GPU manufacturers may try to mislead you by multiplying memory bandwidth by the number of cores in order to list a bigger, but less practical number. Because of this, research may be necessary to confirm the memory bandwidth limit we have for the target GPU hardware is given on a per core basis. Note that this value is not the maximum limit on the texture data that our game can contain in the project, nor in CPU RAM, not even in VRAM. It is a metric that limits how much texture swapping can occur during one frame. The same texture could be pulled back and forth multiple times in a single frame depending on how many Shaders need to use them, the order that the objects are rendered, and how often texture sampling must occur, so rendering just a few objects could consume whole gigabytes of memory bandwidth if they all require the same high quality, massive textures, require multiple secondary texture maps (normal maps, emission maps, and so on), and are not batched together, because there simply isn't enough Texture Cache space available to keep a single texture file long enough to exploit it during the next rendering pass. There are several approaches we can take to solve bottlenecks in memory bandwidth. Use less texture data This approach is simple, straightforward, and always a good idea to consider. Reducing texture quality, either through resolution or bit rate, is not ideal for graphical quality, but we can sometimes get away with using 16-bit textures without any noticeable degradation. Mip Maps are another excellent way of reducing the amount of texture data being pushed back and forth between VRAM and the Texture Cache. Note that the Scene View has a Mipmaps Shading Mode, which will highlight textures in our scene blue or red depending on whether the current texture scale is appropriate for the current Scene View's camera position and orientation. This will help identify what textures are good candidates for further optimization. Mip Maps should almost always be used in 3D Scenes, unless the camera moves very little. Test different GPU Texture Compression formats The Texture Compression techniques helpe reduce our application's footprint (executable file size), and runtime CPU memory usage, that is, the storage area where all texture resource data is kept until it is needed by the GPU. However, once the data reaches the GPU, it uses a different form of compression to keep texture data small. The common formats are DXT, PVRTC, ETC, and ASTC. To make matters more confusing, each platform and GPU hardware supports different compression formats, and if the device does not support the given compression format, then it will be handled at the software level. In other words, the CPU will need to stop and recompress the texture to the desired format the GPU wants, as opposed to the GPU taking care of it with a specialized hardware chip. The compression options are only available if a texture resource has its Texture Type field set to Advanced. Using any of the other texture type settings will simplify the choices, and Unity will make a best guess when deciding which format to use for the target platform, which may not be ideal for a given piece of hardware and thus will consume more memory bandwidth than necessary. The best approach to determining the correct format is to simply test a bunch of different devices and Texture Compression techniques and find one that fits. For example, common wisdom says that ETC is the best choice for Android since more devices support it, but some developers have found their game works better with the DXT and PVRTC formats on certain devices. Beware that, if we're at the point where individually tweaking Texture Compression techniques is necessary, then hopefully we have exhausted all other options for reducing memory bandwidth. By going down this road, we could be committing to supporting many different devices each in their own specific way. Many of us would prefer to keep things simple with a general solution instead of personal customization and time-consuming handiwork to work around problems like this. Minimize texture sampling Can we modify our Shaders to remove some texture sampling overhead? Did we add some extra texture lookup files to give ourselves some fill rate savings on mathematical functions? If so, we might want to consider lowering the resolution of such textures or reverting the changes and solving our fill rate problems in other ways. Essentially, the less texture sampling we do, the less often we need to use memory bandwidth and the closer we get to resolving the bottleneck. Organize assets to reduce texture swaps This approach basically comes back to Batching and Atlasing again. Are there opportunities to batch some of our biggest texture files together? If so, then we could save the GPU from having to pull in the same texture files over and over again during the same frame. As a last resort, we could look for ways to remove some textures from the entire project and reuse similar files. For instance, if we have fill rate budget to spare, then we may be able to use some Fragment Shaders to make a handful of textures files appear in our game with different color variations. VRAM limits One last consideration related to textures is how much VRAM we have available. Most texture transfer from CPU to GPU occurs during initialization, but can also occur when a non-existent texture is first required by the current view. This process is asynchronous and will result in a blank texture being used until the full texture is ready for rendering. As such, we should avoid too much texture variation across our Scenes. Texture preloading Even though it doesn't strictly relate to graphics performance, it is worth mentioning that the blank texture that is used during asynchronous texture loading can be jarring when it comes to game quality. We would like a way to control and force the texture to be loaded from disk to the main memory and then to VRAM before it is actually needed. A common workaround is to create a hidden GameObject that features the texture and place it somewhere in the Scene on the route that the player will take towards the area where it is actually needed. As soon as the textured object becomes a candidate for the rendering system (even if it's technically hidden), it will begin the process of copying the data towards VRAM. This is a little clunky, but is easy to implement and works sufficiently well in most cases. We can also control such behavior via Script code by changing a hidden Material's texture: GetComponent<Renderer>().material.texture = textureToPreload; Texture thrashing In the rare event that too much texture data is loaded into VRAM, and the required texture is not present, the GPU will need to request it from the main memory and overwrite the existing texture data to make room. This is likely to worsen over time as the memory becomes fragmented, and it introduces a risk that the texture just flushed from VRAM needs to be pulled again within the same frame. This will result in a serious case of memory "thrashing", and should be avoided at all costs. This is less of a concern on modern consoles such as the PS4, Xbox One, and WiiU, since they share a common memory space for both CPU and GPU. This design is a hardware-level optimization given the fact that the device is always running a single application, and almost always rendering 3D graphics. But, all other platforms must share time and space with multiple applications and be capable of running without a GPU. They therefore feature separate CPU and GPU memory, and we must ensure that the total texture usage at any given moment remains below the available VRAM of the target hardware. Note that this "thrashing" is not precisely the same as hard disk thrashing, where memory is copied back and forth between main memory and virtual memory (the swap file), but it is analogous. In either case, data is being unnecessarily copied back and forth between two regions of memory because too much data is being requested in too short a time period for the smaller of the two memory regions to hold it all. Thrashing such as this can be a common cause of dreadful graphics performance when games are ported from modern consoles to the desktop and should be treated with care. Avoiding this behavior may require customizing texture quality and file sizes on a per-platform and per-device basis. Be warned that some players are likely to notice these inconsistencies if we're dealing with hardware from the same console or desktop GPU generation. As many of us will know, even small differences in hardware can lead to a lot of apples-versus-oranges comparisons, but hardcore gamers will expect a similar level of quality across the board. Lighting and Shadowing Lighting and Shadowing can affect all parts of the graphics pipeline, and so they will be treated separately. This is perhaps one of the most important parts of game art and design to get right. Good Lighting and Shadowing can turn a mundane scene into something spectacular as there is something magical about professional coloring that makes it visually appealing. Even the low-poly art style (think Monument Valley) relies heavily on a good lighting and shadowing profile in order to allow the player to distinguish one object from another. But, this isn't an art book, so we will focus on the performance characteristics of various Lighting and Shadowing features. Unity offers two styles of dynamic light rendering, as well as baked lighting effects through lightmaps. It also provides multiple ways of generating shadows with varying levels of complexity and runtime processing cost. Between the two, there are a lot of options to explore, and a lot of things that can trip us up if we're not careful. The Unity documentation covers all of these features in an excellent amount of detail (start with this page and work through them: http://docs.unity3d.com/Manual/Lighting.html), so we'll examine these features from a performance standpoint. Let's tackle the two main light rendering modes first. This setting can be found under Edit | Project Settings | Player | Other Settings | Rendering, and can be configured on a per-platform basis. Forward Rendering Forward Rendering is the classical form of rendering lights in our scene. Each object is likely to be rendered in multiple passes through the same Shader. How many passes are required will be based on the number, distance, and brightness of light sources. Unity will try to prioritize which directional light is affecting the object the most and render the object in a "base pass" as a starting point. It will then take up to four of the most powerful point lights nearby and re-render the same object multiple times through the same Fragment Shader. The next four point lights will then be processed on a per-vertex basis. All remaining lights are treated as a giant blob by means of a technique called spherical harmonics. Some of this behavior can be simplified by setting a light's Render Mode to values such as Not Important, and changing the value of Edit | Project Settings | Quality | Pixel Light Count. This value limits how many lights will be treated on a per pixel basis, but is overridden by any lights with a Render Mode set to Important. It is therefore up to us to use this combination of settings responsibly. As you can imagine, the design of Forward Rendering can utterly explode our Draw Call count very quickly in scenes with a lot of point lights present, due to the number of render states being configured and Shader passes being reprocessed. CPU-bound applications should avoid this rendering mode if possible. More information on Forward Rendering can be found in the Unity documentation: http://docs.unity3d.com/Manual/RenderTech-ForwardRendering.html. Deferred Shading Deferred Shading or Deferred Rendering as it is sometimes known, is only available on GPUs running at least Shader Model 3.0. In other words, any desktop graphics card made after around 2004. The technique has been around for a while, but it has not resulted in a complete replacement of the Forward Rendering method due to the caveats involved and limited support on mobile devices. Anti-aliasing, transparency, and animated characters receiving shadows are all features that cannot be managed through Deferred Shading alone and we must use the Forward Rendering technique as a fallback. Deferred Shading is so named because actual shading does not occur until much later in the process; that is, it is deferred until later. From a performance perspective, the results are quite impressive as it can generate very good per pixel lighting with surprisingly little Draw Call effort. The advantage is that a huge amount of lighting can be accomplished using only a single pass through the lighting Shader. The main disadvantages include the additional costs if we wish to pile on advanced lighting features such as Shadowing and any steps that must pass through Forward Rendering in order to complete, such as transparency. The Unity documentation contains an excellent source of information on the Deferred Shading technique, its advantages, and its pitfalls: http://docs.unity3d.com/Manual/RenderTech-DeferredShading.html Vertex Lit Shading (legacy) Technically, there are more than two lighting methods. Unity allows us to use a couple of legacy lighting systems, only one of which may see actual use in the field: Vertex Lit Shading. This is a massive simplification of lighting, as lighting is only considered per vertex, and not per pixel. In other words, entire faces are colored based on the incoming light color, and not individual pixels. It is not expected that many, or really any, 3D games will make use of this legacy technique, as a lack of shadows and proper lighting make visualizations of depth very difficult. It is mostly relegated to 2D games that don't intend to make use of shadows, normal maps, and various other lighting features, but it is there if we need it. Real-time Shadows Soft Shadows are expensive, Hard Shadows are cheap, and No Shadows are free. Shadow Resolution, Shadow Projection, Shadow Distance, and Shadow Cascades are all settings we can find under Edit | Project Settings | Quality | Shadows that we can use to modify the behavior and complexity of our shadowing passes. That summarizes almost everything we need to know about Unity's real-time shadowing techniques from a high-level performance standpoint. We will cover shadows more in the following section on optimizing our lighting effects. Lighting optimization With a cursory glance at all of the relevant lighting techniques, let's run through some techniques we can use to improve lighting costs. Use the appropriate Shading Mode It is worth testing both of the main rendering modes to see which one best suits our game. Deferred Shading is often used as a backup in the event that Forward Rendering is becoming a burden on performance, but it really depends on where else we're finding bottlenecks as it is sometimes difficult to tell the difference between them. Use Culling Masks A Light Component's Culling Mask property is a layer-based mask that can be used to limit which objects will be affected by the given Light. This is an effective way of reducing lighting overhead, assuming that the layer interactions also make sense with how we are using layers for physics optimization. Objects can only be a part of a single layer, and reducing physics overhead probably trumps lighting overhead in most cases; thus, if there is a conflict, then this may not be the ideal approach. Note that there is limited support for Culling Masks when using Deferred Shading. Because of the way it treats lighting in a very global fashion, only four layers can be disabled from the mask, limiting our ability to optimize its behavior through this method. Use Baked Lightmaps Baking Lighting and Shadowing into a Scene is significantly less processor-intensive than generating them at runtime. The downside is the added application footprint, memory consumption, and potential for memory bandwidth abuse. Ultimately, unless a game's lighting effects are being handled exclusively through Legacy Vertex Lighting or a single Directional Light, then it should probably include Lightmapping to make some huge budget savings on lighting calculations. Relying entirely on real-time lighting and shadows is a recipe for disaster unless the game is trying to win an award for the smallest application file size of all time. Optimize Shadows Shadowing passes mostly consume our Draw Calls and fill rate, but the amount of vertex position data we feed into the process and our selection for the Shadow Projection setting will affect the front end's ability to generate the required shadow casters and shadow receivers. We should already be attempting to reduce vertex counts to solve front end bottlenecking in the first place, and making this change will be an added multiplier towards that effort. Draw Calls are consumed during shadowing by rendering visible objects into a separate buffer (known as the shadow map) as either a shadow caster, a shadow receiver, or both. Each object that is rendered into this map will consume another Draw Call, which makes shadows a huge performance cost multiplier, so it is often a setting that games will expose to users via quality settings, allowing users with weaker hardware to reduce the effect or even disable it entirely. Shadow Distance is a global multiplier for runtime shadow rendering. The fewer shadows we need to draw, the happier the entire rendering process will be. There is little point in rendering shadows at a great distance from the camera, so this setting should be configured specific to our game and how much shadowing we expect to witness during gameplay. It is also a common setting that is exposed to the user to reduce the burden of rendering shadows. Higher values of Shadow Resolution and Shadow Cascades will increase our memory bandwidth and fill rate consumption. Both of these settings can help curb the effects of artefacts in shadow rendering, but at the cost of a much larger shadow map size that must be moved around and of the canvas size to draw to. The Unity documentation contains an excellent summary on the topic of the aliasing effect of shadow maps and how the Shadow Cascades feature helps to solve the problem: http://docs.unity3d.com/Manual/DirLightShadows.html. It's worth noting that Soft Shadows do not consume any more memory or CPU overhead relative to Hard Shadows, as the only difference is a more complex Shader. This means that applications with enough fill rate to spare can enjoy the improved graphical fidelity of Soft Shadows. Optimizing graphics for mobile Unity's ability to deploy to mobile devices has contributed greatly to its popularity among hobbyist, small, and mid-size development teams. As such, it would be prudent to cover some approaches that are more beneficial for mobile platforms than for desktop and other devices. Note that any, and all, of the following approaches may become obsolete soon, if they aren't already. The mobile device market is moving blazingly fast, and the following techniques as they apply to mobile devices merely reflect conventional wisdom from the last half decade. We should occasionally test the assumptions behind these approaches from time-to-time to see whether the limitations of mobile devices still fit the mobile marketplace. Minimize Draw Calls Mobile applications are more often bottlenecked on Draw Calls than on fill rate. Not that fill rate concerns should be ignored (nothing should, ever!), but this makes it almost necessary for any mobile application of reasonable quality to implement Mesh Combining, Batching, and Atlasing techniques from the very beginning. Deferred Rendering is also the preferred technique as it fits well with other mobile-specific concerns, such as avoiding transparency and having too many animated characters. Minimize the Material count This concern goes hand in hand with the concepts of Batching and Atlasing. The fewer Materials we use, the fewer Draw Calls will be necessary. This strategy will also help with concerns relating to VRAM and memory bandwidth, which tend to be very limited on mobile devices. Minimize texture size and Material count Most mobile devices feature a very small Texture Cache relative to desktop GPUs. For instance, the iPhone 3G can only support a total texture size of 1024x1024 due to running OpenGLES1.1 with simple vertex rendering techniques. Meanwhile the iPhone 3GS, iPhone 4, and iPad generation run OpenGLES 2.0, which only supports textures up to 2048x2048. Later generations can support textures up to 4096x4096. Double check the device hardware we are targeting to be sure it supports the texture file sizes we wish to use (there are too many Android devices to list here).
However, later-generation devices are never the most common devices in the mobile marketplace. If we wish our game to reach a wide audience (increasing its chances of success), then we must be willing to support weaker hardware. Note that textures that are too large for the GPU will be downscaled by the CPU during initialization, wasting valuable loading time, and leaving us with unintended graphical fidelity. This makes texture reuse of paramount importance for mobile devices due to the limited VRAM and Texture Cache sizes available. Make textures square and power-of-2 The GPU will find it difficult, or simply be unable to compress the texture if it is not in a square format, so make sure you stick to the common development convention and keep things square and sized to a power of 2. Use the lowest possible precision formats in Shaders Mobile GPUs are particularly sensitive to precision formats in its Shaders, so the smallest formats should be used. On a related note, format conversion should be avoided for the same reason. Avoid Alpha Testing Mobile GPUs haven't quite reached the same levels of chip optimization as desktop GPUs, and Alpha Testing remains a particularly costly task on mobile devices. In most cases it should simply be avoided in favor of Alpha Blending. Summary If you've made it this far without skipping ahead, then congratulations are in order. That was a lot of information to absorb for just one component of the Unity Engine, but then it is clearly the most complicated of them all, requiring a matching depth of explanation. Hopefully, you've learned a lot of approaches to help you improve your rendering performance and enough about the rendering pipeline to know how to use them responsibly! To learn more about Unity 5, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Unity 5 Game Optimization (https://www.packtpub.com/game-development/unity-5-game-optimization) Unity 5.x By Example (https://www.packtpub.com/game-development/unity-5x-example) Unity 5.x Cookbook (https://www.packtpub.com/game-development/unity-5x-cookbook) Unity 5 for Android Essentials (https://www.packtpub.com/game-development/unity-5-android-essentials) Resources for Article: Further resources on this subject: The Vertex Functions [article] UI elements and their implementation [article] Routing for Yii Demystified [article]
Read more
  • 0
  • 0
  • 40570

article-image-concurrency-and-parallelism-swift-2
Packt
22 Feb 2016
35 min read
Save for later

Concurrency and Parallelism with Swift 2

Packt
22 Feb 2016
35 min read
When I first started learning Objective-C, I already had a good understanding of concurrency and multitasking with my background in other languages such as C and Java. This background made it very easy for me to create multithreaded applications using threads in Objective-C. Then, Apple changed everything for me when they released Grand Central Dispatch (GCD) with OS X 10.6 and iOS 4. At first, I went into denial; there was no way GCD could manage my application's threads better than I could. Then I entered the anger phase, GCD was hard to use and understand. Next was the bargaining phase, maybe I can use GCD with my threading code, so I could still control how the threading worked. Then there was the depression phase, maybe GCD does handle the threading better than I can. Finally, I entered the wow phase; this GCD thing is really easy to use and works amazingly well. After using Grand Central Dispatch and Operation Queues with Objective-C, I do not see a reason for using manual threads with Swift. In this artcle, we will learn the following topics: Basics of concurrency and parallelism How to use GCD to create and manage concurrent dispatch queues How to use GCD to create and manage serial dispatch queues How to use various GCD functions to add tasks to the dispatch queues How to use NSOperation and NSOperationQueues to add concurrency to our applications (For more resources related to this topic, see here.) Concurrency and parallelism Concurrency is the concept of multiple tasks starting, running, and completing within the same time period. This does not necessarily mean that the tasks are executing simultaneously. In order for tasks to be run simultaneously, our application needs to be running on a multicore or multiprocessor system. Concurrency allows us to share the processor or cores with multiple tasks; however, a single core can only execute one task at a given time. Parallelism is the concept of two or more tasks running simultaneously. Since each core of our processor can only execute one task at a time, the number of tasks executing simultaneously is limited to the number of cores within our processors. Therefore, if we have, for example, a four-core processor, then we are limited to only four tasks running simultaneously. Today's processors can execute tasks so quickly that it may appear that larger tasks are executing simultaneously. However, within the system, the larger tasks are actually taking turns executing subtasks on the cores. In order to understand the difference between concurrency and parallelism, let's look at how a juggler juggles balls. If you watch a juggler, it seems they are catching and throwing multiple balls at any given time; however, a closer look reveals that they are, in fact, only catching and throwing one ball at a time. The other balls are in the air waiting to be caught and thrown. If we want to be able to catch and throw multiple balls simultaneously, we need to add multiple jugglers. This example is really good because we can think of jugglers as the cores of a processer. A system with a single core processor (one juggler), regardless of how it seems, can only execute one task (catch and throw one ball) at a time. If we want to execute more than one task at a time, we need to use a multicore processor (more than one juggler). Back in the old days when all the processors were single core, the only way to have a system that executed tasks simultaneously was to have multiple processors in the system. This also required specialized software to take advantage of the multiple processors. In today's world, just about every device has a processor that has multiple cores, and both the iOS and OS X operating systems are designed to take advantage of the multiple cores to run tasks simultaneously. Traditionally, the way applications added concurrency was to create multiple threads; however, this model does not scale well to an arbitrary number of cores. The biggest problem with using threads was that our applications ran on a variety of systems (and processors), and in order to optimize our code, we needed to know how many cores/processors could be efficiently used at a given time, which is sometimes not known at the time of development. In order to solve this problem, many operating systems, including iOS and OS X, started relying on asynchronous functions. These functions are often used to initiate tasks that could possibly take a long time to complete, such as making an HTTP request or writing data to disk. An asynchronous function typically starts the long running task and then returns prior to the task completion. Usually, this task runs in the background and uses a callback function (such as closure in Swift) when the task completes. These asynchronous functions work great for the tasks that the OS provides them for, but what if we needed to create our own asynchronous functions and do not want to manage the threads ourselves? For this, Apple provides a couple of technologies. In this artcle, we will be covering two of these technologies. These are GCD and operation queues. GCD is a low-level C-based API that allows specific tasks to be queued up for execution and schedules the execution on any of the available processor cores. Operation queues are similar to GCD; however, they are Cocoa objects and are internally implemented using GCD. Let's begin by looking at GCD. Grand Central Dispatch Grand Central Dispatch provides what is known as dispatch queues to manage submitted tasks. The queues manage these submitted tasks and execute them in a first-in, first- out (FIFO) order. This ensures that the tasks are started in the order they were submitted. A task is simply some work that our application needs to perform. As examples, we can create tasks that perform simple calculations, read/write data to disk, make an HTTP request, or anything else that our application needs to do. We define these tasks by placing the code inside either a function or a closure and adding it to a dispatch queue. GCD provides three types of queues: Serial queues: Tasks in a serial queue (also known as a private queue) are executed one at a time in the order they were submitted. Each task is started only after the preceding task is completed. Serial queues are often used to synchronize access to specific resources because we are guaranteed that no two tasks in a serial queue will ever run simultaneously. Therefore, if the only way to access the specific resource is through the tasks in the serial queue, then no two tasks will attempt to access the resource at the same time or be out of order. Concurrent queues: Tasks in a concurrent queue (also known as a global dispatch queue) execute concurrently; however, the tasks are still started in the order that they were added to the queue. The exact number of tasks that can be executing at any given instance is variable and is dependent on the system's current conditions and resources. The decision on when to start a task is up to GCD and is not something that we can control within our application. Main dispatch queue: The main dispatch queue is a globally available serial queue that executes tasks on the application's main thread. Since tasks put into the main dispatch queue run on the main thread, it is usually called from a background queue when some background processing has finished and the user interface needs to be updated. Dispatch queues offer a number of advantages over traditional threads. The first and foremost advantage is, with dispatch queues, the system handles the creation and management of threads rather than the application itself. The system can scale the number of threads, dynamically based on the overall available resources of the system and the current system conditions. This means that dispatch queues can manage the threads with greater efficiency than we could. Another advantage of dispatch queues is we are able to control the order that our tasks are started. With serial queues, not only do we control the order in which tasks are started, but also ensure that one task does not start before the preceding one is complete. With traditional threads, this can be very cumbersome and brittle to implement, but with dispatch queues, as we will see later in this artcle, it is quite easy. Creating and managing dispatch queues Let's look at how to create and use a dispatch queue. The following three functions are used to create or retrieve queues. These functions are as follows: dispatch_queue_create: This creates a dispatch queue of either the concurrent or serial type dispatch_get_global_queue: This returns a system-defined global concurrent queue with a specified quality of service dispatch_get_main_queue: This returns the serial dispatch queue associated with the application's main thread We will also be looking at several functions that submit tasks to a queue for execution. These functions are as follows: dispatch_async: This submits a task for asynchronous execution and returns immediately. dispatch_sync: This submits a task for synchronous execution and waits until it is complete before it returns. dispatch_after: This submits a task for execution at a specified time. dispatch_once: This submits a task to be executed once and only once while this application is running. It will execute the task again if the application restarts. Before we look at how to use the dispatch queues, we need to create a class that will help us demonstrate how the various types of queues work. This class will contain two basic functions. The first function will simply perform some basic calculations and then return a value. Here is the code for this function, which is named doCalc(): func doCalc() { var x=100 var y = x*x _ = y/x } The other function, which is named performCalculation(), accepts two parameters. One is an integer named iterations, and the other is a string named tag. The performCalculation () function calls the doCalc() function repeatedly until it calls the function the same number of times as defined by the iterations parameter. We also use the CFAbsoluteTimeGetCurrent() function to calculate the elapsed time it took to perform all of the iterations and then print the elapse time with the tag string to the console. This will let us know when the function completes and how long it took to complete it. The code for this function looks similar to this: func performCalculation(iterations: Int, tag: String) { let start = CFAbsoluteTimeGetCurrent() for var i=0; i<iterations; i++ { self.doCalc() } let end = CFAbsoluteTimeGetCurrent() print("time for (tag): (end-start)") } These functions will be used together to keep our queues busy, so we can see how they work. Let's begin by looking at the GCD functions by using the dispatch_queue_create() function to create both concurrent and serial queues. Creating queues with the dispatch_queue_create() function The dispatch_queue_create() function is used to create both concurrent and serial queues. The syntax of the dispatch_queue_create() function looks similar to this: func dispatch_queue_t dispatch_queue_create (label:   UnsafePointer<Int8>, attr: dispatch_queue_attr_t!) - >     dispatch_queue_t! It takes the following parameters: label: This is a string label that is attached to the queue to uniquely identify it in debugging tools, such as Instruments and crash reports. It is recommended that we use a reverse DNS naming convention. This parameter is optional and can be nil. attr: This specifies the type of queue to make. This can be DISPATCH_QUEUE_SERIAL, DISPATCH_QUEUE_CONCURRENT or nil. If this parameter is nil, a serial queue is created. The return value for this function is the newly created dispatch queue. Let's see how to use the dispatch_queue_create() function by creating a concurrent queue and seeing how it works. Some programming languages use the reverse DNS naming convention to name certain components. This convention is based on a registered domain name that is reversed. As an example, if we worked for company that had a domain name mycompany.com with a product called widget, the reverse DNS name will be com.mycompany.widget. Creating concurrent dispatch queues with the dispatch_queue_create() function The following line creates a concurrent dispatch queue with the label of cqueue.hoffman.jon: let queue = dispatch_queue_create("cqueue.hoffman.jon", DIS-PATCH_QUEUE_CONCURRENT) As we saw in the beginning of this section, there are several functions that we can use to submit tasks to a dispatch queue. When we work with queues, we generally want to use the dispatch_async() function to submit tasks because when we submit a task to a queue, we usually do not want to wait for a response. The dispatch_async() function has the following signature: func dispatch_async (queue: dispatch_queue_t!, block: dis- patch_queue_block!) The following example shows how to use the dispatch_async() function with the concurrent queue we just created: let c = { performCalculation(1000, tag: "async0") } dispatch_async(queue, c) In the preceding code, we created a closure, which represents our task, that simply calls the performCalculation() function of the DoCalculation instance requesting that it runs through 1000 iterations of the doCalc() function. Finally, we use the dispatch_async() function to submit the task to the concurrent dispatch queue. This code will execute the task in a concurrent dispatch queue, which is separate from the main thread. While the preceding example works perfectly, we can actually shorten the code a little bit. The next example shows that we do not need to create a separate closure as we did in the preceding example; we can also submit the task to execute like this: dispatch_async (queue) { calculation.performCalculation(10000000, tag: "async1") } This shorthand version is how we usually submit small code blocks to our queues. If we have larger tasks, or tasks that we need to submit multiple times, we will generally want to create a closure and submit the closure to the queue as we showed originally. Let's see how the concurrent queue actually works by adding several items to the queue and looking at the order and time that they return. The following code will add three tasks to the queue. Each task will call the performCalculation() function with various iteration counts. Remember that the performCalculation() function will execute the calculation routine continuously until it is executed the number of times as defined by the iteration count passed in. Therefore, the larger the iteration count we pass into the performCalculation() function, the longer it should take to execute. Let's take a look at the following code: dispatch_async (queue) { calculation.performCalculation(10000000, tag: "async1") } dispatch_async(queue) { calculation.performCalculation(1000, tag: "async2") } dispatch_async(queue) { calculation.performCalculation(100000, tag: "async3") } Notice that each of the functions is called with a different value in the tag parameter. Since the performCalculation() function prints out the tag variable with the elapsed time, we can see the order in which the tasks complete and the time it took to execute. If we execute the preceding code, we should see the following results: time for async2: 0.000200986862182617 time for async3: 0.00800204277038574 time for async1: 0.461670994758606 The elapse time will vary from one run to the next and from system to system. Since the queues function in a FIFO order, the task that had the tag of async1 was executed first. However, as we can see from the results, it was the last task to finish. Since this is a concurrent queue, if it is possible (if the system has available resources), the blocks of code will execute concurrently. This is why the tasks with the tags of async2 and async3 completed prior to the task that had the async1 tag, even though the execution of the async1 task began before the other two. Now, let's see how a serial queue executes tasks. Creating a serial dispatch queue with the dispatch_queue_create() function A serial queue functions is a little different than a concurrent queue. A serial queue will only execute one task at a time and will wait for one task to complete before starting the next task. This queue, like the concurrent dispatch queue, follows a first-in first-out order. The following line of code will create a serial queue with the label of squeue.hoffman.jon: let queue2 = dispatch_queue_create("squeue.hoffman.jon", DIS-PATCH_QUEUE_SERIAL) Notice that we create the serial queue with the DISPATCH_QUEUE_SERIAL attribute. If you recall, when we created the concurrent queue, we created it with the DISPATCH_QUEUE_CONCURRENT attribute. We can also set this attribute to nil, which will create a serial queue by default. However, it is recommended to always set the attribute to either DISPATCH_QUEUE_SERIAL or DISPATCH_QUEUE_CONCURRENT to make it easier to identify which type of queue we are creating. As we saw with the concurrent dispatch queues, we generally want to use the dispatch_async() function to submit tasks because when we submit a task to a queue, we usually do not want to wait for a response. If, however, we did want to wait for a response, we would use the dispatch_synch() function. var calculation = DoCalculations() let c = { calculation.performCalculation(1000, tag: "sync0") } dispatch_async(queue2, c) Just like with the concurrent queues, we do not need to create a closure to submit a task to the queue. We can also submit the task like this: dispatch_async (queue2) { calculation.performCalculation(100000, tag: "sync1") } Let's see how the serial queues works by adding several items to the queue and looking at the order and time that they complete. The following code will add three tasks, which will call the performCalculation() function with various iteration counts, to the queue: dispatch_async (queue2) { calculation.performCalculation(100000, tag: "sync1") } dispatch_async(queue2) { calculation.performCalculation(1000, tag: "sync2") } dispatch_async(queue2) { calculation.performCalculation(100000, tag: "sync3") } Just like with the concurrent queue example, we call the performCalculation() function with various iteration counts and different values in the tag parameter. Since the performCalculation() function prints out the tag string with the elapsed time, we can see the order that the tasks complete in and the time it takes to execute. If we execute this code, we should see the following results: time for sync1: 0.00648999214172363 time for sync2: 0.00009602308273315 time for sync3: 0.00515800714492798 The elapse time will vary from one run to the next and from system to system.  Unlike the concurrent queues, we can see that the tasks completed in the same order that they were submitted, even though the sync2 and sync3 tasks took considerably less time to complete. This demonstrates that a serial queue only executes one task at a time and that the queue waits for each task to complete before starting the next one. Now that we have seen how to use the dispatch_queue_create() function to create both concurrent and serial queues, let's look at how we can get one of the four system- defined, global concurrent queues using the dispatch_get_global_queue() function. Requesting concurrent queues with the dispatch_get_global_queue() function The system provides each application with four concurrent global dispatch queues of different priority levels. The different priority levels are what distinguish these queues. The four priorities are: DISPATCH_QUEUE_PRIORITY_HIGH: The items in this queue run with the highest priority and are scheduled before items in the default and low priority queues DISPATCH_QUEUE_PRIORITY_DEFAULT: The items in this queue run at the default priority and are scheduled before items in the low priority queue but after items in the high priority queue DISPATCH_QUEUE_PRIORITY_LOW: The items in this queue run with a low priority and are schedule only after items in the high and default queues DISPATCH_QUEUE_PRIORITY_BACKGROUND: The items in this queue run with a background priority, which has the lowest priority Since these are global queues, we do not need to actually create them; instead, we ask for a reference to the queue with the priority level needed. To request a global queue, we use the dispatch_get_global_queue() function. This function has the following syntax: func dispatch_get_global_queue(identifier: Int, flags: UInt) -> ? dispatch_queue_t! Here, the following parameters are defined: identifier: This is the priority of the queue we are requesting flags: This is reserved for future expansion and should be set to zero at this time We request a queue using the dispatch_get_global_queue() function, as shown in the following example: let queue = dispatch_get_global_queue (DISPATCH_QUEUE_PRIORITY_DEFAULT, 0) In this example, we are requesting the global queue with the default priority. We can then use this queue exactly as we used the concurrent queues that we created with the dispatch_queue_create() function. The difference between the queues returned with the dispatch_get_global_queue() function and the ones created with the dispatch_create_queue() function is that with the dispatch_create_queue() function, we are actually creating a new queue. The queues that are returned with the dispatch_get_global_queue() function are global queues that are created when our application first starts; therefore, we are requesting a queue rather than creating a new one. When we use the dispatch_get_global_queue() function, we avoid the overhead of creating the queue; therefore, I recommend using the dispatch_get_global_queue() function unless you have a specific reason to create a queue. Requesting the main queue with the dispatch_get_main_queue() function The dispatch_get_main_queue() function returns the main queue for our application. The main queue is automatically created for the main thread when the application starts. This main queue is a serial queue; therefore, items in this queue are executed one at a time, in the order that they were submitted. We will generally want to avoid using this queue unless we have a need to update the user interface from a background thread. The dispatch_get_main_queue() function has the following syntax: func dispatch_get_main_queue() -> dispatch_queue_t! The following code example shows how to request the main queue: let mainQueue = dispatch_get_main_queue(); We will then submit tasks to the main queue exactly as we would any other serial queue. Just remember that anything submitted to this queue will run on the main thread, which is the thread that all the user interface updates run on; therefore, if we submitted a long running task, the user interface will freeze until that task is completed. In the previous sections, we saw how the dispatch_async() functions submit tasks to concurrent and serial queues. Now, let's look at two additional functions that we can use to submit tasks to our queues. The first function we will look at is the dispatch_after() function. Using the dispatch_after() function There will be times that we need to execute tasks after a delay. If we were using a threading model, we would need to create a new thread, perform some sort of delay or sleep function, and execute our task. With GCD, we can use the dispatch_after() function. The dispatch_after() function takes the following syntax: func dispatch_after(when: dispatch_time_t, queue:   dispatch_queue_t, block: dispatch_block_t) Here, the dispatch_after() function takes the following parameters: when: This is the time that we wish the queue to execute our task in queue: This is the queue that we want to execute our task in block: This is the task to execute As with the dispatch_async() and dispatch_synch() functions, we do not need to include our task as a parameter. We can include our task to execute between two curly brackets exactly as we did previously with the dispatch_async() and dispatch_synch() functions. As we can see from the dispatch_after() function, we use the dispatch_time_t type to define the time to execute the task. We use the dispatch_time() function to create the dispatch_time_t type. The dispatch_time() function has the following syntax: func dispatch_time(when: dispatch_time_t, delta:Int64) ->   dispatch_time_t Here, the dispatch_time() function takes the following parameter: when: This value is used as the basis for the time to execute the task. We generally pass the DISPATCH_TIME_NOW value to create the time, based on the current time. delta: This is the number of nanoseconds to add to the when parameter to get our time. We will use the dispatch_time() and dispatch_after() functions like this: var delayInSeconds = 2.0 let eTime = dispatch_time(DISPATCH_TIME_NOW, Int64(delayInSeconds * Double(NSEC_PER_SEC))) dispatch_after(eTime, queue2) { print("Times Up") } The preceding code will execute the task after a two-second delay. In the dispatch_ time() function, we create a dispatch_time_t type that is two seconds in the future. The NSEC_PER_SEC constant is use to calculate the nanoseconds from seconds. After the two-second delay, we print the message, Times Up, to the console. There is one thing to watch out for with the dispatch_after() function. Let's take a look at the following code: let queue2 = dispatch_queue_create("squeue.hoffman.jon", DIS-PATCH_QUEUE_SERIAL) var delayInSeconds = 2.0 let pTime = dispatch_time (DISPATCH_TIME_NOW,Int64(delayInSeconds * Double(NSEC_PER_SEC))) dispatch_after(pTime, queue2) { print("Times Up") } dispatch_sync(queue2) { calculation.performCalculation(100000, tag: "sync1") } In this code, we begin by creating a serial queue and then adding two tasks to the queue. The first task uses the dispatch_after() function, and the second task uses the dispatch_sync() function. Our initial thought would be that when we executed this code within the serial queue, the first task would execute after a two-second delay and then the second task would execute; however, this would not be correct. The first task is submitted to the queue and executed immediately. It also returns immediately, which lets the queue execute the next task while it waits for the correct time to execute the first task. Therefore, even though we are running the tasks in a serial queue, the second task completes before the first task. The following is an example of the output if we run the preceding code: time for sync1: 0.00407701730728149 Times Up The final GCD function that we are going to look at is dispatch_once(). Using the dispatch_once() function The dispatch_once() function will execute a task once, and only once, for the lifetime of the application. What this means is that the task will be executed and marked as executed, then that task will not be executed again unless the application restarts. While the dispatch_once() function can be and has been used to implement the singleton pattern, there are other easier ways to do this. The dispatch_once() function is great for executing initialization tasks that need to run when our application initially starts. These initialization tasks can consist of initializing our data store or variables and objects. The following code shows the syntax for the dispatch_once() function: func dispatch_once (predicate: UnsafeMutablePoin- ter<dispatch_once_t>,block: dispatch_block_t!) Let's look at how to use the dispatch_once() function: var token: dispatch_once_t = 0 func example() { dispatch_once(&token) { print("Printed only on the first call") } print("Printed for each call") } In this example, the line that prints the message, Printed only on the first call, will be executed only once, no matter how many times the function is called. However, the line that prints the Printed for each call message will be executed each time the function is called. Let's see this in action by calling this function four times, like this: for i in 0..<4 { example() } If we execute this example, we should see the following output: Printed only on the first call Printed for each call Printed for each call Printed for each call Printed for each call Notice, in this example, that we only see the Printed only on the first call message once whereas we see the Printed for each call message all the four times that we call the function. Now that we have looked at GCD, let's take a look at operation queues. Using NSOperation and NSOperationQueue types The NSOperation and NSOperationQueues types, working together, provide us with an alternative to GCD for adding concurrency to our applications. Operation queues are Cocoa objects that function like dispatch queues and internally, operation queues are implemented using GCD. We define the tasks (NSOperations) that we wish to execute and then add the task to the operation queue (NSOperationQueue). The operation queue will then handle the scheduling and execution of tasks. Operation queues are instances of the NSOperationQueue class and operations are instances of the NSOperation class. The operation represents a single unit of work or task. The NSOperation type is an abstract class that provides a thread-safe structure for modeling the state, priority, and dependencies. This class must be subclassed in order to perform any useful work. Apple does provide two concrete implementations of the NSOperation type that we can use as-is for situations where it does not make sense to build a custom subclass. These subclasses are NSBlockOperation and NSInvocationOperation. More than one operation queue can exist at the same time, and actually, there is always at least one operation queue running. This operation queue is known as the main queue. The main queue is automatically created for the main thread when the application starts and is where all the UI operations are performed. There are several ways that we can use the NSOperation and NSOperationQueues classes to add concurrency to our application. In this artcle, we will look at three different ways. The first one we will look at is using the NSBlockOperation implementation of the NSOperation abstract class. Using the NSBlockOperation implementation of NSOperation In this section, we will be using the same DoCalculation class that we used in the Grand Central Dispatch section to keep our queues busy with work so that we can see how the NSOpererationQueues class work. The NSBlockOperation class is a concrete implementation of the NSOperation type that can manage the execution of one or more blocks. This class can be used to execute several tasks at once without the need to create separate operations for each task. Let's see how to use the NSBlockOperation class to add concurrency to our application. The following code shows how to add three tasks to an operation queue using a single NSBlockOperation instance: let calculation = DoCalculations() let operationQueue = NSOperationQueue() let blockOperation1: NSBlockOperation = NSBlockOpera-tion.init(block: { calculation.performCalculation(10000000, tag: "Operation 1") }) blockOperation1.addExecutionBlock( { calculation.performCalculation(10000, tag: "Operation 2") } ) blockOperation1.addExecutionBlock( { calculation.performCalculation(1000000, tag: "Operation 3") } ) operationQueue.addOperation(blockOperation1) In this code, we begin by creating an instance of the DoCalculation class and an instance of the NSOperationQueue class. Next, we created an instance of the NSBlockOperation class using the init constructor. This constructor takes a single parameter, which is a block of code that represents one of the tasks we want to execute in the queue. Next, we add two additional tasks to the NSBlockOperation instance using the addExecutionBlock() method. This is one of the differences between dispatch queues and operations. With dispatch queues, if resources are available, the tasks are executed as they are added to the queue. With operations, the individual tasks are not executed until the operation itself is submitted to an operation queue. Once we add all of the tasks to the NSBlockOperation instance, we then add the operation to the NSOperationQueue instance that we created at the beginning of the code. At this point, the individual tasks within the operation start to execute. This example shows how to use NSBlockOperation to queue up multiple tasks and then pass the tasks to the operation queue. The tasks are executed in a FIFO order; therefore, the first task that is added to the NSBlockOperation instance will be the first task executed. However, since the tasks can be executed concurrently if we have the available resources, the output from this code should look similar to this: time for Operation 2: 0.00546294450759888 time for Operation 3: 0.0800899863243103 time for Operation 1: 0.484337985515594 What if we do not want our tasks to run concurrently? What if we wanted them to run serially like the serial dispatch queue? We can set a property in our operation queue that defines the number of tasks that can be run concurrently in the queue. The property is called maxConcurrentOperationCount and is used like this: operationQueue.maxConcurrentOperatio nCount = 1 However, if we added this line to our previous example, it will not work as expected. To see why this is, we need to understand what the property actually defines. If we look at Apple's NSOperationQueue class reference, the definition of the property says, "The maximum number of queued operations that can execute at the same time." What this tells us is that the maxConcurrentOperationCount property defines the number of operations (this is the key word) that can be executed at the same time. The NSBlockOperation instance, which we added all of our tasks to, represents a single operation; therefore, no other NSBlockOperation added to the queue will execute until the first one is complete, but the individual tasks within the operation will execute concurrently. To run the tasks serially, we would need to create a separate instance of the NSBlockOperations for each task. Using an instance of the NSBlockOperation class good if we have a number of tasks that we want to execute concurrently, but they will not start executing until we add the operation to an operation queue. Let's look at a simpler way of adding tasks to an operation queue using the queues addOperationWithBlock() methods. Using the addOperationWithBlock() method of the operation queue The NSOperationQueue class has a method named addOperationWithBlock() that makes it easy to add a block of code to the queue. This method automatically wraps the block of code in an operation object and then passes that operation to the queue itself. Let's see how to use this method to add tasks to a queue: let operationQueue = NSOperationQueue() let calculation = DoCalculations() operationQueue.addOperationWithBlock() { calculation.performCalculation(10000000, tag: "Operation1") } operationQueue.addOperationWithBlock() { calculation.performCalculation(10000, tag: "Operation2") } operationQueue.addOperationWithBlock() { calculation.performCalculation(1000000, tag: "Operation3") } In the NSBlockOperation example, earlier in this artcle, we added the tasks that we wished to execute into an NSBlockOperation instance. In this example, we are adding the tasks directly to the operation queue, and each task represents one complete operation. Once we create the instance of the operation queue, we then use the addOperationWithBlock() method to add the tasks to the queue. Also, in the NSBlockOperation example, the individual tasks did not execute until all of the tasks were added to the NSBlockOperation object and then that operation was added to the queue. This addOperationWithBlock() example is similar to the GCD example where the tasks begin executing as soon as they are added to the operation queue. If we run the preceding code, the output should be similar to this: time for Operation2: 0.0115870237350464 time for Operation3: 0.0790849924087524 time for Operation1: 0.520610988140106 You will notice that the operations are executed concurrently. With this example, we can execute the tasks serially by using the maxConcurrentOperationCount property that we mentioned earlier. Let's try this by initializing the NSOperationQueue instance like this: var operationQueue = NSOperationQueue() operationQueue.maxConcurrentOperationCount = 1 Now, if we run the example, the output should be similar to this: time for Operation1: 0.418763995170593 time for Operation2: 0.000427007675170898 time for Operation3: 0.0441589951515198 In this example, we can see that each task waited for the previous task to complete prior to starting. Using the addOperationWithBlock() method to add tasks, the operation queue is generally easier than using the NSBlockOperation method; however, the tasks will begin as soon as they are added to the queue, which is usually the desired behavior. Now, let's look at how we can subclass the NSOperation class to create an operation that we can add directly to an operation queue. Subclassing the NSOperation class The previous two examples showed how to add small blocks of code to our operation queues. In these examples, we called the performCalculations method in the DoCalculation class to perform our tasks. These examples illustrate two really good ways to add concurrency for functionally that is already written, but what if, at design time, we want to design our DoCalculation class for concurrency? For this, we can subclass the NSOperation class. The NSOperation abstract class provides a significant amount of infrastructure. This allows us to very easily create a subclass without a lot of work. We should at least provide an initialization method and a main method. The main method will be called when the queue begins executing the operation: Let's see how to implement the DoCalculation class as a subclass of the NSOperation class; we will call this new class MyOperation: class MyOperation: NSOperation { let iterations: Int let tag: String init(iterations: Int, tag: String) { self.iterations = iterations self.tag = tag } override func main() { performCalculation() } func performCalculation() { let start = CFAbsoluteTimeGetCurrent() for var i=0; i<iterations; i++ { self.doCalc() } let end = CFAbsoluteTimeGetCurrent() print("time for (tag): (end-start)") } func doCalc() { let x=100 let y = x*x _ = y/x } } We begin by defining that the MyOperation class is a subclass of the NSOperation class. Within the implementation of the class, we define two class constants, which represent the iteration count and the tag that the performCalculations() method uses. Keep in mind that when the operation queue begins executing the operation, it will call the main() method with no parameters; therefore, any parameters that we need to pass in must be passed in through the initializer. In this example, our initializer takes two parameters that are used to set the iterations and tag classes constants. Then the main() method, that the operation queue is going to call to begin execution of the operation, simply calls the performCalculation() method. We can now very easily add instances of our MyOperation class to an operation queue, like this: var operationQueue = NSOperationQueue() operationQueue.addOperation(MyOperation (iterations: 10000000, tag: "Operation 1")) operationQueue.addOperation(MyOperation (iterations: 10000, tag: "Operation 2")) operationQueue.addOperation(MyOperation (iterations: 1000000, tag: "Operation 3")) If we run this code, we will see the following results: time for Operation 2: 0.00187397003173828 time for Operation 3: 0.104826986789703 time for Operation 1: 0.866684019565582 As we saw earlier, we can also execute the tasks serially by adding the following line, which sets the maxConcurrentOperationCount property of the operation queue: operationQueue.maxConcurrentOperationCount = 1 If we know that we need to execute some functionality concurrently prior to writing the code, I will recommend subclassing the NSOperation class, as shown in this example, rather than using the previous examples. This gives us the cleanest implementation; however, there is nothing wrong with using the NSBlockOperation class or the addOperationWithBlock() methods described earlier in this section. Summary Before we consider adding concurrency to our application, we should make sure that we understand why we are adding it and ask ourselves whether it is necessary. While concurrency can make our application more responsive by offloading work from our main application thread to a background thread, it also adds extra complexity to our code and overhead to our application. I have even seen numerous applications, in various languages, which actually run better after we pulled out some of the concurrency code. This is because the concurrency was not well thought out or planned. With this in mind, it is always a good idea to think and talk about concurrency while we are discussing the application's expected behavior. At the start of this artcle, we had a discussion about running tasks concurrently compared to running tasks in parallel. We also discussed the hardware limitation that limits how many tasks can run in parallel on a given device. Having a good understanding of those concepts is very important to understanding how and when to add concurrency to our projects. While GCD is not limited to system-level applications, before we use it in our application, we should consider whether operation queues would be easier and more appropriate for our needs. In general, we should use the highest level of abstraction that meets our needs. This will usually point us to using operation queues; however, there really is nothing preventing us from using GCD, and it may be more appropriate for our needs. One thing to keep in mind with operation queues is that they do add additional overhead because they are Cocoa objects. For the large majority of applications, this little extra overhead should not be an issue or even noticed; however, for some projects, such as games that need every last resource that they can get, this extra overhead might very well be an issue. Resources for Article: Further resources on this subject: Swift for Open Source Developers [article] Your First Swift 2 Project [article] Exploring Swift [article]
Read more
  • 0
  • 0
  • 20135

article-image-component-composition
Packt
22 Feb 2016
38 min read
Save for later

Component Composition

Packt
22 Feb 2016
38 min read
In this article, we understand how large-scale JavaScript applications amount to a series of communicating components. Composition is a big topic, and one that's relevant to scalable JavaScript code. When we start thinking about the composition of our components, we start to notice certain flaws in our design; limitations that prevent us from scaling in response to influencers. (For more resources related to this topic, see here.) The composition of a component isn't random—there's a handful of prevalent patterns for JavaScript components. We'll begin the article with a look at some of these generic component types that encapsulate common patterns found in every web application. Understanding that components implement patterns is crucial for extending these generic components in a way that scales. It's one thing to get our component composition right from a purely technical standpoint, it's another to easily map these components to features. The same challenge holds true for components we've already implemented. The way we compose our code needs to provide a level of transparency, so that it's feasible to decompose our components and understand what they're doing, both at runtime and at design time. Finally, we'll take a look at the idea of decoupling business logic from our components. This is nothing new, the idea of separation-of-concerns has been around for a long time. The challenge with JavaScript applications is that it touches so many things—it's difficult to clearly separate business logic from other implementation concerns. The way in which we organize our source code (relative to the components that use them) can have a dramatic effect on our ability to scale. Generic component types It's exceedingly unlikely that anyone, in this day and age, would set out to build a large scale JavaScript application without the help of libraries, a framework, or both. Let's refer to these collectively as tools, since we're more interested in using the tools that help us scale, and not necessarily which tools are better than other tools. At the end of the day, it's up to the development team to decide which tool is best for the application we're building, personal preferences aside. Guiding factors in choosing the tools we use are the type of components they provide, and what these are capable of. For example, a larger web framework may have all the generic components we need. On the other hand, a functional programming utility library might provide a lot of the low-level functionality we need. How these things are composed into a cohesive feature that scales, is for us to figure out. The idea is to find tools that expose generic implementations of the components we need. Often, we'll extend these components, building specific functionality that's unique to our application. This section walks through the most typical components we'd want in a large-scale JavaScript application. Modules Modules exist, in one form or another, in almost every programming language. Except in JavaScript. That's almost untrue though—ECMAScript 6, in it's final draft status at the time of this writing, introduces the notion of modules. However, there're tools out there today that allow us to modularize our code, without relying on the script tag. Large-scale JavaScript code is still a relatively new thing. Things like the script tag weren't meant to address issues like modular code and dependency management. RequireJS is probably the most popular module loader and dependency resolver. The fact that we need a library just to load modules into our front-end application speaks of the complexities involved. For example, module dependencies aren't a trivial matter when there's network latency and race conditions to consider. Another option is to use a transpiler like Browserify. This approach is gaining traction because it lets us declare our modules using the CommonJS format. This format is used by NodeJS, and the upcoming ECMAScript module specification is a lot closer to CommonJS than to AMD. The advantage is that the code we write today has better compatibility with back-end JavaScript code, and with the future. Some frameworks, like Angular or Marionette, have their own ideas of what modules are, albeit, more abstract ideas. These modules are more about organizing our code, than they are about tactfully delivering code from the server to the browser. These types of modules might even map better to other features of the framework. For example, if there's a centralized application instance that's used to manage our modules, the framework might provide a mean to manage modules from the application. Take a look at the following diagram: A global application component using modules as it's building blocks. Modules can be small, containing only one feature, or large, containing several features This lets us perform higher-level tasks at the module level (things like disabling modules or configuring them with arguments). Essentially, modules speak for features. They're a packaging mechanism that allows us to encapsulate things about a given feature that the rest of the application doesn't care about. Modules help us scale our application by adding high-level operations to our features, by treating our features as the building blocks. Without modules, we'd have no meaningful way to do this. The composition of modules look different depending on the mechanism used to declare the module. A module could be straightforward, providing a namespace from which objects can be exported. Or if we're using a specific framework module flavor, there could be much more to it. Like automatic event life cycles, or methods for performing boilerplate setup tasks. However we slice it, modules in the context of scalable JavaScript are a means to create larger building blocks, and a means to handle complex dependencies: // main.js // Imports a log() function from the util.js model. import log from 'util.js'; log('Initializing...'); // util.js // Exports a basic console.log() wrapper function. 'use strict'; export default function log(message) { if (console) { console.log(message); } } While it's easier to build large-scale applications with module-sized building blocks, it's also easier to tear a module out of an application and work with it in isolation. If our application is monolithic or our modules are too plentiful and fine-grained, it's very difficult for us to excise problem-spots from our code, or to test work in progress. Our component may function perfectly well on its own. It could have negative side-effects somewhere else in the system, however. If we can remove pieces of the puzzle, one at a time and without too much effort, we can scale the trouble-shooting process. Routers Any large-scale JavaScript application has a significant number of possible URIs. The URI is the address of the page that the user is looking at. They can navigate to this resource by clicking on links, or they may be taken to a new URI automatically by our code, perhaps in response to some user action. The web has always relied on URIs, long before the advent of large-scale JavaScript applications. URIs point to resources, and resources can be just about anything. The larger the application, the more resources, and the more potential URIs. Router components are tools we use in the front-end, to listen for these URI change events and respond to them accordingly. There's less reliance on the back-end web servers parsing the URI, and returning the new content. Most web sites still do this, but there're several disadvantages with this approach when it comes to building applications: The browser triggers events when the URI changes, and the router component responds to these changes. The URI changes can be triggered from the history API, or from location.hash The main problem is that we want the UI to be portable, as in, we want to be able to deploy it against any back-end and things should work. Since we're not assembling markup for the URI in the back-end, it doesn't make sense to parse the URI in the back-end either. We declaratively specify all the URI patterns in our router components. We generally refer to these as, routes. Think of a route as a blueprint, and a URI as an instance of that blueprint. This means that when the router receives a URI, it can correlate it to a route. That, in essence, is the responsibility of router components. Which is easy with smaller applications, but when we're talking about scale, further deliberation on router design is in order. As a starting point, we have to consider the URI mechanism we want to use. The two choices are basically listening to hash change events, or utilizing the history API. Using hash-bang URIs is probably the simplest approach. The history API available in every modern browser, on the other hand, lets us format URI's without the hash-bang—they look like real URIs. The router component in the framework we're using may support only one or the other, thus simplifying the decision. Some support both URI approaches, in which case we need to decide which one works best for our application. The next thing to consider about routing in our architecture is how to react to route changes. There're generally two approaches to this. The first is to declaratively bind a route to a callback function. This is ideal when the router doesn't have a lot of routes. The second approach is to trigger events when routes are activated. This means that there's nothing directly bound to the router. Instead, some other component listens for such an event. This approach is beneficial when there are lots of routes, because the router has no knowledge of the components, just the routes. Here's an example that shows a router component listening to route events: // router.js import Events from 'events.js' // A router is a type of event broker, it // can trigger routes, and listen to route // changes. export default class Router extends Events { // If a route configuration object is passed, // then we iterate over it, calling listen() // on each route name. This is translating from // route specs to event listeners. constructor(routes) { super(); if (routes != null) { for (let key of Object.keys(routes)) { this.listen(key, routes[key]); } } } // This is called when the caller is ready to start // responding to route events. We listen to the // "onhashchange" window event. We manually call // our handler here to process the current route. start() { window.addEventListener('hashchange', this.onHashChange.bind(this)); this.onHashChange(); } // When there's a route change, we translate this into // a triggered event. Remember, this router is also an // event broker. The event name is the current URI. onHashChange() { this.trigger(location.hash, location.hash); } }; // Creates a router instance, and uses two different // approaches to listening to routes. // // The first is by passing configuration to the Router. // The key is the actual route, and the value is the // callback function. // // The second uses the listen() method of the router, // where the event name is the actual route, and the // callback function is called when the route is activated. // // Nothing is triggered until the start() method is called, // which gives us an opportunity to set everything up. For // example, the callback functions that respond to routes // might require something to be configured before they can // run. import Router from 'router.js' function logRoute(route) { console.log(`${route} activated`); } var router = new Router({ '#route1': logRoute }); router.listen('#route2', logRoute); router.start(); Some of the code required to run these examples is omitted from the listings. For example, the events.js module is included in the code bundle that comes with this book, it's just not that relevant to the example. Also in the interest of space, the code examples avoid using specific frameworks and libraries. In practice, we're not going to write our own router or events API—our frameworks do that already. We're instead using vanillaES6 JavaScript, to illustrate points pertinent to scaling our applications Another architectural consideration we'll want to make when it comes to routing is whether we want a global, monolithic router, or a router per module, or some other component. The downside to having a monolithic router is that it becomes difficult to scale when it grows sufficiently large, as we keep adding features and routes. The advantage is that the routes are all declared in one place. Monolithic routers can still trigger events that all our components can listen to. The per-module approach to routing involves multiple router instances. For example, if our application has five components, each would have their own router. The advantage here is that the module is completely self-contained. Anyone working with this module doesn't need to look elsewhere to figure out which routes it responds to. Using this approach, we can also have a tighter coupling between the route definitions and the functions that respond to them, which could mean simpler code. The downside to this approach is that we lose the consolidated aspect of having all our routes declared in a central place. Take a look at the following diagram: The router to the left is global—all modules use the same instance to respond to URI events. The modules to the right have their own routers. These instances contain configuration specific to the module, not the entire application Depending on the capabilities of the framework we're using, the router components may or may not support multiple router instances. It may only be possible to have one callback function per route. There may be subtle nuances to the router events we're not yet aware of. Models/Collections The API our application interacts with exposes entities. Once these entities have been transferred to the browser, we will store a model of those entities. Collections are a bunch of related entities, usually of the same type. The tools we're using may or may not provide a generic model and/or collection components, or they may have something similar but named differently. The goal of modeling API data is a rough approximation of the API entity. This could be as simple as storing models as plain JavaScript objects and collections as arrays. The challenge with simply storing our API entities as plain objects in arrays is that some other component is then responsible for talking to the API, triggering events when the data changes, and for performing data transformations. We want other components to be able to transform collections and models where needed, in order to fulfill their duties. But we don't want repetitive code, and it's best if we're able to encapsulate the common things like transformations, API calls, and event life cycles. Take a look at the next diagram: Models encapsulate interaction with APIs, parsing data, and triggering events when data changes. This leads to simpler code outside of the models Hiding the details of how the API data is loaded into the browser, or how we issue commands, helps us scale our application as we grow. As we add more entities to the API, the complexity of our code grows too. We can throttle this complexity by constraining the API interactions to our model and collection components. Another scalability issue we'll face with our models and collections is where they fit in the big picture. That is, our application is really just one big component, composed of smaller components. Our models and collections map well to our API, but not necessarily to features. API entities are more generic than specific features, and are often used by several features. Which leaves us with an open question—where do our models and collections fit into components? Here's an example that shows specific views extending generic views. The same model can be passed to both: // A super simple model class. class Model { constructor(first, last, age) { this.first = first; this.last = last; this.age = age; } } // The base view, with a name method that // generates some output. class BaseView { name() { return `${this.model.first} ${this.model.last}`; } } // Extends BaseView with a constructor that accepts // a model and stores a reference to it. class GenericModelView extends BaseView { constructor(model) { super(); this.model = model; } } // Extends GenericModelView with specific constructor // arguments. class SpecificModelView extends BaseView { constructor(first, last, age) { super(); this.model = new Model(...arguments); } } var properties = [ 'Terri', 'Hodges', 41 ]; // Make sure the data is the same in both views. // The name() method should return the same result... console.log('generic view', new GenericModelView(new Model(...properties)).name()); console.log('specific view', new SpecificModelView(...properties).name()); On one hand, components can be completely generic with regard to the models and collections they use. On the other hand, some components are specific with their requirements—they can directly instantiate their collections. Configuring generic components with specific models and collections at runtime only benefits us when the component truly is generic, and is used in several places. Otherwise, we might as well encapsulate the models within the components that use them. Choosing the right approach helps us scale. Because, not all our components will be entirely generic or entirely specific. Controllers/Views Depending on the framework we're using, and the design patterns our team is following, controllers and views can represent different things. There's simply too many MV* pattern and style variations to provide a meaningful distinction in terms of scale. The minute differences have trade-offs relative to similar but different MV* approaches. For our purpose of discussing large scale JavaScript code, we'll treat them as the same type of component. If we decide to separate the two concepts in our implementation, the ideas in this section will be relevant to both types. Let's stick with the term views for now, knowing that we're covering both views and controllers, conceptually. These components interact with several other component types, including routers, models or collections, and templates, which are discussed in the next section. When something happens, the user needs to be notified about it. The view's job is to update the DOM. This could be as simple as changing an attribute on a DOM element, or as involved as rendering a new template: A view component updating the DOM in response to router and model events A view can update the DOM in response to several types of events. A route could have changed. A model could have been updated. Or something more direct, like a method call on the view component. Updating the DOM is not as straightforward as one might think. There's the performance to think about—what happens when our view is flooded with events? There's the latency to think about—how long will this JavaScript call stack run, before stopping and actually allowing the DOM to render? Another responsibility of our views is responding to DOM events. These are usually triggered by the user interacting with our UI. The interaction may start and end with our view. For example, depending on the state of something like user input or one of our models, we might update the DOM with a message. Or we might do nothing, if the event handler is debounced, for instance. A debounced function groups multiple calls into one. For example, calling foo() 20 times in 10 milliseconds may only result in the implementation of foo() being called once. For a more detailed explanation, look at: http://drupalmotion.com/article/debounce-and-throttle-visual-explanation. Most of the time, the DOM events get translated into something else, either a method call or another event. For example, we might call a method on a model, or transform a collection. The end result, most of the time, is that we provide feedback by updating the DOM. This can be done either directly, or indirectly. In the case of direct DOM updates, it's simple to scale. In the case of indirect updates, or updates through side-effects, scaling becomes more of a challenge. This is because as the application acquires more moving parts, the more difficult it becomes to form a mental map of cause and effect. Here's an example that shows a view listening to DOM events and model events. import Events from 'events.js'; // A basic model. It extending "Events" so it // can listen to events triggered by other components. class Model extends Events { constructor(enabled) { super(); this.enabled = !!enabled; } // Setters and getters for the "enabled" property. // Setting it also triggers an event. So other components // can listen to the "enabled" event. set enabled(enabled) { this._enabled = enabled; this.trigger('enabled', enabled); } get enabled() { return this._enabled; } } // A view component that takes a model and a DOM element // as arguments. class View { constructor(element, model) { // When the model triggers the "enabled" event, // we adjust the DOM. model.listen('enabled', (enabled) => { element.setAttribute('disabled', !enabled); }); // Set the state of the model when the element is // clicked. This will trigger the listener above. element.addEventListener('click', () => { model.enabled = false; }); } } new View(document.getElementById('set'), new Model()); On the plus side to all this complexity, we actually get some reusable code. The view is agnostic as to how the model or router it's listening to is updated. All it cares about is specific events on specific components. This is actually helpful to us because it reduces the amount of special-case handling we need to implement. The DOM structure that's generated at runtime, as a result of rendering all our views, needs to be taken into consideration as well. For example, if we look at some of the top-level DOM nodes, they have nested structure within them. It's these top-level nodes that form the skeleton of our layout. Perhaps this is rendered by the main application view, and each of our views has a child-relationship to it. Or perhaps the hierarchy extends further down than that. The tools we're using most likely have mechanisms for dealing with these parent-child relationships. However, bear in mind that vast view hierarchies are difficult to scale. Templates Template engines used to reside mostly in the back-end framework. That's less true today, thanks in a large part to the sophisticated template rendering libraries available in the front-end. With large-scale JavaScript applications, we rarely talk to back-end services about UI-specific things. We don't say, "here's a URL, render the HTML for me". The trend is to give our JavaScript components a certain level autonomy—letting them render their own markup. Having component markup coupled with the components that render them is a good thing. It means that we can easily discern where the markup in the DOM is being generated. We can then diagnose issues and tweak the design of a large scale application. Templates help establish a separation of concerns with each of our components. The markup that's rendered in the browser mostly comes from the template. This keeps markup-specific code out of our JavaScript. Front-end template engines aren't just tools for string replacement; they often have other tools to help reduce the amount of boilerplate JavaScript code to write. For example, we can embed things like conditionals and for-each loops in our markup, where they're suited. Application-specific components The component types we've discussed so far are very useful for implementing scalable JavaScript code, but they're also very generic. Inevitably, during implementation we're going to hit a road block—the component composition patterns we've been following will not work for certain features. This is when it's time to step back and think about possibly adding a new type of component to our architecture. For example, consider the idea of widgets. These are generic components that are mainly focused on presentation and user interactions. Let's say that many of our views are using the exact same DOM elements, and the exact same event handlers. There's no point in repeating them in every view throughout our application. Might it be better if we were to factor it into a common component? A view might be overkill, perhaps we need a new type of widget component? Sometimes we'll create components for the sole purpose of composition. For example, we might have a component that glues together router, view, model/collection, and template components together to form a cohesive unit. Modules partially solve this problem but they aren't always enough. Sometimes we're missing that added bit of orchestration that our components need in order to communicate. Extending generic components We often discover, late in the development process, that the components we rely on are lacking something we need. If the base component we're using is designed well, then we can extend it, plugging in the new properties or functionality we need. In this section, we'll walk through some scenarios where we might need to extend the common generic components used throughout our application. If we're going to scale our code, we need to leverage these base components where we can. We'll probably want to start extending our own base components at some point too. Some tools are better than others at facilitating the extension mechanism through which we implement this specialized behavior. Identifying common data and functionality Before we look at extending the specific component types, it's worthwhile to consider the common properties and functionality that's common across all component types. Some of these things will be obvious up-front, while others are less pronounced. Our ability to scale depends, in part, on our ability to identify commonality across our components. If we have a global application instance, quite common in large JavaScript applications, global values and functionality can live there. This can grow unruly down the line though, as more common things are discovered. Another approach might be to have several global modules, as shown in the following diagram, instead of just a single application instance. Or both. But this doesn't scale from an understandability perspective: The ideal component hierarchy doesn't extend beyond three levels. The top level is usually found in a framework our application depends on As a rule-of-thumb, we should, for any given component, avoid extending it more than three levels down. For example, a generic view component from the tools we're using could be extended by our generic version of it. This would include properties and functionality that every view instance in our application requires. This is only a two-level hierarchy, and easy to manage. This means that if any given component needs to extend our generic view, it can do so without complicating things. Three-levels should be the maximum extension hierarchy depth for any given type. This is just enough to avoid unnecessary global data, going beyond this presents scaling issues because the hierarchy isn't easily grasped. Extending router components Our application may only require a single router instance. Even in this case, we may still need to override certain extension points of the generic router. In case of multiple router instances, there's bound to be common properties and functionality that we don't want to repeat. For example, if every route in our application follows the same pattern, with only subtle differences, we can implement the tools in our base router to avoid repetitious code. In addition to declaring routes, events take place when a given route is activated. Depending on the architecture of our application, different things need to happen. Maybe certain things always need to happen, no matter which route has been activated. This is where extending the router to provide our own functionality comes in hand. For example, we have to validate permission for a given route. It wouldn't make much sense for us to handle this through individual components, as this would not scale well with complex access control rules and a lot of routes. Extending models/collections Our models and collections, no matter what their specific implementation looks like, will share some common properties with one another. Especially if they're targeting the same API, which is the common case. The specifics of a given model or collection revolve around the API endpoint, the data returned, and the possible actions taken. It's likely that we'll target the same base API path for all entities, and that all entities have a handful of shared properties. Rather than repeat ourselves in every model or collection instance, it's better to abstract the common data. In addition to sharing properties among our models and collections, we can share common behavior. For instance, it's quite likely that a given model isn't going to have sufficient data for a given feature. Perhaps that data can be derived by transforming the model. These types of transformations can be common, and abstracted in a base model or collection. It really depends on the types of features we're implementing and how consistent they are with one another. If we're growing fast and getting lots of requests for "outside-the-box" features, then we're more likely to implement data transformations inside the views that require these one-off changes to the models or collections they're using. Most frameworks take care of the nuances for performing XHR requests to fetch our data or perform actions. That's not the whole story unfortunately, because our features will rarely map one-to-one with a single API entity. More likely, we will have a feature that requires several collections that are related to one another somehow, and a transformed collection. This type of operation can grow complex quickly, because we have to work with multiple XHR requests. We'll likely use promises to synchronize the fetching of these requests, and then perform the data transformation once we have all the necessary sources. Here's an example that shows a specific model extending a generic model, to provide new fetching behavior: // The base fetch() implementation of a model, sets // some property values, and resolves the promise. class BaseModel { fetch() { return new Promise((resolve, reject) => { this.id = 1; this.name = 'foo'; resolve(this); }); } } // Extends BaseModel with a specific implementation // of fetch(). class SpecificModel extends BaseModel { // Overrides the base fetch() method. Returns // a promise with combines the original // implementation and result of calling fetchSettings(). fetch() { return Promise.all([ super.fetch(), this.fetchSettings() ]); } // Returns a new Promise instance. Also sets a new // model property. fetchSettings() { return new Promise((resolve, reject) => { this.enabled = true; resolve(this); }); } } // Make sure the properties are all in place, as expected, // after the fetch() call completes. new SpecificModel().fetch().then((result) => { var [ model ] = result; console.assert(model.id === 1, 'id'); console.assert(model.name === 'foo'); console.assert(model.enabled, 'enabled'); console.log('fetched'); }); Extending controllers/views When we have a base model or base collection, there're often properties shared between our controllers or views. That's because the job of a controller or a view is to render model or collection data. For example, if the same view is rendering the same model properties over and over, we can probably move that bit to a base view, and extend from that. Perhaps the repetitive parts are in the templates themselves. This means that we might want to consider having a base template inside a base view, as shown in the following diagram. Views that extend this base view, inherit this base template. Depending on the library or framework at our disposal, extending templates like this may not be feasible. Or the nature of our features may make this difficult to achieve. For example, there might not be a common base template, but there might be a lot of smaller views and templates that can plug-into larger components: A view that extends a base view can populate the template of the base view, as well as inherit other base view functionalities Our views also need to respond to user interactions. They may respond directly, or forward the events up the component hierarchy. In either case, if our features are at all consistent, there will be some common DOM event handling that we'll want to abstract into a common base view. This is a huge help in scaling our application, because as we add more features, the DOM event handling code additions is minimized. Mapping features to components Now that we have a handle on the most common JavaScript components, and the ways we'll want to extend them for use in our application, it's time to think about how to glue those components together. A router on it's own isn't very useful. Nor is a standalone model, template, or controller. Instead, we want these things to work together, to form a cohesive unit that realizes a feature in our application. To do that, we have to map our features to components. We can't do this haphazardly either—we need to think about what's generic about our feature, and about what makes it unique. These feature properties will guide our design decisions on producing something that scales. Generic features Perhaps the most important aspects of component composition are consistency and reusability. While considering that the scaling influences our application faces, we'll come up with a list of traits that all our components must carry. Things like user management, access control, and other traits unique to our application. Along with the other architectural perspectives (explored in more depth throughout the remainder of the book), which form the core of our generic features: A generic component, composed of other generic components from our framework. The generic aspects of every feature in our application serve as a blueprint. They inform us in composing larger building blocks. These generic features account for the architectural factors that help us scale. And if we can encode these factors as parts of an aggregate component, we'll have an easier time scaling our application. What makes this design task challenging is that we have to look at these generic components not only from a scalable architecture perspective, but also from a feature-complete perspective. As much as we'd like to think that if every feature behaves the same way, we'd be all set. If only every feature followed an identical pattern, the sky's the limit when it comes the time to scale. But 100% consistent feature functionality is an illusion, more visible to JavaScript programmers than to users. The pattern breaks out of necessity. It's responding to this breakage in a scalable way that matters. This is why successful JavaScript applications will continuously revisit the generic aspects of our features to ensure they reflect reality. Specific features When it's time to implement something that doesn't fit the pattern, we're faced with a scaling challenge. We have to pivot, and consider the consequences of introducing such a feature into our architecture. When patterns are broken, our architecture needs to change. This isn't a bad thing—it's a necessity. The limiting factor in our ability to scale in response to these new features, lies with generic aspects of our existing features. This means that we can't be too rigid with our generic feature components. If we're too demanding, we're setting ourselves up for failure. Before making any brash architectural decisions stemming from offbeat features, think about the specific scaling consequences. For example, does it really matter that the new feature uses a different layout and requires a template that's different from all other feature components? The state of the JavaScript scaling art revolves around finding the handful of essential blueprints to follow for our component composition. Everything else is up for discussion on how to proceed. Decomposing components Component composition is an activity that creates order; larger behavior out of smaller parts. We often need to move in the opposite direction during development. Even after development, we can learn how a component works by tearing the code apart and watching it run in different contexts. Component decomposition means that we're able to take the system apart and examine individual parts in a somewhat structured approach. Maintaining and debugging components Over the course of application development, our components accumulate abstractions. We do this to support a feature's requirement better, while simultaneously supporting some architectural property that helps us scale. The problem is that as the abstractions accumulate, we lose transparency into the functioning of our components. This is not only essential for diagnosing and fixing issues, but also in terms of how easy the code is to learn. For example, if there's a lot of indirection, it takes longer for a programmer to trace cause to effect. Time wasted on tracing code, reduces our ability to scale from a developmental point of view. We're faced with two opposing problems. First, we need abstractions to address real world feature requirements and architectural constraints. Second, is our inability to master our own code due to a lack of transparency. 'Following is an example that shows a renderer component and a feature component. Renderers used by the feature are easily substitutable: // A Renderer instance takes a renderer function // as an argument. The render() method returns the // result of calling the function. class Renderer { constructor(renderer) { this.renderer = renderer; } render() { return this.renderer ? this.renderer(this) : ''; } } // A feature defines an output pattern. It accepts // header, content, and footer arguments. These are // Renderer instances. class Feature { constructor(header, content, footer) { this.header = header; this.content = content; this.footer = footer; } // Renders the sections of the view. Each section // either has a renderer, or it doesn't. Either way, // content is returned. render() { var header = this.header ? `${this.header.render()}n` : '', content = this.content ? `${this.content.render()}n` : '', footer = this.footer ? this.footer.render() : ''; return `${header}${content}${footer}`; } } // Constructs a new feature with renderers for three sections. var feature = new Feature( new Renderer(() => { return 'Header'; }), new Renderer(() => { return 'Content'; }), new Renderer(() => { return 'Footer'; }) ); console.log(feature.render()); // Remove the header section completely, replace the footer // section with a new renderer, and check the result. delete feature.header; feature.footer = new Renderer(() => { return 'Test Footer'; }); console.log(feature.render()); A tactic that can help us cope with these two opposing scaling influencers is substitutability. In particular, the ease with which one of our components, or sub-components, can be replaced with something else. This should be really easy to do. So before we go introducing layers of abstraction, we need to consider how easy it's going to be to replace a complex component with a simple one. This can help programmers learn the code, and also help with debugging. For example, if we're able to take a complex component out of the system and replace it with a dummy component, we can simplify the debugging process. If the error goes away after the component is replaced, we have found the problematic component. Otherwise, we can rule out a component and keep digging elsewhere. Re-factoring complex components It's of course easier said than done to implement substitutability with our components, especially in the face of deadlines. Once it becomes impractical to easily replace components with others, it's time to consider re-factoring our code. Or at least the parts that make substitutability infeasible. It's a balancing act, getting the right level of encapsulation, and the right level of transparency. Substitution can also be helpful at a more granular level. For example, let's say a view method is long and complex. If there are several stages during the execution of that method, where we would like to run something custom, we can't. It's better to re-factor the single method into a handful of methods, each of which can be overridden. Pluggable business logic Not all of our business logic needs to live inside our components, encapsulated from the outside world. Instead, it would be ideal if we could write our business logic as a set of functions. In theory, this provides us with a clear separation of concerns. The components are there to deal with the specific architectural concerns that help us scale, and the business logic can be plugged into any component. In practice, excising business logic from components isn't trivial. Extending versus configuring There're two approaches we can take when it comes to building our components. As a starting point, we have the tools provided by our libraries and frameworks. From there, we can keep extending these tools, getting more specific as we drill deeper and deeper into our features. Alternatively, we can provide our component instances with configuration values. These instruct the component on how to behave. The advantage of extending things that would otherwise need to be configured is that the caller doesn't need to worry about them. And if we can get by, using this approach, all the better, because it leads to simpler code. Especially the code that's using the component. On the other hand, we could have generic feature components that can be used for a specific purpose, if only they support this configuration or that configuration option. This approach has the advantage of simpler component hierarchies, and less overall components. Sometimes it's better to keep components as generic as possible, within the realm of understandability. That way, when we need a generic component for a specific feature, we can use it without having to re-define our hierarchy. Of course, there's more complexity involved for the caller of that component, because they need to supply it with the configuration values. It's a trade-off that's up to us, the JavaScript architects of our application. Do we want to encapsulate everything, configure everything, or do we want to strike a balance between the two? Stateless business logic With functional programming, functions don't have side effects. In some languages, this property is enforced, in JavaScript it isn't. However, we can still implement side-effect-free functions in JavaScript. If a function takes arguments, and always returns the same output based on those arguments, then the function can be said to be stateless. It doesn't depend on the state of a component, and it doesn't change the state of a component. It just computes a value. If we can establish a library of business logic that's implemented this way, we can design some super flexible components. Rather than implement this logic directly in a component, we pass the behavior into the component. That way, different components can utilize the same stateless business logic functions. The tricky part is finding the right functions that can be implemented this way. It's not a good idea to implement these up-front. Instead, as the iterations of our application development progress, we can use this strategy to re-factor code into generic stateless functions that are shared by any component capable of using them. This leads to business logic that's implemented in a focused way, and components that are small, generic, and reusable in a variety of contexts. Organizing component code In addition to composing our components in such a way that helps our application scale, we need to consider the structure of our source code modules too. When we first start off with a given project, our source code files tend to map well to what's running in the client's browser. Over time, as we accumulate more features and components, earlier decisions on how to organize our source tree can dilute this strong mapping. When tracing runtime behavior to our source code, the less mental effort involved, the better. We can scale to more stable features this way because our efforts are focused more on the design problems of the day—the things that directly provide customer value: The diagram shows the mapping component parts to their implementation artifacts There's another dimension to code organization in the context of our architecture, and that's our ability to isolate specific code. We should treat our code just like our runtime components, which are self-sustained units that we can turn on or off. That is, we should be able to find all the source code files required for a given component, without having to hunt them down. If a component requires, say, 10 source code files—JavaScript, HTML, and CSS—then ideally these should all be found in the same directory. The exception, of course, is generic base functionality that's shared by all components. These should be as close to the surface as possible. Then it's easy to trace our component dependencies; they will all point to the top of the hierarchy. It's a challenge to scale the dependency graph when our component dependencies are all over the place. Summary This article introduced us to the concept of component composition. Components are the building blocks of a scalable JavaScript application. The common components we're likely to encounter include things like modules, models/collections, controllers/views, and templates. While these patterns help us achieve a level of consistency, they're not enough on their own to make our code work well under various scaling influencers. This is why we need to extend these components, providing our own generic implementations that specific features of our application can further extend and use. Depending on the various scaling factors our application encounters, different approaches may be taken in getting generic functionality into our components. One approach is to keep extending the component hierarchy, and keep everything encapsulated and hidden away from the outside world. Another approach is to plug logic and properties into components when they're created. The cost is more complexity for the code that's using the components. We ended the article with a look at how we might go about organizing our source code; so that it's structure better reflects that of our logical component design. This helps us scale our development effort, and helps isolate one component's code from others'. It's one thing to have well crafted components that stand by themselves. It's quite another to implement scalable component communication. For more information, refer to: https://www.packtpub.com/web-development/javascript-and-json-essentials https://www.packtpub.com/application-development/learning-javascript-data-structures-and-algorithms Resources for Article: Further resources on this subject: Welcome to JavaScript in the full stack [Article] Components of PrimeFaces Extensions [Article] Unlocking the JavaScript Core [Article]
Read more
  • 0
  • 0
  • 13386
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-publication-apps
Packt
22 Feb 2016
10 min read
Save for later

Publication of Apps

Packt
22 Feb 2016
10 min read
Ever wondered if you could prepare and publish an app on Google Play and you needed a short article on how you could get this done quickly? Here it is! Go ahead, read this piece of article, and you'll be able to get your app running on Google Play. (For more resources related to this topic, see here.) Preparing to publish You probably don't want to upload any of the apps from this book, so the first step is to develop an app that you want to publish. Head over to https://play.google.com/apps/publish/ and follow the instructions to get a Google Play developer account. This was $25 at the time of writing and is a one-time charge with no limit on the number of apps you can publish. Creating an app icon Exactly how to design an icon is beyond the remit of this book. But, simply put, you need to create a nice image for each of the Android screen density categorizations. This is easier than it sounds. Design one nice app icon in your favorite drawing program and save it as a .png file. Then, visit http://romannurik.github.io/AndroidAssetStudio/icons-launcher.html. This will turn your single icon into a complete set of icons for every single screen density. Warning! The trade-off for using this service is that the website will collect your e-mail address for their own marketing purposes. There are many sites that offer a similar free service. Once you have downloaded your .zip file from the preceding site, you can simply copy the res folder from the download into the main folder within the project explorer. All icons at all densities have now been updated. Preparing the required resources When we log into Google Play to create a new listing in the store, there is nothing technical to handle, but we do need to prepare quite a few images that we will need to upload. Prepare upto 8 screenshots for each device type (a phone/tablet/TV/watch) that your app is compatible with. Don't crop or pad these images. Create a 512 x 512 pixel image that will be used to show off your app icon on the Google Play store. You can prepare your own icon, or the process of creating app icons that we just discussed will have already autogenerated icons for you. You also need to create three banner graphics, which are as follows: 1024 x 500 180 x 120 320 x 180 These can be screenshots, but it is usually worth taking a little time to create something a bit more special. If you are not artistically minded, you can place a screenshot inside some quite cool device art and then simply add a background image. You can generate some device art at https://developer.android.com/distribute/tools/promote/device-art.html. Then, just add the title or feature of your app to the background. The following banner was created with no skill at all, just with a pretty background purchased for $10 and the device art tool I just mentioned: Also, consider creating a video of your app. Recording video of your Android device is nearly impossible unless your device is rooted. I cannot recommend you to root your device; however, there is a tool called ARC (App Runtime for Chrome) that enables you to run APK files on your desktop. There is no debugging output, but it can run a demanding app a lot more smoothly than the emulator. It will then be quite simple to use a free, open source desktop capture program such as OBS (Open Broadcaster Software) to record your app running within ARC. You can learn more about ARC at https://developer.chrome.com/apps/getstarted_arc and about OBS at https://obsproject.com/. Building the publishable APK file What we are doing in this section is preparing the file that we will upload to Google Play. The format of the file we will create is .apk. This type of file is often referred to as an APK. The actual contents of this file are the compiled class files, all the resources that we've added, and the files and resources that Android Studio has autogenerated. We don't need to concern ourselves with the details, as we just need to follow these steps. The steps not only create the APK, but they also create a key and sign your app with the key. This process is required and it also protects the ownership of your app:   Note that this is not the same thing as copy protection/digital rights management. In Android Studio, open the project that you want to publish and navigate to Build | Generate Signed APK and a pop-up window will open, as shown: In the Generate Signed APK window, click on the Create new button. After this, you will see the New Key Store window, as shown in the following screenshot: In the Key store path field, browse to a location on your hard disk where you would like to keep your new key, and enter a name for your key store. If you don't have a preference, simply enter keys and click on OK. Add a password and then retype it to confirm it. Next, you need to choose an alias and type it into the Alias field. You can treat this like a name for your key. It can be any word that you like. Now, enter another password for the key itself and type it again to confirm. Leave Validity (years) at its default value of 25. Now, all you need to do is fill out your personal/business details. This doesn't need to be 100% complete as the only mandatory field is First and Last Name. Click on the OK button to continue. You will be taken back to the Generate Signed APK window with all the fields completed and ready to proceed, as shown in the following window: Now, click on Next to move to the next screen: Choose where you would like to export your new APK file and select release for the Build Type field. Click on Finish and Android Studio will build the shiny new APK into the location you've specified, ready to be uploaded to the App Store. Taking a backup of your key store in multiple safe places! The key store is extremely valuable. If you lose it, you will effectively lose control over your app. For example, if you try to update an app that you have on Google Play, it will need to be signed by the same key. Without it, you would not be able to update it. Think of the chaos if you had lots of users and your app needed a database update, but you had to issue a whole new app because of a lost key store. As we will need it quite soon, locate the file that has been built and ends in the .apk extension. Publishing the app Log in to your developer account at https://play.google.com/apps/publish/. From the left-hand side of your developer console, make sure that the All applications tab is selected, as shown: On the top right-hand side corner, click on the Add new application button, as shown in the next screenshot: Now, we have a bit of form filling to do, and you will need all the images from the Preparing to publish section that is near the start of the chapter. In the ADD NEW APPLICATION window shown next, choose a default language and type the title of your application: Now, click on the Upload APK button and then the Upload your first APK button and browse to the APK file that you built and signed in. Wait for the file to finish uploading: Now, from the inner left-hand side menu, click on Store Listing: We are faced with a fair bit of form filling here. If, however, you have all your images to hand, you can get through this in about 10 minutes. Almost all the fields are self-explanatory, and the ones that aren't have helpful tips next to the field entry box. Here are a few hints and tips to make the process smooth and produce a good end result: In the Full description and Short description fields, you enter the text that will be shown to potential users/buyers of your app. Be sure to make the description as enticing and exciting as you can. Mention all the best features in a clear list, but start the description with one sentence that sums up your app and what it does. Don't worry about the New content rating field as we will cover that in a minute. If you haven't built your app for tablet/phone devices, then don't add images in these tabs. If you have, however, make sure that you add a full range of images for each because these are the only images that the users of this type of device will see. When you have completed the form, click on the Save draft button at the top-right corner of the web page. Now, click on the Content rating tab and you can answer questions about your app to get a content rating that is valid (and sometimes varied) across multiple countries. The last tab you need to complete is the Pricing and Distribution tab. Click on this tab and choose the Paid or Free distribution button. Then, enter a price if you've chosen Paid. Note that if you choose Free, you can never change this. You can, however, unpublish it. If you chose Paid, you can click on Auto-convert prices now to set up equivalent pricing for all currencies around the world. In the DISTRIBUTE IN THESE COUNTRIES section, you can select countries individually or check the SELECT ALL COUNTRIES checkbox, as shown in the next screenshot:   The next six options under the Device categories and User programs sections in the context of what you have learned in this book should all be left unchecked. Do read the tips to find out more about Android Wear, Android TV, Android Auto, Designed for families, Google Play for work, and Google Play for education, however. Finally, you must check two boxes to agree with the Google consent guidelines and US export laws. Click on the Publish App button in the top-right corner of the web page and your app will soon be live on Google Play. Congratulations. Summary You can now start building Android apps. Don't run off and build the next Evernote, Runtatstic, or Angry Birds just yet. Head over to our book, Android Programming for Beginners: https://www.packtpub.com/application-development/android-programming-beginners. Here are a few more books that you can check out to learn more about Android: Android Studio Cookbook (https://www.packtpub.com/application-development/android-studio-cookbook) Learning Android Google Maps (https://www.packtpub.com/application-development/learning-android-google-maps) Android 6 Essentials (https://www.packtpub.com/application-development/android-6-essentials) Android Sensor Programming By Example (https://www.packtpub.com/application-development/android-sensor-programming-example) Resources for Article: Further resources on this subject: Saying Hello to Unity and Android[article] Android and iOS Apps Testing at a Glance[article] Testing with the Android SDK[article]
Read more
  • 0
  • 0
  • 13775

article-image-adding-spark-r
Packt
22 Feb 2016
3 min read
Save for later

Adding a Spark to R

Packt
22 Feb 2016
3 min read
Spark is written in a language called Scala. It has interfaces to use from Java and Python and from the recent version 1.4.0; it also supports R. This is called SparkR, which we will describe in the next section. The four classes of libraries available in Spark are SQL and DataFrames, Spark Streaming, MLib (machine learning), and GraphX (graph algorithms). Currently, SparkR supports only SQL and DataFrames; others are definitely in the roadmap. Spark can be downloaded from the Apache project page at http://spark.apache.org/downloads.html. Starting from 1.4.0 version, SparkR is included in Spark and no separate download is required. (For more resources related to this topic, see here.) SparkR Similar to RHadoop, SparkR is an R package that allows R users to use Spark APIs through the RDD class. For example, using SparkR, users can run jobs on Spark from RStudio. SparkR can be evoked from RStudio. To enable this, include the following lines in your .Rprofile file that R uses at startup to initialize the environments: Sys.setenv(SPARK_HOME/.../spark-1.5.0-bin-hadoop2.6") #provide the correct path where spark downloaded folder is kept for SPARK_HOME .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),""R",""lib"),".libPaths())) Once this is done, start RStudio and enter the following commands to start using SparkR: >library(SparkR) >sc ← sparkR.init(master="local") As mentioned, as of the latest version 1.5 when this chapter is in writing, SparkR supports limited functionalities of R. This mainly includes data slicing and dicing and summary stat functions. The current version does not support the use of contributed R packages; however, it is planned for a future release. On machine learning, currently SparkR supports the glm( ) function. We will do an example in the next section. Linear regression using SparkR In the following example, we will illustrate how to use SparkR for machine learning. >library(SparkR) >sc ← sparkR.init(master="local") >sqlContext ← sparkRSQL.init(sc) #Importing data >df ← read.csv("/Users/harikoduvely/Projects/Book/Data /ENB2012_data.csv",header = T) >#Excluding variable Y2,X6,X8 and removing records from 768 containing mainly null values >df ← df[1:768,c(1,2,3,4,5,7,9)] >#Converting to a Spark R Dataframe >dfsr ← createDataFrame(sqlContext,df) >model ← glm(Y1 ~ X1 + X2 + X3 + X4 + X5 + X7,data = dfsr,family = "gaussian") > summary(model) Summary In this article we have seen examples of SparkR and linear regression using SparkR. For more information on Spark you can refer to: https://www.packtpub.com/big-data-and-business-intelligence/spark-python-developers https://www.packtpub.com/big-data-and-business-intelligence/spark-beginners Resources for Article: Further resources on this subject: Data Analysis Using R[article] Introducing Bayesian Inference[article] Bayesian Network Fundamentals[article]
Read more
  • 0
  • 0
  • 2637

article-image-vr-build-and-run
Packt
22 Feb 2016
25 min read
Save for later

VR Build and Run

Packt
22 Feb 2016
25 min read
Yeah well, this is cool and everything, but where's my VR? I WANT MY VR! Hold on kid, we're getting there. In this article, we are going to set up a project that can be built and run with a virtual reality head-mounted display (HMD) and then talk more in depth about how the VR hardware technology really works. We will be discussing the following topics: The spectrum of the VR device integration software Installing and building a project for your VR device The details and defining terms for how the VR technology really works (For more resources related to this topic, see here.) VR device integration software Before jumping in, let's understand the possible ways to integrate our Unity project with virtual reality devices. In general, your Unity project must include a camera object that can render stereographic views, one for each eye on the VR headset. Software for the integration of applications with the VR hardware spans a spectrum, from built-in support and device-specific interfaces to the device-independent and platform-independent ones. Unity's built-in VR support Since Unity 5.1, support for the VR headsets is built right into Unity. At the time of writing this article, there is direct support for Oculus Rift and Samsung Gear VR (which is driven by the Oculus software). Support for other devices has been announced, including Sony PlayStation Morpheus. You can use a standard camera component, like the one attached to Main Camera and the standard character asset prefabs. When your project is built with Virtual Reality Supported enabled in Player Settings, it renders stereographic camera views and runs on an HMD. The device-specific SDK If a device is not directly supported in Unity, the device manufacturer will probably publish the Unity plugin package. An advantage of using the device-specific interface is that it can directly take advantage of the features of the underlying hardware. For example, Steam Valve and Google have device-specific SDK and Unity packages for the Vive and Cardboard respectively. If you're using one of these devices, you'll probably want to use such SDK and Unity packages. (At the time of writing this article, these devices are not a part of Unity's built-in VR support.) Even Oculus, supported directly in Unity 5.1, provides SDK utilities to augment that interface (see, https://developer.oculus.com/documentation/game-engines/latest/concepts/unity-intro/). Device-specific software locks each build into the specific device. If that's a problem, you'll either need to do some clever coding, or take one of the following approaches instead. The OSVR project In January 2015, Razer Inc. led a group of industry leaders to announce the Open Source Virtual Reality (OSVR) platform (for more information on this, visit http://www.osvr.com/) with plans to develop open source hardware and software, including an SDK that works with multiple devices from multiple vendors. The open source middleware project provides device-independent SDKs (and Unity packages) so that you can write your code to a single interface without having to know which devices your users are using. With OSVR, you can build your Unity game for a specific operating system (such as Windows, Mac, and Linux) and then let the user configure the app (after they download it) for whatever hardware they're going to use. At the time of writing this article, the project is still in its early stage, is rapidly evolving, and is not ready for this article. However, I encourage you to follow its development. WebVR WebVR (for more information, visit http://webvr.info/) is a JavaScript API that is being built directly into major web browsers. It's like WebGL (2D and 3D graphics API for the web) with VR rendering and hardware support. Now that Unity 5 has introduced the WebGL builds, I expect WebVR to surely follow, if not in Unity then from a third-party developer. As we know, browsers run on just about any platform. So, if you target your game to WebVR, you don't even need to know the user's operating system, let alone which VR hardware they're using! That's the idea anyway. New technologies, such as the upcoming WebAssembly, which is a new binary format for the Web, will help to squeeze the best performance out of your hardware and make web-based VR viable. For WebVR libraries, check out the following: WebVR boilerplate: https://github.com/borismus/webvr-boilerplate GLAM: http://tparisi.github.io/glam/ glTF: http://gltf.gl/ MozVR (the Mozilla Firefox Nightly builds with VR): http://mozvr.com/downloads/ WebAssembly: https://github.com/WebAssembly/design/blob/master/FAQ.md 3D worlds There are a number of third-party 3D world platforms that provide multiuser social experiences in shared virtual spaces. You can chat with other players, move between rooms through portals, and even build complex interactions and games without having to be an expert. For examples of 3D virtual worlds, check out the following: VRChat: http://vrchat.net/ JanusVR: http://janusvr.com/ AltspaceVR: http://altvr.com/ High Fidelity: https://highfidelity.com/ For example, VRChat lets you develop 3D spaces and avatars in Unity, export them using their SDK, and load them into VRChat for you and others to share over the Internet in a real-time social VR experience. Creating the MeMyselfEye prefab To begin, we will create an object that will be a proxy for the user in the virtual environment. Let's create the object using the following steps: Open Unity and the project from the last article. Then, open the Diorama scene by navigating to File | Open Scene (or double-click on the scene object in Project panel, under Assets). From the main menu bar, navigate to GameObject | Create Empty. Rename the object MeMyselfEye (hey, this is VR!). Set its position up close into the scene, at Position (0, 1.4, -1.5). In the Hierarchy panel, drag the Main Camera object into MeMyselfEye so that it's a child object. With the Main Camera object selected, reset its transform values (in the Transform panel, in the upper right section, click on the gear icon and select Reset). The Game view should show that we're inside the scene. If you recall the Ethan experiment that we did earlier, I picked a Y-position of 1.4 so that we'll be at about the eye level with Ethan. Now, let's save this as a reusable prefabricated object, or prefab, in the Project panel, under Assets: In Project panel, under Assets, select the top-level Assets folder, right-click and navigate to Create | Folder. Rename the folder Prefabs. Drag the MeMyselfEye prefab into the Project panel, under Assets/Prefabs folder to create a prefab. Now, let's configure the project for your specific VR headset. Build for the Oculus Rift If you have a Rift, you've probably already downloaded Oculus Runtime, demo apps, and tons of awesome games. To develop for the Rift, you'll want to be sure that the Rift runs fine on the same machine on which you're using Unity. Unity has built-in support for the Oculus Rift. You just need to configure your Build Settings..., as follows: From main menu bar, navigate to File | Build Settings.... If the current scene is not listed under Scenes In Build, click on Add Current. Choose PC, Mac, & Linux Standalone from the Platform list on the left and click on Switch Platform. Choose your Target Platform OS from the Select list on the right (for example, Windows). Then, click on Player Settings... and go to the Inspector panel. Under Other Settings, check off the Virtual Reality Supported checkbox and click on Apply if the Changing editor vr device dialog box pops up. To test it out, make sure that the Rift is properly connected and turned on. Click on the game Play button at the top of the application in the center. Put on the headset, and IT SHOULD BE AWESOME! Within the Rift, you can look all around—left, right, up, down, and behind you. You can lean over and lean in. Using the keyboard, you can make Ethan walk, run, and jump just like we did earlier. Now, you can build your game as a separate executable app using the following steps. Most likely, you've done this before, at least for non-VR apps. It's pretty much the same: From the main menu bar, navigate to File | Build Settings.... Click on Build and set its name. I like to keep my builds in a subdirectory named Builds; create one if you want to. Click on Save. An executable will be created in your Builds folder. If you're on Windows, there may also be a rift_Data folder with built data. Run Diorama as you would do for any executable application—double-click on it. Choose the Windowed checkbox option so that when you're ready to quit, close the window with the standard Close icon in the upper right of your screen. Build for Google Cardboard Read this section if you are targeting Google Cardboard on Android and/or iOS. A good starting point is the Google Cardboard for Unity, Get Started guide (for more information, visit https://developers.google.com/cardboard/unity/get-started). The Android setup If you've never built for Android, you'll first need to download and install the Android SDK. Take a look at Unity manual for Android SDK Setup (http://docs.unity3d.com/Manual/android-sdksetup.html). You'll need to install the Android Developer Studio (or at least, the smaller SDK Tools) and other related tools, such as Java (JVM) and the USB drivers. It might be a good idea to first build, install, and run another Unity project without the Cardboard SDK to ensure that you have all the pieces in place. (A scene with just a cube would be fine.) Make sure that you know how to install and run it on your Android phone. The iOS setup A good starting point is Unity manual, Getting Started with iOS Development guide (http://docs.unity3d.com/Manual/iphone-GettingStarted.html). You can only perform iOS development from a Mac. You must have an Apple Developer Account approved (and paid for the standard annual membership fee) and set up. Also, you'll need to download and install a copy of the Xcode development tools (via the Apple Store). It might be a good idea to first build, install, and run another Unity project without the Cardboard SDK to ensure that you have all the pieces in place. (A scene with just a cube would be fine). Make sure that you know how to install and run it on your iPhone. Installing the Cardboard Unity package To set up our project to run on Google Cardboard, download the SDK from https://developers.google.com/cardboard/unity/download. Within your Unity project, import the CardboardSDKForUnity.unitypackage assets package, as follows: From the Assets main menu bar, navigate to Import Package | Custom Package.... Find and select the CardboardSDKForUnity.unitypackage file. Ensure that all the assets are checked, and click on Import. Explore the imported assets. In the Project panel, the Assets/Cardboard folder includes a bunch of useful stuff, including the CardboardMain prefab (which, in turn, contains a copy of CardboardHead, which contains the camera). There is also a set of useful scripts in the Cardboard/Scripts/ folder. Go check them out. Adding the camera Now, we'll put the Cardboard camera into MeMyselfEye, as follows: In the Project panel, find CardboardMain in the Assets/Cardboard/Prefabs folder. Drag it onto the MeMyselfEye object in the Hierarchy panel so that it's a child object. With CardboardMain selected in Hierarchy, look at the Inspector panel and ensure the Tap is Trigger checkbox is checked. Select the Main Camera in the Hierarchy panel (inside MeMyselfEye) and disable it by unchecking the Enable checkbox on the upper left of its Inspector panel. Finally, apply theses changes back onto the prefab, as follows: In the Hierarchy panel, select the MeMyselfEye object. Then, in its Inspector panel, next to Prefab, click on the Apply button. Save the scene. We now have replaced the default Main Camera with the VR one. The build settings If you know how to build and install from Unity to your mobile phone, doing it for Cardboard is pretty much the same: From the main menu bar, navigate to File | Build Settings.... If the current scene is not listed under Scenes to Build, click on Add Current. Choose Android or iOS from the Platform list on the left and click on Switch Platform. Then, click on Player Settings… in the Inspector panel. For Android, ensure that Other Settings | Virtual Reality Supported is unchecked, as that would be for GearVR (via the Oculus drivers), not Cardboard Android! Navigate to Other Settings | PlayerSettings.bundleIdentifier and enter a valid string, such as com.YourName.VRisAwesome. Under Resolution and Presentation | Default Orientation set Landscape Left. Play Mode To test it out, you do not need your phone connected. Just press the game's Play button at the top of the application in the center to enter Play Mode. You will see the split screen stereographic views in the Game view panel. While in Play Mode, you can simulate the head movement if you were viewing it with the Cardboard headset. Use Alt + mouse-move to pan and tilt forward or backwards. Use Ctrl + mouse-move to tilt your head from side to side. You can also simulate magnetic clicks (we'll talk more about user input in a later article) with mouse clicks. Note that since this emulates running on a phone, without a keyboard, the keyboard keys that we used to move Ethan do not work now. Building and running in Android To build your game as a separate executable app, perform the following steps: From the main menu bar, navigate to File | Build & Run. Set the name of the build. I like to keep my builds in a subdirectory named Build; you can create one if you want. Click on Save. This will generate an Android executable .apk file, and then install the app onto your phone. The following screenshot shows the Diorama scene running on an Android phone with Cardboard (and Unity development monitor in the background). Building and running in iOS To build your game and run it on the iPhone, perform the following steps: Plug your phone into the computer via a USB cable/port. From the main menu bar, navigate to File | Build & Run. This allows you to create an Xcode project, launch Xcode, build your app inside Xcode, and then install the app onto your phone. Antique Stereograph (source https://www.pinterest.com/pin/493073859173951630/) The device-independent clicker At the time of writing this article, VR input has not yet been settled across all platforms. Input devices may or may not fit under Unity's own Input Manager and APIs. In fact, input for VR is a huge topic and deserves its own book. So here, we will keep it simple. As a tribute to the late Steve Jobs and a throwback to the origins of Apple Macintosh, I am going to limit these projects to mostly one-click inputs! Let's write a script for it, which checks for any click on the keyboard, mouse, or other managed device: In the Project panel, select the top-level Assets folder. Right-click and navigate to Create | Folder. Name it Scripts. With the Scripts folder selected, right-click and navigate to Create | C# Script. Name it Clicker. Double-click on the Clicker.cs file in the Projects panel to open it in the MonoDevelop editor. Now, edit the Script file, as follows: using UnityEngine; using System.Collections; public class Clicker { public bool clicked() { return Input.anyKeyDown; } } Save the file. If you are developing for Google Cardboard, you can add a check for the Cardboard's integrated trigger when building for mobile devices, as follows: using UnityEngine; using System.Collections; public class Clicker { public bool clicked() { #if (UNITY_ANDROID || UNITY_IPHONE) return Cardboard.SDK.CardboardTriggered; #else return Input.anyKeyDown; #endif } } Any scripts that we write that require user clicks will use this Clicker file. The idea is that we've isolated the definition of a user click to a single script, and if we change or refine it, we only need to change this file. How virtual reality really works So, with your headset on, you experienced the diorama! It appeared 3D, it felt 3D, and maybe you even had a sense of actually being there inside the synthetic scene. I suspect that this isn't the first time you've experienced VR, but now that we've done it together, let's take a few minutes to talk about how it works. The strikingly obvious thing is, VR looks and feels really cool! But why? Immersion and presence are the two words used to describe the quality of a VR experience. The Holy Grail is to increase both to the point where it seems so real, you forget you're in a virtual world. Immersion is the result of emulating the sensory inputs that your body receives (visual, auditory, motor, and so on). This can be explained technically. Presence is the visceral feeling that you get being transported there—a deep emotional or intuitive feeling. You can say that immersion is the science of VR, and presence is the art. And that, my friend, is cool. A number of different technologies and techniques come together to make the VR experience work, which can be separated into two basic areas: 3D viewing Head-pose tracking In other words, displays and sensors, like those built into today's mobile devices, are a big reason why VR is possible and affordable today. Suppose the VR system knows exactly where your head is positioned at any given moment in time. Suppose that it can immediately render and display the 3D scene for this precise viewpoint stereoscopically. Then, wherever and whenever you moved, you'd see the virtual scene exactly as you should. You would have a nearly perfect visual VR experience. That's basically it. Ta-dah! Well, not so fast. Literally. Stereoscopic 3D viewing Split-screen stereography was discovered not long after the invention of photography, like the popular stereograph viewer from 1876 shown in the following picture (B.W. Kilborn & Co, Littleton, New Hampshire, see http://en.wikipedia.org/wiki/Benjamin_W._Kilburn). A stereo photograph has separate views for the left and right eyes, which are slightly offset to create parallax. This fools the brain into thinking that it's a truly three-dimensional view. The device contains separate lenses for each eye, which let you easily focus on the photo close up. Similarly, rendering these side-by-side stereo views is the first job of the VR-enabled camera in Unity. Let's say that you're wearing a VR headset and you're holding your head very still so that the image looks frozen. It still appears better than a simple stereograph. Why? The old-fashioned stereograph has twin relatively small images rectangularly bound. When your eye is focused on the center of the view, the 3D effect is convincing, but you will see the boundaries of the view. Move your eyeballs around (even with the head still), and any remaining sense of immersion is totally lost. You're just an observer on the outside peering into a diorama. Now, consider what an Oculus Rift screen looks like without the headset (see the following screenshot): The first thing that you will notice is that each eye has a barrel shaped view. Why is that? The headset lens is a very wide-angle lens. So, when you look through it you have a nice wide field of view. In fact, it is so wide (and tall), it distorts the image (pincushion effect). The graphics software (SDK) does an inverse of that distortion (barrel distortion) so that it looks correct to us through the lenses. This is referred to as an ocular distortion correction. The result is an apparent field of view (FOV), that is wide enough to include a lot more of your peripheral vision. For example, the Oculus Rift DK2 has a FOV of about 100 degrees. Also of course, the view angle from each eye is slightly offset, comparable to the distance between your eyes, or the Inter Pupillary Distance (IPD). IPD is used to calculate the parallax and can vary from one person to the next. (Oculus Configuration Utility comes with a utility to measure and configure your IPD. Alternatively, you can ask your eye doctor for an accurate measurement.) It might be less obvious, but if you look closer at the VR screen, you see color separations, like you'd get from a color printer whose print head is not aligned properly. This is intentional. Light passing through a lens is refracted at different angles based on the wavelength of the light. Again, the rendering software does an inverse of the color separation so that it looks correct to us. This is referred to as a chromatic aberration correction. It helps make the image look really crisp. Resolution of the screen is also important to get a convincing view. If it's too low-res, you'll see the pixels, or what some refer to as a screen door effect. The pixel width and height of the display is an oft-quoted specification when comparing the HMD's, but the pixels per inch (ppi) value may be more important. Other innovations in display technology such as pixel smearing and foveated rendering (showing a higher-resolution detail exactly where the eyeball is looking) will also help reduce the screen door effect. When experiencing a 3D scene in VR, you must also consider the frames per second (FPS). If FPS is too slow, the animation will look choppy. Things that affect FPS include the graphics processor (GPU) performance and complexity of the Unity scene (number of polygons and lighting calculations), among other factors. This is compounded in VR because you need to draw the scene twice, once for each eye. Technology innovations, such as GPUs optimized for VR, frame interpolation and other techniques, will improve the frame rates. For us developers, performance-tuning techniques in Unity, such as those used by mobile game developers, can be applied in VR. These techniques and optics help make the 3D scene appear realistic. Sound is also very important—more important than many people realize. VR should be experienced while wearing stereo headphones. In fact, when the audio is done well but the graphics are pretty crappy, you can still have a great experience. We see this a lot in TV and cinema. The same holds true in VR. Binaural audio gives each ear its own stereo view of a sound source in such a way that your brain imagines its location in 3D space. No special listening devices are needed. Regular headphones will work (speakers will not). For example, put on your headphones and visit the Virtual Barber Shop at https://www.youtube.com/watch?v=IUDTlvagjJA. True 3D audio, such as VisiSonics (licensed by Oculus), provides an even more realistic spatial audio rendering, where sounds bounce off nearby walls and can be occluded by obstacles in the scene to enhance the first-person experience and realism. Lastly, the VR headset should fit your head and face comfortably so that it's easy to forget that you're wearing it and should block out light from the real environment around you. Head tracking So, we have a nice 3D picture that is viewable in a comfortable VR headset with a wide field of view. If this was it and you moved your head, it'd feel like you have a diorama box stuck to your face. Move your head and the box moves along with it, and this is much like holding the antique stereograph device or the childhood View Master. Fortunately, VR is so much better. The VR headset has a motion sensor (IMU) inside that detects spatial acceleration and rotation rate on all three axes, providing what's called the six degrees of freedom. This is the same technology that is commonly found in mobile phones and some console game controllers. Mounted on your headset, when you move your head, the current viewpoint is calculated and used when the next frame's image is drawn. This is referred to as motion detection. Current motion sensors may be good if you wish to play mobile games on a phone, but for VR, it's not accurate enough. These inaccuracies (rounding errors) accumulate over time, as the sensor is sampled thousands of times per second, one may eventually lose track of where you are in the real world. This drift is a major shortfall of phone-based VR headsets such as Google Cardboard. It can sense your head motion, but it loses track of your head position. High-end HMDs account for drift with a separate positional tracking mechanism. The Oculus Rift does this with an inside-out positional tracking, where an array of (invisible) infrared LEDs on the HMD are read by an external optical sensor (infrared camera) to determine your position. You need to remain within the view of the camera for the head tracking to work. Alternatively, the Steam VR Vive Lighthouse technology does an outside-in positional tracking, where two or more dumb laser emitters are placed in the room (much like the lasers in a barcode reader at the grocery checkout), and an optical sensor on the headset reads the rays to determine your position. Either way, the primary purpose is to accurately find the position of your head (and other similarly equipped devices, such as handheld controllers). Together, the position, tilt, and the forward direction of your head—or the head pose—is used by the graphics software to redraw the 3D scene from this vantage point. Graphics engines such as Unity are really good at this. Now, let's say that the screen is getting updated at 90 FPS, and you're moving your head. The software determines the head pose, renders the 3D view, and draws it on the HMD screen. However, you're still moving your head. So, by the time it's displayed, the image is a little out of date with respect to your then current position. This is called latency, and it can make you feel nauseous. Motion sickness caused by latency in VR occurs when you're moving your head and your brain expects the world around you to change exactly in sync. Any perceptible delay can make you uncomfortable, to say the least. Latency can be measured as the time from reading a motion sensor to rendering the corresponding image, or the sensor-to-pixel delay. According to Oculus' John Carmack: "A total latency of 50 milliseconds will feel responsive, but still noticeable laggy. 20 milliseconds or less will provide the minimum level of latency deemed acceptable." There are a number of very clever strategies that can be used to implement latency compensation. The details are outside the scope of this article and inevitably will change as device manufacturers improve on the technology. One of these strategies is what Oculus calls the timewarp, which tries to guess where your head will be by the time the rendering is done, and uses that future head pose instead of the actual, detected one. All of this is handled in the SDK, so as a Unity developer, you do not have to deal with it directly. Meanwhile, as VR developers, we need to be aware of latency as well as the other causes of motion sickness. Latency can be reduced by faster rendering of each frame (keeping the recommended FPS). This can be achieved by discouraging the moving of your head too quickly and using other techniques to make the user feel grounded and comfortable. Another thing that the Rift does to improve head tracking and realism is that it uses a skeletal representation of the neck so that all the rotations that it receives are mapped more accurately to the head rotation. An example of this is looking down at your lap makes a small forward translation since it knows it's impossible to rotate one's head downwards on the spot. Other than head tracking, stereography and 3D audio, virtual reality experiences can be enhanced with body tracking, hand tracking (and gesture recognition), locomotion tracking (for example, VR treadmills), and controllers with haptic feedback. The goal of all of this is to increase your sense of immersion and presence in the virtual world. Summary In this article, we discussed the different levels of device integration software and then installed the software that is appropriate for your target VR device. We also discussed what happens inside the hardware and software SDK that makes virtual reality work and how it matters to us VR developers. For more information on VR development and Unity refer to the following Packt books: Unity UI Cookbook, by Francesco Sapio: https://www.packtpub.com/game-development/unity-ui-cookbook Building a Game with Unity and Blender, by Lee Zhi Eng: https://www.packtpub.com/game-development/building-game-unity-and-blender Augmented Reality with Kinect, by Rui Wang: https://www.packtpub.com/application-development/augmented-reality-kinect Resources for Article: Further resources on this subject: Virtually Everything for Everyone [article] Unity Networking – The Pong Game [article] Getting Started with Mudbox 2013 [article]
Read more
  • 0
  • 0
  • 15808

article-image-cross-platform-solution-xamarinforms-and-mvvm-architecture
Packt
22 Feb 2016
9 min read
Save for later

A cross-platform solution with Xamarin.Forms and MVVM architecture

Packt
22 Feb 2016
9 min read
In this article by George Taskos, the author of the book, Xamarin Cross Platform Development Cookbook, we will discuss a cross-platform solution with Xamarin.Forms and MVVM architecture. Creating a cross-platform solution correctly requires a lot of things to be taken under consideration. In this article, we will quickly provide you with a starter MVVM architecture showing data retrieved over the network in a ListView control. (For more resources related to this topic, see here.) How to do it... In Xamarin Studio, click on File | New | Xamarin.Forms App. Provide the name XamFormsMVVM. Add the NuGet dependencies by right-clicking on each project in the solution and choosing Add | Add NuGet Packages…. Search for the packages XLabs.Forms and modernhttpclient, and install them. Repeat step 2 for the XamFormsMVVM portable class library and add the packages Microsoft.Net.Http and Newtonsoft.Json. In the XamFormsMVVM portable class library, create the following folders: Models, ViewModels, and Views. To create a folder, right-click on the project and select Add | New Folder. Right-click on the Models folder and select Add | New File…, choose the General | Empty Interface template, name it IDataService, and click on New, and add the following code: public interface IDataService { Task<IEnumerable<OrderModel>> GetOrdersAsync (); } Right-click on the Models folder again and select Add | New File…, choose the General | Empty Class template, name it DataService, and click on New, and add the following code: [assembly: Xamarin.Forms.Dependency (typeof (DataService))] namespace XamFormsMVVM{ public class DataService : IDataService { protected const string BaseUrlAddress = @"https://api.parse.com/1/classes"; protected virtual HttpClient GetHttpClient() { HttpClient httpClient = new HttpClient(new NativeMessageHandler()); httpClient.BaseAddress = new Uri(BaseUrlAddress); httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue ("application/json")); return httpClient; } public async Task<IEnumerable<OrderModel>> GetOrdersAsync () { using (HttpClient client = GetHttpClient ()) { HttpRequestMessage requestMessage = new HttpRequestMessage(HttpMethod.Get, client.BaseAddress + "/Order"); requestMessage.Headers.Add("X-Parse- Application-Id", "fwpMhK1Ot1hM9ZA4iVRj49VFz DePwILBPjY7wVFy"); requestMessage.Headers.Add("X-Parse-REST- API-Key", "egeLQVTC7IsQJGd8GtRj3ttJV RECIZgFgR2uvmsr"); HttpResponseMessage response = await client.SendAsync(requestMessage); response.EnsureSuccessStatusCode (); string ordersJson = await response.Content.ReadAsStringAsync(); JObject jsonObj = JObject.Parse (ordersJson); JArray ordersResults = (JArray)jsonObj ["results"]; return JsonConvert.DeserializeObject <List<OrderModel>> (ordersResults.ToString ()); } } } } Right-click on the Models folder and select Add | New File…, choose the General | Empty Interface template, name it IDataRepository, and click on New, and add the following code: public interface IDataRepository { Task<IEnumerable<OrderViewModel>> GetOrdersAsync (); } Right-click on the Models folder and select Add | New File…, choose the General | Empty Class template, name it DataRepository, and click on New, and add the following code in that file: [assembly: Xamarin.Forms.Dependency (typeof (DataRepository))] namespace XamFormsMVVM { public class DataRepository : IDataRepository { private IDataService DataService { get; set; } public DataRepository () : this(DependencyService.Get<IDataService> ()) { } public DataRepository (IDataService dataService) { DataService = dataService; } public async Task<IEnumerable<OrderViewModel>> GetOrdersAsync () { IEnumerable<OrderModel> orders = await DataService.GetOrdersAsync ().ConfigureAwait (false); return orders.Select (o => new OrderViewModel (o)); } } } In the ViewModels folder, right-click on Add | New File… and name it OrderViewModel. Add the following code in that file: public class OrderViewModel : XLabs.Forms.Mvvm.ViewModel { string _orderNumber; public string OrderNumber { get { return _orderNumber; } set { SetProperty (ref _orderNumber, value); } } public OrderViewModel (OrderModel order) { OrderNumber = order.OrderNumber; } public override string ToString () { return string.Format ("[{0}]", OrderNumber); } } Repeat step 5 and create a class named OrderListViewModel.cs: public class OrderListViewModel : XLabs.Forms.Mvvm.ViewModel{ protected IDataRepository DataRepository { get; set; } ObservableCollection<OrderViewModel> _orders; public ObservableCollection<OrderViewModel> Orders { get { return _orders; } set { SetProperty (ref _orders, value); } } public OrderListViewModel () : this(DependencyService.Get<IDataRepository> ()) { } public OrderListViewModel (IDataRepository dataRepository) { DataRepository = dataRepository; DataRepository.GetOrdersAsync ().ContinueWith (antecedent => { if (antecedent.Status == TaskStatus.RanToCompletion) { Orders = new ObservableCollection<OrderViewModel> (antecedent.Result); } }, TaskScheduler. FromCurrentSynchronizationContext ()); } } Right-click on the Views folder and choose Add | New File…, select the Forms | Forms Content Page Xaml, name it OrderListView, and click on New: <?xml version="1.0" encoding="UTF-8"?> <ContentPage x_Class="XamFormsMVVM.OrderListView" Title="Orders"> <ContentPage.Content> <ListView ItemsSource="{Binding Orders}"/> </ContentPage.Content> </ContentPage> Go to XmaFormsMVVM.cs and replace the contents with the following code: public App() { if (!Resolver.IsSet) { SetIoc (); } RegisterViews(); MainPage = new NavigationPage((Page)ViewFactory. CreatePage<OrderListViewModel, OrderListView>()); } private void SetIoc() { var resolverContainer = new SimpleContainer(); Resolver.SetResolver (resolverContainer.GetResolver()); } private void RegisterViews() { ViewFactory.Register<OrderListView, OrderListViewModel>(); } Run the application, and you will get results like the following screenshots: For Android: For iOS: How it works… A cross-platform solution should share as much logic and common operations as possible, such as retrieving and/or updating data in a local database or over the network, having your logic centralized, and coordinating components. With Xamarin.Forms, you even have a cross-platform UI, but this shouldn't stop you from separating the concerns correctly; the more abstracted you are from the user interface and programming against interfaces, the easier it is to adapt to changes and remove or add components. Starting with models and creating a DataService implementation class with its equivalent interface, IDataService retrieves raw JSON data over the network from the Parse API and converts it to a list of OrderModel, which are POCO classes with just one property. Every time you invoke the GetOrdersAsync method, you get the same 100 orders from the server. Notice how we used the Dependency attribute declaration above the namespace to instruct DependencyService that we want to register this implementation class for the interface. We took a step to improve the performance of the REST client API; although we do use the HTTPClient package, we pass a delegate handler, NativeMessageHandler, when constructing in the GetClient() method. This handler is part of the modernhttpclient NuGet package and it manages undercover to use a native REST API for each platform: NSURLSession in iOS and OkHttp in Android. The IDataService interface is used by the DataRepository implementation, which acts as a simple intermediate repository layer converting the POCO OrderModel received from the server in OrderViewModel instances. Any model that is meant to be used on a view is a ViewModel, the view's model, and also, when retrieving and updating data, you don't carry business logic. Only data logic that is known should be included as data transfer objects. Dependencies, such as in our case, where we have a dependency of IDataService for the DataRepository to work, should be clear to classes that will use the component, which is why we create a default empty constructor required from the XLabs ViewFactory class, but in reality, we always invoke the constructor that accepts an IDataService instance; this way, when we unit test this unit, we can pass our mock IDataService class and test the functionality of the methods. We are using the DependencyService class to register the implementation to its equivalent IDataRepository interface here as well. OrderViewModel inherits XLabs.Forms.ViewModel; it is a simple ViewModel class with one property raising property change notifications and accepting an OrderModel instance as a dependency in the default constructor. We override the ToString() method too for a default string representation of the object, which simplifies the ListView control without requiring us, in our example, to use a custom cell with DataTemplate. The second ViewModel in our architecture is the OrderListViewModel, which inherits XLabs.Forms.ViewModel too and has a dependency of IDataRepository, following the same pattern with a default constructor and a constructor with the dependency argument. This ViewModel is responsible for retrieving a list of OrderViewModel and holding it to an ObservableCollection<OrderViewModel> instance that raises collection change notifications. In the constructor, we invoke the GetOrdersAsync() method and register an action delegate handler to be invoked on the main thread when the task has finished passing the orders received in a new ObservableCollection<OrderViewModel> instance set to the Orders property. The view of this recipe is super simple: in XAML, we set the title property which is used in the navigation bar for each platform and we leverage the built-in data-binding mechanism of Xamarin.Forms to bind the Orders property in the ListView ItemsSource property. This is how we abstract the ViewModel from the view. But we need to provide a BindingContext class to the view while still not coupling the ViewModel to the view, and Xamarin Forms Labs is a great framework for filling the gap. XLabs has a ViewFactory class; with this API, we can register the mapping between a view and a ViewModel, and the framework will take care of injecting our ViewModel into the BindingContext class of the view. When a page is required in our application, we use the ViewFactory.CreatePage class, which will construct and provide us with the desired instance. Xamarin Forms Labs uses a dependency resolver internally; this has to be set up early in the application startup entry point, so it is handled in the App.cs constructor. Run the iOS application in the simulator or device and in your preferred Android emulator or device; the result is the same with the equivalent native themes for each platform. Summary Xamarin.Forms is a great cross-platform UI framework that you can use to describe your user interface code declaratives in XAML, and it will be translated into the equivalent native views and pages with the ability of customizing each native application layer. Xamarin.Forms and MVVM are made for each other; the pattern fits naturally into the design of native cross-platform mobile applications and abstracts the view from the data easy using the built-in data-binding mechanism. Resources for Article: Further resources on this subject: Code Sharing Between iOS and Android [Article] Working with Xamarin.Android [Article] Sharing with MvvmCross [Article]
Read more
  • 0
  • 0
  • 16314
article-image-working-commands-and-plugins
Packt
22 Feb 2016
26 min read
Save for later

Working with Commands and Plugins

Packt
22 Feb 2016
26 min read
In this article written by Tom Ryder, author of the book Nagios Core Administration Cookbook, Second Edition, we will cover the following topics: Installing a plugin Removing a plugin Creating a new command Customizing an existing command (For more resources related to this topic, see here.) Introduction Nagios Core is perhaps best thought of as a monitoring framework and less as a monitoring tool.Its modular design allows any kind of program that returns appropriate values based on some kind of check as a check_command option for a host or service. This is where the concepts of commands and pluginscome into play. For Nagios Core, a plugin is any program that can be used to gather information about a host or service. To ensure that a host is responding to ping requests, we'd use a plugin, such as check_ping,which when run against a hostname or address—whether by Nagios Core or not—returns a status code to whatever called it, based on whether a response was received to the pingrequest within a certain period of time. This status code and any accompanying message is what Nagios Core uses to establish the state that a host or service is in. Plugins are generally just like any other program on a Unix-like system; they can be run from the command line, are subject to permissions and owner restrictions, can be written in any language, can read variables from their environment, and can take parameters and options to modify how they work. Most importantly, they are entirely separate from Nagios Core itself (even if programmed by the same people), and the way that they're used by the application can be changed. To allow for additional flexibility in how plugins are used, Nagios Core uses these programs according to the terms of a command definition. A command for a specific plugin defines the way in which that plugin is used, including its location in the filesystem, any parameters that should be passed to it, and any other options. In particular, parameters and options often include thresholds for the WARNINGand CRITICAL states. Nagios Core is usually downloaded and installed alongside a set of plugins called Nagios Plugins, available at https://nagios-plugins.org/, which this article assumes you have installed. These plugins were chosen because they cover the most common needs for a monitoring infrastructure quite well as a set, including checks for common services, such as web services, mail services, DNS services, and others as well as more generic checks, such as whether a TCP or UDP port is accessible and open on a server. It's possible that for most, if not all, of our monitoring needs, we won't need any other plugins—but if we do, Nagios Core makes it possible to use existing plugins in novel ways using custom command definitions, adding third-party plugins written by contributors on the Nagios Exchange website or even writing custom plugins ourselves from scratch in some special cases. Installing a plugin In this recipe, we'll install a custom plugin that we retrieved from Nagios Exchange onto a Nagios Core server so that we can use it in a Nagios Core command, and hence check a service with it. Getting ready You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already, and you should have found an appropriate plugin to install to solve some particular monitoring needs. Your Nagios Core server should have Internet connectivity to allow you to download the plugin directly from the website. In this example, we'll use check_rsync,which is available on the Web at https://exchange.nagios.org/directory/Plugins/Network-Protocols/Rsync/check_rsync/details. This particular plugin is quite simple,consisting of a single Perlscript with only very basic dependencies. If you want to install this script as an example,the server will also need to have a Perl interpreter installed, for example, in /usr/bin/perl. This example will also include directly testing a server running an rsync(1)daemon called troy.example.net. How to do it... We can download and install a new plugin using the following steps: Copy the URL for the download link for the most recent version of the check_rsync plugin. Navigate to the plugins directory for the Nagios Core server. The default location is /usr/local/nagios/libexec: # cd /usr/local/nagios/libexec Download the plugin using the wget command into a file called check_rsync. It's important to enclose the URL in quotes: # wget 'https://exchange.nagios.org/components/com_mtree/attachment. php?link_id=307&cf_id=29' -O check_rsync Make the plugin executable using the chmod(1) and chown(1) commands: # chown root.nagios check_rsync # chmod 0770 check_rsync Run the plugin directly with no arguments to check that it runs and to get usage instructions. It's a good idea to test it as the nagios user using the su(8) or sudo(8) commands:# sudo -s -u nagios$ ./check_rsync Usage: check_rsync -H <host> [-p <port>] [-m <module>[,<user>,<password>] [-m <module>[,<user>,<password>]...]] Try running the plugin directly against a host running rsync(1) to check whether it works and reports a status: $ ./check_rsync -H troy.example.net The output normally starts with the status determined, with any extra information after a colon: OK: Rsync is up If all of this works, then the plugin is now installed and working correctly. How it works... Because Nagios Core plugins are programs in themselves, all that installing a plugin really amounts to is saving a program or script into an appropriate directory, in this case, /usr/local/nagios/libexec, where all the other plugins live. It's then available to be used the same way as any other plugin. The next step once the plugin is working is defining a command for it in the Nagios Core configuration so that it can be used to monitor hosts and/or services. This can be done with the Creating a new commandrecipe in this article. There's more... If we inspect the Perl script, we can see a little bit of how it works. It works like any other Perl script except perhaps for the fact that its return valuesare defined in a hash called %ERRORS,and the return values it chooses depend on what happens when it tries to check the rsync(1)process. This is the most important part of implementing a plugin for Nagios Core. Installation procedures for different plugins vary. In particular, many plugins are written in languages like C, and hence, they need to be compiled. One such plugin is the popular check_nrpe plugin.Rather than simply being saved into a directory and made executable, these sorts of plugins often follow the usual pattern of configuration, compilation, and installation: $ ./configure $ make # make install For many plugins that are built in this style, the final step of make installwill often install the compiled plugin into the appropriate directory for us. In general, if instructions are included with the plugin, it pays to read them to see how best to install it. See also The Removing a plugin recipe The Creating a new command recipe Removing a plugin In this recipe, we'll remove a plugin that we no longer need as part of our Nagios Core installation. Perhaps it's not working correctly, the service it monitors is no longer available, or there are security or licensing concerns with its usage. Getting ready You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already and have a plugin that you would like to remove from the server. In this instance, we'll remove the now unneeded check_rsync plugin from our Nagios Core server. How to do it... We can remove a plugin from our Nagios Core instance using the following steps: Remove any part of the configuration that uses the plugin, including the hosts or services that use it for check_command and command definitions that refer to the program. As an example, the following definition for a command would no longer work after we remove the check_rsync plugin: define command { command_name check_rsync command_line $USER1$/check_rsync -H $HOSTADDRESS$ } Using a tool, such as grep(1), can be a good way to find mentions of the command and plugin: # grep -R check_rsync /usr/local/nagios/etc Change the directory on the Nagios Core server to wherever the plugins are kept. The default location is /usr/local/nagios/libexec: # cd /usr/local/nagios/libexec Delete the plugin with the rm(1) command: # rm check_rsync Validate the configuration and restart the Nagios Core server: # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload How it works... Nagios Core plugins are simply external programs that the server uses to perform checks of hosts and services. If a plugin is no longer needed, all that we need to do is remove references to it in our configuration, if any, and delete it from /usr/local/nagios/libexec. There's more... There's not usually any harm in leaving the plugin's program on the server even if Nagios Core isn't using it. It doesn't slow anything down or cause any other problems, and it may be needed later. Nagios Core plugins are generally quite small programs and should not really cause disk space concerns on a modern server. See also The Installing a plugin recipe The Creating a new command recipe Creating a new command In this recipe, we'll create a new command for a plugin that was just installed into the /usr/local/nagios/libexecdirectory in the Nagios Core server. This will define the way in which Nagios Core should use the plugin, and thereby allow it to be used as part of a service definition. Getting ready You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already and have a plugin installed for which you'd like to define a new command so that you can use it as part of a service definition. In this instance, we'll define a command for an installed check_rsyncplugin. How to do it... We can define a new command in our configuration as follows: Change to the directory containing the objects configuration for Nagios Core. The default location is /usr/local/nagios/etc/objects: # cd /usr/local/nagios/etc/objects Edit the commands.cfg file: # vi commands.cfg At the bottom of the file, add the following command definition: define command {     command_name  check_rsync     command_line  $USER1$/check_rsync -H $HOSTADDRESS$ } Validate the configuration and restart the Nagios Core server: # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload If the validation passed and the server restarted successfully, we should now be able to use the check_rsync command in a service definition. How it works... The configuration we added to the commands.cfgfile in the preceding steps defines a new command called check_rsync,which specifies a method for using the plugin of the same name to monitor a service. This enables us to use check_rsyncas a value for the check_commanddirective in a service declaration, which might look like this: define service {     use                  generic-service     host_name            troy.example.net     service_description  RSYNC     check_command        check_rsync } Only two directives are required for command definitions, and we've defined both: command_name: This defines the unique name with which we can reference the command when we use it in host or service definitions command_line: This defines the command line that should be executed by Nagios Core to make the appropriate check This particular command line also uses the following two macros: $USER1$: This expands to /usr/local/nagios/libexec, the location of the plugin binaries, including check_rsync. This is defined in the sample configuration in the /usr/local/nagios/etc/resource.cfg file. $HOSTADDRESS$: This expands to the address of any host for which this command is used as a host or service definition. So, if we used the command in a service, checking the rsync(1) server on troy.example.net, the completed command might look like this: $ /usr/local/nagios/libexec/check_rsync -H troy.example.net We can run this straight from the command line ourselves as the nagios userto see what kind of results it returns: $ /usr/local/nagios/libexec/check_rsync -H troy.example.net OK: Rsync is up There's more... A plugin can be used for more than one command. If we had a particular rsync(1) module, which we wanted to check named backup, we could write another command called check_rsync_backupas follows: define command {     command_name  check_rsync_backup     command_line  $USER1$/check_rsync -H $HOSTADDRESS$ -m backup } Alternatively, if one or more of our rsync(1) servers were running on an alternate port, say, port 5873, we could define a separate command check_rsync_altport for that: define command {     command_name  check_rsync_altport     command_line  $USER1$/check_rsync -H $HOSTADDRESS$ -p 5873 } Commands can thus be defined as precisely as we need them to be. We explore this in more detail in the Customizing an existing commandrecipe in this article. See also The Installing a plugin recipe The Customizing an existing command recipe Customizing an existing command In this recipe, we'll customize an existing command definition. There are a number of reasons why you might want to do this, but a common one is if a check is overzealous, sending notifications for the WARNING orCRITICALstates, which aren't actually terribly worrisome, or on the other hand, if a check is too "forgiving" and doesn't flag hosts or services as having problems when it would actually be appropriate to do so. Another reason is to account for peculiarities in your own network. For example, if you run HTTPdaemons on a large number of hosts in your hosts on the alternative port 8080 that you need to check, it would be convenient to have a check_http_altportcommand available. We can do this by copying and altering the definition for the vanilla check_httpcommand. Getting ready You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already. You should also already be familiar with the relationship between services, commands, and plugins. How to do it... We can customize an existing command definition as follows: Change to the directory containing the objects configuration for Nagios Core. The default location is /usr/local/nagios/etc/objects: # cd /usr/local/nagios/etc/objects Edit the commands.cfg or whichever file is an appropriate location for the check_http command: # vi commands.cfg Find the definition for the check_http command. In a default Nagios Core configuration, it should look something like this: # 'check_http' command_definition define command {     command_name  check_http     command_line  $USER1$/check_http -I $HOSTADDRESS$ $ARG1$ } Copy this definition into a new definition directly under it and alter it to look like the following, renaming the command and adding a new option to its command line: # 'check_http_altport' command_definition define command {     command_name  check_http_altport     command_line  $USER1$/check_http -I $HOSTADDRESS$ -p 8080 $ARG1$ } Validate the configuration and restart the Nagios Core server: # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload If the validation passed and the server restarted successfully, we should now be able to use the check_http_altportcommand, which is based on the original check_httpcommand, in a service definition. How it works... The configuration we added to the commands.cfgfile in the preceding steps reproduces the command definition for check_http,but changes it in two ways: It renames the command from check_http to check_http_alt, which is necessary to distinguish the commands from one another. Command names in Nagios Core, like host names, must be unique. It adds the -p 8080 option to the command line call, specifying that when the call to check_http is made, the check will be made using TCP port 8080 rather than the default value for TCP port 80. The check_http_altcommand can now be used as a check command in the same way a check_httpcommand can be used. For example, a service definition that checks whether the sparta.example.nethost is running an HTTP daemon on port 8080 might look something like this: define service {     use                  generic-service     host_name            sparta.example.net     service_description  HTTP_8080     check_command        check_http_alt } There's more... This recipe's title implies that we should customize the existing commands by editing them in-place, and indeed, this works fine if we really do want to do things this way. Instead of copying the command definition, we can just add -p 8080 or any other customization to the command line and change the original command. However, this is bad practice in most cases, mostly because it can break existing monitoring and can be potentially confusing to other administrators of the Nagios Core server. If we have a special case for monitoring, in this case, checking a nonstandard port for HTTP, then it's wise to create a whole new command based on the existing one with the customisations we need. Particularly if you share monitoring configuration duties with someone else on your team, changing the command can break the monitoring for anyone who had set up the services using the check_http command beforeyou changed it, meaning that their checks would all start failing because port 8080 would be checked instead. There is no limit to the number of commands you can define, so you can be very liberal in defining as many alternative commands as you need. It's a good idea to give them instructive names that say something about what they do as well as to add explanatory comments to the configuration file. You can add a comment to the file by prefixing it with a # character: # # 'check_http_altport' command_definition. This is to keep track of the # servers that have administrative panels running on an alternative port # to confer special privileges to a separate instance of Apache HTTPD # that we don't want to confer to the one for running public-facing # websites. # define command {     command_name  check_http_altport     command_line  $USER1$/check_http -H $HOSTADDRESS$ -p 8080 $ARG1$ } See also The Creating a new command recipe Writing a new plugin from scratch Even given the very useful standard plugins in the Nagios Plugins set and the large number of custom plugins available on Nagios Exchange, occasionally, as our monitoring setup becomes more refined, we may well find that there is some service or property of a host that we would like to check, but for which there doesn't seem to be any suitable plugin available. Every network is different, and sometimes, the plugins that others have generously donated their time to make for the community don't quite cover all your bases. Generally, the more specific your monitoring requirements get, the less likely it is for there to be a plugin available that does exactly what you need. In this example, we'll deal with a very particular problem that we'll assume can't be dealt with effectively by any known Nagios Core plugins, and we'll write one ourselves using Perl. Here's the example problem. Our Linux security team wants to be able to automatically check whether any of our servers are running kernels that have known exploits. However, they're not worried about every vulnerable kernel, only certain ones. They have provided us with the version numbers of three kernels that have small vulnerabilities that they're not particularly worried about but that do need patching, and one they're extremely worried about. Let's say the minor vulnerabilities are in the kernels with version numbers 2.6.19, 2.6.24, and 3.0.1. The serious vulnerability is in the kernel with version number 2.6.39. Note that these version numbers in this case are arbitrary and don't necessarily reflect any real kernel vulnerabilities! The team could log in to all of the servers individually to check them, but the servers are of varying ages and access methods, and they are managed by different people. They would also have to check manually more than once because it's possible that a naive administrator could upgrade to a kernel that's known to be vulnerable in an older release, and they also might want to add other vulnerable kernel numbers for checking later on. So, the team have asked us to solve the problem with Nagios Core monitoring, and we've decided that the best way to do it is to write our own plugin, check_vuln_kernel, thatchecks the output of uname(1)for a kernel version string, and then does the following: If it's one of the slightly vulnerable kernels, it will return a WARNING state so that we can let the security team know that they should address it when they're next able to. If it's the highly vulnerable kernel version, it will return a CRITICAL state so that the security team knows that a patched kernel needs to be installed immediately. If uname(1) gives an error or output we don't understand, it will return an UNKNOWN state, alerting the team to a bug in the plugin or possibly more serious problems with the server. Otherwise, it returns an OK state, confirming that the kernel is not known to be a vulnerable one. Finally, in the Nagios Core monitoring, they want to be able to see at a glance what the kernel version is and whether it's vulnerable or not. For the purposes of this example, we'll only monitor the Nagios Core server; however, via NRPE, we'd be able to install this plugin on the other servers that require this monitoring, they'll work just fine here as well. While this problem is very specific, we'll approach it in a very general way, which you'll be able to adapt to any solution where it's required for a Nagios plugin to: Run a command and pull its output into a variable. Check the output for the presence or absence of certain patterns. Return an appropriate status based on those tests. All that this means is that if you're able to do this, you'll be able to monitor anything effectively from Nagios Core! Getting ready You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already. You should also already be familiar with the relationship between services, commands, and plugins. You should have Perl installed, at least version 5.10. This will include the required POSIX module. You should also have the Perl modules Nagios::Plugin(or Monitoring::Plugin) andReadonly installed. On Debian-like systems, you can install this with the following: # apt-get install libnagios-plugin-perl libreadonly-perl On RPM-based systems, such as CentOS or Fedora Core, the following command should work: # yum install perl-Nagios-Plugin perl-Readonly This will be a rather long recipe that ties in a lot of Nagios Core concepts. You should be familiar with all the following concepts: Defining new hosts and services and how they relate to one another Defining new commands and how they relate to the plugins they call Installing, testing, and using Nagios Core plugins Some familiarity with Perl would also be helpful, but it is not required. We'll include comments to explain what each block of code is doing in the plugin. How to do it... We can write, test, and implement our example plugin as follows: Change to the directory containing the plugin binaries for Nagios Core. The default location is /usr/local/nagios/libexec: # cd /usr/local/nagios/libexec Start editing a new file called check_vuln_kernel: # vi check_vuln_kernel Include the following code in it. Take note of the comments, which explain what each block of code is doing. #!/usr/bin/env perl   # Use strict Perl style use strict; use warnings; use utf8;   # Require at least Perl v5.10 use 5.010;   # Require a few modules, including Nagios::Plugin use Nagios::Plugin; use POSIX; use Readonly;   # Declare some constants with patterns that match bad kernels Readonly::Scalar my $CRITICAL_REGEX => qr/^2[.]6[.]39[^d]/msx; Readonly::Scalar my $WARNING_REGEX =>   qr/^(?:2[.]6[.](?:19|24)|3[.]0[.]1)[^d]/msx;   # Run POSIX::uname() to get the kernel version string my @uname   = uname(); my $version = $uname[2];   # Create a new Nagios::Plugin object my $np = Nagios::Plugin->new();   # If we couldn't get the version, bail out with UNKNOWN if ( !$version ) {     $np->nagios_die('Could not read kernel version     string'); }   # Exit with CRITICAL if the version string matches the critical pattern if ( $version =~ $CRITICAL_REGEX ) {     $np->nagios_exit( CRITICAL, $version ); }   # Exit with WARNING if the version string matches the warning pattern if ( $version =~ $WARNING_REGEX ) {     $np->nagios_exit( WARNING, $version ); }   # Exit with OK if neither of the patterns matched $np->nagios_exit( OK, $version ); Make the plugin owned by the nagios group and executable with chmod(1): # chown root.nagios check_vuln_kernel # chmod 0770 check_vuln_kernel Run the plugin directly to test it: # sudo -s -u nagios $ ./check_vuln_kernel VULN_KERNEL OK: 3.16.0-4-amd64 We should now be able to use the plugin in a command, and hence in a service check just like any other command. How it works... The code we added in the new plugin file, check_vuln_kernel,earlier is actually quite simple: It runs Perl's POSIX uname implementation to get the version number of the kernel If that didn't work, it exits with the UNKNOWN status If the version number matches anything in a pattern containing critical version numbers, it exits with the CRITICAL status If the version number matches anything in a pattern containing warning version numbers, it exits with the WARNING status Otherwise, it exits with the OK status It also prints the status as a string along with the kernel version number, if it was able to retrieve one. We might set up a command definition for this plugin, as follows: define command {     command_name  check_vuln_kernel     command_line  $USER1$/check_vuln_kernel } In turn, we might set up a service definition for that command, as follows: define service {     use                  local-service     host_name            localhost     service_description  VULN_KERNEL     check_command        check_vuln_kernel } If the kernel was not vulnerable, the service's appearance in the web interface might be something like this: However, if the monitoring server itself happened to be running a vulnerable kernel, it might look more like this (and send consequent notifications, if configured to do so): There's more... This may be a simple plugin, but its structure can be generalised to all sorts of monitoring tasks. If we can figure out the correct logic to return the status we want in an appropriate programming language, then we can write a plugin to do basically anything. A plugin like this can just as effectively be written in C or for improved performance, but we'll assume for simplicity's sake that high performance for the plugin is not required, we can instead use a language that's better suited for quick ad hoc scripts like this one, in this case, Perl. The utils.shfile,also in /usr/local/nagios/libexec, allows us to write in shell script if we'd prefer that. If you prefer Python, the nagiosplugin library should meet your needs for both Python 2 and Python 3. Ruby users may like the nagiosplugin gem. If you write a plugin that you think could be generally useful for the Nagios community at large, consider putting it under a free software license and submitting it to the Nagios Exchange so that others can benefit from your work. Community contribution and support is what has made Nagios Core such a great monitoring platform in such wide use. Any plugin you publish in this way should confirm to the Nagios Plugin Development Guidelines. At the time of writing, these are available at https://nagios-plugins.org/doc/guidelines.html. You may find older Nagios Core plugins written in Perl using the utils.pm file instead of Nagios::Plugin or Monitoring::Plugin. This will work fine, but Nagios::Plugin is recommended, as it includes more functionality out of the box and tends to be easier to use. See also The Creating a new command recipe The Customizing an existing command recipe Summary In this article, we learned about how to install a custom plugin that we retrieved from Nagios Exchange onto a Nagios Core server so that we can use it in a Nagios Core command, removing a plugin that we no longer need as part of our Nagios Core installation, creating new command, writing and customizing commands. Resources for Article: Further resources on this subject: An Introduction To NODE.JS Design Patterns [article] Developing A Basic Site With NODE.JS And EXPRESS [article] Creating Our First App With IONIC [article]
Read more
  • 0
  • 0
  • 5735

article-image-customizing-heat-maps-intermediate
Packt
22 Feb 2016
11 min read
Save for later

Customizing heat maps (Intermediate)

Packt
22 Feb 2016
11 min read
This article will help you explore more advanced functions to customize the layout of the heat maps. The main focus lies on the usage of different color palettes, but we will also cover other useful features, such as cell notes that will be used in this recipe. (For more resources related to this topic, see here.) To ensure that our heat maps look good in any situation, we will make use of different color palettes in this recipe, and we will even learn how to create our own. Further, we will add some more extras to our heat maps including visual aids such as cell note labels, which will make them even more useful and accessible as a tool for visual data analysis. The following image shows a heat map with cell notes and an alternative color palette created from the arabidopsis_genes.csv data set: Getting ready Download the 5644OS_03_01.r script and the Arabidopsis_genes.csv data set from your account at http://www.packtpub.com and save it to your hard drive. I recommend that you save the script and data file to the same folder on your hard drive. If you execute the script from a different location to the data file, you will have to change the current R working directory accordingly. The script will check automatically if any additional packages need to be installed in R. How to do it... Execute the following code in R via the 5644OS_03_01.r script and take a look at the PDF file custom_heatmaps.pdf that will be created in the current working directory: ### loading packages if (!require("gplots")) { install.packages("gplots", dependencies = TRUE) library(RColorBrewer) } if (!require("RColorBrewer")) { install.packages("RColorBrewer", dependencies = TRUE) library(RColorBrewer) } ### reading in data gene_data <- read.csv("arabidopsis_genes.csv") row_names <- gene_data[,1] gene_data <- data.matrix(gene_data[,2:ncol(gene_data)]) rownames(gene_data) <- row_names ### setting heatmap.2() default parameters heat2 <- function(...) heatmap.2(gene_data, tracecol = "black", dendrogram = "column", Rowv = NA, trace = "none", margins = c(8,10), density.info = "density", ...) pdf("custom_heatmaps.pdf") ### 1) customizing colors # 1.1) in-built color palettes heat2(col = terrain.colors(n = 1000), main = "1.1) Terrain Colors") # 1.2) RColorBrewer palettes heat2(col = brewer.pal(n = 9, "YlOrRd"), main = "1.2) Brewer Palette") # 1.3) creating own color palettes my_colors <- c(y1 = "#F7F7D0", y2 = "#FCFC3A", y3 = "#D4D40D", b1 = "#40EDEA", b2 = "#18B3F0", b3 = "#186BF0", r1 = "#FA8E8E", r2 = "#F26666", r1 = "#C70404") heat2(col = my_colors, main = "1.3) Own Color Palette") my_palette <- colorRampPalette(c("blue", "yellow", "red"))(n = 1000) heat2(col = my_palette, main = "1.3) ColorRampPalette") # 1.4) gray scale heat2(col = gray(level = (0:100)/100), main ="1.4) Gray Scale") ### 2) adding cell notes fold_change <- 2^gene_data rounded_fold_changes <- round(rounded_fold_changes, 2) heat2(cellnote = rounded, notecex = 0.5, notecol = "black", col = my_palette, main = "2) Cell Notes") ### 3) adding column side colors heat2(ColSideColors = c("red", "gray", "red", rep("green",13)), main = "3) ColSideColors") dev.off() How it works... Primarily, we will be using read.csv() and heatmap.2() to read in data into R and construct our heat maps. In this recipe, however, we will focus on advanced features to enhance our heat maps, such as customizing color and other visual elements: Inspecting the arabidopsis_genes.csv data set: The arabidopsis_genes.csv file contains a compilation of gene expression data from the model plant Arabidopsis thaliana. I obtained the freely available data of 16 different genes as log 2 ratios of target and reference gene from the Arabidopsis eFP Browser (http://bar.utoronto.ca/efp_arabidopsis/). For each gene, expression data of 47 different areas of the plant is available in this data file. Reading the data and converting it into a numeric matrix: We have to convert the data table into a numeric matrix first before we can construct our heat maps: gene_data <- read.csv("arabidopsis_genes.csv") row_names <- gene_data[,1] gene_data <- data.matrix(gene_data[,2:ncol(gene_data)]) rownames(gene_data) <- row_names Creating a customized heatmap.2() function: To reduce typing efforts, we are defining our own version of the heatmap.2() function now, where we will include some arguments that we are planning to keep using throughout this recipe: heat2 <- function(...) heatmap.2(gene_data, tracecol = "black", dendrogram = "column", Rowv = NA, trace = "none", margins = c(8,10), density.info = "density", ...) So, each time we call our newly defined heat2() function, it will behave similar to the heatmap.2() function, except for the additional arguments that we will pass along. We also include a new argument, black, for the tracecol parameter, to better distinguish the density plot in the color key from the background. The built-in color palettes: There are four more color palettes available in the base R that we could use instead of the heat.colors palette: rainbow, terrain.colors, topo.colors, and cm.colors. So let us make use of the terrain.colors color palette now, which will give us a nice color transition from green over yellow to rose: heat2(col = terrain.colors(n = 1000), main = "1.1) Terrain Colors") Every number for the parameter n that is larger than the default value 12 will add additional colors, which will make the transition smoother. A value of 1000 for the n parameter should be more than sufficient to make the transition between the individual colors indistinguishable to the human eye. The following image shows a side-by-side comparison of the heat.colors and terrain.colors color palettes using a different number of color shades: Further, it is also possible to reverse the direction of the color transition. For example, if we want to have a heat.color transition from yellow to red instead of red to yellow in our heat map, we could simply define a reverse function: rev_heat.colors <- function(x) rev(heat.colors(x)) heat2(col = rev_heat.colors(500)) RColorBrewer palettes: A lot of color palettes are available from the RColorBrewer package. To see how they look like, you can type display.brewer.all() into the R command-line after loading the RColorBrewer package. However, in contrast to the dynamic range color palettes that we have seen previously, the RColorBrewer palettes have a distinct number of different colors. So to select all nine colors from the YlOrRd palette, a gradient from yellow to red, we use the following command: heat2(col = brewer.pal(n = 9, "YlOrRd"), main = "1.2) Brewer Palette") The following image gives you a good overview of all the different color palettes that are available from the RColorBrewer package: Creating our own color palettes: Next, we will see how we can create our own color palettes. A whole bunch of different colors are already defined in R. An overview of those colors can be seen by typing colors() into the command line of R. The most convenient way to assign new colors to a color palette is using hex colors (hexadecimal colors). Many different online tools are freely available that allow us to obtain the necessary hex codes. A great example is color picker (http://www.colorpicker.com), which allows us to choose from a rich color table and provides us with the corresponding hex codes. Once we gather all the hexadecimal codes for the colors that we want to use for our color palette, we can assign them to a variable as we have done before with the explicit color names: my_colors <- c(y1 = "#F7F7D0", y2 = "#FCFC3A", y3 = "#D4D40D", b1 = "#40EDEA", b2 = "#18B3F0", b3 = "#186BF0", r1 = "#FA8E8E", r2 = "#F26666", r1 = "#C70404") heat2(col = my_colors, main = "1.3) Own Color Palette") This is a very handy approach for creating a color key with very distinct colors. However, the downside of this method is that we have to provide a lot of different colors if we want to create a smooth color gradient; we have used 1000 different colors for the terrain.color() palette to get a smooth transition in the color key! Using colorRampPalette for smoother color gradients: A convenient approach to create a smoother color gradient is to use the colorRampPalette() function, so we don't have to insert all the different colors manually. The function takes a vector of different colors as an argument. Here, we provide three colors: blue for the lower end of the color key, yellow for the middle range, and red for the higher end. As we did it for the in-built color palettes, such as heat.color, we assign the value 1000 to the n parameter: my_palette <- colorRampPalette(c("blue", "yellow", "red"))(n = 1000) heat2(col = my_palette, main = "1.3) ColorRampPalette") In this case, it is more convenient to use discrete color names over hex colors, since we are using the colorRampPalette() function to create a gradient and do not need all the different shades of a particular color. Grayscales: It might happen that the medium or device that we use to display our heat maps does not support colors. Under these circumstances, we can use the gray palette to create a heat map that is optimized for those conditions. The level parameter of the gray() function takes a vector with values between 0 and 1 as an argument, where 0 represents black and 1 represents white, respectively. For a smooth gradient, we use a vector with 100 equally spaced shades of gray ranging from 0 to 1. heat2(col = gray(level = (0:200)/200), main ="1.4) Gray Scale") We can make use of the same color palettes for the levelplot() function too. It works in a similar way as it did for the heatmap.2() function that we are using in this recipe. However, inside the levelplot() function call, we must use col.regions instead of the simple col, so that we can include a color palette argument. Adding cell notes to our heat map: Sometimes, we want to show a data set along with our heat map. A neat way is to use so-called cell notes to display data values inside the individual heat map cells. The underlying data matrix for the cell notes does not necessarily have to be the same numeric matrix we used to construct our heat map, as long as it has the same number of rows and columns. As we recall, the data we read from arabidopsis_genes.csv resembles log 2 ratios of sample and reference gene expression levels. Let us calculate the fold changes of the gene expression levels now and display them—rounded to two digits after the decimal point—as cell notes on our heat map: fold_change <- 2^gene_data rounded_fold_changes <- round(fold_change, 2) heat2(cellnote = rounded_fold_changes, notecex = 0.5, notecol = "black", col = rev_heat.colors, main = "Cell Notes") The notecex parameter controls the size of the cell notes. Its default size is 1, and every argument between 0 and 1 will make the font smaller, whereas values larger than 1 will make the font larger. Here, we decreased the font size of the cell notes by 50 percent to fit it into the cell boundaries. Also, we want to display the cell notes in black to have a nice contrast to the colored background; this is controlled by the notecol parameter. Row and column side colors: Another approach to pronounce certain regions, that is, rows or columns on the heat map is to make use of row and column side colors. The ColSideColors argument will place a colored box between the dendrogram and heat map that can be used to annotate certain columns. We pass our vector with colors to ColSideColors, where its length must be equal to the number of columns of the heat map. Here, we want to color the first and third column red, the second one gray, and all the remaining 13 columns green: heat2(ColSideColors = c("red", "gray", "red", rep("green", 13)), main = "ColSideColors") You can see in the following image how the column side colors look like when we include the ColSideColors argument as shown previously: Attentive readers may have noticed that the order of colors in the column color box slightly differs from the order of colors we passed as a vector to ColSideColors. We see red two times next to each other, followed by a green and a gray box. This is due to the fact that the columns of our heat map have been reordered by the hierarchical clustering algorithm. Summary To learn more about the similar technology, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Instant R Starter (https://www.packtpub.com/big-data-and-business-intelligence/instant-r-starter-instant) Machine Learning with R - Second Edition (https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) Mastering RStudio – Develop, Communicate, and Collaborate with R (https://www.packtpub.com/application-development/mastering-rstudio-%E2%80%93-develop-communicate-and-collaborate-r) Resources for Article: Further resources on this subject: Data Analysis Using R[article] Big Data Analysis[article] Big Data Analysis (R and Hadoop)[article]
Read more
  • 0
  • 0
  • 4540

article-image-architectural-and-feature-overview
Packt
22 Feb 2016
12 min read
Save for later

Architectural and Feature Overview

Packt
22 Feb 2016
12 min read
 In this article by Giordano Scalzo, the author of Learning VMware App Volumes, we are going to look a little deeper into the different component parts that make up an App Volumes solution. Then, once you are familiar with these different components, we will discuss how they fit and work together. (For more resources related to this topic, see here.) App Volumes Components We are going to start by covering an overview of the different core components that make up the complete App Volumes solution, a glossary if you like. These are either the component parts of the actual App Volumes solution or additional components that are required to build your complete environment. App Volumes Manager The App Volumes Manager is the heart of the solution. Installed on a Windows Server operating system, the App Volumes Manager controls the application delivery engine and also provides you the access to a web-based dashboard and console from where you can manage your entire App Volumes environment. You will get your first glimpse of the App Volumes Manager when you complete the installation process and start the post-installation tasks, where you will configure details about your virtual host servers, storage, Active Directory, and other environment variables. Once you have completed the installation tasks, you will use the App Volumes Manager to perform tasks, such as creating new and updating existing AppStacks, creating Writable Volumes as well as then assigning both AppStacks and Writable Volumes to end users or virtual desktop machines. The App Volumes Manager also manages the virtual desktop machine that has the App Volumes Agent installed. Once virtual desktop machine has the agent installed, then it will then appear within the App Volumes Manager inventory so that you are able to configure assignments. In summary the App Volumes Manager performs the following functions: It provides the following functionality: Orchestrates the key infrastructure components such as, Active Directory, AppStack or Writable Volumes attachments, virtual hosting infrastructure (ESXi hosts and vCenter Servers) Manages assignments of AppStack or Writable Volumes to users, groups, and virtual desktop machines Collates AppStacks and Writable Volumes usage Provides a history of administrative actions Acts as a broker for the App Volumes agents for automated assignment of AppStacks and Writable Volumes as virtual desktop machines boot up and the end user logs in Provides a web-based graphical interface from which to manage the entire environment Throughout this article you will see the following icon used in any drawings or schematics to denote the App Volumes Manager. App Volumes Agent The App Volumes Agent is installed onto a virtual desktop machine on which you want to be able to attach AppStacks or Writable Volumes, and runs as a service on that machine. As such it is invisible to the end user. When you attach AppStack or Writable Volume to a virtual machine, then the agent acts as a filter driver and takes care of any application calls and file system redirects between the operating system and the App Stack or Writable Volume. Rather than seeing your AppStack, which appears as an additional hard drive within the operating system, the agent makes the applications appear as if they were natively installed. So, for example, the icons for your applications will automatically appear on your desktop/taskbar. The App Volumes Agent is also responsible for registering the virtual machine with the App Volumes Manager. Throughout this article, you will see the following icon used in any drawings or schematics to denote the App Volumes Agent. The App Volumes Agent can also be installed onto an RDSH host server to allow the attaching of AppStacks within a hosted applications environment. AppStacks An AppStack is a read-only volume that contains your applications, which is mounted as a Virtual Machine Disk file (VMDK) for VMware environments, or as a Virtual Hard Disk file (VHD) for Citrix and Microsoft environments on your virtual desktop machine, or RDSH host server. An AppStack is created using a provisioning machine, which has the App Volumes Agent installed on it. Then, as a part of the provisioning process, you mount an empty container (VMDK or VHD file) and then install the application(s) as you would do normally. The App Volumes Agent redirects the installation files, file system, and registry settings to the AppStack. Once completed, AppStack is set to read-only, which then allows one AppStack to be used for multiple users. This not only helps you reduce the storage requirements (an App Stack is also thin provisioned) but also allows any application that is delivered via AppStack to be centrally managed and updated. AppStacks are then delivered to the end users either as individual user assignments or via group membership using Active Directory. Throughout this article, you will see the following icon used in any drawings or schematics to denote AppStack. Writable Volumes One of the many use cases that was not best suited to a virtual desktop environment was that of developers, where they would need to install various different applications and other software. To cater for this use case, you would need to deploy a dedicated, persistent desktop to meet their requirements. This method of deployment is not necessarily the most cost-effective method, which potentially requires additional infrastructure resources, and management. With App Volumes, this all changes with the Writable Volumes feature. In the same way as you assign AppStack containing preinstalled and configured applications to an end user, with Writable Volumes, you attach an empty container as a VMDK file to their virtual desktop machine into which they can install their own applications. This virtual desktop machine will be running the App Volumes Agent, which provides the filter between any applications that the end user installs into the Writable Volume and the native operating system of the virtual desktop machine. The user then has their own drive onto which they can install applications. Now you can deploy nonpersistent, floating desktops for these users and attach not only their corporate applications via AppStacks, but also their own user-installed applications via a Writable Volume. Throughout this article, you will see the following icon used in any drawings or schematics to denote a Writable Volume. Provisioning virtual machine Although not an actual part of the App Volumes software, a key component is to have a clean virtual desktop machine to use as reference point from which to create your AppStacks from. This is known as the provisioning machine. Once you have your provisioning virtual desktop machine, you first install the App Volumes Agent onto it. Then, from the App Volumes Manager, you initiate the provisioning process, which attaches an empty VMDK file to the provisioning virtual desktop machine, and then prompts you, as the IT admin, to install the application. Before you start the installation of the application(s) that you are going to create as AppStack, it’s a good practice to take a snapshot before you start. in this way, you can roll back to your clean virtual desktop machine state before installation, ready to create the next AppStack. Throughout this article, you will see the following icon used in any drawings or schematics to denote a provisioning machine. A Broker Integration service The Broker Integration service is installed on a VMware Horizon View Connection Server, and it provides faster log on times for the end users who have access to a Horizon View virtual desktop machine. Throughout this article, you will see the following icon used in any drawings or schematics to denote the Broker Integration Service. Storage Groups Again, although not a specific component of App Volumes, you have the ability to define Storage Groups to store your AppStacks and Writable Volumes. Storage Groups are primarily used to provide replication of AppStacks and distribute Writable Volumes across multiple data stores. With AppStack storage groups, you can define a group of data stores that will be used to store the same AppStacks, enabling replication to be automatically deployed on those data stores. With Writable Volumes, only some of the storage group settings will apply attributes for the storage group, for example, the template location and distribution strategy. The distribution strategy allows you to define how writable volumes are distributed across the storage group. There are two settings for this as described: Spread: This will distribute files evenly across all the storage locations. When a file is created, the storage with the most available space is used. Round-Robin: This works by distributing the Writable Volume files sequentially, using the storage location that was used the longest amount of time ago. In this article, you will see the following icon used in any drawings or schematics to denote storage groups. We have introduced you to the core components that make up the App Volumes deployment. App Volumes Architecture Now that you understand what each of the individual components is used for, the next step is to look at how they all fit together to form the complete solution. We are going to break the architecture down into two parts. The first part will be focused on the application delivery and virtual desktop machines from an end user’s perspective. In the second part, we will look more at the supporting and underlying infrastructure, so the view from an IT administrator’s point of view. Finally, in the infrastructure section, we will look at the infrastructure with a networking hat on and illustrate the various network ports we are going to require to be available to us. So let's go back and look at our first part, what the end user will see. In this example, we have a virtual desktop machine to run a Windows operating system as the starting point of our solution. Onto that virtual desktop machine, we have installed the App Volumes Agent. We also have some core applications already installed onto this virtual desktop machine as a part of the core/parent image. These would be applications that are delivered to every user, such as Adobe Reader for example. This is exactly the same best practice as we would normally follow in any other virtual desktop environment. The updates here would be taken care of by updating the parent image and then using the recompose feature of linked clones in Horizon View. With the agent installed, the virtual desktop machine will appear in the App Volumes Manager console from where we can start to assign AppStacks to our Active Directory users and groups. When a user who has been assigned AppStack or Writable Volume logs in to a virtual desktop machine, AppStack that has been assigned to them will be attached to that virtual desktop machine, and the applications within that AppStack will seamlessly appear on the desktop. Users will also have access to their Writable Volume. The following diagram illustrates an example deployment from the virtual desktop machines perspective, as we have just described. Moving on to the second part of our focus on the architecture, we are now going to look at the underlying/supporting infrastructure. As a starting point, all of our infrastructure components have been deployed as virtual machines. They are hosted on the VMware vSphere platform. The following diagram illustrates the infrastructure components and how they fit together to deliver the applications to the virtual desktop machines. In the top section of the diagram, we have the virtual desktop machine running our Windows operating system with the App Volumes Agent installed. Along with acting as the filter driver, the agent talks to the App Volumes Manager (1) to read user assignment information for who can access which AppStacks and Writable Volumes. The App Volumes Manager also communicates with Active Directory (2) to read user, group, and machine information to assign AppStacks and Writable Volumes. The virtual desktop machine also talks to Active Directory to authenticate user logins (3). The App Volumes Manager also needs access to a SQL database (4), which stores the information about the assignments, AppStacks, Writable Volumes, and so on. A SQL database is also a requirement for vCenter Server (5), and if you are using the linked clone function of Horizon View, then a database is required for the View Composer. The final part of this diagram shows the App Volumes storage groups that are used to store the AppStacks and the Writable Volumes. These get mounted to the virtual desktop machines as virtual disks or VMDK files (6). Following on from the architecture and the how the different components fit together and communicate, later we are going to cover which ports need to be open to allow the communication between the various services and components. Network ports Now, we are going to cover the firewall ports that are required to be open in order for the App Volumes components to communicate with the other infrastructure components. The diagram here shows the port numbers (highlighted in the boxes) that are required to be open for each component to communicate. It's worth ensuring that these ports are configured before you start the deployment of App Volumes. Summary In this article, we introduced you to the individual components that make up the App Volumes solution and what task each of them performs. We then went on to look at how those components fit into the overall solution architecture, as well as how the architecture works. Resources for Article:   Further resources on this subject: Elastic Load Balancing [article] Working With CEPH Block Device [article] Android and IOs Apps Testing At A Glance [article]
Read more
  • 0
  • 0
  • 9571
article-image-what-naive-bayes-classifier
Packt
22 Feb 2016
9 min read
Save for later

What is Naïve Bayes classifier?

Packt
22 Feb 2016
9 min read
The name Naïve Bayes comes from the basic assumption in the model that the probability of a particular feature Xi is independent of any other feature Xj given the class label CK. This implies the following: Using this assumption and the Bayes rule, one can show that the probability of class CK, given features {X1,X2,X3,...,Xn}, is given by: Here, P(X1,X2,X3,...,Xn) is the normalization term obtained by summing the numerator on all the values of k. It is also called Bayesian evidence or partition function Z. The classifier selects a class label as the target class that maximizes the posterior class probability P(CK |{X1,X2,X3,...,Xn}): The Naïve Bayes classifier is a baseline classifier for document classification. One reason for this is that the underlying assumption that each feature (words or m-grams) is independent of others, given the class label typically holds good for text. Another reason is that the Naïve Bayes classifier scales well when there is a large number of documents. There are two implementations of Naïve Bayes. In Bernoulli Naïve Bayes, features are binary variables that encode whether a feature (m-gram) is present or absent in a document. In multinomial Naïve Bayes, the features are frequencies of m-grams in a document. To avoid issues when the frequency is zero, a Laplace smoothing is done on the feature vectors by adding a 1 to each count. Let's look at multinomial Naïve Bayes in some detail. Let ni be the number of times the feature Xi occurred in the class CK in the training data. Then, the likelihood function of observing a feature vector X={X1,X2,X3,..,Xn}, given a class label CK, is given by: Here, is the probability of observing the feature Xi in the class CK. Using Bayesian rule, the posterior probability of observing the class CK, given a feature vector X, is given by: Taking logarithm on both the sides and ignoring the constant term Z, we get the following: So, by taking logarithm of posterior distribution, we have converted the problem into a linear regression model with as the coefficients to be determined from data. This can be easily solved. Generally, instead of term frequencies, one uses TF-IDF (term frequency multiplied by inverse frequency) with the document length normalized to improve the performance of the model. The R package e1071 (Miscellaneous Functions of the Department of Statistics) by T.U. Wien contains an R implementation of Naïve Bayes. For this article, we will use the SMS spam dataset from the UCI Machine Learning repository (reference 1 in the References section of this article). The dataset consists of 425 SMS spam messages collected from the UK forum Grumbletext, where consumers can submit spam SMS messages. The dataset also contains 3375 normal (ham) SMS messages from the NUS SMS corpus maintained by the National University of Singapore. The dataset can be downloaded from the UCI Machine Learning repository (https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection). Let's say that we have saved this as file SMSSpamCollection.txt in the working directory of R (actually, you need to open it in Excel and save it is as tab-delimited file for it to read in R properly). Then, the command to read the file into the tm (text mining) package would be the following: >spamdata ←read.table("SMSSpamCollection.txt",sep="\t",stringsAsFactors = default.stringsAsFactors()) We will first separate the dependent variable y and independent variables x and split the dataset into training and testing sets in the ratio 80:20, using the following R commands: >samp←sample.int(nrow(spamdata),as.integer(nrow(spamdata)*0.2),replace=F) >spamTest ←spamdata[samp,] >spamTrain ←spamdata[-samp,] >ytrain←as.factor(spamTrain[,1]) >ytest←as.factor(spamTest[,1]) >xtrain←as.vector(spamTrain[,2]) >xtest←as.vector(spamTest[,2]) Since we are dealing with text documents, we need to do some standard preprocessing before we can use the data for any machine learning models. We can use the tm package in R for this purpose. In the next section, we will describe this in some detail. Text processing using the tm package The tm package has methods for data import, corpus handling, preprocessing, metadata management, and creation of term-document matrices. Data can be imported into the tm package either from a directory, a vector with each component a document, or a data frame. The fundamental data structure in tm is an abstract collection of text documents called Corpus. It has two implementations; one is where data is stored in memory and is called VCorpus (volatile corpus) and the second is where data is stored in the hard disk and is called PCorpus (permanent corpus). We can create a corpus of our SMS spam dataset by using the following R commands; prior to this, you need to install the tm package and SnowballC package by using the install.packages("packagename") command in R: >library(tm) >library(SnowballC) >xtrain ← VCorpus(VectorSource(xtrain)) First, we need to do some basic text processing, such as removing extra white space, changing all words to lowercase, removing stop words, and stemming the words. This can be achieved by using the following functions in the tm package: >#remove extra white space >xtrain ← tm_map(xtrain,stripWhitespace) >#remove punctuation >xtrain ← tm_map(xtrain,removePunctuation) >#remove numbers >xtrain ← tm_map(xtrain,removeNumbers) >#changing to lower case >xtrain ← tm_map(xtrain,content_transformer(tolower)) >#removing stop words >xtrain ← tm_map(xtrain,removeWords,stopwords("english")) >#stemming the document >xtrain ← tm_map(xtrain,stemDocument) Finally, the data is transformed into a form that can be consumed by machine learning models. This is the so called document-term matrix form where each document (SMS in this case) is a row, the terms appearing in all documents are the columns, and the entry in each cell denotes how many times each word occurs in one document: >#creating Document-Term Matrix >xtrain ← as.data.frame.matrix(DocumentTermMatrix(xtrain)) The same set of processes is done on the xtest dataset as well. The reason we converted y to factors and xtrain to a data frame is to match the input format for the Naïve Bayes classifier in the e1071 package. Model training and prediction You need to first install the e1071 package from CRAN. The naiveBayes() function can be used to train the Naïve Bayes model. The function can be called using two methods. The following is the first method: >naiveBayes(formula,data,laplace=0, ,subset,na.action=na.pass) Here formula stands for the linear combination of independent variables to predict the following class: >class ~ x1+x2+… Also, data stands for either a data frame or contingency table consisting of categorical and numerical variables. If we have the class labels as a vector y and dependent variables as a data frame x, then we can use the second method of calling the function, as follows: >naiveBayes(x,y,laplace=0,…) We will use the second method of calling in our example. Once we have a trained model, which is an R object of class naiveBayes, we can predict the classes of new instances as follows: >predict(object,newdata,type=c(class,raw),threshold=0.001,eps=0,…) So, we can train the Naïve Bayes model on our training dataset and score on the test dataset by using the following commands: >#Training the Naive Bayes Model >nbmodel ← naiveBayes(xtrain,ytrain,laplace=3) >#Prediction using trained model >ypred.nb ← predict(nbmodel,xtest,type = "class",threshold = 0.075) >#Converting classes to 0 and 1 for plotting ROC >fconvert ← function(x){ if(x == "spam"){ y ← 1} else {y ← 0} y } >ytest1 ← sapply(ytest,fconvert,simplify = "array") >ypred1 ← sapply(ypred.nb,fconvert,simplify = "array") >roc(ytest1,ypred1,plot = T)  Here, the ROC curve for this model and dataset is shown. This is generated using the pROC package in CRAN: >#Confusion matrix >confmat ← table(ytest,ypred.nb) >confmat pred.nb ytest ham spam ham 143 139 spam 9 35 From the ROC curve and confusion matrix, one can choose the best threshold for the classifier, and the precision and recall metrics. Note that the example shown here is for illustration purposes only. The model needs be to tuned further to improve accuracy. We can also print some of the most frequent words (model features) occurring in the two classes and their posterior probabilities generated by the model. This will give a more intuitive feeling for the model exercise. The following R code does this job: >tab ← nbmodel$tables >fham ← function(x){ y ← x[1,1] y } >hamvec ← sapply(tab,fham,simplify = "array") >hamvec ← sort(hamvec,decreasing = T) >fspam ← function(x){ y ← x[2,1] y } >spamvec ← sapply(tab,fspam,simplify = "array") >spamvec ← sort(spamvec,decreasing = T) >prb ← cbind(spamvec,hamvec) >print.table(prb)  The output table is as follows: word Prob(word|spam) Prob(word|ham) call 0.6994 0.4084 free 0.4294 0.3996 now 0.3865 0.3120 repli 0.2761 0.3094 text 0.2638 0.2840 spam 0.2270 0.2726 txt 0.2270 0.2594 get 0.2209 0.2182 stop 0.2086 0.2025 The table shows, for example, that given a document is spam, the probability of the word call appearing in it is 0.6994, whereas the probability of the same word appearing in a normal document is only 0.4084. Summary In this article, we learned a basic and popular method for classification, Naïve Bayes, implemented using the Bayesian approach. For further information on Bayesian models, you can refer to: https://www.packtpub.com/big-data-and-business-intelligence/data-analysis-r https://www.packtpub.com/big-data-and-business-intelligence/building-probabilistic-graphical-models-python Resources for Article: Further resources on this subject: Introducing Bayesian Inference [article] Practical Applications of Deep Learning [article] Machine learning in practice [article]
Read more
  • 0
  • 0
  • 23340

article-image-building-recommendation-system-azure
Packt
19 Feb 2016
7 min read
Save for later

Building A Recommendation System with Azure

Packt
19 Feb 2016
7 min read
Recommender systems are common these days. You may not have noticed, but you might already be a user or receiver of such a system somewhere. Most of the well-performing e-commerce platforms use recommendation systems to recommend items to their users. When you see on the Amazon website that a book is recommended to you based on your earlier preferences, purchases, and browse history, Amazon is actually using such a recommendation system. Similarly, Netflix uses its recommendation system to suggest movies for you. (For more resources related to this topic, see here.) A recommender or recommendation system is used to recommend a product or information often based on user characteristics, preferences, history, and so on. So, a recommendation is always personalized. Until recently, it was not so easy or straightforward to build a recommender, but Azure ML makes it really easy to build one as long as you have your data ready. This article introduces you to the concept of recommendation systems and also the model available in ML Studio for you to build your own recommender system. It then walks you through the process of building a recommendation system with a simple example. The Matchbox recommender Microsoft has developed a large-scale recommender system based on a probabilistic model (Bayesian) called Matchbox. This model can learn about a user's preferences through observations made on how they rate items, such as movies, content, or other products. Based on those observations, it recommends new items to the users when requested. Matchbox uses the available data for each user in the most efficient way possible. The learning algorithm it uses is designed specifically for big data. However, its main feature is that Matchbox takes advantage of metadata available for both users and items. This means that the things it learns about one user or item can be transferred across to other users or items. You can find more information about the Matchbox model at the Microsoft Research project link. Kinds of recommendations The Matchbox recommender supports the building of four kinds of recommenders, which will include most of the scenarios. Let's take a look at the following list: Rating Prediction: This predicts ratings for a given user and item, for example, if a new movie is released, the system will predict what will be your rating for that movie out of 1-5. Item Recommendation: This recommends items to a given user, for example, Amazon suggests you books or YouTube suggests you videos to watch on its home page (especially when you are logged in). Related Users: This finds users that are related to a given user, for example, LinkedIn suggests people that you can get connected to or Facebook suggests friends to you. Related Items: This finds the items related to a given item, for example, a blog site suggests you related posts when you are reading a blog post. Understanding the recommender modules The Matchbox recommender comes with three components; as you might have guessed, a module each to train, score, and evaluate the data. The modules are described as follows. The train Matchbox recommender This module contains the algorithm and generates the trained algorithm, as shown in the following screenshot: This module takes the values for the following two parameters. The number of traits This value decides how many implicit features (traits) the algorithm will learn about that are related to every user and item. The higher this value, the precise it would be as it would lead to better prediction. Typically, it takes a value in the range of 2 to 20. The number of recommendation algorithm iterations It is the number of times the algorithm iterates over the data. The higher this value, the better would the predictions be. Typically, it takes a value in the range of 1 to 10. The score matchbox recommender This module lets you specify the kind of recommendation and corresponding parameters you want: Rating Prediction Item Prediction Related Users Related Items Let's take a look at the following screenshot: The ML Studio help page for the module provides details of all the corresponding parameters. The evaluate recommender This module takes a test and a scored dataset and generates evaluation metrics, as shown in the following screenshot: It also lets you specify the kind of recommendation, such as the score module and corresponding parameters. Building a recommendation system Now, it would be worthwhile that you learn to build one by yourself. We will build a simple recommender system to recommend restaurants to a given user. ML Studio includes three sample datasets, described as follows: Restaurant customer data: This is a set of metadata about customers, including demographics and preferences, for example, latitude, longitude, interest, and personality. Restaurant feature data: This is a set of metadata about restaurants and their features, such as food type, dining style, and location, for example, placeID, latitude, longitude, price. Restaurant ratings: This contains the ratings given by users to restaurants on a scale of 0 to 2. It contains the columns: userID, placeID, and rating. Now, we will build a recommender that will recommend a given number of restaurants to a user (userID). To build a recommender perform the following steps: Create a new experiment. In the Search box in the modules palette, type Restaurant. The preceding three datasets get listed. Drag them all to the canvas one after another. Drag a Split module and connect it to the output port of the Restaurant ratings module. On the properties section to the right, choose Splitting mode as Recommender Split. Leave the other parameters at their default values. Drag a Project Columns module to the canvas and select the columns: userID, latitude, longitude, interest, and personality. Similarly, drag another Project Columns module and connect it to the Restaurant feature data module and select the columns: placeID, latitude, longitude, price, the_geom_meter, and address, zip. Drag a Train Matchbox Recommender module to the canvas and make connections to the three input ports, as shown in the following screenshot: Drag a Score Matchbox Recommender module to the canvas and make connections to the three input ports and set the property's values, as shown in the following screenshot: Run the experiment and when it gets completed, right-click on the output of the Score Matchbox Recommender module and click on Visualize to explore the scored data. You can note the different restaurants (IDs) recommended as items for a user from the test dataset. The next step is to evaluate the scored prediction. Drag the Evaluate Recommender module to the canvas and connect the second output of the Split module to its first input port and connect the output of the Score Matchbox Recommender module to its second input. Leave the module at its default properties. Run the experiment again and when finished, right-click on the output port of the Evaluate Recommender module and click on Visualize to find the evaluation metric. The evaluation metric Normalized Discounted Cumulative Gain (NDCG) is estimated from the ground truth ratings given in the test set. Its value ranges from 0.0 to 1.0, where 1.0 represents the most ideal ranking of the entities. Summary You started with gaining the basic knowledge about a recommender system. You then understood the Matchbox recommender that comes with ML Studio along with its components. You also explored different kinds of recommendations that you can make with it. Finally, you ended up building a simple recommendation system to recommend restaurants to a given user. For more information on Azure, take a look at the following books also by Packt Publishing: Learning Microsoft Azure (https://www.packtpub.com/networking-and-servers/learning-microsoft-azure) Microsoft Windows Azure Development Cookbook (https://www.packtpub.com/application-development/microsoft-windows-azure-development-cookbook) Resources for Article: Further resources on this subject: Introduction to Microsoft Azure Cloud Services[article] Microsoft Azure – Developing Web API for Mobile Apps[article] Security in Microsoft Azure[article]
Read more
  • 0
  • 0
  • 19211
Modal Close icon
Modal Close icon