Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-visualization-dashboard-design

10 Jan 2017

18 min read

Visualization Dashboard Design

10 Jan 2017

In this article by David Baldwin, the author of the book Mastering Tableau, we will cover how you need to create some effective dashboards. (For more resources related to this topic, see here.) Since that fateful week in Manhattan, I've read Edward Tufte, Stephen Few, and other thought leaders in the data visualization space. This knowledge has been very fruitful. For instance, quite recently a colleague told me that one of his clients thought a particular dashboard had too many bar charts and he wanted some variation. I shared the following two quotes: Show data variation, not design variation. –Edward Tufte in The Visual Display of Quantitative Information Variety might be the spice of life, but, if it is introduced on a dashboard for its own sake, the display suffers. –Stephen Few in Information Dashboard Design Those quotes proved helpful for my colleague. Hopefully the following information will prove helpful to you. Additionally I would also like to draw attention to Alberto Cairo—a relatively new voice providing new insight. Each of these authors should be considered a must-read for anyone working in data visualization. Visualization design theory Dashboard design Sheet selection Visualization design theory Any discussion on designing dashboards should begin with information about constructing well-designed content. The quality of the dashboard layout and the utilization of technical tips and tricks do not matter if the content is subpar. In other words we should consider the worksheets displayed on dashboards and ensure that those worksheets are well-designed. Therefore, our discussion will begin with a consideration of visualization design principles. Regarding these principles, it's tempting to declare a set of rules such as: To plot change over time, use a line graph To show breakdowns of the whole, use a treemap To compare discrete elements, use a bar chart To visualize correlation, use a scatter plot But of course even a cursory review of the preceding list brings to mind many variations and alternatives! Thus, we will consider various rules while always keeping in mind that rules (at least rules such as these) are meant to be broken. Formatting rules The following formatting rules encompass fonts, lines, and bands. Fonts are, of course, an obvious formatting consideration. Lines and bands, however, may not be something you typically think of when formatting, especially when considering formatting from the perspective of Microsoft Word. But if we broaden formatting considerations to think of Adobe Illustrator, InDesign, and other graphic design tools, then lines and bands are certainly considered. This illustrates that data visualization is closely related to graphic design and that formatting considers much more than just textual layout. Rule – keep the font choice simple Typically using one or two fonts on a dashboard is advisable. More fonts can create a confusing environment and interfere with readability. Fonts chosen for titles should be thick and solid while the body fonts should be easy to read. As of Tableau 10.0 choosing appropriate fonts is simple because of the new Tableau Font Family. Go to Format | Font to display the Format Font window to see and choose these new fonts: Assuming your dashboard is primarily intended for the screen, sans serif fonts are best. On the rare occasions a dashboard is primarily intended for print, you may consider serif fonts; particularly if the print resolution is high. Rule – Trend line > Fever line > Reference line > Drop line > Zero line > Grid line The preceding pseudo formula is intended to communicate line visibility. For example, trend line visibility should be greater than fever line visibility. Visibility is usually enhanced by increasing line thickness but may be enhanced via color saturation or by choosing a dotted or dashed line over a solid line. The trend line, if present, is usually the most visible line on the graph. Trend lines are displayed via the Analytics pane and can be adjusted via Format à Lines. The fever line (for example, the line used on a time-series chart) should not be so heavy as to obscure twists and turns in the data. Although a fever line may be displayed as dotted or dashed by utilizing the Pages shelf, this is usually not advisable because it may obscure visibility. The thickness of a fever line can be adjusted by clicking on the Size shelf in the Marks View card. Reference lines are usually less prevalent than either fever or trend lines and can be formatted by going to Format | Reference lines. Drop lines are not frequently used. To deploy drop lines, right-click in a blank portion of the view and go to Drop lines | Show drop lines. Next, click on a point in the view to display a drop line. To format droplines, go to Format | Droplines. Drop lines are relevant only if at least one axis is utilized in the visualization. Zero lines (sometimes referred to as base lines) display only if zero or negative values are included in the view or positive numerical values are relatively close to zero. Format zero lines by going to Format | Lines. Grid lines should be the most muted lines on the view and may be dispensed with altogether. Format grid lines by going to Format | Lines. Rule – band in groups of three to five Visualizations comprised of a tall table of text or horizontal bars should segment dimension members in groups of three to five. Exercise – banding Navigate to https://public.tableau.com/profile/david.baldwin#!/ to locate and download the workbook. Navigate to the worksheet titled Banding. Select the Superstore data source and place Product Name on the Rows shelf. Double-click on Discount, Profit, Quantity, and Sales. Navigate to Format | Shading and set Band Size under Row Banding so that three to five lines of text are encompassed by each band. Be sure to set an appropriate color for both Pane and Header: Note that after completing the preceding five steps, Tableau defaulted to banding every other row. This default formatting is fine for a short table but is quite busy for a tall table. The band in groups of three to five rule is influenced by Dona W. Wong, who, in her book The Wall Street Journal Guide to Information Graphics, recommends separating long tables or bar charts with thin rules to separate the bars in groups of three to five to help the readers read across. Color rules It seems slightly ironic to discuss color rules in a black-and-white publication such as Mastering Tableau. Nonetheless, even in a monochromatic setting, a discussion of color is relevant. For example, exclusive use of black text communicates differently than using variations of gray. The following survey of color rules should be helpful to ensure that you use colors effectively in a variety of settings. Rule – keep colors simple and limited Stick to the basic hues and provide only a few (perhaps three to five) hue variations. Alberto Cairo, in his book The Functional Art: An Introduction to Information Graphics and Visualization, provides insights into why this is important. The limited capacity of our visual working memory helps explain why it's not advisable to use more than four or five colors or pictograms to identify different phenomena on maps and charts. Rule – respect the psychological implication of colors In Western society, there is a color vocabulary so pervasive, it's second nature. Exit signs marking stairwell locations are red. Traffic cones are orange. Baby boys are traditionally dressed in blue while baby girls wear pink. Similarly, in Tableau reds and oranges should usually be associated with negative performance while blues and greens should be associated with positive performance. Using colors counterintuitively can cause confusion. Rule – be colorblind-friendly Colorblindness is usually manifested as an inability to distinguish red and green or blue and yellow. Red/green and blue/yellow are on opposite sides of the color wheel. Consequently, the challenges these color combinations present for colorblind individuals can be easily recreated with image editing software such as Photoshop. If you are not colorblind, convert an image with these color combinations to grayscale and observe. The challenge presented to the 8.0% of the males and 0.5% of the females who are color blind becomes immediately obvious! Rule – use pure colors sparingly The resulting colors from the following exercise should be a very vibrant red, green, and blue. Depending on the monitor, you may even find it difficult to stare directly at the colors. These are known as pure colors and should be used sparingly; perhaps only to highlight particularly important items. Exercise – using pure colors Open the workbook and navigate to the worksheet entitled Pure Colors. Select the Superstore data source and place Category on both the Rows shelf and the Color shelf. Set the Fit to Entire View. Click on the Color shelf and choose Edit Colors…. In the Edit Colors dialog box, double-click on the color icons to the left of each dimension member; that is, Furniture, Office Supplies, and Technology: Within the resulting dialog box, set furniture to an HTML value of #0000ff, Office Supplies to #ff0000, and Technology to #00ff00. Rule – color variations over symbol variation Deciphering different symbols takes more mental energy for the end user than distinguishing color. Therefore color variation should be used over symbol variation. This rule can actually be observed in Tableau defaults. Create a scatter plot and place a dimension with many members on the Color shelf and Shape shelf respectively. Note that by default, the view will display 20 unique colors but only 10 unique shapes. Older versions of Tableau (such as Tableau 9.0) display warnings that include text such as “…the recommended maximum for this shelf is 10”: Visualization type rules We won't spend time here to delve into a lengthy list of visualization type rules. However, it does seem appropriate to review at least a couple of rules. In the following exercise, we will consider keeping shapes simple and effectively using pie charts. Rule – keep shapes simple Too many shape details impede comprehension. This is because shape details draw the user's focus away from the data. Consider the following exercise on using two different shopping cart images. Exercise – shapes Open the workbook associated and navigate to the worksheet entitled Simple Shopping Cart. Note that the visualization is a scatterplot showing the top 10 selling Sub-Categories in terms of total sales and profits. On your computer, navigate to the Shapes directory located in the My Tableau Repository. On my computer, the path is C:UsersDavid BaldwinDocumentsMy Tableau RepositoryShapes. Within the Shapes directory, create a folder named My Shapes. Reference the link included in the comment section of the worksheet to download the assets. In the downloaded material, find the images titled Shopping_Cart and Shopping_Cart_3D and copy those images into the My Shapes directory created previously. Within Tableau, access the Simple Shopping Cart worksheet. Click on the Shape shelf and then select More Shapes. Within the Edit Shape dialog box, click on the Reload Shapes button. Select the My Shapes palette and set the shape to the simple shopping cart. After closing the dialog box, click on the Size shelf and adjust as desired. Also adjust other aspects of the visualization as desired. Navigate to the 3D Shopping Cart worksheet and then repeat steps 8 to 11. Instead of using the simple shopping cart, use the 3D shopping cart: Compare the two visualizations. Which version of the shopping cart is more attractive? Likely the cart with the 3D look was your choice. Why not choose the more attractive image? Making visualizations attractive is only of secondary concern. The primary goal is to display the data as clearly and efficiently as possible. A simple shape is grasped more quickly and intuitively than a complex shape. Besides, the cuteness of the 3D image will quickly wear off. Rule – use pie charts sparingly Edward Tufte makes an acrid (and somewhat humorous) comment against the use of pie charts in his book The Visual Display of Quantitative Information. A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them. Given their low density and failure to order numbers along a visual dimension, pie charts should never be used. The present sentiment in data visualization circles is largely sympathetic to Tufte's criticism. There may, however, be some exceptions; that is, some circumstances where a pie chart is optimal. Consider the following visualization: Which of the four visualizations best demonstrates that A accounts for 25% of the whole? Clearly it is the pie chart! Therefore, perhaps it is fairer to refer to pie charts as limited and to use them sparingly as opposed to considering them inherently evil. Compromises In this section, we will transition from more or less strict rules to compromises. Often, building visualizations is a balancing act. It's common to encounter contradictory directions from books, blogs, consultants, and within organizations. One person may insist on utilizing every pixel of space while another urges simplicity and whitespace. One counsels a guided approach while another recommends building wide open dashboards that allow end users to discover their own path. Avant gardes may crave esoteric visualizations while those of a more conservative bent prefer to stay with the conventional. We now explore a few of the more common competing requests and suggests compromises. Make the dashboard simple versus make the dashboard robust Recently a colleague showed me a complex dashboard he had just completed. Although he was pleased that he had managed to get it working well, he felt the need to apologize by saying, “I know it's dense and complex but it's what the client wanted.” Occam's Razor encourages the simplest possible solution for any problem. For my colleague's dashboard, the simplest solution was rather complex. This is OK! Complexity in Tableau dashboarding need not be shunned. But a clear understanding of some basic guidelines can help the author intelligently determine how to compromise between demands for simplicity and demands for robustness. More frequent data updates necessitate simpler design. Some Tableau dashboards may be near-real-time. Third-party technology may be utilized to force a browser displaying a dashboard via Tableau Server to refresh every few minutes to ensure the absolute latest data displays. In such cases, the design should be quite simple. The end user must be able to see at a glance all pertinent data and should not use that dashboard for extensive analysis. Conversely, a dashboard that is refreshed monthly can support high complexity and thus may be used for deep exploration. Greater end user expertise supports greater dashboard complexity. Know thy users. If they want easy, at-a-glance visualizations, keep the dashboards simple. If they like deep dives, design accordingly. Smaller audiences require more precise design. If only a few people monitor a given dashboard, it may require a highly customized approach. In such cases, specifications may be detailed, complex, and difficult to execute and maintain because the small user base has expectations that may not be natively easy to produce in Tableau. Screen resolution and visualization complexity are proportional. Users with low-resolution devices will need to interact fairly simply with a dashboard. Thus the design of such a dashboard will likely be correspondingly uncomplicated. Conversely, high-resolution devices support greater complexity. Greater distance from the screen requires larger dashboard elements. If the dashboard is designed for conference room viewing, the elements on the dashboard may need to be fairly large to meet the viewing needs of those far from the screen. Thus the dashboard will likely be relatively simple. Conversely, a dashboard to be viewed primarily on end users desktops can be more complex. Although these points are all about simple versus complex, do not equate simple with easy. A simple and elegantly designed dashboard can be more difficult to create than a complex dashboard. In the words of Steve Jobs: Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains. Present dense information versus present sparse information Normally, a line graph should have a maximum of four to five lines. However, there are times when you may wish to display many lines. A compromise can be achieved by presenting many lines and empowering the end user to highlight as desired. The following line graph displays the percentage of Internet usage by country from 2000 to 2012. Those countries with the largest increases have been highlighted. Assuming that Highlight Selected Items has been activated within the Color legend, the end user can select items (countries in this case) from the legend to highlight as desired. Or, even better, a worksheet can be created listing all countries and used in conjunction with a highlight action on a dashboard to focus attention on selected items on the line graph: Tell a story versus allow a story to be discovered Albert Cairo, in his excellent book The Functional Art: An Introduction to Information Graphics and Visualization, includes a section where he interviews prominent data visualization and information graphics professionals. Two of these interviews are remarkable for their opposing views. I… feel that many visualization designers try to transform the user into an editor. They create these amazing interactive tools with tons of bubbles, lines, bars, filters, and scrubber bars, and expect readers to figure the story out by themselves, and draw conclusions from the data. That's not an approach to information graphics I like. – Jim Grimwade The most fascinating thing about the rise of data visualization is exactly that anyone can explore all those large data sets without anyone telling us what the key insight is. – Moritz Stefaner Fortunately, the compromise position can be found in the Jim Grimwade interview: [The New York Times presents] complex sets of data, and they let you go really deep into the figures and their connections. But beforehand, they give you some context, some pointers as to what you can do with those data. If you don't do this… you will end up with a visualization that may look really beautiful and intricate, but that will leave readers wondering, What has this thing really told me? What is this useful for? – Jim Grimwade Although the case scenarios considered in the preceding quotes are likely quite different from the Tableau work you are involved in, the underlying principles remain the same. You can choose to tell a story or build a platform that allows the discovery of numerous stories. Your choice will differ depending on the given dataset and audience. If you choose to create a platform for story discovery, be sure to take the New York Times approach suggested by Grimwade. Provide hints, pointers, and good documentation to lead your end user to successfully interact with the story you wish to tell or successfully discover their own story. Document, Document, Document! But don't use any space! Immediately above we considered the suggestion Provide hints, pointers, and good documentation… but there's an issue. These things take space. Dashboard space is precious. Often Tableau authors are asked to squeeze more and more stuff on a dashboard and are hence looking for ways to conserve space. Here are some suggestions for maximizing documentation on a dashboard while minimally impacting screen real estate. Craft titles for clear communication Titles are expected. Not just a title for a dashboard and worksheets on the dashboard, but also titles for legends, filters and other objects. These titles can be used for effective and efficient documentation. For instance a filter should not just read Market. Instead it should say something like Select a Market. Notice the imperative statement. The user is being told to do something and this is a helpful hint. Adding a couple of words to a title will usually not impact dashboard space. Use subtitles to relay instructions A subtitle will take some extra space but it does not have to be much. A small, italicized font immediately underneath a title is an obvious place a user will look at for guidance. Consider an example: red represents loss. This short sentence could be used as a subtitle that may eliminate the need for a legend and thus actually save space. Use intuitive icons Consider a use case of navigating from one dashboard to another. Of course you could associate an action with some hyperlinked text stating Click here to navigate to another dashboard. But this seems quite unnecessary when an action can be associated with a small, innocuous arrow, such as is natively used in PowerPoint, to communicate the same thing. Store more extensive documentation in a tooltip associated with a help icon. A small question mark in the top-right corner of an application is common. This clearly communicates where to go if additional help is required. As shown in the following exercise, it's easy to create a similar feature on a Tableau dashboard. Summary Hence from this article we studied to create some effective dashboards that are very beneficial in corporate world as a statistical tool to calculate average growth in terms of revenue. Resources for Article: Further resources on this subject: Say Hi to Tableau [article] Tableau Data Extract Best Practices [article] Getting Started with Tableau Public [article]

0
0
3264

Packt

10 Jan 2017

9 min read

Elastic Stack Overview

Packt

10 Jan 2017

9 min read

0
0
3325

article-image-deep-learning-and-regression-analysis

Packt

09 Jan 2017

6 min read

Deep learning and regression analysis

Packt

09 Jan 2017

6 min read

In this article by Richard M. Reese and Jennifer L. Reese, authors of the book, Java for Data Science, We will discuss neural networks can be used to perform regression analysis. However, other techniques may offer a more effective solution. With regression analysis, we want to predict a result based on several input variables (For more resources related to this topic, see here.) We can perform regression analysis using an output layer that consists of a single neuron that sums the weighted input plus bias of the previous hidden layer. Thus, the result is a single value representing the regression. Preparing the data We will use a car evaluation database to demonstrate how to predict the acceptability of a car based on a series of attributes. The file containing the data we will be using can be downloaded from: http://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data. It consists of car data such as price, number of passengers, and safety information, and an assessment of its overall quality. It is this latter element that we will try to predict. The comma-delimited values in each attribute are shown next, along with substitutions. The substitutions are needed because the model expects numeric data: Attribute Original value Substituted value Buying price vhigh, high, med, low 3,2,1,0 Maintenance price vhigh, high, med, low 3,2,1,0 Number of doors 2, 3, 4, 5-more 2,3,4,5 Seating 2, 4, more 2,4,5 Cargo space small, med, big 0,1,2 Safety low, med, high 0,1,2 There are 1,728 instances in the file. The cars are marked with four classes: Class Number of instances Percentage of instances Original value Substituted value Unacceptable 1210 70.023% unacc 0 Acceptable 384 22.222% acc 1 Good 69 3.99% good 2 Very good 65 3.76% v-good 3 Setting up the class We start with the definition of a CarRegressionExample class, as shown next: public class CarRegressionExample { public CarRegressionExample() { try { ... } catch (IOException | InterruptedException ex) { // Handle exceptions } } public static void main(String[] args) { new CarRegressionExample(); } } Reading and preparing the data The first task is to read in the data. We will use the CSVRecordReader class to get the data: RecordReader recordReader = new CSVRecordReader(0, ","); recordReader.initialize(new FileSplit(new File("car.txt"))); DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader, 1728, 6, 4); With this dataset, we will split the data into two sets. Sixty five percent of the data is used for training and the rest for testing: DataSet dataset = iterator.next(); dataset.shuffle(); SplitTestAndTrain testAndTrain = dataset.splitTestAndTrain(0.65); DataSet trainingData = testAndTrain.getTrain(); DataSet testData = testAndTrain.getTest(); The data now needs to be normalized: DataNormalization normalizer = new NormalizerStandardize(); normalizer.fit(trainingData); normalizer.transform(trainingData); normalizer.transform(testData); We are now ready to build the model. Building the model A MultiLayerConfiguration instance is created using a series of NeuralNetConfiguration.Builder methods. The following is the dice used. We will discuss the individual methods following the code. Note that this configuration uses two layers. The last layer uses the softmax activation function, which is used for regression analysis: MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .iterations(1000) .activation("relu") .weightInit(WeightInit.XAVIER) .learningRate(0.4) .list() .layer(0, new DenseLayer.Builder() .nIn(6).nOut(3) .build()) .layer(1, new OutputLayer .Builder(LossFunctions.LossFunction .NEGATIVELOGLIKELIHOOD) .activation("softmax") .nIn(3).nOut(4).build()) .backprop(true).pretrain(false) .build(); Two layers are created. The first is the input layer. The DenseLayer.Builder class is used to create this layer. The DenseLayer class is a feed-forward and fully connected layer. The created layer uses the six car attributes as input. The output consists of three neurons that are fed into the output layer and is duplicated here for your convenience: .layer(0, new DenseLayer.Builder() .nIn(6).nOut(3) .build()) The second layer is the output layer created with the OutputLayer.Builder class. It uses a loss function as the argument of its constructor. The softmax activation function is used since we are performing regression as shown here: .layer(1, new OutputLayer .Builder(LossFunctions.LossFunction .NEGATIVELOGLIKELIHOOD) .activation("softmax") .nIn(3).nOut(4).build()) Next, a MultiLayerNetwork instance is created using the configuration. The model is initialized, its listeners are set, and then the fit method is invoked to perform the actual training. The ScoreIterationListener instance will display information as the model trains which we will see shortly in the output of this example. Its constructor argument specifies the frequency that information is displayed: MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.setListeners(new ScoreIterationListener(100)); model.fit(trainingData); We are now ready to evaluate the model. Evaluating the model In the next sequence of code, we evaluate the model against the training dataset. An Evaluation instance is created using an argument specifying that there are four classes. The test data is fed into the model using the output method. The eval method takes the output of the model and compares it against the test data classes to generate statistics. The getLabels method returns the expected values: Evaluation evaluation = new Evaluation(4); INDArray output = model.output(testData.getFeatureMatrix()); evaluation.eval(testData.getLabels(), output); out.println(evaluation.stats()); The output of the training follows, which is produced by the ScoreIterationListener class. However, the values you get may differ due to how the data is selected and analyzed. Notice that the score improves with the iterations but levels out after about 500 iterations: 12:43:35.685 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 0 is 1.443480901811554 12:43:36.094 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 100 is 0.3259061845624861 12:43:36.390 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 200 is 0.2630572026049783 12:43:36.676 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 300 is 0.24061281470878784 12:43:36.977 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 400 is 0.22955121170274934 12:43:37.292 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 500 is 0.22249920540161677 12:43:37.575 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 600 is 0.2169898450109222 12:43:37.872 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 700 is 0.21271599814600958 12:43:38.161 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 800 is 0.2075677126088741 12:43:38.451 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 900 is 0.20047317735870715 This is followed by the results of the stats method as shown next. The first part reports on how examples are classified and the second part displays various statistics: Examples labeled as 0 classified by model as 0: 397 times Examples labeled as 0 classified by model as 1: 10 times Examples labeled as 0 classified by model as 2: 1 times Examples labeled as 1 classified by model as 0: 8 times Examples labeled as 1 classified by model as 1: 113 times Examples labeled as 1 classified by model as 2: 1 times Examples labeled as 1 classified by model as 3: 1 times Examples labeled as 2 classified by model as 1: 7 times Examples labeled as 2 classified by model as 2: 21 times Examples labeled as 2 classified by model as 3: 14 times Examples labeled as 3 classified by model as 1: 2 times Examples labeled as 3 classified by model as 3: 30 times ==========================Scores======================================== Accuracy: 0.9273 Precision: 0.854 Recall: 0.8323 F1 Score: 0.843 ======================================================================== The regression model does a reasonable job with this dataset. Summary In this article, we examined deep learning and regression analysis. We showed how to prepare the data and class, build the model, and evaluate the model. We used sample data and displayed output statistics to demonstrate the relative effectiveness of our model. Resources for Article: Further resources on this subject: KnockoutJS Templates [article] The Heart of It All [article] Bringing DevOps to Network Operations [article]

0
0
5104

article-image-exploring-structure-motion-using-opencv

Packt

09 Jan 2017

20 min read

Exploring Structure from Motion Using OpenCV

Packt

09 Jan 2017

20 min read

0
1
57530

Packt

04 Jan 2017

17 min read

TensorFlow

Packt

04 Jan 2017

17 min read

0
0
2713

Packt

04 Jan 2017

7 min read

Text Recognition

Packt

04 Jan 2017

7 min read

0
0
1859

article-image-microsoft-cognitive-services

Packt

04 Jan 2017

16 min read

Microsoft Cognitive Services

Packt

04 Jan 2017

16 min read

In this article by Leif Henning Larsen, author of the book Learning Microsoft Cognitive Services, we will look into what Microsoft Cognitive Services offer. You will then learn how to utilize one of the APIs by recognizing faces in images. Microsoft Cognitive Services give developers the possibilities of adding AI-like capabilities to their applications. Using a few lines of code, we can take advantage of powerful algorithms that would usually take a lot of time, effort, and hardware to do yourself. (For more resources related to this topic, see here.) Overview of Microsoft Cognitive Services Using Cognitive Services means you have 21 different APIs at your hand. These are in turn separated into 5 top-level domains according to what they do. They are vision, speech, language, knowledge, and search. Let's see more about them in the following sections. Vision APIs under the vision flags allows your apps to understand images and video content. It allows you to retrieve information about faces, feelings, and other visual content. You can stabilize videos and recognize celebrities. You can read text in images and generate thumbnails from videos and images. There are four APIs contained in the vision area, which we will see now. Computer Vision Using the Computer Vision API, you can retrieve actionable information from images. This means you can identify content (such as image format, image size, colors, faces, and more). You can detect whether an image is adult/racy. This API can recognize text in images and extract it to machine-readable words. It can detect celebrities from a variety of areas. Lastly, it can generate storage-efficient thumbnails with smart cropping functionality. Emotion The Emotion API allows you to recognize emotions, both in images and videos. This can allow for more personalized experiences in applications. The emotions that are detected are cross-cultural emotions: anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Face We have already seen the very basic example of what the Face API can do. The rest of the API revolves around the same—to detect, identify, organize, and tag faces in photos. Apart from face detection, you can see how likely it is that two faces belong to the same person. You can identify faces and also find similar-looking faces. Video The Video API is about analyzing, editing, and processing videos in your app. If you have a video that is shaky, the API allows you to stabilize it. You can detect and track faces in videos. If a video contains a stationary background, you can detect motion. The API lets you to generate thumbnail summaries for videos, which allows users to see previews or snapshots quickly. Speech Adding one of the Speech APIs allows your application to hear and speak to your users. The APIs can filter noise and identify speakers. They can drive further actions in your application based on the recognized intent. Speech contains three APIs, which we will discuss now. Bing Speech Adding the Bing Speech API to your application allows you to convert speech to text and vice versa. You can convert spoken audio to text either by utilizing a microphone or other sources in real time or by converting audio from files. The API also offer speech intent recognition, which is trained by Language Understanding Intelligent Service to understand the intent. Speaker Recognition The Speaker Recognition API gives your application the ability to know who is talking. Using this API, you can use verify that someone speaking is who they claim to be. You can also determine who an unknown speaker is, based on a group of selected speakers. Custom Recognition To improve speech recognition, you can use the Custom Recognition API. This allows you to fine-tune speech recognition operations for anyone, anywhere. Using this API, the speech recognition model can be tailored to the vocabulary and speaking style of the user. In addition to this, the model can be customized to match the expected environment of the application. Language APIs related to language allow your application to process natural language and learn how to recognize what users want. You can add textual and linguistic analysis to your application as well as natural language understanding. The following five APIs can be found in the Language area. Bing Spell Check Bing Spell Check API allows you to add advanced spell checking to your application. Language Understanding Intelligent Service (LUIS) Language Understanding Intelligent Service, or LUIS, is an API that can help your application understand commands from your users. Using this API, you can create language models that understand intents. Using models from Bing and Cortana, you can make these models recognize common requests and entities (such as places, time, and numbers). You can add conversational intelligence to your applications. Linguistic Analysis Linguistic Analysis API lets you parse complex text to explore the structure of text. Using this API, you can find nouns, verbs, and more from text, which allows your application to understand who is doing what to whom. Text Analysis Text Analysis API will help you in extracting information from text. You can find the sentiment of a text (whether the text is positive or negative). You will be able to detect language, topic, and key phrases used throughout the text. Web Language Model Using the Web Language Model (WebLM) API, you are able to leverage the power of language models trained on web-scale data. You can use this API to predict which words or sequences follow a given sequence or word. Knowledge When talking about Knowledge APIs, we are talking about APIs that allow you to tap into rich knowledge. This may be knowledge from the Web, it may be academia, or it may be your own data. Using these APIs, you will be able to explore different nuances of knowledge. The following four APIs are contained in the Knowledge area. Academic Using the Academic API, you can explore relationships among academic papers, journals, and authors. This API allows you to interpret natural language user query strings, which allows your application to anticipate what the user is typing. It will evaluate the said expression and return academic knowledge entities. Entity Linking Entity Linking is the API you would use to extend knowledge of people, places, and events based on the context. As you may know, a single word may be used differently based on the context. Using this API allows you to recognize and identify each separate entity within a paragraph based on the context. Knowledge Exploration The Knowledge Exploration API will let you add the ability to use interactive search for structured data in your projects. It interprets natural language queries and offers auto-completions to minimize user effort. Based on the query expression received, it will retrieve detailed information about matching objects. Recommendations The Recommendations API allows you to provide personalized product recommendations for your customers. You can use this API to add frequently bought together functionality to your application. Another feature you can add is item-to-item recommendations, which allow customers to see what other customers who likes this also like. This API will also allow you to add recommendations based on the prior activity of the customer. Search Search APIs give you the ability to make your applications more intelligent with the power of Bing. Using these APIs, you can use a single call to access data from billions of web pages, images videos, and news. The following five APIs are in the search domain. Bing Web Search With Bing Web Search, you can search for details in billions of web documents indexed by Bing. All the results can be arranged and ordered according to the layout you specify, and the results are customized to the location of the end user. Bing Image Search Using Bing Image Search API, you can add advanced image and metadata search to your application. Results include URLs to images, thumbnails, and metadata. You will also be able to get machine-generated captions and similar images and more. This API allows you to filter the results based on image type, layout, freshness (how new is the image), and license. Bing Video Search Bing Video Search will allow you to search for videos and returns rich results. The results contain metadata from the videos, static- or motion- based thumbnails, and the video itself. You can add filters to the result based on freshness, video length, resolution, and price. Bing News Search If you add Bing News Search to your application, you can search for news articles. Results can include authoritative image, related news and categories, information on the provider, URL, and more. To be more specific, you can filter news based on topics. Bing Autosuggest Bing Autosuggest API is a small, but powerful one. It will allow your users to search faster with search suggestions, allowing you to connect powerful search to your apps. Detecting faces with the Face API We have seen what the different APIs can do. Now we will test the Face API. We will not be doing a whole lot, but we will see how simple it is to detect faces in images. The steps we need to cover to do this are as follows: Register for a free Face API preview subscription. Add necessary NuGet packages to our project. Add some UI to the test application. Detect faces on command. Head over to https://www.microsoft.com/cognitive-services/en-us/face-api to start the process of registering for a free subscription to the Face API. By clicking on the yellow button, stating Get started for free,you will be taken to a login page. Log in with your Microsoft account, or if you do not have one, register for one. Once logged in, you will need to verify that the Face API Preview has been selected in the list and accept the terms and conditions. With that out of the way, you will be presented with the following: You will need one of the two keys later, when we are accessing the API. In Visual Studio, create a new WPF application. Following the instructions at https://www.codeproject.com/articles/100175/model-view-viewmodel-mvvm-explained, create a base class that implements the INotifyPropertyChanged interface and a class implementing the ICommand interface. The first should be inherited by the ViewModel, the MainViewModel.cs file, while the latter should be used when creating properties to handle button commands. The Face API has a NuGet package, so we need to add that to our project. Head over to NuGet Package Manager for the project we created earlier. In the Browse tab, search for the Microsoft.ProjectOxford.Face package and install the it from Microsoft: As you will notice, another package will also be installed. This is the Newtonsoft.Json package, which is required by the Face API. The next step is to add some UI to our application. We will be adding this in the MainView.xaml file. First, we add a grid and define some rows for the grid: <Grid> <Grid.RowDefinitions> <RowDefinition Height="*" /> <RowDefinition Height="20" /> <RowDefinition Height="30" /> </Grid.RowDefinitions> Three rows are defined. The first is a row where we will have an image. The second is a line for status message, and the last is where we will place some buttons: Next, we add our image element: <Image x_Name="FaceImage" Stretch="Uniform" Source="{Binding ImageSource}" Grid.Row="0" /> We have given it a unique name. By setting the Stretch parameter to Uniform, we ensure that the image keeps its aspect ratio. Further on, we place this element in the first row. Last, we bind the image source to a BitmapImage interface in the ViewModel, which we will look at in a bit. The next row will contain a text block with some status text. The text property will be bound to a string property in the ViewModel: <TextBlock x_Name="StatusTextBlock" Text="{Binding StatusText}" Grid.Row="1" /> The last row will contain one button to browse for an image and one button to be able to detect faces. The command properties of both buttons will be bound to the DelegateCommand properties in the ViewModel: <Button x_Name="BrowseButton" Content="Browse" Height="20" Width="140" HorizontalAlignment="Left" Command="{Binding BrowseButtonCommand}" Margin="5, 0, 0, 5" Grid.Row="2" /> <Button x_Name="DetectFaceButton" Content="Detect face" Height="20" Width="140" HorizontalAlignment="Right" Command="{Binding DetectFaceCommand}" Margin="0, 0, 5, 5" Grid.Row="2"/> With the View in place, make sure that the code compiles and run it. This should present you with the following UI: The last part is to create the binding properties in our ViewModel and make the buttons execute something. Open the MainViewModel.cs file. First, we define two variables: private string _filePath; private IFaceServiceClient _faceServiceClient; The string variable will hold the path to our image, while the IFaceServiceClient variable is to interface the Face API. Next we define two properties: private BitmapImage _imageSource; public BitmapImage ImageSource { get { return _imageSource; } set { _imageSource = value; RaisePropertyChangedEvent("ImageSource"); } } private string _statusText; public string StatusText { get { return _statusText; } set { _statusText = value; RaisePropertyChangedEvent("StatusText"); } } What we have here is a property for the BitmapImage mapped to the Image element in the view. We also have a string property for the status text, mapped to the text block element in the view. As you also may notice, when either of the properties is set, we call the RaisePropertyChangedEvent method. This will ensure that the UI is updated when either of the properties has new values. Next, we define our two DelegateCommand objects and do some initialization through the constructor: public ICommand BrowseButtonCommand { get; private set; } public ICommand DetectFaceCommand { get; private set; } public MainViewModel() { StatusText = "Status: Waiting for image..."; _faceServiceClient = new FaceServiceClient("YOUR_API_KEY_HERE"); BrowseButtonCommand = new DelegateCommand(Browse); DetectFaceCommand = new DelegateCommand(DetectFace, CanDetectFace); } In our constructor, we start off by setting the status text. Next, we create an object of the Face API, which needs to be created with the API key we got earlier. At last, we create the DelegateCommand object for our command properties. Note how the browse command does not specify a predicate. This means it will always be possible to click on the corresponding button. To make this compile, we need to create the functions specified in the DelegateCommand constructors—the Browse, DetectFace, and CanDetectFace functions: private void Browse(object obj) { var openDialog = new Microsoft.Win32.OpenFileDialog(); openDialog.Filter = "JPEG Image(*.jpg)|*.jpg"; bool? result = openDialog.ShowDialog(); if (!(bool)result) return; We start the Browse function by creating an OpenFileDialog object. This dialog is assigned a filter for JPEG images, and in turn it is opened. When the dialog is closed, we check the result. If the dialog was cancelled, we simply stop further execution: _filePath = openDialog.FileName; Uri fileUri = new Uri(_filePath); With the dialog closed, we grab the filename of the file selected and create a new URI from it: BitmapImage image = new BitmapImage(fileUri); image.CacheOption = BitmapCacheOption.None; image.UriSource = fileUri; With the newly created URI, we want to create a new BitmapImage interface. We specify it to use no cache, and we set the URI source the URI we created: ImageSource = image; StatusText = "Status: Image loaded..."; } The last step we take is to assign the bitmap image to our BitmapImage property, so the image is shown in the UI. We also update the status text to let the user know the image has been loaded. Before we move on, it is time to make sure that the code compiles and that you are able to load an image into the View: private bool CanDetectFace(object obj) { return !string.IsNullOrEmpty(ImageSource?.UriSource.ToString()); } The CanDetectFace function checks whether or not the detect faces button should be enabled. In this case, it checks whether our image property actually has a URI. If it does, by extension that means we have an image, and we should be able to detect faces: private async void DetectFace(object obj) { FaceRectangle[] faceRects = await UploadAndDetectFacesAsync(); string textToSpeak = "No faces detected"; if (faceRects.Length == 1) textToSpeak = "1 face detected"; else if (faceRects.Length > 1) textToSpeak = $"{faceRects.Length} faces detected"; Debug.WriteLine(textToSpeak); } Our DetectFace method calls an async method to upload and detect faces. The return value contains an array of FaceRectangles. This array contains the rectangle area for all face positions in the given image. We will look into the function we call in a bit. After the call has finished executing, we print a line with the number of faces to the debug console window: private async Task<FaceRectangle[]> UploadAndDetectFacesAsync() { StatusText = "Status: Detecting faces..."; try { using (Stream imageFileStream = File.OpenRead(_filePath)) { In the UploadAndDetectFacesAsync function, we create a Stream object from the image. This stream will be used as input for the actual call to the Face API service: Face[] faces = await _faceServiceClient.DetectAsync(imageFileStream, true, true, new List<FaceAttributeType>() { FaceAttributeType.Age }); This line is the actual call to the detection endpoint for the Face API. The first parameter is the file stream we created in the previous step. The rest of the parameters are all optional. The second parameter should be true if you want to get a face ID. The next specifies if you want to receive face landmarks or not. The last parameter takes a list of facial attributes you may want to receive. In our case, we want the age parameter to be returned, so we need to specify that. The return type of this function call is an array of faces with all the parameters you have specified: List<double> ages = faces.Select(face => face.FaceAttributes.Age).ToList(); FaceRectangle[] faceRects = faces.Select(face => face.FaceRectangle).ToArray(); StatusText = "Status: Finished detecting faces..."; foreach(var age in ages) { Console.WriteLine(age); } return faceRects; } } The first line in the previous code iterates over all faces and retrieves the approximate age of all faces. This is later printed to the debug console window, in the following foreach loop. The second line iterates over all faces and retrieves the face rectangle with the rectangular location of all faces. This is the data we return to the calling function. Add a catch clause to finish the method. In case an exception is thrown, in our API call, we catch that. You want to show the error message and return an empty FaceRectangle array. With that code in place, you should now be able to run the full example. The end result will look like the following image: The resulting debug console window will print the following text: 1 face detected 23,7 Summary In this article, we looked at what Microsoft Cognitive Services offer. We got a brief description of all the APIs available. From there, we looked into the Face API, where we saw how to detect faces in images. Resources for Article: Further resources on this subject: Auditing and E-discovery [article] The Sales and Purchase Process [article] Manage Security in Excel [article]

0
0
16238

Packt

04 Jan 2017

11 min read

What is an Artificial Neural Network?

Packt

04 Jan 2017

11 min read

0
0
24394

article-image-introduction-deep-learning

Packt

04 Jan 2017

19 min read

Introduction to Deep Learning

Packt

04 Jan 2017

19 min read

In this article by Dipayan Dev, the author of the book Deep Learning with Hadoop, we will see a brief introduction to concept of the deep learning and deep feed-forward networks. "By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it." - Eliezer Yudkowsky Ever thought, why it is often difficult to beat the computer in chess, even by the best players of the game? How Facebook is able to recognize your face among hundreds of millions photos? How your mobile phone can recognize your voice, and redirects the call to the correct person selecting from hundreds of contacts listed? The primary goal of this book is to deal with many of those queries, and to provide detailed solutions to the readers. This book can be used for a wide range of reasons by a variety of readers, however, we wrote the book with two main target audiences in mind. One of the primary target audiences are the undergraduate or graduate university students learning about deep learning and Artificial Intelligence; the second group of readers belongs to the software engineers who already have a knowledge of Big Data, deep learning, and statistical modeling, but want to rapidly gain the knowledge of how deep learning can be used for Big Data and vice versa. This article will mainly try to set the foundation of the readers by providing the basic concepts, terminologies, characteristics, and the major challenges of deep learning. The article will also put forward the classification of different deep network algorithms, which have been widely used by researchers in the last decade. Following are the main topics that this article will cover: Get started with deep learning Deep learning: A revolution in Artificial Intelligence Motivations for deep learning Classification of deep learning networks Ever since the dawn of civilization, people have always dreamt of building some artificial machines or robots which can behave and work exactly like human beings. From the Greek mythological characters to the ancient Hindu epics, there are numerous such examples, which clearly suggest people's interest and inclination towards creating and having an artificial life. During the initial computer generations, people had always wondered if the computer could ever become as intelligent as a human being! Going forward, even in medical science too, the need of automated machines became indispensable and almost unavoidable. With this need and constant research in the same field, Artificial Intelligence (AI) has turned out to be a flourishing technology with its various applications in several domains, such as image processing, video processing, and many other diagnosis tools in medical science too. Although there are many problems that are resolved by AI systems on a daily basis, nobody knows the specific rules for how an AI system is programmed! Few of the intuitive problems are as follows: Google search, which does a really good job of understanding what you type or speak As mentioned earlier, Facebook too, is somewhat good at recognizing your face, and hence, understanding your interests Moreover, with the integration of various other fields, for example, probability, linear algebra, statistics, machine learning, deep learning, and so on, AI has already gained a huge amount of popularity in the research field over the course of time. One of the key reasons for he early success of AI could be because it basically dealt with fundamental problems for which the computer did not require vast amount of knowledge. For example, in 1997, IBM's Deep Blue chess-playing system was able to defeat the world champion Garry Kasparov [1]. Although this kind of achievement at that time can be considered as substantial, however, chess, being limited by only a few number of rules, it was definitely not a burdensome task to train the computer with only those number of rules! Training a system with fixed and limited number of rules is termed as hard-coded knowledge of the computer. Many Artificial Intelligence projects have undergone this hard-coded knowledge about the various aspects of the world in many traditional languages. As time progresses, this hard-coded knowledge does not seem to work with systems dealing with huge amounts of data. Moreover, the number of rules that the data were following also kept changing in a frequent manner. Therefore, most of those projects following that concept failed to stand up to the height of expectation. The setbacks faced by this hard-coded knowledge implied that those artificial intelligent systems need some way of generalizing patterns and rules from the supplied raw data, without the need of external spoon-feeding. The proficiency of a system to do so is termed as machine learning. There are various successful machine learning implementations, which we use in our daily life. Few of the most common and important implementations are as follows: Spam detection: Given an e-mail in your inbox, the model can detect whether to put that e-mail in spam or in the inbox folder. A common naive Bayes model can distinguish between such e-mails. Credit card fraud detection: A model that can detect whether a number of transactions performed at a specific time interval are done by the original customer or not. One of the most popular machine learning model, given by Mor-Yosef et al. [1990], used logistic regression, which could recommend whether caesarean delivery is needed for the patient or not! There are many such models, which have been implemented with the help of machine learning techniques: The figure shows the example of different types of representation. Let's say we want to train the machine to detect some empty spaces in between the jelly beans. In the image on the right side, we have sparse jelly beans, and it would be easier for the AI system to determine the empty parts. However, in the image on the left side, we have extremely compact jelly beans, and hence, it will be an extremely difficult task for the machine to find the empty spaces. Images sourced from USC-SIPI image database. A large portion of performance of the machine learning systems depends on the data fed to the system. This is called representation of the data. All the information related to the representation is called feature of the data. For example, if logistic regression is used to detect a brain tumor in a patient, the AI system will not try to diagnose the patient directly! Rather, the concerned doctor will provide the necessary input to the systems according to the common symptoms of that patient. The AI system will then match those inputs with the already received past inputs which were used to train the system. Based on the predictive analysis of the system, it will provide its decision regarding the disease. Although logistic regression can learn and decide based on the features given, it cannot influence or modify the way features are defined. For example, if that model was provided with a cesarean patient's report instead of the brain tumor patient's report, it would surely fail to predict the outcome, as the given features would never match with the trained data. This dependency of the machine learning systems on the representation of the data is not really unknown to us! In fact, most of our computer theory performs better based on how the data is represented. For example, the quality of database is considered based on the schema design. The execution of any database query, even on a thousand of million lines of data, becomes extremely fast if the schema is indexed properly. Therefore, the dependency of data representation of the AI systems should not surprise us. There are many such daily life examples too, where the representation of the data decides our efficiency. To locate a person from among 20 people is obviously easier than to locate the same from a crowd of 500 people. A visual representation of two different types of data representation in shown in preceding figure. Therefore, if the AI systems are fed with the appropriate featured data, even the hardest problems could be resolved. However, collecting and feeding the desired data in the correct way to the system has been a serious impediment for the computer programmer. There can be numerous real-time scenarios, where extracting the features could be a cumbersome task. Therefore, the way the data are represented decides the prime factors in the intelligence of the system. Finding cats from among a group of humans and cats could be extremely complicated if the features are not appropriate. We know that cats have tails; therefore, we might like to detect the presence of tails as a prominent feature. However, given the different tail shapes and sizes, it is often difficult to describe exactly how a tail will look like in terms of pixel values! Moreover, tails could sometimes be confused with the hands of humans. Also, overlapping of some objects could omit the presence of a cat's tail, making the image even more complicated. From all the above discussions, it can really be concluded that the success of AI systems depends, mainly, on how the data is represented. Also, various representations can ensnare and cache the different explanatory factors of all the disparities behind the data. Representation learning is one of the most popular and widely practiced learning approaches used to cope with these specific problems. Learning the representations of the next layer from the existing representation of data can be defined as representation learning. Ideally, all representation learning algorithms have this advantage of learning representations, which capture the underlying factors, a subset that might be applicable for each particular sub-task. A simple illustration is given in the following figure: The figure illustrates of representation learning. The middle layers are able to discover the explanatory factors (hidden layers, in blue rectangular boxes). Some of the factors explain each task's target, whereas some explain the inputs. However, while dealing with extracting some high-level data and features from a huge amount of raw data, which requires some sort of human-level understanding, has shown its limitations. There can be many such following examples: Differentiating the cry of two similar age babies. Identifying the image of a cat's eye in both day and night times. This becomes clumsy, because a cat's eyes glow at night unlike during daytime. In all these preceding edge cases, representation learning does not appear to behave exceptionally, and shows deterrent behavior. Deep learning, a sub-field of machine learning can rectify this major problem of representation learning by building multiple levels of representations or learning a hierarchy of features from a series of other simple representations and features [2] [8]. The figure shows how a deep learning system can represent the human image through identifying various combinations such as corners, contours, which can be defined in terms of edges. The preceding figure shows an illustration of a deep learning model. It is generally a cumbersome task for the computer to decode the meaning of raw unstructured input data, as represented by this image, as a collection of different pixel values. A mapping function, which will convert the group of pixels to identify the image, is, ideally, difficult to achieve. Also, to directly train the computer for these kinds of mapping looks almost insuperable. For these types of tasks, deep learning resolves the difficulty by creating a series of subset of mappings to reach the desired output. Each subset of mapping corresponds to a different set of layer of the model. The input contains the variables that one can observe, and hence, represented in the visible layers. From the given input, we can incrementally extract the abstract features of the data. As these values are not available or visible in the given data, these layers are termed as hidden layers. In the image, from the first layer of data, the edges can easily be identified just by a comparative study of the neighboring pixels. The second hidden layer can distinguish the corners and contours from the first layer's description of the edges. From this second hidden layer, which describes the corners and contours, the third hidden layer can identify the different parts of the specific objects. Ultimately, the different objects present in the image can be distinctly detected from the third layer. Image reprinted with permission from Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, published by The MIT Press. Deep learning started its journey exclusively since 2006, Hinton et al. in 2006[2]; also Bengio et al. in 2007[3] initially focused on the MNIST digit classification problem. In the last few years, deep learning has seen major transitions from digits to object recognition in natural images. One of the major breakthroughs was achieved by Krizhevsky et al. in 2012 [4] using the ImageNet dataset 4. The scope of this book is mainly limited to deep learning, so before diving into it directly, the necessary definitions of deep learning should be provided. Many researchers have defined deep learning in many ways, and hence, in the last 10 years, it has gone through many explanations too! Following are few of the widely accepted definitions: As noted by GitHub, deep learning is a new area of machine learning research, which has been introduced with the objective of moving machine learning closer to one of its original goals: Artificial Intelligence. Deep learning is about learning multiple levels of representation and abstraction, which help to make sense of data such as images, sound, and text. As recently updated by Wikipedia, deep learning is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in the data by using a deep graph with multiple processing layers, composed of multiple linear and non-linear transformations. As the definitions suggest, deep learning can also be considered as a special type of machine learning. Deep learning has achieved immense popularity in the field of data science with its ability to learn complex representation from various simple features. To have an in-depth grip on deep learning, we have listed out a few terminologies. The next topic of this article will help the readers lay a foundation of deep learning by providing various terminologies and important networks used for deep learning. Getting started with deep learning To understand the journey of deep learning in this book, one must know all the terminologies and basic concepts of machine learning. However, if the reader has already got enough insight into machine learning and related terms, they should feel free to ignore this section and jump to the next topic of this article. The readers who are enthusiastic about data science, and want to learn machine learning thoroughly, can follow Machine Learning by Tom M. Mitchell (1997) [5] and Machine Learning: a Probabilistic Perspective (2012) [6]. Image shows the scattered data points of social network analysis. Image sourced from Wikipedia. Neural networks do not perform miracles. But if used sensibly, they can produce some amazing results. Deep feed-forward networks Neural networks can be recurrent as well as feed-forward. Feed-forward networks do not have any loop associated in their graph, and are arranged in a set of layers. A network with many layers is said to be a deep network. In simple words, any neural network with two or more layers (hidden) is defined as deep feed-forward network or feed-forward neural network. Figure 4 shows a generic representation of a deep feed-forward neural network. Deep feed-forward network works on the principle that with an increase in depth, the network can also execute more sequential instructions. Instructions in sequence can offer great power, as these instructions can point to the earlier instruction. The aim of a feed-forward network is to generalize some function f. For example, classifier y=f/(x) maps from input x to category y. A deep feed-forward network modified the mapping, y=f(x; α), and learns the value of the parameter α, which gives the most appropriate value of the function. The following figure shows a simple representation of the deep-forward network to provide the architectural difference with the traditional neural network. Deep neural network is feed-forward network with many hidden layers: Datasets are considered to be the building blocks of a learning process. A dataset can be defined as a collection of interrelated sets of data, which is comprised of separate entities, but which can be used as a single entity depending on the use-case. The individual data elements of a dataset are called data points. The preceding figure gives the visual representation of the following data points: Unlabeled data: This part of data consists of human-generated objects, which can be easily obtained from the surroundings. Some of the examples are X-rays, log file data, news articles, speech, videos, tweets, and so on. Labelled data: Labelled data are normalized data from a set of unlabeled data. These types of data are usually well formatted, classified, tagged, and easily understandable by human beings for further processing. From the top-level understanding, the machine learning techniques can be classified as supervised and unsupervised learning based on how their learning process is carried out. Unsupervised learning In unsupervised learning algorithms, there is no desired output from the given input datasets. The system learns meaningful properties and features from its experience during the analysis of the dataset. During deep learning, the system generally tries to learn from the whole probability distribution of the data points. There are various types of unsupervised learning algorithms too, which perform clustering, which means separating the data points among clusters of similar types of data. However, with this type of learning, there is no feedback based on the final output, that is, there won't be any teacher to correct you! Figure 6 shows a basic overview of unsupervised clustering. A real life example of an unsupervised clustering algorithm is Google News. When we open a topic under Google News, it shows us a number of hyper-links redirecting to several pages. Each of those topics can be considered as a cluster of hyper-links that point to independent links. Supervised learning In supervised learning, unlike unsupervised learning, there is an expected output associated with every step of the experience. The system is given a dataset, and it already knows what the desired output will look like, along with the correct relationship between the input and output of every associated layer. This type of learning is often used for classification problems. A visual representation is given in Figure 7. Real-life examples of supervised learning are face detection, face recognition, and so on. Although supervised and unsupervised learning look like different identities, they are often connected to each other by various means. Hence, that fine line between these two learnings is often hazy to the student fraternity. The preceding statement can be formulated with the following mathematical expression: The general product rule of probability states that for an n number of datasets n ε ℝk, the joint distribution can be given fragmented as follows: The distribution signifies that the appeared unsupervised problem can be resolved by k number of supervised problems. Apart from this, the conditional probability of p (k | n), which is a supervised problem, can be solved using unsupervised learning algorithms to experience the joint distribution of p (n, k): Although these two types are not completely separate identities, they often help to classify the machine learning and deep learning algorithms based on the operations performed. Generally speaking, cluster formation, identifying the density of a population based on similarity, and so on are termed as unsupervised learning, whereas, structured formatted output, regression, classification, and so on are recognized as supervised learning. Semi-supervised learning As the name suggests, in this type of learning, both labelled and unlabeled data are used during the training. It's a class of supervised learning, which uses a vast amount of unlabeled data during training. For example, semi-supervised learning is used in Deep belief network (explained network), a type of deep network, where some layers learn the structure of the data (unsupervised), whereas one layer learns how to classify the data (supervised learning). In semi-supervised learning, unlabeled data from p (n) and labelled data from p (n, k) are used to predict the probability of k, given the probability of n, or p (k | n): Figure shows the impact of a large amount of unlabeled data during the semi-supervised learning technique. Art the top, it shows the decision boundary that the model puts after distinguishing the white and black circle. The figure at the bottom displays another decision boundary, which the model embraces. In that dataset, in addition to two different categories of circles, a collection of unlabeled data (grey circle) is also annexed. This type of training can be viewed as creating the cluster, and then marking those with the labelled data, which moves the decision boundary away from the high-density data region. Figure obtained from Wikipedia. Deep learning networks are all about representation of data. Therefore, semi-supervised learning is, generally, about learning a representation, whose objective function is given by the following: l = f (n) The objective of the equation is to determine the representation-based cluster. The preceding figure depicts the illustration of a semi-supervised learning. Readers can refer to Chapelle et al.'s book [7] to know more about semi-supervised learning methods. So, as we have already got a foundation of what Artificial Intelligence, machine learning, representation learning are, we can move our entire focus to elaborate on deep learning with further description. From the previously mentioned definition of deep learning, two major characteristics of deep learning can be pointed out as follows: A way of experiencing unsupervised and supervised learning of the feature representation through successive knowledge from subsequent abstract layers A model comprising of multiple abstract stages of non-linear information processing Summary In this article, we have explained most of these concepts in detail, and have also classified the various algorithms of deep learning.

0
0
2480

Packt

03 Jan 2017

7 min read

Notes from the field

Packt

03 Jan 2017

7 min read

In this article by Donabel Santos author of the book Tableau 10 Business Intelligence Cookbook would like to offer you perhaps a personal, and maybe a not-so-conventional way to introduce Tableau. I’d like to highlight a few key concepts and tricks that I think would be useful to you as you go along. These are certainly points I highlight on the board whenever I do training on Tableau. If you feel like we are jumping too far ahead, please go ahead and start with the following section Tableau Primer. Come back to this section when you are ready for the tips and tricks. (For more resources related to this topic, see here.) Instead of thinking of Tableau as this software tool that has a steep learning curve, it is useful to think of it as a blank slate. You will draw on it, keep on adding things, removing things until something makes sense or something insightful pops out. After you work with Tableau for a while and get more comfortable with its functionalities, it might even feel like an extension of your brain to some degree. When you get access to data, you might automatically open Tableau to try and understand what’s in that data. Undo is your best friend Do not be afraid to make mistakes, and do not be afraid to explore in Tableau. Do not come in with strict prejudice – for example thinking that you can only use a time series graph when you have a measure and a date field. The best way to learn and explore how powerful Tableau is to try anything and everything. It’s one of the best tools to experiment. If you make a mistake, or if you don’t like what you see, no sweat. Just click on this friendly undo button and you are back to your previous view. If you are more of a shortcut person, it will be Ctrl + Z on a PC or Command + Z on a Mac. It doesn’t change your original data This is another common concern that comes up in my training sessions or whenever I talk to people about Tableau. No, Tableau does not write back to your data source. All the changes you make will be stored in Tableau like creating calculated fields, changing data types, editing aliases will be stored in your Tableau workbook or data source. Drag and drop Tableau is a highly drag and drop software. Although you can use the menu or a right click instead of a drag and drop for the same tasks, dragging and dropping is often faster. It also flows with your train of thought. Look for visual cues Tableau leverages its visual culture in your design area, so when you create views in Tableau, some of the visual cues and icons can help you along the way. A number of the visual cues have been discussed in this section. However, there may be some lesser known (or less noticeable) visual cues: Italicized field names mean they are Tableau-generated fields: Dual axis charts create fused pills. Notice the area when the two pills touch – they’re straight instead of curved: When you zoom in to maps, or when you search for a place, your map gets pinned (or fixed to this place) until you unpin it: Know the difference between blue (discrete) and green (continuous) Knowing the difference between blue and green will take you far in the Tableau world. The data type icons you will find beside your field names in the side bar are colored either blue or green. When you drag fields onto shelves and cards, the pills are also colored blue and green. Simply speaking, blue means discrete and green means continuous. Discrete means individual, separate, countable and finite. Continuous means range, and technically, there is an infinite number of values within this range. What’s more important is how these are manifested in Tableau. A blue discrete field will produce header, and a green continuous field will produce an axis. If dropped onto the Color shelf, for example, a blue discrete field will use individual, finite colors. A green continuous field will use a range (gradient) of colors. Some confusion also arises when we see that, by default, Tableau places numeric fields under Measures and are colored green, and categorical information under Dimensions are colored blue. These won’t always be the case. We can have numeric values that are discrete – for example an Order Number. We can also see non-numerical, discrete fields under Measures. Learn a few key shortcuts Shortcuts are great, but it’s typically faster to work when you know a few of them. Here are some of my favorite shortcuts: Shortcut What it does Right click + Drag Opens the Drop Field menu, which allows you to specify exactly which variation of the field you want to use Double click Adds the field to the view I particularly like this when creating text tables. After you place your first measure in Text, you can add more measures to your text table by double clicking on the succeeding measures Ctrl + Arrow Adjusts the height/width of the rows/columns in the view Ctrl + H Presentation mode You can find the complete list of shortcuts here: http://bit.ly/tableau-shortcuts Unpackage option The .twbx file is a Tableau packaged workbook, which means it packages local files with your Tableau workbook. When you right click a .twbx file in a machine that has Tableau Desktop installed in it, you will see a new option called Unpackage. When you unpack a .twbx file, you will get the .twb file and another folder that contains all the local files that were used in the original workbook: Just keep in mind that data (at least the file-based data sources and extracts) get packaged with your .twbx files. This is an important security and data governance consideration when you are deciding how to share your workbooks with others. Table calculations are calculations on your table. How you structure or lay out your table (or view) will affect your table calculations. Table calculations are highly influenced by: Layout Filters Scope and Direction Let’s say, for example, you are calculating Percent of Total in your view. If you swap the fields in your Rows and Columns, i.e. changing the layout, your numbers will change If you filter some of the products out, your numbers will change If you decide to compute Pane Down instead of Table Across, your numbers will change If you’re looking for the common use cases for table calculations, check out the Tableau article entitled Top 10 Tableau Table Calculations which can be found here: http://bit.ly/top10tablecalcs LODs Rock Many of the tasks that required complex table calculations or data blending have been greatly simplified by LODs (Level of Detail expressions). LODs allow us to have multiple levels of detail within a single view, and this increases the possibilities in Tableau. To learn more about Level of Detail expressions, I encourage you to check out the following: Understanding Level of Detail Expressions: http://bit.ly/UnderstandingLOD Top 15 LOD Expressions: http://bit.ly/top15LOD It is possible …. Another common question that comes up is can I do <this> or is it possible to do <this>. The answer to many of the questions is yes, and many will include calculations and/or parameters. However, not all solutions will be quick and straightforward. Some may require multiple calculated fields, table calculations, LOD expressions, regular expressions, R scripts etc. Summary In this article we have seen the basics of Tableau as this software tool that has a steep learning curve, it is useful to think of it as a blank slate. You will draw on it, keep on adding things, removing things until something makes sense or something insightful pops out. After you work with Tableau for a while and get more comfortable with its functionalities, it might even feel like an extension of your brain to some degree. When you get access to data, you might automatically open Tableau to try and understand what’s in that data. Resources for Article: Further resources on this subject: Say Hi to Tableau [article] Getting Started with Tableau Public [article] R and its Diverse Possibilities [article]

0
0
3356

Packt

03 Jan 2017

15 min read

Dimensionality Reduction

Packt

03 Jan 2017

15 min read

0
0
4433

article-image-recommendation-engines-explained

Packt

02 Jan 2017

10 min read

Recommendation Engines Explained

Packt

02 Jan 2017

10 min read

0
0
53536

article-image-implementing-rethinkdb-query-language

Packt

23 Dec 2016

5 min read

Implementing RethinkDB Query Language

Packt

23 Dec 2016

5 min read

0
0
1982

Packt

21 Dec 2016

9 min read

Say Hi to Tableau

Packt

21 Dec 2016

9 min read

0
0
3225

article-image-r-and-its-diverse-possibilities

Packt

16 Dec 2016

11 min read

R and its Diverse Possibilities

Packt

16 Dec 2016

11 min read

In this article by Jen Stirrup, the author of the book Advanced Analytics with R and Tableau, We will cover, with examples, the core essentials of R programming such as variables and data structures in R such as matrices, factors, vectors, and data frames. We will also focus on control mechanisms in R ( relational operators, logical operators, conditional statements, loops, functions, and apply) and how to execute these commands in R to get grips before proceeding to article that heavily rely on these concepts for scripting complex analytical operations. (For more resources related to this topic, see here.) Core essentials of R programming One of the reasons for R’s success is its use of variables. Variables are used in all aspects of R programming. For example, variables can hold data, strings to access a database, whole models, queries, and test results. Variables are a key part of the modeling process, and their selection has a fundamental impact on the usefulness of the models. Therefore, variables are an important place to start since they are at the heart of R programming. Variables In the following section we will deal with the variables—how to create variables and working with variables. Creating variables It is very simple to create variables in R, and to save values in them. To create a variable, you simply need to give the variable a name, and assign a value to it. In many other languages, such as SQL, it’s necessary to specify the type of value that the variable will hold. So, for example, if the variable is designed to hold an integer or a string, then this is specified at the point at which the variable is created. Unlike other programming languages, such as SQL, R does not require that you specify the type of the variable before it is created. Instead, R works out the type for itself, by looking at the data that is assigned to the variable. In R, we assign variables using an assignment variable, which is a less than sign (<) followed by a hyphen (-). Put together, the assignment variable looks like so: Working with variables It is important to understand what is contained in the variables. It is easy to check the content of the variables using the lscommand. If you need more details of the variables, then the ls.strcommand will provide you with more information. If you need to remove variables, then you can use the rm function. Data structures in R The power of R resides in its ability to analyze data, and this ability is largely derived from its powerful data types. Fundamentally, R is a vectorized programming language. Data structures in R are constructed from vectors that are foundational. This means that R’s operations are optimized to work with vectors. Vector The vector is a core component of R. It is a fundamental data type. Essentially, a vector is a data structure that contains an array where all of the values are the same type. For example, they could all be strings, or numbers. However, note that vectors cannot contain mixed data types. R uses the c() function to take a list of items and turns them into a vector. Lists R contains two types of lists: a basic list, and a named list. A basic list is created using the list() operator. In a named list, every item in the list has a name as well as a value. named lists are a good mapping structure to help map data between R and Tableau. In R, lists are mapped using the $ operator. Note, however, that the list label operators are case sensitive. Matrices Matrices are two-dimensional structures that have rows and columns. The matrices are lists of rows. It’s important to note that every cell in a matrix has the same type. Factors A factor is a list of all possible values of a variable in a string format. It is a special string type, which is chosen from a specified set of values known as levels. They are sometimes known as categorical variables. In dimensional modeling terminology, a factor is equivalent to a dimension, and the levels represent different attributes of the dimension. Note that factors are variables that can only contain a limited number of different values. Data frames The data frame is the main data structure in R. It’s possible to envisage the data frame as a table of data, with rows and columns. Unlike the list structure, the data frame can contain different types of data. In R, we use the data.frame() command in order to create a data frame. The data frame is extremely flexible for working with structured data, and it can ingest data from many different data types. Two main ways to ingest data into data frames involves the use of many data connectors, which connect to data sources such as databases, for example. There is also a command, read.table(), which takes in data. Data Frame Structure Here is an example, populated data frame. There are three columns, and two rows. The top of the data frame is the header. Each horizontal line afterwards holds a data row. This starts with the name of the row, and then followed by the data itself. Each data member of a row is called a cell. Here is an example data frame, populated with data: Example Data Frame Structure df = data.frame( Year=c(2013, 2013, 2013), Country=c("Arab World","Carribean States", "Central Europe"), LifeExpectancy=c(71, 72, 76)) As always, we should read out at least some of the data frame so we can double-check that it was set correctly. The data frame was set to the df variable, so we can read out the contents by simply typing in the variable name at the command prompt: To obtain the data held in a cell, we enter the row and column co-ordinates of the cell, and surround them by square brackets []. In this example, if we wanted to obtain the value of the second cell in the second row, then we would use the following: df[2, "Country"] We can also conduct summary statistics on our data frame. For example, if we use the following command: summary(df) Then we obtain the summary statistics of the data. The example output is as follows: You’ll notice that the summary command has summarized different values for each of the columns. It has identified Year as an integer, and produced the min, quartiles, mean, and max for year. The Country column has been listed, simply because it does not contain any numeric values. Life Expectancy is summarized correctly. We can change the Year column to a factor, using the following command: df$Year <- as.factor(df$Year) Then, we can rerun the summary command again: summary(df) On this occasion, the data frame now returns the correct results that we expect: As we proceed throughout this book, we will be building on more useful features that will help us to analyze data using data structures, and visualize the data in interesting ways using R. Control structures in R R has the appearance of a procedural programming language. However, it is built on another language, known as S. S leans towards functional programming. It also has some object-oriented characteristics. This means that there are many complexities in the way that R works. In this section, we will look at some of the fundamental building blocks that make up key control structures in R, and then we will move onto looping and vectorized operations. Logical operators Logical operators are binary operators that allow the comparison of values: Operator Description < less than <= less than or equal to > greater than >= greater than or equal to == exactly equal to != not equal to !x Not x x | y x OR y x & y x AND y isTRUE(x) test if X is TRUE For loops and vectorization in R Specifically, we will look at the constructs involved in loops. Note, however, that it is more efficient to use vectorized operations rather than loops, because R is vector-based. We investigate loops here, because they are a good first step in understanding how R works, and then we can optimize this understanding by focusing on vectorized alternatives that are more efficient. More information about control flows can be obtained by executing the command at the command line: Help?Control The control flow commands take decisions and make decisions between alternative actions. The main constructs are for, while, and repeat. For loops Let’s look at a for loop in more detail. For this exercise, we will use the Fisher iris dataset, which is installed along with R by default. We are going to produce summary statistics for each species of iris in the dataset. You can see some of the iris data by typing in the following command at the command prompt: head(iris) We can divide the iris dataset so that the data is split by species. To do this, we use the split command, and we assign it to the variable called IrisBySpecies: IrisBySpecies <- split(iris,iris$Species) Now, we can use a for loop in order to process the data in order to summarize it by species. Firstly, we will set up a variable called output, and set it to a list type. For each species held in the IrisBySpecies variable, we set it to calculate the minimum, maximum, mean, and total cases. It is then set to a data frame called output.df, which is printed out to the screen: output <- list() for(n in names(IrisBySpecies)){ ListData <- IrisBySpecies[[n]] output[[n]] <- data.frame(species=n, MinPetalLength=min(ListData$Petal.Length), MaxPetalLength=max(ListData$Petal.Length), MeanPetalLength=mean(ListData$Petal.Length), NumberofSamples=nrow(ListData)) output.df <- do.call(rbind,output) } print(output.df) The output is as follows: We used a for loop here, but they can be expensive in terms of processing. We can achieve the same end by using a function that uses a vector called Tapply. Tapply processes data in groups. Tapply has three parameters; the vector of data, the factor that defines the group, and a function. It works by extracting the group, and then applying the function to each of the groups. Then, it returns a vector with the results. We can see an example of tapply here, using the same dataset: output <- data.frame(MinPetalLength=tapply(iris$Petal.Length,iris$Species,min), MaxPetalLength=tapply(iris$Petal.Length,iris$Species,max), MeanPetalLength=tapply(iris$Petal.Length,iris$Species,mean), NumberofSamples=tapply(iris$Petal.Length,iris$Species,length)) print(output) This time, we get the same output as previously. The only difference is that by using a vectorized function, we have concise code that runs efficiently. To summarize, R is extremely flexible and it’s possible to achieve the same objective in a number of different ways. As we move forward through this book, we will make recommendations about the optimal method to select, and the reasons for the recommendation. Functions R has many functions that are included as part of the installation. In the first instance, let’s look to see how we can work smart by finding out what functions are available by default. In our last example, we used the split() function. To find out more about the split function, we can simply use the following command: ?split Or we can use: help(split) It’s possible to get an overview of the arguments required for a function. To do this, simply use the args command: args(split) Fortunately, it’s also possible to see examples of each function by using the following command: example(split) If you need more information than the documented help file about each function, you can use the following command. It will go and search through all the documentation for instances of the keyword: help.search("split") If you want to search the R project site from within RStudio, you can use the RSiteSearch command. For example: RSiteSearch("split") Summary In this article, we have looked at various essential structures in working with R. We have looked at the data structures that are fundamental to using R optimally. We have also taken the view that structures such as for loops can often be done better as vectorized operations. Finally, we have looked at the ways in which R can be used to create functions in order to simply code. Resources for Article: Further resources on this subject: Getting Started with Tableau Public [article] Creating your first heat map in R [article] Data Modelling Challenges [article]

0
0
2982

How-To Tutorials - Data

Visualization Dashboard Design

Elastic Stack Overview

Deep learning and regression analysis

Exploring Structure from Motion Using OpenCV

TensorFlow

Text Recognition

Microsoft Cognitive Services

What is an Artificial Neural Network?

Introduction to Deep Learning

Notes from the field

Trending Topics

Dimensionality Reduction

Recommendation Engines Explained

Implementing RethinkDB Query Language

Say Hi to Tableau

R and its Diverse Possibilities

Create a Free Account To Continue Reading

SignIn Free Account To Continue Reading