How-To Tutorials

article-image-emotional-ai-detecting-facial-expressions-and-emotions-using-coreml-tutorial

14 Sep 2018

11 min read

Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial]

14 Sep 2018

Recently we see computers allow natural forms of interaction and are becoming more ubiquitous, more capable, and more ingrained in our daily lives. They are becoming less like heartless dumb tools and more like friends, able to entertain us, look out for us, and assist us with our work. This article is an excerpt taken from the book Machine Learning with Core ML authored by Joshua Newnham. With this shift comes a need for computers to be able to understand our emotional state. For example, you don't want your social robot cracking a joke after you arrive back from work having lost your job (to an AI bot!). This is a field of computer science known as affective computing (also referred to as artificial emotional intelligence or emotional AI), a field that studies systems that can recognize, interpret, process, and simulate human emotions. The first stage of this is being able to recognize the emotional state. In this article, we will be creating a model that can detect the exact face expression or emotion using CoreML. Input data and preprocessing We will implement the preprocessing functionality required to transform images into something the model is expecting. We will build up this functionality in a playground project before migrating it across to our project in the next section. If you haven't done so already, pull down the latest code from the accompanying repository: https://github.com/packtpublishing/machine-learning-with-core-ml. Once downloaded, navigate to the directory Chapter4/Start/ and open the Playground project ExploringExpressionRecognition.playground. Once loaded, you will see the playground for this extract, as shown in the following screenshot: Before starting, to avoid looking at images of me, please replace the test images with either personal photos of your own or royalty free images from the internet, ideally a set expressing a range of emotions. Along with the test images, this playground includes a compiled Core ML model (we introduced it in the previous image) with its generated set of wrappers for inputs, outputs, and the model itself. Also included are some extensions for UIImage, UIImageView, CGImagePropertyOrientation, and an empty CIImage extension, to which we will return later in the extract. The others provide utility functions to help us visualize the images as we work through this playground. When developing machine learning applications, you have two broad paths. The first, which is becoming increasingly popular, is to use an end-to-end machine learning model capable of just being fed the raw input and producing adequate results. One particular field that has had great success with end-to-end models is speech recognition. Prior to end-to-end deep learning, speech recognition systems were made up of many smaller modules, each one focusing on extracting specific pieces of data to feed into the next module, which was typically manually engineered. Modern speech recognition systems use end-to-end models that take the raw input and output the result. Both of the described approaches can been seen in the following diagram: Obviously, this approach is not constrained to speech recognition and we have seen it applied to image recognition tasks, too, along with many others. But there are two things that make this particular case different; the first is that we can simplify the problem by first extracting the face. This means our model has less features to learn and offers a smaller, more specialized model that we can tune. The second thing, which is no doubt obvious, is that our training data consisted of only faces and not natural images. So, we have no other choice but to run our data through two models, the first to extract faces and the second to perform expression recognition on the extracted faces, as shown in this diagram: Luckily for us, Apple has mostly taken care of our first task of detecting faces through the Vision framework it released with iOS 11. The Vision framework provides performant image analysis and computer vision tools, exposing them through a simple API. This allows for face detection, feature detection and tracking, and classification of scenes in images and video. The latter (expression recognition) is something we will take care of using the Core ML model introduced earlier. Prior to the introduction of the Vision framework, face detection would typically be performed using the Core Image filter. Going back further, you had to use something like OpenCV. You can learn more about Core Image here: https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_detect_faces/ci_detect_faces.html. Now that we have got a bird's-eye view of the work that needs to be done, let's turn our attention to the editor and start putting all of this together. Start by loading the images; add the following snippet to your playground: var images = [UIImage]() for i in 1...3{ guard let image = UIImage(named:"images/joshua_newnham_\(i).jpg") else{ fatalError("Failed to extract features") } images.append(image) } let faceIdx = 0 let imageView = UIImageView(image: images[faceIdx]) imageView.contentMode = .scaleAspectFit In the preceding snippet, we are simply loading each of the images we have included in our resources' Images folder and adding them to an array we can access conveniently throughout the playground. Once all the images are loaded, we set the constant faceIdx, which will ensure that we access the same images throughout our experiments. Finally, we create an ImageView to easily preview it. Once it has finished running, click on the eye icon in the right-hand panel to preview the loaded image, as shown in the following screenshot: Next, we will take advantage of the functionality available in the Vision framework to detect faces. The typical flow when working with the Vision framework is defining a request, which determines what analysis you want to perform, and defining the handler, which will be responsible for executing the request and providing means of obtaining the results (either through delegation or explicitly queried). The result of the analysis is a collection of observations that you need to cast into the appropriate observation type; concrete examples of each of these can be seen here: As illustrated in the preceding diagram, the request determines what type of image analysis will be performed; the handler, using a request or multiple requests and an image, performs the actual analysis and generates the results (also known as observations). These are accessible via a property or delegate if one has been assigned. The type of observation is dependent on the request performed; it's worth highlighting that the Vision framework is tightly integrated into Core ML and provides another layer of abstraction and uniformity between you and the data and process. For example, using a classification Core ML model would return an observation of type VNClassificationObservation. This layer of abstraction not only simplifies things but also provides a consistent way of working with machine learning models. In the previous figure, we showed a request handler specifically for static images. Vision also provides a specialized request handler for handling sequences of images, which is more appropriate when dealing with requests such as tracking. The following diagram illustrates some concrete examples of the types of requests and observations applicable to this use case: So, when do you use VNImageRequestHandler and VNSequenceRequestHandler? Though the names provide clues as to when one should be used over the other, it's worth outlining some differences. The image request handler is for interactive exploration of an image; it holds a reference to the image for its life cycle and allows optimizations of various request types. The sequence request handler is more appropriate for performing tasks such as tracking and does not optimize for multiple requests on an image. Let's see how this all looks in code; add the following snippet to your playground: let faceDetectionRequest = VNDetectFaceRectanglesRequest() let faceDetectionRequestHandler = VNSequenceRequestHandler() Here, we are simply creating the request and handler; as discussed in the preceding code, the request encapsulates the type of image analysis while the handler is responsible for executing the request. Next, we will get faceDetectionRequestHandler to run faceDetectionRequest; add the following code: try? faceDetectionRequestHandler.perform( [faceDetectionRequest], on: images[faceIdx].cgImage!, orientation: CGImagePropertyOrientation(images[faceIdx].imageOrientation)) The perform function of the handler can throw an error if it fails; for this reason, we wrap the call with try? at the beginning of the statement and can interrogate the error property of the handler to identify the reason for failing. We pass the handler a list of requests (in this case, only our faceDetectionRequest), the image we want to perform the analysis on, and, finally, the orientation of the image that can be used by the request during analysis. Once the analysis is done, we can inspect the observation obtained through the results property of the request itself, as shown in the following code: if let faceDetectionResults = faceDetectionRequest.results as? [VNFaceObservation]{ for face in faceDetectionResults{ // ADD THE NEXT SNIPPET OF CODE HERE } } The type of observation is dependent on the analysis; in this case, we're expecting a VNFaceObservation. Hence, we cast it to the appropriate type and then iterate through all the observations. Next, we will take each recognized face and extract the bounding box. Then, we'll proceed to draw it in the image (using an extension method of UIImageView found within the UIImageViewExtension.swift file). Add the following block within the for loop shown in the preceding code: if let currentImage = imageView.image{ let bbox = face.boundingBox let imageSize = CGSize( width:currentImage.size.width, height: currentImage.size.height) let w = bbox.width * imageSize.width let h = bbox.height * imageSize.height let x = bbox.origin.x * imageSize.width let y = bbox.origin.y * imageSize.height let faceRect = CGRect( x: x, y: y, width: w, height: h) let invertedY = imageSize.height - (faceRect.origin.y + faceRect.height) let invertedFaceRect = CGRect( x: x, y: invertedY, width: w, height: h) imageView.drawRect(rect: invertedFaceRect) } We can obtain the bounding box of each face via the let boundingBox property; the result is normalized, so we then need to scale this based on the dimensions of the image. For example, you can obtain the width by multiplying boundingBox with the width of the image: bbox.width * imageSize.width. Next, we invert the y axis as the coordinate system of Quartz 2D is inverted with respect to that of UIKit's coordinate system, as shown in this diagram: We invert our coordinates by subtracting the bounding box's origin and height from height of the image and then passing this to our UIImageView to render the rectangle. Click on the eye icon in the right-hand panel in line with the statement imageView.drawRect(rect: invertedFaceRect) to preview the results; if successful, you should see something like the following: An alternative to inverting the face rectangle would be to use an AfflineTransform, such as: var transform = CGAffineTransform(scaleX: 1, y: -1) transform = transform.translatedBy(x: 0, y: -imageSize.height) let invertedFaceRect = faceRect.apply(transform) This approach leads to less code and therefore less chances of errors. So, it is the recommended approach. The long approach was taken previously to help illuminate the details. As a designer and builder of intelligent systems, it is your task to interpret these results and present them to the user. Some questions you'll want to ask yourself are as follows: What is an acceptable threshold of a probability before setting the class as true? Can this threshold be dependent on probabilities of other classes to remove ambiguity? That is, if Sad and Happy have a probability of 0.3, you can infer that the prediction is inaccurate, or at least not useful. Is there a way to accept multiple probabilities? Is it useful to expose the threshold to the user and have it manually set and/or tune it? These are only a few questions you should ask. The specific questions and their answers will depend on your use case and users. At this point, we have everything we need to preprocess and perform inference We briefly explored some use cases showing how emotion recognition could be applied. For a detailed overview of this experiment, check out our book, Machine Learning with Core ML to further implement Core ML for visual-based applications using the principles of transfer learning and neural networks. Amazon Rekognition can now ‘recognize’ faces in a crowd at real-time 5 cool ways Transfer Learning is being used today My friend, the robot: Artificial Intelligence needs Emotional Intelligence

0
0
34336

How-To Tutorials

article-image-aws-machine-learning-learning-aws-cli-to-execute-a-simple-amazon-ml-workflow-tutorial

Melisha Dsouza

13 Sep 2018

15 min read

AWS machine learning: Learning AWS CLI to execute a simple Amazon ML workflow [Tutorial]

Melisha Dsouza

13 Sep 2018

15 min read

0
0
25076

How-To Tutorials

article-image-how-to-predict-viral-content-using-random-forest-regression-in-python-tutorial

Prasad Ramesh

12 Sep 2018

9 min read

How to predict viral content using random forest regression in Python [Tutorial]

Prasad Ramesh

12 Sep 2018

9 min read

0
0
33426

How-To Tutorials

article-image-implementing-an-ai-in-unreal-engine-4-with-ai-perception-components-tutorial

Natasha Mathur

11 Sep 2018

8 min read

Implementing an AI in Unreal Engine 4 with AI Perception components [Tutorial]

Natasha Mathur

11 Sep 2018

8 min read

0
0
39120

How-To Tutorials

article-image-build-a-custom-news-feed-with-python-tutorial

Prasad Ramesh

10 Sep 2018

13 min read

Build a custom news feed with Python [Tutorial]

Prasad Ramesh

10 Sep 2018

13 min read

0
0
33853

How-To Tutorials

article-image-implementing-dependency-injection-google-guice

Natasha Mathur

09 Sep 2018

10 min read

Implementing Dependency Injection in Google Guice [Tutorial]

Natasha Mathur

09 Sep 2018

10 min read

0
6
42807

How-To Tutorials

article-image-building-recommendation-system-with-scala-and-apache-spark-tutorial

Savia Lobo

08 Sep 2018

12 min read

Building Recommendation System with Scala and Apache Spark [Tutorial]

Savia Lobo

08 Sep 2018

12 min read

Recommendation systems can be defined as software applications that draw out and learn from data such as preferences, their actions (clicks, for example), browsing history, and generated recommendations, which are products that the system determines are appealing to the user in the immediate future. In this tutorial, we will learn to build a recommendation system with Scala and Apache Spark. This article is an excerpt taken from Modern Scala Projects written Ilango Gurusamy. What does a recommendation system look like The following diagram is representative of a typical recommendation system: Recommendation system In the preceding diagram, can be thought of as a recommendation ecosystem, where the recommendation system is at the heart of it. This system needs three entities: Users Products Transactions between users and products where transactions contain feedback from users about products Implementation and deployment Implementation is documented in the following subsections. All code is developed in an Intellij code editor. The very first step is to create an empty Scala project called Chapter7. Step 1 – creating the Scala project Let's create a Scala project called Chapter7 with the following artifacts: RecommendationSystem.scala RecommendationWrapper.scala Let's break down the project's structure: .idea: Generated IntelliJ configuration files. project: Contains build.properties and plugins.sbt. project/assembly.sbt: This file specifies the sbt-assembly plugin needed to build a fat JAR for deployment. src/main/scala: This is a folder that houses Scala source files in the com.packt.modern.chapter7 package. target: This is where artifacts of the compile process are stored. The generated assembly JAR file goes here. build.sbt: This is the main SBT configuration file. Spark and its dependencies are specified here. At this point, we will start developing code in the IntelliJ code editor. We will start with the AirlineWrapper Scala file and end with the deployment of the final application JAR into Spark with spark-submit. Step 2 – creating the AirlineWrapper definition Let's create the trait definition. The trait will hold the SparkSession variable, schema definitions for the datasets, and methods to build a dataframe: trait RecWrapper { } Next, let's create a schema for past weapon sales orders. Step 3 – creating a weapon sales orders schema Let's create a schema for the past sales order dataset: val salesOrderSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true), StructField("sItemUnitPrice",DoubleType,true), StructField("sOrderSize", DoubleType,true), StructField("sAmountPaid", DoubleType,true) )) Next, let's create a schema for weapon sales leads. Step 4 – creating a weapon sales leads schema Here is a schema definition for the weapon sales lead dataset: val salesLeadSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true) )) Next, let's build a weapon sales order dataframe. Step 5 – building a weapon sales order dataframe Let's invoke the read method on our SparkSession instance and cache it. We will call this method later from the RecSystem object: def buildSalesOrders(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesOrderSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } Next up, let's build a sales leads dataframe: def buildSalesLeads(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesLeadSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } This completes the trait. Overall, it looks like this: trait RecWrapper { 1) Create a lazy SparkSession instance and call it session. 2) Create a schema for the past sales orders dataset 3) Create a schema for sales lead dataset 4) Write a method to create a dataframe that holds past sales order data. This method takes in sales order dataset and returns a dataframe 5) Write a method to create a dataframe that holds lead sales data } Bring in the following imports: import org.apache.spark.mllib.recommendation.{ALS, Rating} import org.apache.spark.rdd.RDD import org.apache.spark.sql.{DataFrame, Dataset, SparkSession} Create a Scala object called RecSystem: object RecSystem extends App with RecWrapper { } Before going any further, bring in the following imports: import org.apache.spark.rdd.RDD import org.apache.spark.sql.DataFrame Inside this object, start by loading the past sales order data. This will be our training data. Load the sales order dataset, as follows: val salesOrdersDf = buildSalesOrders("sales\\PastWeaponSalesOrders.csv") Verify the schema. This is what the schema looks like: salesOrdersDf.printSchema() root |-- sCustomerId: integer (nullable = true) |-- sCustomerName: string (nullable = true) |-- sItemId: integer (nullable = true) |-- sItemName: string (nullable = true) |-- sItemUnitPrice: double (nullable = true) |-- sOrderSize: double (nullable = true) |-- sAmountPaid: double (nullable = true) Here is a partial view of a dataframe displaying past weapon sales order data: Partial view of dataframe displaying past weapon sales order data Now, we have what we need to create a dataframe of ratings: val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => Rating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save all and compile the project at the command line: C:\Path\To\Your\Project\Chapter7>sbt compile You are likely to run into the following error: [error] C:\Path\To\Your\Project\Chapter7\src\main\scala\com\packt\modern\chapter7\RecSystem.scala:50:50: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. [error] val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => [error] ^ [error] two errors found [error] (compile:compileIncremental) Compilation failed To fix this, place the following statement at the top of the declarations of the rating dataframe. It should look like this: import session.implicits._ val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => UserRating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save and recompile the project. This time, it compiles just fine. Next, import the Rating class from the org.apache.spark.mllib.recommendation package. This transforms the rating dataframe that we obtained previously to its RDD equivalent: val ratings: RDD[Rating] = ratingsDf.rdd.map( row => Rating( row.getInt(0), row.getInt(1), row.getDouble(2) ) ) println("Ratings RDD is: " + ratings.take(10).mkString(" ") ) The following few lines of code are very important. We will be using the ALS algorithm from Spark MLlib to create and train a MatrixFactorizationModel, which takes an RDD[Rating] object as input. The ALS train method may require a combination of the following training hyperparameters: numBlocks: Preset to -1 in an auto-configuration setting. This parameter is meant to parallelize computation. custRank: The number of features, otherwise known as latent factors. iterations: This parameter represents the number of iterations for ALS to execute. For a reasonable solution to converge on, this algorithm needs roughly 20 iterations or less. regParam: The regularization parameter. implicitPrefs: This hyperparameter is a specifier. It lets us use either of the following: Explicit feedback Implicit feedback alpha: This is a hyperparameter connected to an implicit feedback variant of the ALS algorithm. Its role is to govern the baseline confidence in preference observations. We just explained the role played by each parameter needed by the ALS algorithm's train method. Let's get started by bringing in the following imports: import org.apache.spark.mllib.recommendation.MatrixFactorizationModel Now, let's get down to training the matrix factorization model using the ALS algorithm. Let's train a matrix factorization model given an RDD of ratings by customers (users) for certain items (products). Our train method on the ALS algorithm will take the following four parameters: Ratings. A rank. A number of iterations. A Lambda value or regularization parameter: val ratingsModel: MatrixFactorizationModel = ALS.train(ratings, 6, /* THE RANK */ 10, /* Number of iterations */ 15.0 /* Lambda, or regularization parameter */ ) Next, we load the sales lead file and convert it into a tuple format: val weaponSalesLeadDf = buildSalesLeads("sales\\ItemSalesLeads.csv") In the next section, we will display the new weapon sales lead dataframe. Step 6 – displaying the weapons sales dataframe First, we must invoke the show method: println("Weapons Sales Lead dataframe is: ") weaponSalesLeadDf.show Here is a view of the weapon sales lead dataframe: View of weapon sales lead dataframe Next, create a version of the sales lead dataframe structured as (customer, item) tuples: val customerWeaponsSystemPairDf: DataFrame = weaponSalesLeadDf.map(salesLead => ( salesLead.getInt(0), salesLead.getInt(2) )).toDF("user","item") In the next section, let's display the dataframe that we just created. Step 7 – displaying the customer-weapons-system dataframe Let's the show method, as follows: println("The Customer-Weapons System dataframe as tuple pairs looks like: ") customerWeaponsSystemPairDf.show Here is a screenshot of the new customer-weapons-system dataframe as tuple pairs: New customer-weapons-system dataframe as tuple pairs Next, we will convert the preceding dataframe into an RDD: val customerWeaponsSystemPairRDD: RDD[(Int, Int)] = customerWeaponsSystemDf.rdd.map(row => (row.getInt(0), row.getInt(1)) ) /* Notes: As far as the algorithm is concerned, customer corresponds to "user" and "product" or item corresponds to a "weapons system" */ We previously created a MatrixFactorization model that we trained with the weapons system sales orders dataset. We are in a position to predict how each customer country may rate a weapon system in the future. In the next section, we will generate predictions. Step 8 – generating predictions Here is how we will generate predictions. The predict method of our model is designed to do just that. It will generate a predictions RDD that we call weaponRecs. It represents the ratings of weapons systems that were not rated by customer nations (listed in the past sales order data) previously: val weaponRecs: RDD[Rating] = ratingsModel.predict(customerWeaponsSystemPairRDD).distinct() Next up, we will display the final predictions. Step 9 – displaying predictions Here is how to display the predictions, lined up in tabular format: println("Future ratings are: " + weaponRecs.foreach(rating => { println( "Customer: " + rating.user + " Product: " + rating.product + " Rating: " + rating.rating ) } ) ) The following table displays how each nation is expected to rate a certain system in the future, that is, a weapon system that they did not rate earlier: System rating by each nation Our recommendation system proved itself capable of generating future predictions. Up until now, we did not say how all of the preceding code is compiled and deployed. We will look at this in the next section. Compilation and deployment Compiling the project Invoke the sbt compile project at the root folder of your Chapter7 project. You should get the following output: Output on compiling the project Besides loading build.sbt, the compile task is also loading settings from assembly.sbt which we will create below. What is an assembly.sbt file? We have not yet talked about the assembly.sbt file. Our scala-based Spark application is a Spark job that will be submitted to a (local) Spark cluster as a JAR file. This file, apart from Spark libraries, also needs other dependencies in it for our recommendation system job to successfully complete. The name fat JAR is from all dependencies bundled in one JAR. To build such a fat JAR, we need an sbt-assembly plugin. This explains the need for creating a new assembly.sbt and the assembly plugin. Creating assembly.sbt Create a new assembly.sbt in your IntelliJ project view and save it under your project folder, as follows: Creating assembly.sbt Contents of assembly.sbt Paste the following contents into the newly created assembly.sbt (under the project folder). The output should look like this: Output on placing contents of assembly.sbt The sbt-assembly plugin, version 0.14.7, gives us the ability to run an sbt-assembly task. With that, we are one step closer to building a fat or Uber JAR. This action is documented in the next step. Running the sbt assembly task Issue the sbt assembly command, as follows: Running the sbt assembly command This time, the assembly task loads the assembly-plugin in assembly.sbt. However, further assembly halts because of a common duplicate error. This error arises due to several duplicates, multiple copies of dependency files that need removal before the assembly task can successfully complete. To address this situation, build.sbt needs an upgrade. Upgrading the build.sbt file The following lines of code need to be added in, as follows: Code lines for upgrading the build.sbt file To test the effect of your changes, save this and go to the command line to reissue the sbt assembly task. Rerunning the assembly command Run the assembly task, as follows: Rerunning the assembly task This time, the settings in the assembly.sbt file are loaded. The task completes successfully. To verify, drill down to the target folder. If everything went well, you should see a fat JAR, as follows: Output as a JAR file Our JAR file under the target folder is the recommendation system application's JAR file that needs to be deployed into Spark. This is documented in the next step. Deploying the recommendation application The spark-submit command is how we will deploy the application into Spark. Here are two formats for the spark-submit command. The first one is a long one which sets more parameters than the second one: spark-submit --class "com.packt.modern.chapter7.RecSystem" --master local[2] --deploy-mode client --driver-memory 16g -num-executors 2 --executor-memory 2g --executor-cores 2 <path-to-jar> Leaning on the preceding format, let's submit our Spark job, supplying various parameters to it: Parameters for Spark The different parameters are explained as follows: Tabular explanation of parameters for Spark Job We used Spark's support for recommendations to build a prediction model that generated recommendations and leveraged Spark's alternating least squares algorithm to implement our collaborative filtering recommendation system. If you've enjoyed reading this post, do check out the book Modern Scala Projects to gain insights into data that will help organizations have a strategic and competitive advantage. How to Build a music recommendation system with PageRank Algorithm Recommendation Systems Building A Recommendation System with Azure

0
0
16905

How-To Tutorials

article-image-building-a-twitter-news-bot-using-twitter-api-tutorial

Bhagyashree R

07 Sep 2018

11 min read

Building a Twitter news bot using Twitter API [Tutorial]

Bhagyashree R

07 Sep 2018

11 min read

This article is an excerpt from a book written by Srini Janarthanam titled Hands-On Chatbots and Conversational UI Development. In this article, we will explore the Twitter API and build core modules for tweeting, searching, and retweeting. We will further explore a data source for news around the globe and build a simple bot that tweets top news on its timeline. Getting started with the Twitter app To get started, let us explore the Twitter developer platform. Let us begin by building a Twitter app and later explore how we can tweet news articles to followers based on their interests: Log on to Twitter. If you don't have an account on Twitter, create one. Go to Twitter Apps, which is Twitter's application management dashboard. Click the Create New App button: Create an application by filling in the form providing name, description, and a website (fully-qualified URL). Read and agree to the Developer Agreement and hit Create your Twitter application: You will now see your application dashboard. Explore the tabs: Click Keys and Access Tokens: Copy consumer key and consumer secret and hang on to them. Scroll down to Your Access Token: Click Create my access token to create a new token for your app: Copy the Access Token and Access Token Secret and hang on to them. Now, we have all the keys and tokens we need to create a Twitter app. Building your first Twitter bot Let's build a simple Twitter bot. This bot will listen to tweets and pick out those that have a particular hashtag. All the tweets with a given hashtag will be printed on the console. This is a very simple bot to help us get started. In the following sections, we will explore more complex bots. To follow along you can download the code from the book's GitHub repository. Go to the root directory and create a new Node.js program using npm init: Execute the npm install twitter --save command to install the Twitter Node.js library: Run npm install request --save to install the Request library as well. We will use this in the future to make HTTP GET requests to a news data source. Explore your package.json file in the root directory: { "name": "twitterbot", "version": "1.0.0", "description": "my news bot", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "request": "^2.81.0", "twitter": "^1.7.1" } } Create an index.js file with the following code: //index.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); In the preceding code, put the keys and tokens you saved in their appropriate variables. We don't need the request package just yet, but we will later. Now let's create a hashtag listener to listen to the tweets on a specific hashtag: //Twitter stream var hashtag = '#brexit'; //put any hashtag to listen e.g. #brexit console.log('Listening to:' + hashtag); Twitter.stream('statuses/filter', {track: hashtag}, function(stream) { stream.on('data', function(tweet) { console.log('Tweet:@' + tweet.user.screen_name + '\t' + tweet.text); console.log('------') }); stream.on('error', function(error) { console.log(error); }); }); Replace #brexit with the hashtag you want to listen to. Use a popular one so that you can see the code in action. Run the index.js file with the node index.js command. You will see a stream of tweets from Twitter users all over the globe who used the hashtag: Congratulations! You have built your first Twitter bot. Exploring the Twitter SDK In the previous section, we explored how to listen to tweets based on hashtags. Let's now explore the Twitter SDK to understand the capabilities that we can bestow upon our Twitter bot. Updating your status You can also update your status on your Twitter timeline by using the following status update module code: tweet ('I am a Twitter Bot!', null, null); function tweet(statusMsg, screen_name, status_id){ console.log('Sending tweet to: ' + screen_name); console.log('In response to:' + status_id); var msg = statusMsg; if (screen_name != null){ msg = '@' + screen_name + ' ' + statusMsg; } console.log('Tweet:' + msg); Twitter.post('statuses/update', { status: msg }, function(err, response) { // if there was an error while tweeting if (err) { console.log('Something went wrong while TWEETING...'); console.log(err); } else if (response) { console.log('Tweeted!!!'); console.log(response) } }); } Comment out the hashtag listener code and instead add the preceding status update code and run it. When run, your bot will post a tweet on your timeline: In addition to tweeting on your timeline, you can also tweet in response to another tweet (or status update). The screen_name argument is used to create a response. tweet. screen_name is the name of the user who posted the tweet. We will explore this a bit later. Retweet to your followers You can retweet a tweet to your followers using the following retweet status code: var retweetId = '899681279343570944'; retweet(retweetId); function retweet(retweetId){ Twitter.post('statuses/retweet/', { id: retweetId }, function(err, response) { if (err) { console.log('Something went wrong while RETWEETING...'); console.log(err); } else if (response) { console.log('Retweeted!!!'); console.log(response) } }); } Searching for tweets You can also search for recent or popular tweets with hashtags using the following search hashtags code: search('#brexit', 'popular') function search(hashtag, resultType){ var params = { q: hashtag, // REQUIRED result_type: resultType, lang: 'en' } Twitter.get('search/tweets', params, function(err, data) { if (!err) { console.log('Found tweets: ' + data.statuses.length); console.log('First one: ' + data.statuses[1].text); } else { console.log('Something went wrong while SEARCHING...'); } }); } Exploring a news data service Let's now build a bot that will tweet news articles to its followers at regular intervals. We will then extend it to be personalized by users through a conversation that happens over direct messaging with the bot. In order to build a news bot, we need a source where we can get news articles. We are going to explore a news service called NewsAPI.org in this section. News API is a service that aggregates news articles from roughly 70 newspapers around the globe. Setting up News API Let us set up an account with the News API data service and get the API key: Go to NewsAPI.org: Click Get API key. Register using your email. Get your API key. Explore the sources: https://newsapi.org/v1/sources?apiKey=YOUR_API_KEY. There are about 70 sources from across the globe including popular ones such as BBC News, Associated Press, Bloomberg, and CNN. You might notice that each source has a category tag attached. The possible options are: business, entertainment, gaming, general, music, politics, science-and-nature, sport, and technology. You might also notice that each source also has language (en, de, fr) and country (au, de, gb, in, it, us) tags. The following is the information on the BBC-News source: { "id": "bbc-news", "name": "BBC News", "description": "Use BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news.", "url": "http://www.bbc.co.uk/news", "category": "general", "language": "en", "country": "gb", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] } Get sources for a specific category, language, or country using: https://newsapi.org/v1/sources?category=business&apiKey=YOUR_API_KEY The following is the part of the response to the preceding query asking for all sources under the business category: "sources": [ { "id": "bloomberg", "name": "Bloomberg", "description": "Bloomberg delivers business and markets news, data, analysis, and video to the world, featuring stories from Businessweek and Bloomberg News.", "url": "http://www.bloomberg.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] }, { "id": "business-insider", "name": "Business Insider", "description": "Business Insider is a fast-growing business site with deep financial, media, tech, and other industry verticals. Launched in 2007, the site is now the largest business news site on the web.", "url": "http://www.businessinsider.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top", "latest" ] }, ... ] Explore the articles: https://newsapi.org/v1/articles?source=bbc-news&apiKey=YOUR_API_KEY The following is the sample response: "articles": [ { "author": "BBC News", "title": "US Navy collision: Remains found in hunt for missing sailors", "description": "Ten US sailors have been missing since Monday's collision with a tanker near Singapore.", "url": "http://www.bbc.co.uk/news/world-us-canada-41013686", "urlToImage": "https://ichef1.bbci.co.uk/news/1024/cpsprodpb/80D9/ production/_97458923_mediaitem97458918.jpg", "publishedAt": "2017-08-22T12:23:56Z" }, { "author": "BBC News", "title": "Afghanistan hails Trump support in 'joint struggle'", "description": "President Ghani thanks Donald Trump for supporting Afghanistan's battle against the Taliban.", "url": "http://www.bbc.co.uk/news/world-asia-41012617", "urlToImage": "https://ichef.bbci.co.uk/images/ic/1024x576/p05d08pf.jpg", "publishedAt": "2017-08-22T11:45:49Z" }, ... ] For each article, the author, title, description, url, urlToImage,, and publishedAt fields are provided. Now that we have explored a source of news data that provides up-to-date news stories under various categories, let us go on to build a news bot. Building a Twitter news bot Now that we have explored News API, a data source for the latest news updates, and a little bit of what the Twitter API can do, let us combine them both to build a bot tweeting interesting news stories, first on its own timeline and then specifically to each of its followers: Let's build a news tweeter module that tweets the top news article given the source. The following code uses the tweet() function we built earlier: topNewsTweeter('cnn', null); function topNewsTweeter(newsSource, screen_name, status_id){ request({ url: 'https://newsapi.org/v1/articles?source=' + newsSource + '&apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { var botResponse = JSON.parse(body); console.log(botResponse); tweetTopArticle(botResponse.articles, screen_name); } else { console.log('Sorry. No new'); } }); } function tweetTopArticle(articles, screen_name, status_id){ var article = articles[0]; tweet(article.title + " " + article.url, screen_name); } Run the preceding program to fetch news from CNN and post the topmost article on Twitter: Here is the post on Twitter: Now, let us build a module that tweets news stories from a randomly-chosen source in a list of sources: function tweetFromRandomSource(sources, screen_name, status_id){ var max = sources.length; var randomSource = sources[Math.floor(Math.random() * (max + 1))]; //topNewsTweeter(randomSource, screen_name, status_id); } Let's call the tweeting module after we acquire the list of sources: function getAllSourcesAndTweet(){ var sources = []; console.log('getting sources...') request({ url: 'https://newsapi.org/v1/sources? apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { // Print out the response body var botResponse = JSON.parse(body); for (var i = 0; i < botResponse.sources.length; i++){ console.log('adding.. ' + botResponse.sources[i].id) sources.push(botResponse.sources[i].id) } tweetFromRandomSource(sources, null, null); } else { console.log('Sorry. No news sources!'); } }); } Let's create a new JS file called tweeter.js. In the tweeter.js file, call getSourcesAndTweet() to get the process started: //tweeter.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); getAllSourcesAndTweet(); Run the tweeter.js file on the console. This bot will tweet a news story every time it is called. It will choose top news stories from around 70 news sources randomly. Hurray! You have built your very own Twitter news bot. In this tutorial, we have covered a lot. We started off with the Twitter API and got a taste of how we can automatically tweet, retweet, and search for tweets using hashtags. We then explored a News source API that provides news articles from about 70 different newspapers. We integrated it with our Twitter bot to create a new tweeting bot. If you found this post useful, do check out the book, Hands-On Chatbots and Conversational UI Development, which will help you explore the world of conversational user interfaces. Build and train an RNN chatbot using TensorFlow [Tutorial] Building a two-way interactive chatbot with Twilio: A step-by-step guide How to create a conversational assistant or chatbot using Python

0
1
52341

How-To Tutorials

article-image-classifying-flowers-in-iris-dataset-using-scala-tutorial

Savia Lobo

06 Sep 2018

15 min read

Classifying flowers in Iris Dataset using Scala [Tutorial]

Savia Lobo

06 Sep 2018

15 min read

0
0
19474

How-To Tutorials

article-image-intelligent-mobile-projects-with-tensorflow-build-your-first-reinforcement-learning-model-on-raspberry-pi-tutorial

Bhagyashree R

05 Sep 2018

13 min read

Intelligent mobile projects with TensorFlow: Build your first Reinforcement Learning model on Raspberry Pi [Tutorial]

Bhagyashree R

05 Sep 2018

13 min read

0
0
18158

How-To Tutorials

article-image-how-to-create-advanced-environment-interactions-with-ai-tutorial

Sugandha Lahoti

04 Sep 2018

10 min read

How to use artificial intelligence to create games with rich and interactive environments [Tutorial]

Sugandha Lahoti

04 Sep 2018

10 min read

Many of the most popular games on the planet have one thing in common: they all have rich, vivid worlds for the player to inhabit and interact with. This doesn't just mean a huge terrain or an extensive map (although it might do), it could simply be how things appear within the world. Similarly, it's not just about the environment - it's also about characters who are able to react in different ways according to the game. The only way to achieve an impressive level of 'realism' is through powerful artificial intelligence. This isn't easy, but it can be done. And learning how to do it will be well worth it, as it will create a much more engaging end product for players. This tutorial is taken from the book Practical Game AI Programming by Micael DaGraca. This book teaches you to create Game AI and implement cutting-edge AI algorithms from scratch. Let's take a look at how we can use AI to create rich environments. Breaking down the game environment by area When we create a map, often we have two or more different areas that could be used to change the gameplay, areas that could contain water, quicksand, flying zones, caves, and much more. If we wish to create an AI character that can be used in any level of our game, and anywhere, we need to take this into consideration and make the AI aware of the different zones of the map. Usually, that means that we need to input more information into the character's behavior, including how to react according to the position in which he is currently placed, or a situation where he can choose where to go. Should he avoid some areas? Should he prefer others? This type of information is relevant because it makes the character aware of the surroundings, choosing or adapting and taking into consideration his position. Not planning this correctly can lead to some unnatural decisions. For example, in Elder Scrolls V: Skyrim developed by Bethesda Softworks studio, we can watch some AI characters of the game simply turning back when they do not have information about how they should behave in some parts of the map, especially on mountains or rivers. Depending on the zones that our character finds, he might react differently or update his behavior tree to adapt to his environment. The environment that surrounds our characters can redefine their priorities or completely change their behaviors. This is a little similar to what Jean-Jacques Rousseau said about humanity: "We are good by nature, but corrupted by society." As humans, we are a representation of the environment that surrounds us, and for that reason, artificial intelligence should follow the same principle. Let's pick a soldier and update his code to work on a different scenario. We want to change his behavior according to three different zones, beach, river, and forest. So, we'll create three public static Boolean functions with the names Beach, Forest and River; then we define the zones on the map that will turn them on or off. public static bool Beach; public static bool River; public static bool Forest; Because in this example, just one of them can be true at a time, we'll add a simple line of code that disables the other options once one of them gets activated. if(Beach == true) { Forest = false; River = false; } if(Forest == true){ Beach = false; River = false; } if(River == true){ Forest = false; Beach = false; } Once we have that done, we can start defining the different behaviors for each zone. For example, in the beach zone, the characters don't have a place to get cover, so that option needs to be taken away and updated with a new one. The river zone can be used to get across to the other side, so the character can hide from the player and attack from that position. To conclude, we can define the character to be more careful and use the trees to get cover. Depending on the zones, we can change the values to better adapt to the environment, or create new functions that would allow us to use some specific characteristics of that zone. if (Forest == true) {// The AI will remain passive until an interaction with the player occurs if (Health == 100 && triggerL == false && triggerR == false && triggerM == false) { statePassive = true; stateAggressive = false; stateDefensive = false; } // The AI will shift to the defensive mode if player comes from the right side or if the AI is below 20 HP if (Health <= 100 && triggerR == true || Health <= 20) { statePassive = false; stateAggressive = false; stateDefensive = true; } // The AI will shift to the aggressive mode if player comes from the left side or it's on the middle and AI is above 20HP if (Health > 20 && triggerL == true || Health > 20 && triggerM == true) { statePassive = false; stateAggressive = true; stateDefensive = false; } walk = speed * Time.deltaTime; walk = speedBack * Time.deltaTime; } Advanced environment interactions with AI As the video game industry and the technology associated with it kept evolving, new gameplay ideas appeared, and rapidly, the interaction between the characters of the game and the environment became even more interesting, especially when using physics. This means that the outcome of the environment could be completely random, where it was required for the AI characters to constantly adapt to different situations. One honorable mention on this subject is the video game Worms developed by Team17, where the map can be fully destroyed and the AI characters of the game are able to adapt and maintain smart decisions. The objective of this game is to destroy the opponent team by killing all their worms, the last man standing wins. From the start, the characters can find some extra health points or ammunition on the map and from time to time, it drops more points from the sky. So, there are two main objectives for the character, namely survive and kill. To survive, he needs to keep a decent amount of HP and away from the enemy, the other part is to choose the best character to shoot and take as much health as possible from him. Meanwhile, the map gets destroyed by the bombs and all of the fire power used by the characters, making it a challenge for artificial intelligence. Adapting to unstable terrain Let's decompose this example and create a character that could be used in this game. We'll start by looking at the map. At the bottom, there's water that automatically kills the worms. Then, we have the terrain where the worms can walk, or destroy if needed. Finally, there's the absence of terrain, specifically, the empty space that cannot be walked on. Then we have the characters (worms) they are placed in random positions at the beginning of the game and they can walk, jump, and shoot. The characters of the game should be able to constantly adapt to the instability of the terrain, so we need to use that and make it part of the behavior tree. As demonstrated in the diagram above, the character will need to understand the position where he is currently placed, as well as the opponent's position, health, and items. Because the terrain can be blocking them, the AI character has a chance of being in a situation where he cannot attack or obtain an item. So, we give him options on what to do in those situations and many others that he might find, but the most important is to define what happens if he cannot successfully accomplish any of them. Because the terrain can be shaped into different forms, during gameplay there will be times that it is near impossible to do anything, and that is why we need to provide options on what to do in those situations. For example, in this situation where the worm doesn't have enough free space to move, a close item to pick up, or an enemy that can be properly attacked, what should he do? It's necessary to make information about the surroundings available to our character so he can make a good judgment for that situation. In this scenario, we have defined our character to shoot anyway, against the closest enemy, or to stay close to a wall. Because he is too close to the explosion that would occur from attacking the closest enemy, he should decide to stay in a corner and wait there until the next turn. Using raycast to evaluate decisions Ideally, at the start of the turn, the character has two raycasts, one for his left side and another for the right side. This will check if there's a wall obstructing one of those directions. This can be used to determine what side the character should be moving toward if he wants to protect himself from being attacked. Then, we would use another raycast in the aim direction, to see if there's something blocking the way when the character is preparing to shoot. If there's something in the middle, the character should be calculating the distance between the two to determine if it's still safe to shoot. So, each character should have a shared list of all of the worms that are currently in the game; that way they can compare the distance between them all and choose which of them are closest and shoot them. Additionally, we add the two raycasts to check if there's something blocking the sides, and we have the basic information to make the character adapt to the constant modifications of the terrain. public int HP; public int Ammunition; public static List<GameObject> wormList = new List<GameObject>(); //creates a list with all the worms public static int wormCount; //Amount of worms in the game public int ID; //It's used to differentiate the worms private float proximityValueX; private float proximityValueY; private float nearValue; public float distanceValue; //how far the enemy should be private bool canAttack; void Awake () { wormList.Add(gameObject); //add this worm to the list wormCount++; //adds plus 1 to the amount of worms in the game } void Start () { HP = 100; distanceValue = 30f; } void Update () { proximityValueX = wormList[1].transform.position.x - this.transform.position.x; proximityValueY = wormList[1].transform.position.y - this.transform.position.y; nearValue = proximityValueX + proximityValueY; if(nearValue <= distanceValue) { canAttack = true; } else { canAttack = false; } Vector3 raycastRight = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, 10)) print("There is something blocking the Right side!"); Vector3 raycastLEft = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, -10)) print("There is something blocking the Left side!"); } In this post, we explored different ways to interact with the environment. First, we learned how to break down the game environment by area. Then we learned about the advanced environment interactions with AI. To learn about manipulating animation behavior with AI read our book Practical Game AI Programming. Read Next Developing Games Using AI Techniques and Practices of Game AI Unite Berlin 2018 Keynote: Unity partners with Google, launches Ml-Agents ToolKit 0.4, Project MARS and more

0
0
24135

How-To Tutorials

article-image-implementing-cost-effective-iot-analytics-for-predictive-maintenance-tutorial

Prasad Ramesh

04 Sep 2018

10 min read

Implementing cost-effective IoT analytics for predictive maintenance [Tutorial]

Prasad Ramesh

04 Sep 2018

10 min read

Predictive maintenance is a common value proposition cited for IoT analytics. In this tutorial will look at a value formula for net savings. Then we walk through an example as a way to highlight how to think financially about when it makes sense to implement a decision and when it does not. The economics of predictive maintenance may not be entirely obvious. Believe it or not, it does not always make sense, even if you can predict early failures accurately. In many cases, you will actually lose money by doing it. Even when it can save you money, there is an optimal point for when it should be used. The optimal point depends on the costs and the accuracy of the predictive model. This article is an excerpt from a book written by Andrew Minteer titled Analytics for the Internet of Things (IoT). The value formula A formula to guide decision making compares the cost of allowing a failure to occur versus the cost to proactively repair the component while considering the probability of predicting the failure: Net Savings = (Cost of Failure * (Expected Number of Failures - Expected True Positive Predictions)) - (Proactive Repair Cost * (Expected True Positives + Expected False Positives)) If the cost of failure is the same as the proactive repair cost, even with a perfect prediction model, then there will be no savings. Make sure to include intangible costs into the cost of failure. Some examples of intangible costs include legal expenses, loss of brand equity, and even the customer's expenses. Predictive repair does make sense when there is a large spread between the cost of failure and the cost of proactive replacement, combined with a well-performing prediction model. For example, if the cost of a failure is a locomotive engine replacement at $1 million USD and the cost of a proactive repair is $200 USD, then the accuracy of the model does not even have to be all that great before a proactive replacement program makes financial sense. On the other hand, if the failure is a $400 USD automotive turbocharger replacement, and the proactive repair cost is $350 USD for a turbocharger actuator subcomponent replacement, the predictive model would need to be highly accurate for that to make financial sense. An example of making a value decision To illustrate the example, we will walk through a business situation and then some R code that simulates a cost-benefit curve for that decision. The code will use a fitted predictive model to calculate the net savings (or lack thereof) to generate a cost curve. The cost curve can then be used in a business decision on what proportion of units with predicted failures should have a proactive replacement. Imagine you work for a company that builds diesel-powered generators. There is a coolant control valve that normally lasts for 4,000 hours of operation until there is a planned replacement. From the analysis, your company has realized that the generators built two years prior are experiencing an earlier than the expected failure of the valve. When the valve fails, the engine overheats and several other components are damaged. The cost of failure, including labor rates for repair personnel and the cost to the customer for downtime, is an average of $1,000 USD. The cost of a proactive replacement of the valve is $253 USD. Should you replace all coolant valves in the population? It depends on how high a failure rate is expected. In this case, about 10% of the current non-failed units are expected to fail before the scheduled replacement. Also, importantly, it matters how well you can predict the failures. The following R code simulates this situation and uses a simple predictive model (logistic regression) to estimate a cost curve. The model has an AUC of close to 0.75. This will vary as you run the code since the dataset is randomly simulated: #make sure all needed packages are installed if(!require(caret)){ install.packages("caret") } if(!require(pROC)){ install.packages("pROC") } if(!require(dplyr)){ install.packages("dplyr") } if(!require(data.table)){ install.packages("data.table") } #Load required libraries library(caret) library(pROC) library(dplyr) library(data.table) #Generate sample data simdata = function(N=1000) { #simulate 4 features X = data.frame(replicate(4,rnorm(N))) #create a hidden data structure to learn hidden = X[,1]^2+sin(X[,2]) + rnorm(N)*1 #10% TRUE, 90% FALSE rare.class.probability = 0.1 #simulate the true classification values y.class = factor(hidden<quantile(hidden,c(rare.class.probability))) return(data.frame(X,Class=y.class)) } #make some data structure model_data = simdata(N=50000) #train a logistic regression model on the simulated data training <- createDataPartition(model_data$Class, p = 0.6, list=FALSE) trainData <- model_data[training,] testData <- model_data[-training,] glmModel <- glm(Class~ . , data=trainData, family=binomial) testData$predicted <- predict(glmModel, newdata=testData, type="response") #calculate AUC roc.glmModel <- pROC::roc(testData$Class, testData$predicted) auc.glmModel <- pROC::auc(roc.glmModel) print(auc.glmModel) #Pull together test data and predictions simModel <- data.frame(trueClass = testData$Class, predictedClass = testData$predicted) # Reorder rows and columns simModel <- simModel[order(simModel$predictedClass, decreasing = TRUE), ] simModel <- select(simModel, trueClass, predictedClass) simModel$rank <- 1:nrow(simModel) #Assign costs for failures and proactive repairs proactive_repair_cost <- 253 # Cost of proactively repairing a part failure_repair_cost <- 1000 # Cost of a failure of the part (include all costs such as lost production, etc not just the repair cost) # Define each predicted/actual combination fp.cost <- proactive_repair_cost # The part was predicted to fail but did not (False Positive) fn.cost <- failure_repair_cost # The part was not predicted to fail and it did (False Negative) tp.cost <- (proactive_repair_cost - failure_repair_cost) # The part was predicted to fail and it did (True Positive). This will be negative for a savings. tn.cost <- 0.0 # The part was not predicted to fail and it did not (True Negative) #incorporate probability of future failure simModel$future_failure_prob <- prob_failure #Function to assign costs for each instance assignCost <- function(pred, outcome, tn.cost, fn.cost, fp.cost, tp.cost, prob){ cost <- ifelse(pred == 0 & outcome == FALSE, tn.cost, # No cost since no action was taken and no failure ifelse(pred == 0 & outcome == TRUE, fn.cost, # The cost of no action and a repair resulted ifelse(pred == 1 & outcome == FALSE, fp.cost, # The cost of proactive repair which was not needed ifelse(pred == 1 & outcome == TRUE, tp.cost, 999999999)))) # The cost of proactive repair which avoided a failure return(cost) } # Initialize list to hold final output master <- vector(mode = "list", length = 100) #use the simulated model. In practice, this code can be adapted to compare multiple models test_model <- simModel # Create a loop to increment through dynamic threshold (starting at 1.0 [no proactive repairs] to 0.0 [all proactive repairs]) threshold <- 1.00 for (i in 1:101) { #Add predicted class with percentile ranking test_model$prob_ntile <- ntile(test_model$predictedClass, 100) / 100 # Dynamically determine if proactive repair would apply based on incrementing threshold test_model$glm_failure <- ifelse(test_model$prob_ntile >= threshold, 1, 0) test_model$threshold <- threshold # Compare to actual outcome to assign costs test_model$glm_impact <- assignCost(test_model$glm_failure, test_model$trueClass, tn.cost, fn.cost, fp.cost, tp.cost, test_model$future_failure_prob) # Compute cost for not doing any proactive repairs test_model$nochange_impact <- ifelse(test_model$trueClass == TRUE, fn.cost, tn.cost) # *test_model$future_failure_prob) # Running sum to produce the overall impact test_model$glm_cumul_impact <- cumsum(test_model$glm_impact) / nrow(test_model) test_model$nochange_cumul_impact <- cumsum(test_model$nochange_impact) / nrow(test_model) # Count the # of classified failures test_model$glm_failure_ct <- cumsum(test_model$glm_failure) # Create new object to house the one row per iteration output for the final plot master[[i]] <- test_model[nrow(test_model),] # Reduce the threshold by 1% and repeat to calculate new value threshold <- threshold - 0.01 } finalOutput <- rbindlist(master) finalOutput <- subset(finalOutput, select = c(threshold, glm_cumul_impact, glm_failure_ct, nochange_cumul_impact) ) # Set baseline to costs of not doing any proactive repairs baseline <- finalOutput$nochange_cumul_impact # Plot the cost curve par(mfrow = c(2,1)) plot(row(finalOutput)[,1], finalOutput$glm_cumul_impact, type = "l", lwd = 3, main = paste("Net Costs: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost, sep = ""), ylim = c(min(finalOutput$glm_cumul_impact) - 100, max(finalOutput$glm_cumul_impact) + 100), xlab = "Percent of Population", ylab = "Net Cost ($) / Unit") # Plot the cost difference of proactive repair program and a 'do nothing' approach plot(row(finalOutput)[,1], baseline - finalOutput$glm_cumul_impact, type = "l", lwd = 3, col = "black", main = paste("Savings: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost,sep = ""), ylim = c(min(baseline - finalOutput$glm_cumul_impact) - 100, max(baseline - finalOutput$glm_cumul_impact) + 100), xlab = "% of Population", ylab = "Savings ($) / Unit") abline(h=0,col="gray") As seen in the resulting net cost and savings curves, based on the model's predictions, the optimal savings would be from a proactive repair program of the top 30 percentile units. The savings decreases after this, although you would still save money when replacing up to 75% of the population. After this point, you should expect to spend more than you save. The following set of charts is the output from the preceding code: Cost and savings curves for the proactive repair $253 and failure cost at $1,000 scenario Note the changes in the following graph when the failure cost drops to $300 USD. At no point do you save money, as the proactive repair cost will always outweigh the reduced failure cost. This does not mean you should not do a proactive repair; you may still want to do so in order to satisfy your customers. Even in such a case, this cost curve method can help in decisions on how much you are willing to spend to address the problem. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 300 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $300 scenario And finally, notice how the savings curve changes when the failure cost moves to $5,000. You will notice that the spread between the proactive repair cost and the failure cost determines much of when doing a proactive repair makes business sense. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 5000 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $5,000 scenario Ultimately, the decision is a business case based on the expected costs and benefits. ML modeling can help optimize savings under the right conditions. Utilizing cost curves helps to determine the expected costs and savings of proactive replacements. In this tutorial, we looked at implementing economically cost effective IoT analytics for predictive maintenance with example. To further explore IoT Analytics and cloud check out the book Analytics for the Internet of Things (IoT). AWS IoT Analytics: The easiest way to run analytics on IoT data, Amazon says Build an IoT application with Azure IoT [Tutorial] Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2

0
0
23484

How-To Tutorials

article-image-build-intelligent-interfaces-with-coreml-using-a-cnn-tutorial

Savia Lobo

03 Sep 2018

19 min read

Build intelligent interfaces with CoreML using a CNN [Tutorial]

Savia Lobo

03 Sep 2018

19 min read

0
0
7937

How-To Tutorials

article-image-getting-started-with-amazon-machine-learning-workflow-tutorial

Melisha Dsouza

02 Sep 2018

14 min read

Getting started with Amazon Machine Learning workflow [Tutorial]

Melisha Dsouza

02 Sep 2018

14 min read

Amazon Machine Learning is useful for building ML models and generating predictions. It also enables the development of robust and scalable smart applications. The process of building ML models with Amazon Machine Learning consists of three operations: data analysis model training evaluation. The code files for this article are available on Github. This tutorial is an excerpt from a book written by Alexis Perrier titled Effective Amazon Machine Learning. The Amazon Machine Learning service is available at https://console.aws.amazon.com/machinelearning/. The Amazon ML workflow closely follows a standard Data Science workflow with steps: Extract the data and clean it up. Make it available to the algorithm. Split the data into a training and validation set, typically a 70/30 split with equal distribution of the predictors in each part. Select the best model by training several models on the training dataset and comparing their performances on the validation dataset. Use the best model for predictions on new data. As shown in the following Amazon ML menu, the service is built around four objects: Datasource ML model Evaluation Prediction The Datasource and Model can also be configured and set up in the same flow by creating a new Datasource and ML model. Let us take a closer look at each one of these steps. Understanding the dataset used We will use the simple Predicting Weight by Height and Age dataset (from Lewis Taylor (1967)) with 237 samples of children's age, weight, height, and gender, which is available at https://v8doc.sas.com/sashtml/stat/chap55/sect51.htm. This dataset is composed of 237 rows. Each row has the following predictors: sex (F, M), age (in months), height (in inches), and we are trying to predict the weight (in lbs) of these children. There are no missing values and no outliers. The variables are close enough in range and normalization is not required. We do not need to carry out any preprocessing or cleaning on the original dataset. Age, height, and weight are numerical variables (real-valued), and sex is a categorical variable. We will randomly select 20% of the rows as the held-out subset to use for prediction on previously unseen data and keep the other 80% as training and evaluation data. This data split can be done in Excel or any other spreadsheet editor: By creating a new column with randomly generated numbers Sorting the spreadsheet by that column Selecting 190 rows for training and 47 rows for prediction (roughly a 80/20 split) Let us name the training set LT67_training.csv and the held-out set that we will use for prediction LT67_heldout.csv, where LT67 stands for Lewis and Taylor, the creator of this dataset in 1967. As with all datasets, scripts, and resources mentioned in this book, the training and holdout files are available in the GitHub repository at https://github.com/alexperrier/packt-aml. It is important for the distribution in age, sex, height, and weight to be similar in both subsets. We want the data on which we will make predictions to show patterns that are similar to the data on which we will train and optimize our model. Loading the data on S3 Follow these steps to load the training and held-out datasets on S3: Go to your s3 console at https://console.aws.amazon.com/s3. Create a bucket if you haven't done so already. Buckets are basically folders that are uniquely named across all S3. We created a bucket named aml.packt. Since that name has now been taken, you will have to choose another bucket name if you are following along with this demonstration. Click on the bucket name you created and upload both the LT67_training.csv and LT67_heldout.csv files by selecting Upload from the Actions drop-down menu: Both files are small, only a few KB, and hosting costs should remain negligible for that exercise. Note that for each file, by selecting the Properties tab on the right, you can specify how your files are accessed, what user, role, group or AWS service may download, read, write, and delete the files, and whether or not they should be accessible from the Open Web. When creating the datasource in Amazon ML, you will be prompted to grant Amazon ML access to your input data. You can specify the access rules to these files now in S3 or simply grant access later on. Our data is now in the cloud in an S3 bucket. We need to tell Amazon ML where to find that input data by creating a datasource. We will first create the datasource for the training file ST67_training.csv. Declaring a datasource Go to the Amazon ML dashboard, and click on Create new... | Datasource and ML model. We will use the faster flow available by default: As shown in the following screenshot, you are asked to specify the path to the LT67_training.csv file {S3://bucket}{path}{file}. Note that the S3 location field automatically populates with the bucket names and file names that are available to your user: Specifying a Datasource name is useful to organize your Amazon ML assets. By clicking on Verify, Amazon ML will make sure that it has the proper rights to access the file. In case it needs to be granted access to the file, you will be prompted to do so as shown in the following screenshot: Just click on Yes to grant access. At this point, Amazon ML will validate the datasource and analyze its contents. Creating the datasource An Amazon ML datasource is composed of the following: The location of the data file: The data file is not duplicated or cloned in Amazon ML but accessed from S3 The schema that contains information on the type of the variables contained in the CSV file: Categorical Text Numeric (real-valued) Binary It is possible to supply Amazon ML with your own schema or modify the one created by Amazon ML. At this point, Amazon ML has a pretty good idea of the type of data in your training dataset. It has identified the different types of variables and knows how many rows it has: Move on to the next step by clicking on Continue, and see what schema Amazon ML has inferred from the dataset as shown in the next screenshot: Amazon ML needs to know at that point which is the variable you are trying to predict. Be sure to tell Amazon ML the following: The first line in the CSV file contains te column name The target is the weight We see here that Amazon ML has correctly inferred the following: sex is categorical age, height, and weight are numeric (continuous real values) Since we chose a numeric variable as the target Amazon ML, will use Linear Regression as the predictive model. For binary or categorical values, we would have used Logistic Regression. This means that Amazon ML will try to find the best a, b, and c coefficients so that the weight predicted by the following equation is as close as possible to the observed real weight present in the data: predicted weight = a * age + b * height + c * sex Amazon ML will then ask you if your data contains a row identifier. In our present case, it does not. Row identifiers are useful when you want to understand the prediction obtained for each row or add an extra column to your dataset later on in your project. Row identifiers are for reference purposes only and are not used by the service to build the model. You will be asked to review the datasource. You can go back to each one of the previous steps and edit the parameters for the schema, the target and the input data. Now that the data is known to Amazon ML, the next step is to set up the parameters of the algorithm that will train the model. Understanding the model We select the default parameters for the training and evaluation settings. Amazon ML will do the following: Create a recipe for data transformation based on the statistical properties it has inferred from the dataset Split the dataset (ST67_training.csv) into a training part and a validation part, with a 70/30 split. The split strategy assumes the data has already been shuffled and can be split sequentially. The recipe will be used to transform the data in a similar way for the training and the validation datasets. The only transformation suggested by Amazon ML is to transform the categorical variable sex into a binary variable, where m = 0 and f = 1 for instance. No other transformation is needed. The default advanced settings for the model are shown in the following screenshot: We see that Amazon ML will pass over the data 10 times, shuffle splitting the data each time. It will use an L2 regularization strategy based on the sum of the square of the coefficients of the regression to prevent overfitting. We will evaluate the predictive power of the model using our LT67_heldout.csv dataset later on. Regularization comes in 3 levels with a mild (10^-6), medium (10^-4), or aggressive (10^-02) setting, each value stronger than the previous one. The default setting is mild, the lowest, with a regularization constant of 0.00001 (10^-6) implying that Amazon ML does not anticipate much overfitting on this dataset. This makes sense when the number of predictors, three in our case, is much smaller than the number of samples (190 for the training set). Clicking on the Create ML model button will launch the model creation. This takes a few minutes to resolve, depending on the size and complexity of your dataset. You can check its status by refreshing the model page. In the meantime, the model status remains pending. At that point, Amazon ML will split our training dataset into two subsets: a training and a validation set. It will use the training portion of the data to train several settings of the algorithm and select the best one based on its performance on the training data. It will then apply the associated model to the validation set and return an evaluation score for that model. By default, Amazon ML will sequentially take the first 70% of the samples for training and the remaining 30% for validation. It's worth noting that Amazon ML will not create two extra files and store them on S3, but instead create two new datasources out of the initial datasource we have previously defined. Each new datasource is obtained from the original one via a Data rearrangement JSON recipe such as the following: { "splitting": { "percentBegin": 0, "percentEnd": 70 } } You can see these two new datasources in the Datasource dashboard. Three datasources are now available where there was initially only one, as shown by the following screenshot: While the model is being trained, Amazon ML runs the Stochastic Gradient algorithm several times on the training data with different parameters: Varying the learning rate in increments of powers of 10: 0.01, 0.1, 1, 10, and 100. Making several passes over the training data while shuffling the samples before each path. At each pass, calculating the prediction error, the Root Mean Squared Error (RMSE), to estimate how much of an improvement over the last pass was obtained. If the decrease in RMSE is not really significant, the algorithm is considered to have converged, and no further pass shall be made. At the end of the passes, the setting that ends up with the lowest RMSE wins, and the associated model (the weights of the regression) is selected as the best version. Once the model has finished training, Amazon ML evaluates its performance on the validation datasource. Once the evaluation itself is also ready, you have access to the model's evaluation. Evaluating the model Amazon ML uses the standard metric RMSE for linear regression. RMSE is defined as the sum of the squares of the difference between the real values and the predicted values: Here, ŷ is the predicted values, and y the real values we want to predict (the weight of the children in our case). The closer the predictions are to the real values, the lower the RMSE is. A lower RMSE means a better, more accurate prediction. Making batch predictions We now have a model that has been properly trained and selected among other models. We can use it to make predictions on new data. A batch prediction consists in applying a model to a datasource in order to make predictions on that datasource. We need to tell Amazon ML which model we want to apply on which data. Batch predictions are different from streaming predictions. With batch predictions, all the data is already made available as a datasource, while for streaming predictions, the data will be fed to the model as it becomes available. The dataset is not available beforehand in its entirety. In the Main Menu select Batch Predictions to access the dashboard predictions and click on Create a New Prediction: The first step is to select one of the models available in your model dashboard. You should choose the one that has the lowest RMSE: The next step is to associate a datasource to the model you just selected. We had uploaded the held-out dataset to S3 at the beginning of this chapter (under the Loading the data on S3 section) but had not used it to create a datasource. We will do so now.When asked for a datasource in the next screen, make sure to check My data is in S3, and I need to create a datasource, and then select the held-out dataset that should already be present in your S3 bucket: Don't forget to tell Amazon ML that the first line of the file contains columns. In our current project, our held-out dataset also contains the true values for the weight of the students. This would not be the case for "real" data in a real-world project where the real values are truly unknown. However, in our case, this will allow us to calculate the RMSE score of our predictions and assess the quality of these predictions. The final step is to click on the Verify button and wait for a few minutes: Amazon ML will run the model on the new datasource and will generate predictions in the form of a CSV file. Contrary to the evaluation and model-building phase, we now have real predictions. We are also no longer given a score associated with these predictions. After a few minutes, you will notice a new batch-prediction folder in your S3 bucket. This folder contains a manifest file and a results folder. The manifest file is a JSON file with the path to the initial datasource and the path to the results file. The results folder contains a gzipped CSV file: Uncompressed, the CSV file contains two columns, trueLabel, the initial target from the held-out set, and score, which corresponds to the predicted values. We can easily calculate the RMSE for those results directly in the spreadsheet through the following steps: Creating a new column that holds the square of the difference of the two columns. Summing all the rows. Taking the square root of the result. The following illustration shows how we create a third column C, as the squared difference between the trueLabel column A and the score (or predicted value) column B: As shown in the following screenshot, averaging column C and taking the square root gives an RMSE of 11.96, which is even significantly better than the RMSE we obtained during the evaluation phase (RMSE 14.4): The fact that the RMSE on the held-out set is better than the RMSE on the validation set means that our model did not overfit the training data, since it performed even better on new data than expected. Our model is robust. The left side of the following graph shows the True (Triangle) and Predicted (Circle) Weight values for all the samples in the held-out set. The right side shows the histogram of the residuals. Similar to the histogram of residuals we had observed on the validation set, we observe that the residuals are not centered on 0. Our model has a tendency to overestimate the weight of the students: In this tutorial, we have successfully performed the loading of the data on S3 and let Amazon ML infer the schema and transform the data. We also created a model and evaluated its performance. Finally, we made a prediction on the held -out dataset. To understand how to leverage Amazon's powerful platform for your predictive analytics needs, check out this book Effective Amazon Machine Learning. Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics Amazon Sagemaker makes machine learning on the cloud easy Amazon ML Solutions Lab to help customers “work backwards” and leverage machine learning

0
0
10120

article-image-should-you-use-javascript-for-machine-learning-and-how-do-you-get-started

Sugandha Lahoti

01 Sep 2018

4 min read

Why use JavaScript for machine learning?

Sugandha Lahoti

01 Sep 2018

4 min read

Python has always been and remains the language of choice for machine learning, in part due to the maturity of the language, in part due to the maturity of the ecosystem, and in part due to the positive feedback loop of early ML efforts in Python. Recent developments in the JavaScript world, however, are making JavaScript more attractive to ML projects. I think we will see a major ML renaissance in JavaScript within a few years, especially as laptops and mobile devices become ever more powerful and JavaScript itself surges in popularity. This post is extracted from the book Hands-on Machine Learning with JavaScript by Burak Kanber. The book is a definitive guide to creating intelligent web applications with the best of machine learning and JavaScript. Advantages and challenges of JavaScript JavaScript, like any other tool, has its advantages and disadvantages. Much of the historical criticism of JavaScript has focused on a few common themes: strange behavior in type coercion, the prototypical object-oriented model, difficulty organizing large codebases, and managing deeply nested asynchronous function calls with what many developers call callback hell. Fortunately, most of these historic gripes have been resolved by the introduction of ES6, that is, ECMAScript 2015, a recent update to the JavaScript syntax. Con: Immature ecosystem for machine learning development Despite the recent language improvements, most developers would still advise against using JavaScript for ML for one reason: the ecosystem. The Python ecosystem for ML is so mature and rich that it's difficult to justify choosing any other ecosystem. But this logic is self-fulfilling and self-defeating; we need brave individuals to take the leap and work on real ML problems if we want JavaScript's ecosystem to mature. Fortunately, JavaScript has been the most popular programming language on GitHub for a few years running, and is growing in popularity by almost every metric. Pro #1: JavaScript is the most popular web development language with a mature npm ecosystem There are some advantages to using JavaScript for ML. Its popularity is one; while ML in JavaScript is not very popular at the moment, the language itself is. As demand for ML applications rises, and as hardware becomes faster and cheaper, it's only natural for ML to become more prevalent in the JavaScript world. There are tons of resources available for learning JavaScript in general, maintaining Node.js servers, and deploying JavaScript applications. The Node Package Manager (npm) ecosystem is also large and still growing, and while there aren't many very mature ML packages available, there are a number of well built, useful tools out there that will come to maturity soon. Pro #2: JavaScript is now a general purpose, cross-platform programming language Another advantage to using JavaScript is the universality of the language. The modern web browser is essentially a portable application platform which allows you to run your code, basically without modification, on nearly any device. Tools like electron (while considered by many to be bloated) allow developers to quickly develop and deploy downloadable desktop applications to any operating system. Node.js lets you run your code in a server environment. React Native brings your JavaScript code to the native mobile application environment, and may eventually allow you to develop desktop applications as well. JavaScript is no longer confined to just dynamic web interactions, it's now a general-purpose, cross-platform programming language. Pro #3: JavaScript makes Machine Learning accessible to web and front-end developers Finally, using JavaScript makes ML accessible to web and frontend developers, a group that historically has been left out of the ML discussion. Server-side applications are typically preferred for ML tools, since the servers are where the computing power is. That fact has historically made it difficult for web developers to get into the ML game, but as hardware improves, even complex ML models can be run on the client, whether it's the desktop or the mobile browser. If web developers, frontend developers, and JavaScript developers all start learning about ML today, that same community will be in a position to improve the ML tools available to us all tomorrow. If we take these technologies and democratize them, expose as many people as possible to the concepts behind ML, we will ultimately elevate the community and seed the next generation of ML researchers. Summary In this article, we've discussed the important moments of JavaScript's history as applied to ML. We’ve discussed some advantages to using JavaScript for machine learning, and also some of the challenges we’re facing, particularly in terms of the machine learning ecosystem. To begin exploring and processing the data itself, read our book Hands-on Machine Learning with JavaScript. 5 JavaScript machine learning libraries you need to know V8 JavaScript Engine releases version 6.9! HTML5 and the rise of modern JavaScript browser APIs [Tutorial]

0
0
43869

How-To Tutorials

Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial]

AWS machine learning: Learning AWS CLI to execute a simple Amazon ML workflow [Tutorial]

How to predict viral content using random forest regression in Python [Tutorial]

Implementing an AI in Unreal Engine 4 with AI Perception components [Tutorial]

Build a custom news feed with Python [Tutorial]

Implementing Dependency Injection in Google Guice [Tutorial]

Building Recommendation System with Scala and Apache Spark [Tutorial]

Building a Twitter news bot using Twitter API [Tutorial]

Classifying flowers in Iris Dataset using Scala [Tutorial]

Intelligent mobile projects with TensorFlow: Build your first Reinforcement Learning model on Raspberry Pi [Tutorial]

Trending Topics

How to use artificial intelligence to create games with rich and interactive environments [Tutorial]

Implementing cost-effective IoT analytics for predictive maintenance [Tutorial]

Build intelligent interfaces with CoreML using a CNN [Tutorial]

Getting started with Amazon Machine Learning workflow [Tutorial]

Why use JavaScript for machine learning?

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access