Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7007 Articles
article-image-emotional-ai-detecting-facial-expressions-and-emotions-using-coreml-tutorial
Savia Lobo
14 Sep 2018
11 min read
Save for later

Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial]

Savia Lobo
14 Sep 2018
11 min read
Recently we see computers allow natural forms of interaction and are becoming more ubiquitous, more capable, and more ingrained in our daily lives. They are becoming less like heartless dumb tools and more like friends, able to entertain us, look out for us, and assist us with our work. This article is an excerpt taken from the book Machine Learning with Core ML authored by Joshua Newnham. With this shift comes a need for computers to be able to understand our emotional state. For example, you don't want your social robot cracking a joke after you arrive back from work having lost your job (to an AI bot!). This is a field of computer science known as affective computing (also referred to as artificial emotional intelligence or emotional AI), a field that studies systems that can recognize, interpret, process, and simulate human emotions. The first stage of this is being able to recognize the emotional state. In this article, we will be creating a model that can detect the exact face expression or emotion using CoreML. Input data and preprocessing We will implement the preprocessing functionality required to transform images into something the model is expecting. We will build up this functionality in a playground project before migrating it across to our project in the next section. If you haven't done so already, pull down the latest code from the accompanying repository: https://github.com/packtpublishing/machine-learning-with-core-ml. Once downloaded, navigate to the directory Chapter4/Start/ and open the Playground project ExploringExpressionRecognition.playground. Once loaded, you will see the playground for this extract, as shown in the following screenshot: Before starting, to avoid looking at images of me, please replace the test images with either personal photos of your own or royalty free images from the internet, ideally a set expressing a range of emotions. Along with the test images, this playground includes a compiled Core ML model (we introduced it in the previous image) with its generated set of wrappers for inputs, outputs, and the model itself. Also included are some extensions for UIImage, UIImageView, CGImagePropertyOrientation, and an empty CIImage extension, to which we will return later in the extract. The others provide utility functions to help us visualize the images as we work through this playground. When developing machine learning applications, you have two broad paths. The first, which is becoming increasingly popular, is to use an end-to-end machine learning model capable of just being fed the raw input and producing adequate results. One particular field that has had great success with end-to-end models is speech recognition. Prior to end-to-end deep learning, speech recognition systems were made up of many smaller modules, each one focusing on extracting specific pieces of data to feed into the next module, which was typically manually engineered. Modern speech recognition systems use end-to-end models that take the raw input and output the result. Both of the described approaches can been seen in the following diagram: Obviously, this approach is not constrained to speech recognition and we have seen it applied to image recognition tasks, too, along with many others. But there are two things that make this particular case different; the first is that we can simplify the problem by first extracting the face. This means our model has less features to learn and offers a smaller, more specialized model that we can tune. The second thing, which is no doubt obvious, is that our training data consisted of only faces and not natural images. So, we have no other choice but to run our data through two models, the first to extract faces and the second to perform expression recognition on the extracted faces, as shown in this diagram: Luckily for us, Apple has mostly taken care of our first task of detecting faces through the Vision framework it released with iOS 11. The Vision framework provides performant image analysis and computer vision tools, exposing them through a simple API. This allows for face detection, feature detection and tracking, and classification of scenes in images and video. The latter (expression recognition) is something we will take care of using the Core ML model introduced earlier. Prior to the introduction of the Vision framework, face detection would typically be performed using the Core Image filter. Going back further, you had to use something like OpenCV. You can learn more about Core Image here: https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_detect_faces/ci_detect_faces.html. Now that we have got a bird's-eye view of the work that needs to be done, let's turn our attention to the editor and start putting all of this together. Start by loading the images; add the following snippet to your playground: var images = [UIImage]() for i in 1...3{ guard let image = UIImage(named:"images/joshua_newnham_\(i).jpg") else{ fatalError("Failed to extract features") } images.append(image) } let faceIdx = 0 let imageView = UIImageView(image: images[faceIdx]) imageView.contentMode = .scaleAspectFit In the preceding snippet, we are simply loading each of the images we have included in our resources' Images folder and adding them to an array we can access conveniently throughout the playground. Once all the images are loaded, we set the constant faceIdx, which will ensure that we access the same images throughout our experiments. Finally, we create an ImageView to easily preview it. Once it has finished running, click on the eye icon in the right-hand panel to preview the loaded image, as shown in the following screenshot: Next, we will take advantage of the functionality available in the Vision framework to detect faces. The typical flow when working with the Vision framework is defining a request, which determines what analysis you want to perform, and defining the handler, which will be responsible for executing the request and providing means of obtaining the results (either through delegation or explicitly queried). The result of the analysis is a collection of observations that you need to cast into the appropriate observation type; concrete examples of each of these can be seen here: As illustrated in the preceding diagram, the request determines what type of image analysis will be performed; the handler, using a request or multiple requests and an image, performs the actual analysis and generates the results (also known as observations). These are accessible via a property or delegate if one has been assigned. The type of observation is dependent on the request performed; it's worth highlighting that the Vision framework is tightly integrated into Core ML and provides another layer of abstraction and uniformity between you and the data and process. For example, using a classification Core ML model would return an observation of type VNClassificationObservation. This layer of abstraction not only simplifies things but also provides a consistent way of working with machine learning models. In the previous figure, we showed a request handler specifically for static images. Vision also provides a specialized request handler for handling sequences of images, which is more appropriate when dealing with requests such as tracking. The following diagram illustrates some concrete examples of the types of requests and observations applicable to this use case: So, when do you use VNImageRequestHandler and VNSequenceRequestHandler? Though the names provide clues as to when one should be used over the other, it's worth outlining some differences. The image request handler is for interactive exploration of an image; it holds a reference to the image for its life cycle and allows optimizations of various request types. The sequence request handler is more appropriate for performing tasks such as tracking and does not optimize for multiple requests on an image. Let's see how this all looks in code; add the following snippet to your playground: let faceDetectionRequest = VNDetectFaceRectanglesRequest() let faceDetectionRequestHandler = VNSequenceRequestHandler() Here, we are simply creating the request and handler; as discussed in the preceding code, the request encapsulates the type of image analysis while the handler is responsible for executing the request. Next, we will get faceDetectionRequestHandler to run faceDetectionRequest; add the following code: try? faceDetectionRequestHandler.perform( [faceDetectionRequest], on: images[faceIdx].cgImage!, orientation: CGImagePropertyOrientation(images[faceIdx].imageOrientation)) The perform function of the handler can throw an error if it fails; for this reason, we wrap the call with try? at the beginning of the statement and can interrogate the error property of the handler to identify the reason for failing. We pass the handler a list of requests (in this case, only our faceDetectionRequest), the image we want to perform the analysis on, and, finally, the orientation of the image that can be used by the request during analysis. Once the analysis is done, we can inspect the observation obtained through the results property of the request itself, as shown in the following code: if let faceDetectionResults = faceDetectionRequest.results as? [VNFaceObservation]{ for face in faceDetectionResults{ // ADD THE NEXT SNIPPET OF CODE HERE } } The type of observation is dependent on the analysis; in this case, we're expecting a VNFaceObservation. Hence, we cast it to the appropriate type and then iterate through all the observations. Next, we will take each recognized face and extract the bounding box. Then, we'll proceed to draw it in the image (using an extension method of UIImageView found within the UIImageViewExtension.swift file). Add the following block within the for loop shown in the preceding code: if let currentImage = imageView.image{ let bbox = face.boundingBox let imageSize = CGSize( width:currentImage.size.width, height: currentImage.size.height) let w = bbox.width * imageSize.width let h = bbox.height * imageSize.height let x = bbox.origin.x * imageSize.width let y = bbox.origin.y * imageSize.height let faceRect = CGRect( x: x, y: y, width: w, height: h) let invertedY = imageSize.height - (faceRect.origin.y + faceRect.height) let invertedFaceRect = CGRect( x: x, y: invertedY, width: w, height: h) imageView.drawRect(rect: invertedFaceRect) } We can obtain the bounding box of each face via the let boundingBox property; the result is normalized, so we then need to scale this based on the dimensions of the image. For example, you can obtain the width by multiplying boundingBox with the width of the image: bbox.width * imageSize.width. Next, we invert the y axis as the coordinate system of Quartz 2D is inverted with respect to that of UIKit's coordinate system, as shown in this diagram: We invert our coordinates by subtracting the bounding box's origin and height from height of the image and then passing this to our UIImageView to render the rectangle. Click on the eye icon in the right-hand panel in line with the statement imageView.drawRect(rect: invertedFaceRect) to preview the results; if successful, you should see something like the following: An alternative to inverting the face rectangle would be to use an AfflineTransform, such as: var transform = CGAffineTransform(scaleX: 1, y: -1) transform = transform.translatedBy(x: 0, y: -imageSize.height) let invertedFaceRect = faceRect.apply(transform) This approach leads to less code and therefore less chances of errors. So, it is the recommended approach. The long approach was taken previously to help illuminate the details. As a designer and builder of intelligent systems, it is your task to interpret these results and present them to the user. Some questions you'll want to ask yourself are as follows: What is an acceptable threshold of a probability before setting the class as true? Can this threshold be dependent on probabilities of other classes to remove ambiguity? That is, if Sad and Happy have a probability of 0.3, you can infer that the prediction is inaccurate, or at least not useful. Is there a way to accept multiple probabilities? Is it useful to expose the threshold to the user and have it manually set and/or tune it? These are only a few questions you should ask. The specific questions and their answers will depend on your use case and users. At this point, we have everything we need to preprocess and perform inference We briefly explored some use cases showing how emotion recognition could be applied. For a detailed overview of this experiment, check out our book, Machine Learning with Core ML to further implement Core ML for visual-based applications using the principles of transfer learning and neural networks. Amazon Rekognition can now ‘recognize’ faces in a crowd at real-time 5 cool ways Transfer Learning is being used today My friend, the robot: Artificial Intelligence needs Emotional Intelligence
Read more
  • 0
  • 0
  • 34336

article-image-aws-machine-learning-learning-aws-cli-to-execute-a-simple-amazon-ml-workflow-tutorial
Melisha Dsouza
13 Sep 2018
15 min read
Save for later

AWS machine learning: Learning AWS CLI to execute a simple Amazon ML workflow [Tutorial]

Melisha Dsouza
13 Sep 2018
15 min read
Using the AWS web interface to manage and run your projects is time-consuming. We will, therefore, start running our projects via the command line with the AWS Command Line Interface (AWS CLI). With just one tool to download and configure, multiple  AWS services can be controlled from the command line and they can be automated through scripts. The code files for this article are available on Github. This article is an excerpt from a book written by Alexis Perrier titled Effective Amazon Machine Learning. Getting started and setting up Creating a performing predictive model from raw data requires many trials and errors, much back and forth. Creating new features, cleaning up data, and trying out new parameters for the model are needed to ensure the robustness of the model. There is a constant back and forth between the data, the models, and the evaluations. Scripting this workflow either via the AWS CLI will give us the ability to speed up the create, test, select loop. Installing AWS CLI In order to set up your CLI credentials, you need your access key ID and your secret access key.  You can simply create them from the IAM console (https://console.aws.amazon.com/iam). Navigate to Users, select your IAM user name and click on the Security credentials tab. Choose Create Access Key and download the CSV file. Store the keys in a secure location. We will need the key in a few minutes to set up AWS CLI. But first, we need to install AWS CLI. Docker environment – This tutorial will help you use the AWS CLI within a docker container: https://blog.flowlog-stats.com/2016/05/03/aws-cli-in-a-docker-container/. A docker image for running the AWS CLI is available at https://hub.docker.com/r/fstab/aws-cli/. There is no need to rewrite the AWS documentation on how to install the AWS CLI. It is complete and up to date, and available at http://docs.aws.amazon.com/cli/latest/userguide/installing.html. In a nutshell, installing the CLI requires you to have Python and pip already installed. Then, run the following: $ pip install --upgrade --user awscli Add AWS to your $PATH: $ export PATH=~/.local/bin:$PATH Reload the bash configuration file (this is for OSX): $ source ~/.bash_profile Check that everything works with the following command: $ aws --version You should see something similar to the following output: $ aws-cli/1.11.47 Python/3.5.2 Darwin/15.6.0 botocore/1.5.10 Once installed, we need to configure the AWS CLI type: $ aws configure Now input the access keys you just created: $ aws configure AWS Access Key ID [None]: ABCDEF_THISISANEXAMPLE AWS Secret Access Key [None]: abcdefghijk_THISISANEXAMPLE Default region name [None]: us-west-2 Default output format [None]: json Choose the region that is closest to you and the format you prefer (JSON, text, or table). JSON is the default format. The AWS configure command creates two files: a config file and a credential file. On OSX, the files are ~/.aws/config and ~/.aws/credentials. You can directly edit these files to change your access or configuration. You will need to create different profiles if you need to access multiple AWS accounts. You can do so via the AWS configure command: $ aws configure --profile user2 You can also do so directly in the config and credential files: ~/.aws/config [default] output = json region = us-east-1 [profile user2] output = text region = us-west-2 You can edit Credential file as follows: ~/.aws/credentials [default] aws_secret_access_key = ABCDEF_THISISANEXAMPLE aws_access_key_id = abcdefghijk_THISISANEXAMPLE [user2] aws_access_key_id = ABCDEF_ANOTHERKEY aws_secret_access_key = abcdefghijk_ANOTHERKEY Refer to the AWS CLI setup page for more in-depth information: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html Picking up CLI syntax The overall format of any AWS CLI command is as follows: $ aws <service> [options] <command> <subcommand> [parameters] Here the terms are stated as: <service>: Is the name of the service you are managing: S3, machine learning, and EC2 [options] : Allows you to set the region, the profile, and the output of the command <command> <subcommand>: Is the actual command you want to execute  [parameters] : Are the parameters for these commands A simple example will help you understand the syntax better. To list the content of an S3 bucket named aml.packt, the command is as follows: $ aws s3 ls aml.packt Here, s3 is the service, ls is the command, and aml.packt is the parameter. The aws help command will output a list of all available services. There are many more examples and explanations on the AWS documentation available at http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-using.html. Passing parameters using JSON files For some services and commands, the list of parameters can become long and difficult to check and maintain. For instance, in order to create an Amazon ML model via the CLI, you need to specify at least seven different elements: the Model ID, name, type, the model's parameters, the ID of the training data source, and the recipe name and URI (aws machinelearning create-ml-model help ). When possible, we will use the CLI ability to read parameters from a JSON file instead of specifying them in the command line. AWS CLI also offers a way to generate a JSON template, which you can then use with the right parameters. To generate that JSON parameter file model (the JSON skeleton), simply add --generate-cli-skeleton after the command name. For instance, to generate the JSON skeleton for the create model command of the machine learning service, write the following: $ aws machinelearning create-ml-model --generate-cli-skeleton This will give the following output: { "MLModelId": "", "MLModelName": "", "MLModelType": "", "Parameters": { "KeyName": "" }, "TrainingDataSourceId": "", "Recipe": "", "RecipeUri": "" } You can then configure this to your liking. To have the skeleton command generate a JSON file and not simply output the skeleton in the terminal, add > filename.json: $ aws machinelearning create-ml-model --generate-cli-skeleton > filename.json This will create a filename.json file with the JSON template. Once all the required parameters are specified, you create the model with the command (assuming the filename.json is in the current folder): $ aws machinelearning create-ml-model file://filename.json Before we dive further into the machine learning workflow via the CLI, we need to introduce the dataset we will be using in this chapter. Introducing the Ames Housing dataset We will use the Ames Housing dataset that was compiled by Dean De Cock for use in data science education. It is a great alternative to the popular but older Boston Housing dataset. The Ames Housing dataset is used in the Advanced Regression Techniques challenge on the Kaggle website: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/. The original version of the dataset is available: http://www.amstat.org/publications/jse/v19n3/decock/AmesHousing.xls and in the GitHub repository for this chapter. For more information on the genesis of this dataset and an in-depth explanation of the different variables, read the paper by Dean De Cock available in PDF at https://ww2.amstat.org/publications/jse/v19n3/decock.pdf. We will start by splitting the dataset into a train and a validate set and build a model on the train set. Both train and validate sets are available in the GitHub repository as ames_housing_training.csv and ames_housing_validate.csv. The entire dataset is in the ames_housing.csv file. Splitting the dataset with shell commands We will use shell commands to shuffle, split, and create training and validation subsets of the Ames Housing dataset: First, extract the first line into a separate file, ames_housing_header.csv and remove it from the original file: $ head -n 1 ames_housing.csv > ames_housing_header.csv We just tail all the lines after the first one into the same file: $ tail -n +2 ames_housing.csv > ames_housing_nohead.csv Then randomly sort the rows into a temporary file. (gshuf is the OSX equivalent of the Linux shuf shell command. It can be installed via brew install coreutils): $ gshuf ames_housing_nohead.csv -o ames_housing_nohead.csv Extract the first 2,050 rows as the training file and the last 880 rows as the validation file: $ head -n 2050 ames_housing_nohead.csv > ames_housing_training.csv $ tail -n 880 ames_housing_nohead.csv > ames_housing_validate.csv Finally, add back the header into both training and validation files: $ cat ames_housing_header.csv ames_housing_training.csv > tmp.csv $ mv tmp.csv ames_housing_training.csv $ cat ames_housing_header.csv ames_housing_validate.csv > tmp.csv $ mv tmp.csv ames_housing_validate.csv A simple project using AWS CLI We are now ready to execute a simple Amazon ML workflow using the CLI. This includes the following: Uploading files on S3 Creating a datasource and the recipe Creating a model Creating an evaluation Prediction batch and real time Let's start by uploading the training and validation files to S3. In the following lines, replace the bucket name aml.packt with your own bucket name. To upload the files to the S3 location s3://aml.packt/data/ch8/, run the following command lines: $ aws s3 cp ./ames_housing_training.csv s3://aml.packt/data/ch8/ upload: ./ames_housing_training.csv to s3://aml.packt/data/ch8/ames_housing_training.csv $ aws s3 cp ./ames_housing_validate.csv s3://aml.packt/data/ch8/ upload: ./ames_housing_validate.csv to s3://aml.packt/data/ch8/ames_housing_validate.csv An overview of Amazon ML CLI commands That's it for the S3 part. Now let's explore the CLI for Amazon's machine learning service. All Amazon ML CLI commands are available at http://docs.aws.amazon.com/cli/latest/reference/machinelearning/. There are 30 commands, which can be grouped by object and action. You can perform the following: create : creates the object describe: searches objects given some parameters (location, dates, names, and so on) get: given an object ID, returns information update: given an object ID, updates the object delete: deletes an object These can be performed on the following elements: datasource create-data-source-from-rds create-data-source-from-redshift create-data-source-from-s3 describe-data-sources delete-data-source get-data-source update-data-source ml-model create-ml-model describe-ml-models get-ml-model delete-ml-model update-ml-model evaluation create-evaluation describe-evaluations get-evaluation delete-evaluation update-evaluation batch prediction create-batch-prediction describe-batch-predictions get-batch-prediction delete-batch-prediction update-batch-prediction real-time end point create-realtime-endpoint delete-realtime-endpoint predict You can also handle tags and set waiting times. Note that the AWS CLI gives you the ability to create datasources from S3, Redshift, and RDS, while the web interface only allowed datasources from S3 and Redshift. Creating the datasource We will start by creating the datasource. Let's first see what parameters are needed by generating the following skeleton: $ aws machinelearning create-data-source-from-s3 --generate-cli-skeleton This generates the following JSON object: { "DataSourceId": "", "DataSourceName": "", "DataSpec": { "DataLocationS3": "", "DataRearrangement": "", "DataSchema": "", "DataSchemaLocationS3": "" }, "ComputeStatistics": true } The different parameters are mostly self-explanatory and further information can be found on the AWS documentation at http://docs.aws.amazon.com/cli/latest/reference/machinelearning/create-data-source-from-s3.html. A word on the schema: when creating a datasource from the web interface, you have the possibility to use a wizard, to be guided through the creation of the schema. The wizard facilitates the process by guessing the type of the variables, thus making available a default schema that you can modify. There is no default schema available via the AWS CLI. You have to define the entire schema yourself, either in a JSON format in the DataSchema field or by uploading a schema file to S3 and specifying its location, in the DataSchemaLocationS3 field. Since our dataset has many variables (79), we cheated and used the wizard to create a default schema that we uploaded to S3. Throughout the rest of the chapter, we will specify the schema location not its JSON definition. In this example, we will create the following datasource parameter file, dsrc_ames_housing_001.json: { "DataSourceId": "ch8_ames_housing_001", "DataSourceName": "[DS] Ames Housing 001", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_training.csv", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } For the validation subset (save to dsrc_ames_housing_002.json): { "DataSourceId": "ch8_ames_housing_002", "DataSourceName": "[DS] Ames Housing 002", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_validate.csv", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } Since we have already split our data into a training and a validation set, there's no need to specify the data DataRearrangement field. Alternatively, we could also have avoided splitting our dataset and specified the following DataRearrangement on the original dataset, assuming it had been already shuffled: (save to dsrc_ames_housing_003.json): { "DataSourceId": "ch8_ames_housing_003", "DataSourceName": "[DS] Ames Housing training 003", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_shuffled.csv", "DataRearrangement": "{"splitting":{"percentBegin":0,"percentEnd":70}}", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } For the validation set (save to dsrc_ames_housing_004.json): { "DataSourceId": "ch8_ames_housing_004", "DataSourceName": "[DS] Ames Housing validation 004", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_shuffled.csv", "DataRearrangement": "{"splitting":{"percentBegin":70,"percentEnd":100}}", }, "ComputeStatistics": true } Here, the ames_housing.csv file has previously been shuffled using the gshuf command line and uploaded to S3: $ gshuf ames_housing_nohead.csv -o ames_housing_nohead.csv $ cat ames_housing_header.csv ames_housing_nohead.csv > tmp.csv $ mv tmp.csv ames_housing_shuffled.csv $ aws s3 cp ./ames_housing_shuffled.csv s3://aml.packt/data/ch8/ Note that we don't need to create these four datasources; these are just examples of alternative ways to create datasources. We then create these datasources by running the following: $ aws machinelearning create-data-source-from-s3 --cli-input-json file://dsrc_ames_housing_001.json We can check whether the datasource creation is pending: In return, we get the datasoure ID we had specified: { "DataSourceId": "ch8_ames_housing_001" } We can then obtain information on that datasource with the following: $ aws machinelearning get-data-source --data-source-id ch8_ames_housing_001 This returns the following: { "Status": "COMPLETED", "NumberOfFiles": 1, "CreatedByIamUser": "arn:aws:iam::178277xxxxxxx:user/alexperrier", "LastUpdatedAt": 1486834110.483, "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_training.csv", "ComputeStatistics": true, "StartedAt": 1486833867.707, "LogUri": "https://eml-prod-emr.s3.amazonaws.com/178277513911-ds-ch8_ames_housing_001/.....", "DataSourceId": "ch8_ames_housing_001", "CreatedAt": 1486030865.965, "ComputeTime": 880000, "DataSizeInBytes": 648150, "FinishedAt": 1486834110.483, "Name": "[DS] Ames Housing 001" } Note that we have access to the operation log URI, which could be useful to analyze the model training later on. Creating the model Creating the model with the create-ml-model command follows the same steps: Generate the skeleton with the following: $ aws machinelearning create-ml-model --generate-cli-skeleton > mdl_ames_housing_001.json Write the configuration file: { "MLModelId": "ch8_ames_housing_001", "MLModelName": "[MDL] Ames Housing 001", "MLModelType": "REGRESSION", "Parameters": { "sgd.shuffleType": "auto", "sgd.l2RegularizationAmount": "1.0E-06", "sgd.maxPasses": "100" }, "TrainingDataSourceId": "ch8_ames_housing_001", "RecipeUri": "s3://aml.packt/data/ch8 /recipe_ames_housing_001.json" } Note the parameters of the algorithm. Here, we used mild L2 regularization and 100 passes. Launch the model creation with the following: $ aws machinelearning create-ml-model --cli-input-json file://mdl_ames_housing_001.json The model ID is returned: { "MLModelId": "ch8_ames_housing_001" } This get-ml-model command gives you a status update on the operation as well as the URL to the log. $ aws machinelearning get-ml-model --ml-model-id ch8_ames_housing_001 The watch command allows you to repeat a shell command every n seconds. To get the status of the model creation every 10s, just write the following: $ watch -n 10 aws machinelearning get-ml-model --ml-model-id ch8_ames_housing_001 The output of the get-ml-model will be refreshed every 10s until you kill it. It is not possible to create the default recipe via the AWS CLI commands. You can always define a blank recipe that would not carry out any transformation on the data. However, the default recipe has been shown to be positively impacting the model performance. To obtain this default recipe, we created it via the web interface, copied it into a file that we uploaded to S3. The resulting file recipe_ames_housing_001.json is available in our GitHub repository. Its content is quite long as the dataset has 79 variables and is not reproduced here for brevity purposes. Evaluating our model with create-evaluation Our model is now trained and we would like to evaluate it on the evaluation subset. For that, we will use the create-evaluation CLI command: Generate the skeleton: $ aws machinelearning create-evaluation --generate-cli-skeleton > eval_ames_housing_001.json Configure the parameter file: { "EvaluationId": "ch8_ames_housing_001", "EvaluationName": "[EVL] Ames Housing 001", "MLModelId": "ch8_ames_housing_001", "EvaluationDataSourceId": "ch8_ames_housing_002" } Launch the evaluation creation: $ aws machinelearning create-evaluation --cli-input-json file://eval_ames_housing_001.json Get the evaluation information: $ aws machinelearning get-evaluation --evaluation-id ch8_ames_housing_001 From that output, we get the performance of the model in the form of the RMSE: "PerformanceMetrics": { "Properties": { "RegressionRMSE": "29853.250469108018" } } The value may seem big, but it is relative to the range of the salePrice variable for the houses, which has a mean of 181300.0 and std of 79886.7. So an RMSE of 29853.2 is a decent score. You don't have to wait for the datasource creation to be completed in order to launch the model training. Amazon ML will simply wait for the parent operation to conclude before launching the dependent one. This makes chaining operations possible. At this point, we have a trained and evaluated model. In this tutorial, we have successfully seen the detailed steps on how to get started with CLI and we have also implemented a  simple project to get comfortable with the same. To understand how to leverage Amazon's powerful platform for your predictive analytics needs,  check out this book Effective Amazon Machine Learning Part1. Learning AWS CLI Part2. ChatOps with Slack and AWS CLI Automate tasks using Azure PowerShell and Azure CLI [Tutorial]
Read more
  • 0
  • 0
  • 25076

article-image-how-to-predict-viral-content-using-random-forest-regression-in-python-tutorial
Prasad Ramesh
12 Sep 2018
9 min read
Save for later

How to predict viral content using random forest regression in Python [Tutorial]

Prasad Ramesh
12 Sep 2018
9 min read
Understanding sharing behavior is a big business. As consumers become blind to traditional advertising, the push is to go beyond simple pitches to tell engaging stories. In this article we will build a predictive content scoring model that will predict whether the content will go viral or not using random forest regression. This article is an excerpt from a book written by Alexander T. Combs titled Python Machine Learning Blueprints: Intuitive data projects you can relate to. You can download the code and other relevant files used in this article from this GitHub link. What does research tell us about content virality? Increasingly, the success of these endeavors is measured in social shares. Why go to so much trouble? Because as a brand, every share that I receive represents another consumer that I've reached—all without spending an additional cent. Due to this value, several researchers have examined sharing behavior in the hopes of understanding what motivates it. Among the reasons researchers have found: To provide practical value to others (an altruistic motive) To associate ourselves with certain ideas and concepts (an identity motive) To bond with others around a common emotion (a communal motive) With regard to the last motive, one particularly well-designed study looked at the 7,000 pieces of content from the New York Times to examine the effect of emotion on sharing. They found that simple emotional sentiment was not enough to explain sharing behavior, but when combined with emotional arousal, the explanatory power was greater. For example, while sadness has a strong negative valence, it is considered to be a low arousal state. Anger, on the other hand, has a negative valence paired with a high arousal state. As such, stories that sadden the reader tend to generate far fewer stories than anger-inducing stories: Source : “What Makes Online Content Viral?” by Jonah Berger and Katherine L. Milkman Building a predictive content scoring model Let's create a model that can estimate the share counts for a given piece of content. Ideally, we would have a much larger sample of content, especially content that had more typical share counts. However, we'll make do with what we have here. We're going to use an algorithm called random forest regression. Here we're going to use a regression and attempt to predict the share counts. We could bucket our share classes into ranges, but it is preferable to use regression when dealing with continuous variables. To begin, we'll create a bare-bones model. We'll use the number of images, the site, and the word count. We'll train our model on the number of Facebook likes. We'll first import the sci-kit learn library, then we'll prepare our data by removing the rows with nulls, resetting our index, and finally splitting the frame into our training and testing set: from sklearn.ensemble import RandomForestRegressor all_data = dfc.dropna(subset=['img_count', 'word_count']) all_data.reset_index(inplace=True, drop=True) train_index = [] test_index = [] for i in all_data.index: result = np.random.choice(2, p=[.65,.35]) if result == 1: test_index.append(i) else: train_index.append(i) We used a random number generator with a probability set for approximately 2/3 and 1/3 to determine which row items (based on their index) would be placed in each set. Setting the probabilities this way ensures that we get approximately twice the number of rows in our training set as compared to the test set. We see this, as follows: print('test length:', len(test_index), '\ntrain length:', len(train_index)) The preceding code will generate the following output: Now, we'll continue on with preparing our data. Next, we need to set up categorical encoding for our sites. Currently, our DataFrame object has the name for each site represented with a string. We need to use dummy encoding. This creates a column for each site. If the row is for that particular site, then that column will be filled in with 1; all the other site columns be filled in with 0. Let's do that now: sites = pd.get_dummies(all_data['site']) sites The preceding code will generate the following output: The dummy encoding can be seen in the preceding image. We'll now continue by splitting our data into training and test sets as follows: y_train = all_data.iloc[train_index]['fb'].astype(int) X_train_nosite = all_data.iloc[train_index][['img_count', 'word_count']] X_train = pd.merge(X_train_nosite, sites.iloc[train_index], left_index=True, right_index=True) y_test = all_data.iloc[test_index]['fb'].astype(int) X_test_nosite = all_data.iloc[test_index][['img_count', 'word_count']] X_test = pd.merge(X_test_nosite, sites.iloc[test_index], left_index=True, right_index=True) With this, we've set up our X_test, X_train, y_test, and y_train variables. We'll use this now to build our model: clf = RandomForestRegressor(n_estimators=1000) clf.fit(X_train, y_train) With these two lines of code, we have trained our model. Let's now use it to predict the Facebook likes for our testing set: y_actual = y_test deltas = pd.DataFrame(list(zip(y_pred, y_actual, (y_pred - y_actual)/(y_actual))), columns=['predicted', 'actual', 'delta']) deltas The preceding code will generate the following output: Here we see the predicted value, the actual value, and the difference as a percentage. Let's take a look at the descriptive stats for this: deltas['delta'].describe() The preceding code will generate the following output: Our median error is 0! Well, unfortunately, this isn't a particularly useful bit of information as errors are on both sides—positive and negative, and they tend to average out, which is what we see here. Let's now look at a more informative metric to evaluate our model. We're going to look at root mean square error as a percentage of the actual mean. To first illustrate why this is more useful, let's run the following scenario on two sample series: a = pd.Series([10,10,10,10]) b = pd.Series([12,8,8,12]) np.sqrt(np.mean((b-a)**2))/np.mean(a) This results in the following output: Now compare this to the mean: (b-a).mean() This results in the following output: Clearly the former is the more meaningful statistic. Let's now run this for our model: np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) The preceding code will generate the following output: Let's now add another feature that iscounts for words and see if it  helps our model. We'll use a count vectorizer to do this. Much like what we did with the site names, we'll transform individual words and n-grams into features: from sklearn.feature_extraction.text import CountVectorizer vect = CountVectorizer(ngram_range=(1,3)) X_titles_all = vect.fit_transform(all_data['title']) X_titles_train = X_titles_all[train_index] X_titles_test = X_titles_all[test_index] X_test = pd.merge(X_test, pd.DataFrame(X_titles_test.toarray(), index=X_test.index), left_index=True, right_index=True) X_train = pd.merge(X_train, pd.DataFrame(X_titles_train.toarray(), index=X_train.index), left_index=True, right_index=True) In these lines, we joined our existing features to our new n-gram features. Let's now train our model and see if we have any improvement: clf.fit(X_train, y_train) y_pred = clf.predict(X_test) deltas = pd.DataFrame(list(zip(y_pred, y_actual, (y_pred - y_actual)/(y_actual))), columns=['predicted', 'actual', 'delta']) deltas The preceding code will generate the following output: While checking our errors again, we see the following: np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) This code results in the following output: So, it appears that we have a modestly improved model. Now, let's add another feature i.e the word count of the title, as follows: all_data = all_data.assign(title_wc = all_data['title'].map(lambda x: len(x.split(' ')))) X_train = pd.merge(X_train, all_data[['title_wc']], left_index=True, right_index=True) X_test = pd.merge(X_test, all_data[['title_wc']], left_index=True, right_index=True) clf.fit(X_train, y_train) y_pred = clf.predict(X_test) np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) The preceding code will generate the following output: It appears that each feature has modestly improved our model. There are certainly more features that we could add to our model. For example, we could add the day of the week and the hour of the posting, we could determine if the article is a listicle by running a regex on the headline, or we could examine the sentiment of each article. This only begins to touch on the features that could be important to model virality. We would certainly need to go much further to continue reducing the error in our model. We have performed only the most cursory testing of our model. Each measurement should be run multiple times to get a more accurate representation of the true error rate. It is possible that there is no statistically discernible difference between our last two models, as we only performed one test. To summarize, we learned how we can build a model to predict content virality using a random forest regression. To know more about predicting and other machine learning projects in Python projects check out Python Machine Learning Blueprints: Intuitive data projects you can relate to. Writing web services with functional Python programming [Tutorial] Visualizing data in R and Python using Anaconda [Tutorial] Python 3.7 beta is available as the second generation Google App Engine standard runtime
Read more
  • 0
  • 0
  • 33426

article-image-implementing-an-ai-in-unreal-engine-4-with-ai-perception-components-tutorial
Natasha Mathur
11 Sep 2018
8 min read
Save for later

Implementing an AI in Unreal Engine 4 with AI Perception components [Tutorial]

Natasha Mathur
11 Sep 2018
8 min read
AI Perception is a system within Unreal Engine 4 that allows sources to register their senses to create stimuli, and then other listeners are periodically updated as the sense stimuli is created within the system. This works wonders for creating a reusable system that can react to an array of customizable sensors. In this tutorial, we will explore different components available within Unreal Engine 4 to enable artificial intelligence sensing within our games. We will do this by taking advantage of a system within Unreal Engine called AI Perception components. These components can be customized and even scripted to introduce new behavior by extending the current sensing interface. This tutorial is an excerpt taken from the book ‘Unreal Engine 4 AI Programming Essentials’ written by Peter L. Newton, Jie Feng. Let’s now have a look at AI sensing. Implementing AI sensing in Unreal Engine 4 Let's start by bringing up Unreal Engine 4 and open our New Project window. Then, perform the following steps: First, name our new project AI Sense and hit create project. After it finishes loading, we want to start by creating a new AIController that will be responsible for sending our AI the appropriate instructions. Let's navigate to the Blueprint folder and create a new AIController class, naming it EnemyPatrol. Now, to assign EnemyPatrol, we need to place a pawn into the world then assign the controller to it. After placing the pawn, click on the Details tab within the editor. Next, we want to search for AI Controller. By default, it is the parent class AI Controller, but we want this to be EnemyPatrol: Next, we will create a new PlayerController named PlayerSense. Then, we need to introduce the AI Perception component to those who we want to be seen by or to see. Let's open the PlayerSense controller first and then add the necessary components. Building AI Perception components There are two components that are currently available within the Unreal Engine Framework. The first one is the AI Perception component that listens for perception of stimulants (sight, hearing, etc.) The other is the AIPerceptionStimuliSource component. It is used to easily register the pawn as a source of stimuli, allowing it to be detected by other AI Perception components. This comes in handy, particularly in our case. Now, follow these steps: With PlayerSense open, let's add a new component called AIPerceptionStimuliSource. Then, under the Details tab, let's select AutoRegister as Source. Next, we want to add new senses to create a source for. So, looking at Register as Source for Senses, there is an AISense array. Populate this array with the AISense_Sight blueprint in order to be detected by sight by other AI Perception components. You will note that there are also other senses to choose from—for example, AISense_Hearing, AISense_Touch, and so on. The complete settings are shown in the following screenshot: This was pretty straightforward considering our next process. This allows our player pawn to be detected by Enemy AI whenever we get within their sense's configured range. Next, let's open our EnemyPatrol class and add the other AI Perception components to our AI. This component is called AIPerception and contains many other configurations, allowing you to customize and tailor the AI for different scenarios: Clicking on the AI Perception component, you will notice that under the AI section, everything is grayed out. This is because we have configurations specific to each sense. This also goes if you create your own AI Sense classes. Let's focus on two sections within this component: the first is the AI Perception settings, and the other is the event provided with this component: The AI Perception section should look similar to the same section on AIPerceptionStimuliSource. The differences are that you have to register your senses, and you can also specify a dominant sense. The dominant sense takes precedence of other senses determined in the same location. Let's look at the Senses configuration and add a new element. This will populate the array with a new sense configuration, which you can then modify. For now, let's select the AI Sight configuration, and then we can leave the default values as the same. In the game, we are able to visualize the configurations, allowing us to have more control over our senses. There is another configuration that allows you to specify affiliation, but at the time of writing this, these options aren't available. When you click on Detection by Affiliation, you must select Detect Neutrals to detect any pawn with Sight Sense Source. Next, we need to be able to notify our AI of a new target. We will do this by utilizing the Event we saw as part of the AI Perception component. By navigating there, we can see an event called OnPerceptionUpdated.  This will be updated when there are changes in the sensory state which makes the tracking of senses easy and straightforward. Let's move toward the OnPerceptionUpdated event and perform the following: Click on OnPerceptionUpdated and create it within the EventGraph. Now, within the EventGraph, whenever this event is called, changes will be made to the senses, and it will return the available sensed actors, as shown in the following screenshot: Now that we understand how we will obtain our referenced sensed actors, we should create a way for our pawn to maintain different states of being similar to what we would do in Behavior Tree. Let's first establish a home location for our pawn to run to when the player is no longer detected by the AI. In the same Blueprint folder, we will create a subclass of Target Point. Let's name this Waypoint and place it at an appropriate location within the world. Now, we need to open this Waypoint subclass and create additional variables to maintain traversable routes. We can do this by defining the next waypoint within a waypoint, allowing us to create what programmers call a linked list. This results in the AI being able to continuously move to the next available route after reaching the destination of its current route. With Waypoint open, add a new variable named NextWaypoint and make the type of this be the same as that of the Waypoint class we created. Navigate back to our Content Browser. Now, within our EnemyPatrol AIController, let's focus on Event Begin in EventGraph. We have to grab the reference to the waypoint we created earlier and store it within our AIController. So, let's create a new waypoint variable type and name it CurrentPoint. Now, on Event Begin Play, the first thing we need is the AIController, which is the self -reference for this EventGraph because we are in the AIController Class. So, let's grab our self-reference and check whether it is valid. Safety first! Next, we will get our AIController from our self-reference. Then, again for safety, let's check whether our AIController is valid.How does our AI sense? Next, we want to create a Get all Actors Of Class node and set the Actor class to Waypoint. Now, we need to convert a few instructions into a macro because we will use the instructions throughout the project. So, let's select the nodes shown as follows and hit convert to macro. Lastly, rename this variable getAIController. You can see the final nodes in the following screenshot: Next, we want our AI to grab a random new route and set it as a new variable. So, let's first get the length of the array of actors returned. Then, we want to subtract 1 from this length, and this will give us the range of our array. From there, we want to pull from Subtract and get Random Integer. Then, from our array, we want to get the Get node and pump our Random Integer node into the index to retrieve. Next, pull the returned available variable from the Get node and promote it to a local variable. This will automatically create the type dragged from thepin, and we want to rename this Current Point to understand why this variable exists. Then, from our getAIController macro, we want to assign the ReceiveMoveCompleted event. This is done so that when our AI successfully moves to the next route, we can update the information and tell our AI to move to the next route. We learned AI sensing in Unreal Engine 4 with the help of a system within Unreal Engine called AI Perception components. We also explored different components within that system. If you found this post useful, be sure to check out the book, ‘Unreal Engine 4 AI Programming Essentials’ for more concepts on AI sensing in Unreal Engine. Development Tricks with Unreal Engine 4 What’s new in Unreal Engine 4.19? Unreal Engine 4.20 released with focus on mobile and immersive (AR/VR/MR) devices    
Read more
  • 0
  • 0
  • 39120

article-image-build-a-custom-news-feed-with-python-tutorial
Prasad Ramesh
10 Sep 2018
13 min read
Save for later

Build a custom news feed with Python [Tutorial]

Prasad Ramesh
10 Sep 2018
13 min read
To create a model a custom news feed, we need data which can be trained. This training data will be fed into a model in order to teach it to discriminate between the articles that we'd be interested in and the ones that we would not. This article is an excerpt from a book written by Alexander T. Combs titled Python Machine Learning Blueprints: Intuitive data projects you can relate to. In this article, we will learn to build a custom news corpus and annotate a large number of articles corresponding to the interests respectively. You can download the code and other relevant files used in this article from this GitHub link. Creating a supervised training dataset Before we can create a model of our taste in news articles, we need training data. This training data will be fed into our model in order to teach it to discriminate between the articles that we'd be interested in and the ones that we would not. To build this corpus, we will need to annotate a large number of articles that correspond to these interests. For each article, we'll label it either “y” or “n”. This will indicate whether the article is the one that we would want to have sent to us in our daily digest or not. To simplify this process, we will use the Pocket app. Pocket is an application that allows you to save stories to read later. You simply install the browser extension, and then click on the Pocket icon in your browser's toolbar when you wish to save a story. The article is saved to your personal repository. One of the great features of Pocket for our purposes is its ability to save the article with a tag of your choosing. We'll use this feature to mark interesting articles as “y” and non-interesting articles as “n”. Installing the Pocket Chrome extension We use Google Chrome here, but other browsers should work similarly. For Chrome, go into the Google App Store and look for the Extensions section: Image from https://chrome.google.com/webstore/search/pocket Click on the blue Add to Chrome button. If you already have an account, log in, and if you do not have an account, go ahead and sign up (it's free). Once this is complete, you should see the Pocket icon in the upper right-hand corner of your browser. It will be greyed out, but once there is an article you wish to save, you can click on it. It will turn red once the article has been saved as seen in the following images. The greyed out icon can be seen in the upper right-hand corner. Image from https://news.ycombinator.com When the icon is clicked, it turns red to indicated the article has been saved.  Image from https://www.wsj.com Now comes the fun part! Begin saving all articles that you come across. Tag the interesting ones with “y”, and the non-interesting ones with “n”. This is going to take some work. Your end results will only be as good as your training set, so you're going to to need to do this for hundreds of articles. If you forget to tag an article when you save it, you can always go to the site, http://www.get.pocket.com, to tag it there. Using the Pocket API to retrieve stories Now that you've diligently saved your articles to Pocket, the next step is to retrieve them. To accomplish this, we'll use the Pocket API. You can sign up for an account at https://getpocket.com/developer/apps/new. Click on Create New App in the upper left-hand side and fill in the details to get your API key. Make sure to click all of the permissions so that you can add, change, and retrieve articles. Image from https://getpocket.com/developer Once you have filled this in and submitted it, you will receive your CONSUMER KEY. You can find this in the upper left-hand corner under My Apps. This will look like the following screen, but obviously with a real key: Image from https://getpocket.com/developer Once this is set, you are ready to move on the the next step, which is to set up the authorizations. It requires that you input your consumer key and a redirect URL. The redirect URL can be anything. Here I have used my Twitter account: import requests auth_params = {'consumer_key': 'MY_CONSUMER_KEY', 'redirect_uri': 'https://www.twitter.com/acombs'} tkn = requests.post('https://getpocket.com/v3/oauth/request', data=auth_params) tkn.content You will see the following output: The output will have the code that you'll need for the next step. Place the following in your browser bar: https://getpocket.com/auth/authorize?request_token=some_long_code&redir ect_uri=https%3A//www.twitter.com/acombs If you change the redirect URL to one of your own, make sure to URL encode it. There are a number of resources for this. One option is to use the Python library urllib, another is to use a free online source. At this point, you should be presented with an authorization screen. Go ahead and approve it, and we can move on to the next step: usr_params = {'consumer_key':'my_consumer_key', 'code': 'some_long_code'} usr = requests.post('https://getpocket.com/v3/oauth/authorize', data=usr_params) usr.content We'll use the following output code here to move on to retrieving the stories: First, we retrieve the stories tagged “n”: no_params = {'consumer_key':'my_consumer_key', 'access_token': 'some_super_long_code', 'tag': 'n'} no_result = requests.post('https://getpocket.com/v3/get', data=no_params) no_result.text The preceding code generates the following output: Note that we have a long JSON string on all the articles that we tagged “n”. There are several keys in this, but we are really only interested in the URL at this point. We'll go ahead and create a list of all the URLs from this: no_jf = json.loads(no_result.text) no_jd = no_jf['list'] no_urls=[] for i in no_jd.values(): no_urls.append(i.get('resolved_url')) no_urls The preceding code generates the following output: This list contains all the URLs of stories that we aren't interested in. Now, let's put this in a DataFrame object and tag it as such: import pandas no_uf = pd.DataFrame(no_urls, columns=['urls']) no_uf = no_uf.assign(wanted = lambda x: 'n') no_uf The preceding code generates the following output: Now, we're all set with the unwanted stories. Let's do the same thing with the stories that we are interested in: ye_params = {'consumer_key': 'my_consumer_key', 'access_token': 'some_super_long_token', 'tag': 'y'} yes_result = requests.post('https://getpocket.com/v3/get', data=yes_params) yes_jf = json.loads(yes_result.text) yes_jd = yes_jf['list'] yes_urls=[] for i in yes_jd.values(): yes_urls.append(i.get('resolved_url')) yes_uf = pd.DataFrame(yes_urls, columns=['urls']) yes_uf = yes_uf.assign(wanted = lambda x: 'y') yes_uf The preceding code generates the following output: Now that we have both types of stories for our training data, let's join them together into a single DataFrame: df = pd.concat([yes_uf, no_uf]) df.dropna(inplace=1) df The preceding code generates the following output: Now that we're set with all our URLs and their corresponding tags in a single frame, we'll move on to downloading the HTML for each article. We'll use another free service for this called embed.ly. Using the embed.ly API to download story bodies We have all the URLs for our stories, but unfortunately this isn't enough to train on. We'll need the full article body. By itself, this could become a huge challenge if we wanted to roll our own scraper, especially if we were going to be pulling stories from dozens of sites. We would need to write code to target the article body while carefully avoiding all the othersite gunk that surrounds it. Fortunately, there are a number of free services that will do this for us. We're going to use embed.ly to do this, but there are a number of other services that you also could use. The first step is to sign up for embed.ly API access. You can do this at https://app.embed.ly/signup. This is a straightforward process. Once you confirm your registration, you will receive an API key.. You need to just use this key in your HTTPrequest. Let's do this now: import urllib def get_html(x): qurl = urllib.parse.quote(x) rhtml = requests.get('https://api.embedly.com/1/extract?url=' + qurl + '&key=some_api_key') ctnt = json.loads(rhtml.text).get('content') return ctnt df.loc[:,'html'] = df['urls'].map(get_html) df.dropna(inplace=1) df The preceding code generates the following output: With that, we have the HTML of each story. As the content is embedded in HTML markup, and we want to feed plain text into our model, we'll use a parser to strip out the markup tags: from bs4 import BeautifulSoup def get_text(x): soup = BeautifulSoup(x, 'lxml') text = soup.get_text() return text df.loc[:,'text'] = df['html'].map(get_text) df The preceding code generates the following output: With this, we have our training set ready. We can now move on to a discussion of how to transform our text into something that a model can work with. Setting up your daily personal newsletter In order to set up a personal e-mail with news stories, we're going to utilize IFTTT again. Build an App to Find Cheap Airfares, we'll use the Maker Channel to send a POST request. However, this time the payload will be our news stories. If you haven't set up the Maker Channel, do this now. Instructions can be found in Chapter 3, Build an App to Find Cheap Airfares. You should also set up the Gmail channel. Once that is complete, we'll add a recipe to combine the two. First, click on Create a Recipe from the IFTTT home page. Then, search for the Maker Channel: Image from https://www.iftt.com Select this, then select Receive a web request: Image from https://www.iftt.com Then, give the request a name. I'm using news_event: Image from https://www.iftt.com Finish by clicking on Create Trigger. Next, click on that to set up the e-mail piece. Search for Gmail and click on the icon seen as follows: Image from https://www.iftt.com Once you have clicked on Gmail, click on Send an e-mail. From here, you can customize your e-mail message. Image from https://www.iftt.com Input your e-mail address, a subject line, and finally, include Value1 in the e-mail body. We will pass our story title and link into this with our POST request. Click on Create Recipe to finalize this. Now, we're ready to generate the script that will run on a schedule automatically sending us articles of interest. We're going to create a separate script for this, but one last thing that we need to do in our existing code is serialize our vectorizer and our model: import pickle pickle.dump(model, open (r'/Users/alexcombs/Downloads/news_model_pickle.p', 'wb')) pickle.dump(vect, open (r'/Users/alexcombs/Downloads/news_vect_pickle.p', 'wb')) With this, we have saved everything that we need from our model. In our new script, we will read these in to generate our new predictions. We're going to use the same scheduling library to run the code that we used in Chapter  3, Build an App to Find Cheap Airfares. Putting it all together, we have the following script:   # get our imports. import pandas as pd from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.svm import LinearSVC import schedule import time import pickle import json import gspread import requests from bs4 import BeautifulSoup from oauth2client.client import SignedJwtAssertionCredentials # create our fetching function def fetch_news(): try: vect = pickle.load(open(r'/Users/alexcombs/Downloads/news_vect_pickle.p', 'rb')) model = pickle.load(open(r'/Users/alexcombs/Downloads/news_model_pickle.p', 'rb')) json_key = json.load(open(r'/Users/alexcombs/Downloads/APIKEY.json')) scope = ['https://spreadsheets.google.com/feeds'] credentials = SignedJwtAssertionCredentials(json_key['client_email'], json_key['private_key'].encode(), scope) gc = gspread.authorize(credentials) ws = gc.open("NewStories") sh = ws.sheet1 zd = list(zip(sh.col_values(2), sh.col_values(3), sh.col_values(4))) zf = pd.DataFrame(zd, columns=['title', 'urls', 'html']) zf.replace('', pd.np.nan, inplace=True) zf.dropna(inplace=True) def get_text(x): soup = BeautifulSoup(x, 'lxml') text = soup.get_text() return text zf.loc[:, 'text'] = zf['html'].map(get_text) tv = vect.transform(zf['text']) res = model.predict(tv) rf = pd.DataFrame(res, columns=['wanted']) rez = pd.merge(rf, zf, left_index=True, right_index=True) news_str = '' for t, u in zip(rez[rez['wanted'] == 'y']['title'], rez[rez['wanted'] == 'y']['urls']): news_str = news_str + t + '\n' + u + '\n' payload = {"value1": news_str} r = requests.post('https://maker.ifttt.com/trigger/news_event/with/key/IFTTT_KE Y', data=payload) # cleanup worksheet lenv = len(sh.col_values(1)) cell_list = sh.range('A1:F' + str(lenv)) for cell in cell_list: cell.value = "" sh.update_cells(cell_list) print(r.text) except: print('Failed') schedule.every(480).minutes.do(fetch_news) while 1: schedule.run_pending() time.sleep(1) What this script will do is run every 4 hours, pull down the news stories from Google Sheets, run the stories through the model, generate an e-mail by sending a POST request to IFTTT for the stories that are predicted to be of interest, and then finally, it will clear out the stories in the spreadsheet so that only new stories get sent in the next e-mail. Congratulations! You now have your own personalize news feed! In this tutorial we learned how to create a custom news feed, to know more about setting it up and other intuitive Python projects, check out Python Machine Learning Blueprints: Intuitive data projects you can relate to. Writing web services with functional Python programming [Tutorial] Visualizing data in R and Python using Anaconda [Tutorial] Python 3.7 beta is available as the second generation Google App Engine standard runtime
Read more
  • 0
  • 0
  • 33853

article-image-implementing-dependency-injection-google-guice
Natasha Mathur
09 Sep 2018
10 min read
Save for later

Implementing Dependency Injection in Google Guice [Tutorial]

Natasha Mathur
09 Sep 2018
10 min read
Choosing a framework wisely is important when implementing Dependency Injection as each framework has its own advantages and disadvantages. There are various Java-based dependency injection frameworks available in the open source community, such as Dagger, Google Guice, Spring DI, JAVA EE 8 DI, and PicoContainer. In this article we will learn about Google Guice (pronounced juice), a lightweight DI framework that helps developers to modularize applications. Guice encapsulates annotation and generics features introduced by Java 5 to make code type-safe. It enables objects to wire together and tests with fewer efforts. Annotations help you to write error-prone and reusable code. This tutorial is an excerpt taken from the book  'Java 9 Dependency Injection', written by Krunal Patel, Nilang Patel. In Guice, the new keyword is replaced with @inject for injecting dependency. It allows constructors, fields, and methods (any method with multiple numbers of arguments) level injections. Using Guice, we can define custom scopes and circular dependency. It also has features to integrate with Spring and AOP interception. Moreover, Guice also implements Java Specification Request (JSR) 330, and uses the standard annotation provided by JSR-330. The first version of Guice was introduced by Google in 2007 and the latest version is Guice 4.1. Before we see how dependency injection gets implemented in Guice, let's first setup Guice. Guice setup To make our coding simple, throughout this tutorial, we are going to use a Maven project to understand Guice DI.  Let’s create a simple Maven project using the following parameters: groupid:, com.packt.guice.id, artifactId : chapter4, and version : 0.0.1-SNAPSHOT. By adding Guice 4.1.0 dependency on the pom.xml file, our final pom.xml will look like this: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.packt.guice.di</groupId> <artifactId>chapter4</artifactId> <packaging>jar</packaging> <version>0.0.1-SNAPSHOT</version> <name>chapter4</name> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency> <dependency> <groupId>com.google.inject</groupId> <artifactId>guice</artifactId> <version>4.1.0</version> </dependency> </dependencies> <build> <finalName>chapter2</finalName> </build> </project> For this tutorial, we have used JDK 9, but not as a module project because the Guice library is not available as a Java 9 modular jar. Basic injection in Guice We have set up Guice, now it is time to understand how injection works in Guice. Let's rewrite the example of a notification system using Guice, and along with that, we will see several indispensable interfaces and classes in Guice.  We have a base interface called NotificationService, which is expecting a message and recipient details as arguments: public interface NotificationService { boolean sendNotification(String message, String recipient); } The SMSService concrete class is an implementation of the NotificationService interface. Here, we will apply the @Singleton annotation to the implementation class. When you consider that service objects will be made through injector classes, this annotation is furnished to allow them to understand that the service class ought to be a singleton object. Because of JSR-330 support in Guice, annotation, either from javax.inject or the com.google.inject package, can be used: import javax.inject.Singleton; import com.packt.guice.di.service.NotificationService; @Singleton public class SMSService implements NotificationService { public boolean sendNotification(String message, String recipient) { // Write code for sending SMS System.out.println("SMS has been sent to " + recipient); return true; } } In the same way, we can also implement another service, such as sending notifications to a social media platform, by implementing the NotificationService interface. It's time to define the consumer class, where we can initialize the service class for the application. In Guice, the @Inject annotation will be used to define setter-based as well as constructor-based dependency injection. An instance of this class is utilized to send notifications by means of the accessible correspondence services. Our AppConsumer class defines setter-based injection as follows: import javax.inject.Inject; import com.packt.guice.di.service.NotificationService; public class AppConsumer { private NotificationService notificationService; //Setter based DI @Inject public void setService(NotificationService service) { this.notificationService = service; } public boolean sendNotification(String message, String recipient){ //Business logic return notificationService.sendNotification(message, recipient); } } Guice needs to recognize which service implementation to apply, so we should configure it with the aid of extending the AbstractModule class, and offer an implementation for the configure() method. Here is an example of an injector configuration: import com.google.inject.AbstractModule; import com.packt.guice.di.impl.SMSService; import com.packt.guice.di.service.NotificationService; public class ApplicationModule extends AbstractModule{ @Override protected void configure() { //bind service to implementation class bind(NotificationService.class).to(SMSService.class); } } In the previous class, the module implementation determines that an instance of SMSService is to be injected into any place a NotificationService variable is determined. In the same way, we just need to define a binding for the new service implementation, if required. Binding in Guice is similar to wiring in Spring: import com.google.inject.Guice; import com.google.inject.Injector; import com.packt.guice.di.consumer.AppConsumer; import com.packt.guice.di.injector.ApplicationModule; public class NotificationClient { public static void main(String[] args) { Injector injector = Guice.createInjector(new ApplicationModule()); AppConsumer app = injector.getInstance(AppConsumer.class); app.sendNotification("Hello", "9999999999"); } } In the previous program, the  Injector object is created using the Guice class's createInjector() method, by passing the ApplicationModule class's implementation object. By using the injector's getInstance() method, we can initialize the AppConsumer class. At the same time as creating the AppConsumer's objects, Guice injects the needy service class implementation (SMSService, in our case). The following is the yield of running the previous code: SMS has been sent to Recipient :: 9999999999 with Message :: Hello So, this is how Guice dependency injection works compared to other DI. Guice has embraced a code-first technique for dependency injection, and management of numerous XML is not required. Let's test our client application by writing a JUnit test case. We can simply mock the service implementation of SMSService, so there is no need to implement the actual service. The MockSMSService class looks like this: import com.packt.guice.di.service.NotificationService; public class MockSMSService implements NotificationService { public boolean sendNotification(String message, String recipient) { System.out.println("In Test Service :: " + message + "Recipient :: " + recipient); return true; } } The following is the JUnit 4 test case for the client application: import org.junit.After; import org.junit.Assert; import org.junit.Before; import org.junit.Test; import com.google.inject.AbstractModule; import com.google.inject.Guice; import com.google.inject.Injector; import com.packt.guice.di.consumer.AppConsumer; import com.packt.guice.di.impl.MockSMSService; import com.packt.guice.di.service.NotificationService; public class NotificationClientTest { private Injector injector; @Before public void setUp() throws Exception { injector = Guice.createInjector(new AbstractModule() { @Override protected void configure() { bind(NotificationService.class).to(MockSMSService.class); } }); } @After public void tearDown() throws Exception { injector = null; } @Test public void test() { AppConsumer appTest = injector.getInstance(AppConsumer.class); Assert.assertEquals(true, appTest.sendNotification("Hello There", "9898989898"));; } } Take note that we are binding the MockSMSService class to NotificationService by having an anonymous class implementation of AbstractModule. This is done in the setUp() method, which runs for some time before the test methods run. Guice dependency injection As we know what dependency injection is, let us explore how Google Guice provides injection. We have seen that the injector helps to resolve dependencies by reading configurations from modules, which are called bindings. Injector is preparing charts for the requested objects. Dependency injection is managed by injectors using various types of injection: Constructor injection Method injection Field injection Optional injection Static injection Constructor Injection Constructor injection can be achieved  by using the @Inject annotation at the constructor level. This constructor ought to acknowledge class dependencies as arguments. Multiple constructors will, at that point, assign the arguments to their final fields: public class AppConsumer { private NotificationService notificationService; //Constructor level Injection @Inject public AppConsumer(NotificationService service){ this.notificationService=service; } public boolean sendNotification(String message, String recipient){ //Business logic return notificationService.sendNotification(message, recipient); } } If our class does not have a constructor with @Inject, then it will be considered a default constructor with no arguments. When we have a single constructor and the class accepts its dependency, at that time the constructor injection works perfectly and is helpful for unit testing. It is also easy because Java is maintaining the constructor invocation, so you don't have to stress about objects arriving in an uninitialized state. Method injection Guice allows us to define injection at the method level by annotating methods with the @Inject annotation. This is similar to the setter injection available in Spring. In this approach, dependencies are passed as parameters, and are resolved by the injector before invocation of the method. The name of the method and the number of parameters does not affect the method injection: private NotificationService notificationService; //Setter Injection @Inject public void setService(NotificationService service) { this.notificationService = service; } This could be valuable when we don't want to control instantiation of classes. We can, moreover, utilize it in case you have a super class that needs a few dependencies. (This is difficult to achieve in a constructor injection.) Field injection Fields can be injected by the @Inject annotation in Guice. This is a simple and short injection, but makes the field untestable if used with the private access modifier. It is advisable to avoid the following: @Inject private NotificationService notificationService; Optional injection Guice provides a way to declare an injection as optional. The method and field might be optional, which causes Guice to quietly overlook them when the dependencies aren't accessible. Optional injection can be used by mentioning the @Inject(optional=true) annotation: public class AppConsumer { private static final String DEFAULT_MSG = "Hello"; private string message = DEFAULT_MSG; @Inject(optional=true) public void setDefaultMessage(@Named("SMS") String message) { this.message = message; } } Static injection Static injection is helpful when we have to migrate a static factory implementation into Guice. It makes it feasible for objects to mostly take part in dependency injection by picking up access to injected types without being injected themselves. In a module, to indicate classes to be injected on injector creation, use requestStaticInjection(). For example,  NotificationUtil is a utility class that provides a static method, timeZoneFormat, to a string in a given format, and returns the date and timezone. The TimeZoneFormat string is hardcoded in NotificationUtil, and we will attempt to inject this utility class statically. Consider that we have one private static string variable, timeZonFmt, with setter and getter methods. We will use @Inject for the setter injection, using the @Named parameter. NotificationUtil will look like this: @Inject static String timezonFmt = "yyyy-MM-dd'T'HH:mm:ss"; @Inject public static void setTimeZoneFmt(@Named("timeZoneFmt")String timeZoneFmt){ NotificationUtil.timeZoneFormat = timeZoneFmt; } Now, SMSUtilModule should look like this: class SMSUtilModule extends AbstractModule{ @Override protected void configure() { bindConstant().annotatedWith(Names.named(timeZoneFmt)).to(yyyy-MM-dd'T'HH:mm:ss); requestStaticInjection(NotificationUtil.class); } } This API is not suggested for common utilization since it faces many of the same issues as static factories. It is also difficult to test and it makes dependencies uncertain. To sum up, what we learned in this tutorial, we began with basic dependency injection then we learned how basic Dependency Injection works in Guice, with examples. If you found this post useful, be sure to check out the book  'Java 9 Dependency Injection' to learn more about Google Guice and other concepts in dependency injection. Learning Dependency Injection (DI) Angular 2 Dependency Injection: A powerful design pattern
Read more
  • 0
  • 6
  • 42807
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-building-recommendation-system-with-scala-and-apache-spark-tutorial
Savia Lobo
08 Sep 2018
12 min read
Save for later

Building Recommendation System with Scala and Apache Spark [Tutorial]

Savia Lobo
08 Sep 2018
12 min read
Recommendation systems can be defined as software applications that draw out and learn from data such as preferences, their actions (clicks, for example), browsing history, and generated recommendations, which are products that the system determines are appealing to the user in the immediate future. In this tutorial, we will learn to build a recommendation system with Scala and Apache Spark. This article is an excerpt taken from Modern Scala Projects written Ilango Gurusamy. What does a recommendation system look like The following diagram is representative of a typical recommendation system: Recommendation system In the preceding diagram, can be thought of as a recommendation ecosystem, where the recommendation system is at the heart of it. This system needs three entities: Users Products Transactions between users and products where transactions contain feedback from users about products Implementation and deployment Implementation is documented in the following subsections. All code is developed in an Intellij code editor. The very first step is to create an empty Scala project called Chapter7. Step 1 – creating the Scala project Let's create a Scala project called Chapter7 with the following artifacts: RecommendationSystem.scala RecommendationWrapper.scala Let's break down the project's structure: .idea: Generated IntelliJ configuration files. project: Contains build.properties and plugins.sbt. project/assembly.sbt: This file specifies the sbt-assembly plugin needed to build a fat JAR for deployment. src/main/scala: This is a folder that houses Scala source files in the com.packt.modern.chapter7 package. target: This is where artifacts of the compile process are stored. The generated assembly JAR file goes here. build.sbt: This is the main SBT configuration file. Spark and its dependencies are specified here. At this point, we will start developing code in the IntelliJ code editor. We will start with the AirlineWrapper Scala file and end with the deployment of the final application JAR into Spark with spark-submit. Step 2 – creating the AirlineWrapper definition Let's create the trait definition. The trait will hold the SparkSession variable, schema definitions for the datasets, and methods to build a dataframe: trait RecWrapper { } Next, let's create a schema for past weapon sales orders. Step 3 – creating a weapon sales orders schema Let's create a schema for the past sales order dataset: val salesOrderSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true), StructField("sItemUnitPrice",DoubleType,true), StructField("sOrderSize", DoubleType,true), StructField("sAmountPaid", DoubleType,true) )) Next, let's create a schema for weapon sales leads. Step 4 – creating a weapon sales leads schema Here is a schema definition for the weapon sales lead dataset: val salesLeadSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true) )) Next, let's build a weapon sales order dataframe. Step 5 – building a weapon sales order dataframe Let's invoke the read method on our SparkSession instance and cache it. We will call this method later from the RecSystem object: def buildSalesOrders(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesOrderSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } Next up, let's build a sales leads dataframe: def buildSalesLeads(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesLeadSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } This completes the trait. Overall, it looks like this: trait RecWrapper { 1) Create a lazy SparkSession instance and call it session. 2) Create a schema for the past sales orders dataset 3) Create a schema for sales lead dataset 4) Write a method to create a dataframe that holds past sales order data. This method takes in sales order dataset and returns a dataframe 5) Write a method to create a dataframe that holds lead sales data } Bring in the following imports: import org.apache.spark.mllib.recommendation.{ALS, Rating} import org.apache.spark.rdd.RDD import org.apache.spark.sql.{DataFrame, Dataset, SparkSession} Create a Scala object called RecSystem: object RecSystem extends App with RecWrapper { } Before going any further, bring in the following imports: import org.apache.spark.rdd.RDD import org.apache.spark.sql.DataFrame Inside this object, start by loading the past sales order data. This will be our training data. Load the sales order dataset, as follows: val salesOrdersDf = buildSalesOrders("sales\\PastWeaponSalesOrders.csv") Verify the schema. This is what the schema looks like: salesOrdersDf.printSchema() root |-- sCustomerId: integer (nullable = true) |-- sCustomerName: string (nullable = true) |-- sItemId: integer (nullable = true) |-- sItemName: string (nullable = true) |-- sItemUnitPrice: double (nullable = true) |-- sOrderSize: double (nullable = true) |-- sAmountPaid: double (nullable = true) Here is a partial view of a dataframe displaying past weapon sales order data: Partial view of dataframe displaying past weapon sales order data Now, we have what we need to create a dataframe of ratings: val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => Rating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save all and compile the project at the command line: C:\Path\To\Your\Project\Chapter7>sbt compile You are likely to run into the following error: [error] C:\Path\To\Your\Project\Chapter7\src\main\scala\com\packt\modern\chapter7\RecSystem.scala:50:50: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. [error] val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => [error] ^ [error] two errors found [error] (compile:compileIncremental) Compilation failed To fix this, place the following statement at the top of the declarations of the rating dataframe. It should look like this: import session.implicits._ val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => UserRating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save and recompile the project. This time, it compiles just fine. Next, import the Rating class from the org.apache.spark.mllib.recommendation package. This transforms the rating dataframe that we obtained previously to its RDD equivalent: val ratings: RDD[Rating] = ratingsDf.rdd.map( row => Rating( row.getInt(0), row.getInt(1), row.getDouble(2) ) ) println("Ratings RDD is: " + ratings.take(10).mkString(" ") ) The following few lines of code are very important. We will be using the ALS algorithm from Spark MLlib to create and train a MatrixFactorizationModel, which takes an RDD[Rating] object as input. The ALS train method may require a combination of the following training hyperparameters: numBlocks: Preset to -1 in an auto-configuration setting. This parameter is meant to parallelize computation. custRank: The number of features, otherwise known as latent factors. iterations: This parameter represents the number of iterations for ALS to execute. For a reasonable solution to converge on, this algorithm needs roughly 20 iterations or less. regParam: The regularization parameter. implicitPrefs: This hyperparameter is a specifier. It lets us use either of the following: Explicit feedback Implicit feedback alpha: This is a hyperparameter connected to an implicit feedback variant of the ALS algorithm. Its role is to govern the baseline confidence in preference observations. We just explained the role played by each parameter needed by the ALS algorithm's train method. Let's get started by bringing in the following imports: import org.apache.spark.mllib.recommendation.MatrixFactorizationModel Now, let's get down to training the matrix factorization model using the ALS algorithm. Let's train a matrix factorization model given an RDD of ratings by customers (users) for certain items (products). Our train method on the ALS algorithm will take the following four parameters: Ratings. A rank. A number of iterations. A Lambda value or regularization parameter: val ratingsModel: MatrixFactorizationModel = ALS.train(ratings, 6, /* THE RANK */ 10, /* Number of iterations */ 15.0 /* Lambda, or regularization parameter */ ) Next, we load the sales lead file and convert it into a tuple format: val weaponSalesLeadDf = buildSalesLeads("sales\\ItemSalesLeads.csv") In the next section, we will display the new weapon sales lead dataframe. Step 6 – displaying the weapons sales dataframe First, we must invoke the show method: println("Weapons Sales Lead dataframe is: ") weaponSalesLeadDf.show   Here is a view of the weapon sales lead dataframe: View of weapon sales lead dataframe Next, create a version of the sales lead dataframe structured as (customer, item) tuples: val customerWeaponsSystemPairDf: DataFrame = weaponSalesLeadDf.map(salesLead => ( salesLead.getInt(0), salesLead.getInt(2) )).toDF("user","item") In the next section, let's display the dataframe that we just created. Step 7 – displaying the customer-weapons-system dataframe Let's the show method, as follows: println("The Customer-Weapons System dataframe as tuple pairs looks like: ") customerWeaponsSystemPairDf.show   Here is a screenshot of the new customer-weapons-system dataframe as tuple pairs: New customer-weapons-system dataframe as tuple pairs Next, we will convert the preceding dataframe into an RDD: val customerWeaponsSystemPairRDD: RDD[(Int, Int)] = customerWeaponsSystemDf.rdd.map(row => (row.getInt(0), row.getInt(1)) ) /* Notes: As far as the algorithm is concerned, customer corresponds to "user" and "product" or item corresponds to a "weapons system" */ We previously created a MatrixFactorization model that we trained with the weapons system sales orders dataset. We are in a position to predict how each customer country may rate a weapon system in the future. In the next section, we will generate predictions.   Step 8 – generating predictions Here is how we will generate predictions. The predict method of our model is designed to do just that. It will generate a predictions RDD that we call weaponRecs. It represents the ratings of weapons systems that were not rated by customer nations (listed in the past sales order data) previously: val weaponRecs: RDD[Rating] = ratingsModel.predict(customerWeaponsSystemPairRDD).distinct() Next up, we will display the final predictions. Step 9 – displaying predictions Here is how to display the predictions, lined up in tabular format: println("Future ratings are: " + weaponRecs.foreach(rating => { println( "Customer: " + rating.user + " Product: " + rating.product + " Rating: " + rating.rating ) } ) ) The following table displays how each nation is expected to rate a certain system in the future, that is, a weapon system that they did not rate earlier: System rating by each nation Our recommendation system proved itself capable of generating future predictions. Up until now, we did not say how all of the preceding code is compiled and deployed. We will look at this in the next section. Compilation and deployment Compiling the project Invoke the sbt compile project at the root folder of your Chapter7 project. You should get the following output: Output on compiling the project Besides loading build.sbt, the compile task is also loading settings from assembly.sbt which we will create below. What is an assembly.sbt file? We have not yet talked about the assembly.sbt file. Our scala-based Spark application is a Spark job that will be submitted to a (local) Spark cluster as a JAR file. This file, apart from Spark libraries, also needs other dependencies in it for our recommendation system job to successfully complete. The name fat JAR is from all dependencies bundled in one JAR. To build such a fat JAR, we need an sbt-assembly plugin. This explains the need for creating a new assembly.sbt and the assembly plugin. Creating assembly.sbt Create a new assembly.sbt in your IntelliJ project view and save it under your project folder, as follows: Creating assembly.sbt Contents of assembly.sbt Paste the following contents into the newly created assembly.sbt (under the project folder). The output should look like this: Output on placing contents of assembly.sbt The sbt-assembly plugin, version 0.14.7, gives us the ability to run an sbt-assembly task. With that, we are one step closer to building a fat or Uber JAR. This action is documented in the next step. Running the sbt assembly task Issue the sbt assembly command, as follows: Running the sbt assembly command This time, the assembly task loads the assembly-plugin in assembly.sbt. However, further assembly halts because of a common duplicate error. This error arises due to several duplicates, multiple copies of dependency files that need removal before the assembly task can successfully complete. To address this situation, build.sbt needs an upgrade. Upgrading the build.sbt file The following lines of code need to be added in, as follows: Code lines for upgrading the build.sbt file To test the effect of your changes, save this and go to the command line to reissue the sbt assembly task. Rerunning the assembly command Run the assembly task, as follows: Rerunning the assembly task This time, the settings in the assembly.sbt file are loaded. The task completes successfully. To verify, drill down to the target folder. If everything went well, you should see a fat JAR, as follows: Output as a JAR file Our JAR file under the target folder is the recommendation system application's JAR file that needs to be deployed into Spark. This is documented in the next step. Deploying the recommendation application The spark-submit command is how we will deploy the application into Spark. Here are two formats for the spark-submit command. The first one is a long one which sets more parameters than the second one: spark-submit --class "com.packt.modern.chapter7.RecSystem" --master local[2] --deploy-mode client --driver-memory 16g -num-executors 2 --executor-memory 2g --executor-cores 2 <path-to-jar> Leaning on the preceding format, let's submit our Spark job, supplying various parameters to it: Parameters for Spark The different parameters are explained as follows:    Tabular explanation of parameters for Spark Job We used Spark's support for recommendations to build a prediction model that generated recommendations and leveraged Spark's alternating least squares algorithm to implement our collaborative filtering recommendation system. If you've enjoyed reading this post, do check out the book  Modern Scala Projects to gain insights into data that will help organizations have a strategic and competitive advantage. How to Build a music recommendation system with PageRank Algorithm Recommendation Systems Building A Recommendation System with Azure
Read more
  • 0
  • 0
  • 16905

article-image-building-a-twitter-news-bot-using-twitter-api-tutorial
Bhagyashree R
07 Sep 2018
11 min read
Save for later

Building a Twitter news bot using Twitter API [Tutorial]

Bhagyashree R
07 Sep 2018
11 min read
This article is an excerpt from a book written by Srini Janarthanam titled Hands-On Chatbots and Conversational UI Development. In this article, we will explore the Twitter API and build core modules for tweeting, searching, and retweeting. We will further explore a data source for news around the globe and build a simple bot that tweets top news on its timeline. Getting started with the Twitter app To get started, let us explore the Twitter developer platform. Let us begin by building a Twitter app and later explore how we can tweet news articles to followers based on their interests: Log on to Twitter. If you don't have an account on Twitter, create one. Go to Twitter Apps, which is Twitter's application management dashboard. Click the Create New App button: Create an application by filling in the form providing name, description, and a website (fully-qualified URL). Read and agree to the Developer Agreement and hit Create your Twitter application: You will now see your application dashboard. Explore the tabs: Click Keys and Access Tokens: Copy consumer key and consumer secret and hang on to them. Scroll down to Your Access Token: Click Create my access token to create a new token for your app: Copy the Access Token and Access Token Secret and hang on to them. Now, we have all the keys and tokens we need to create a Twitter app. Building your first Twitter bot Let's build a simple Twitter bot. This bot will listen to tweets and pick out those that have a particular hashtag. All the tweets with a given hashtag will be printed on the console. This is a very simple bot to help us get started. In the following sections, we will explore more complex bots. To follow along you can download the code from the book's GitHub repository. Go to the root directory and create a new Node.js program using npm init: Execute the npm install twitter --save command to install the Twitter Node.js library: Run npm install request --save to install the Request library as well. We will use this in the future to make HTTP GET requests to a news data source. Explore your package.json file in the root directory: { "name": "twitterbot", "version": "1.0.0", "description": "my news bot", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "request": "^2.81.0", "twitter": "^1.7.1" } } Create an index.js file with the following code: //index.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); In the preceding code, put the keys and tokens you saved in their appropriate variables. We don't need the request package just yet, but we will later. Now let's create a hashtag listener to listen to the tweets on a specific hashtag: //Twitter stream var hashtag = '#brexit'; //put any hashtag to listen e.g. #brexit console.log('Listening to:' + hashtag); Twitter.stream('statuses/filter', {track: hashtag}, function(stream) { stream.on('data', function(tweet) { console.log('Tweet:@' + tweet.user.screen_name + '\t' + tweet.text); console.log('------') }); stream.on('error', function(error) { console.log(error); }); }); Replace #brexit with the hashtag you want to listen to. Use a popular one so that you can see the code in action. Run the index.js file with the node index.js command. You will see a stream of tweets from Twitter users all over the globe who used the hashtag: Congratulations! You have built your first Twitter bot. Exploring the Twitter SDK In the previous section, we explored how to listen to tweets based on hashtags. Let's now explore the Twitter SDK to understand the capabilities that we can bestow upon our Twitter bot. Updating your status You can also update your status on your Twitter timeline by using the following status update module code: tweet ('I am a Twitter Bot!', null, null); function tweet(statusMsg, screen_name, status_id){ console.log('Sending tweet to: ' + screen_name); console.log('In response to:' + status_id); var msg = statusMsg; if (screen_name != null){ msg = '@' + screen_name + ' ' + statusMsg; } console.log('Tweet:' + msg); Twitter.post('statuses/update', { status: msg }, function(err, response) { // if there was an error while tweeting if (err) { console.log('Something went wrong while TWEETING...'); console.log(err); } else if (response) { console.log('Tweeted!!!'); console.log(response) } }); } Comment out the hashtag listener code and instead add the preceding status update code and run it. When run, your bot will post a tweet on your timeline: In addition to tweeting on your timeline, you can also tweet in response to another tweet (or status update). The screen_name argument is used to create a response. tweet. screen_name is the name of the user who posted the tweet. We will explore this a bit later. Retweet to your followers You can retweet a tweet to your followers using the following retweet status code: var retweetId = '899681279343570944'; retweet(retweetId); function retweet(retweetId){ Twitter.post('statuses/retweet/', { id: retweetId }, function(err, response) { if (err) { console.log('Something went wrong while RETWEETING...'); console.log(err); } else if (response) { console.log('Retweeted!!!'); console.log(response) } }); } Searching for tweets You can also search for recent or popular tweets with hashtags using the following search hashtags code: search('#brexit', 'popular') function search(hashtag, resultType){ var params = { q: hashtag, // REQUIRED result_type: resultType, lang: 'en' } Twitter.get('search/tweets', params, function(err, data) { if (!err) { console.log('Found tweets: ' + data.statuses.length); console.log('First one: ' + data.statuses[1].text); } else { console.log('Something went wrong while SEARCHING...'); } }); } Exploring a news data service Let's now build a bot that will tweet news articles to its followers at regular intervals. We will then extend it to be personalized by users through a conversation that happens over direct messaging with the bot. In order to build a news bot, we need a source where we can get news articles. We are going to explore a news service called NewsAPI.org in this section. News API is a service that aggregates news articles from roughly 70 newspapers around the globe. Setting up News API Let us set up an account with the News API data service and get the API key: Go to NewsAPI.org: Click Get API key. Register using your email. Get your API key. Explore the sources: https://newsapi.org/v1/sources?apiKey=YOUR_API_KEY. There are about 70 sources from across the globe including popular ones such as BBC News, Associated Press, Bloomberg, and CNN. You might notice that each source has a category tag attached. The possible options are: business, entertainment, gaming, general, music, politics, science-and-nature, sport, and technology. You might also notice that each source also has language (en, de, fr) and country (au, de, gb, in, it, us) tags. The following is the information on the BBC-News source: { "id": "bbc-news", "name": "BBC News", "description": "Use BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news.", "url": "http://www.bbc.co.uk/news", "category": "general", "language": "en", "country": "gb", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] } Get sources for a specific category, language, or country using: https://newsapi.org/v1/sources?category=business&apiKey=YOUR_API_KEY The following is the part of the response to the preceding query asking for all sources under the business category: "sources": [ { "id": "bloomberg", "name": "Bloomberg", "description": "Bloomberg delivers business and markets news, data, analysis, and video to the world, featuring stories from Businessweek and Bloomberg News.", "url": "http://www.bloomberg.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] }, { "id": "business-insider", "name": "Business Insider", "description": "Business Insider is a fast-growing business site with deep financial, media, tech, and other industry verticals. Launched in 2007, the site is now the largest business news site on the web.", "url": "http://www.businessinsider.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top", "latest" ] }, ... ] Explore the articles: https://newsapi.org/v1/articles?source=bbc-news&apiKey=YOUR_API_KEY The following is the sample response: "articles": [ { "author": "BBC News", "title": "US Navy collision: Remains found in hunt for missing sailors", "description": "Ten US sailors have been missing since Monday's collision with a tanker near Singapore.", "url": "http://www.bbc.co.uk/news/world-us-canada-41013686", "urlToImage": "https://ichef1.bbci.co.uk/news/1024/cpsprodpb/80D9/ production/_97458923_mediaitem97458918.jpg", "publishedAt": "2017-08-22T12:23:56Z" }, { "author": "BBC News", "title": "Afghanistan hails Trump support in 'joint struggle'", "description": "President Ghani thanks Donald Trump for supporting Afghanistan's battle against the Taliban.", "url": "http://www.bbc.co.uk/news/world-asia-41012617", "urlToImage": "https://ichef.bbci.co.uk/images/ic/1024x576/p05d08pf.jpg", "publishedAt": "2017-08-22T11:45:49Z" }, ... ] For each article, the author, title, description, url, urlToImage,, and publishedAt fields are provided. Now that we have explored a source of news data that provides up-to-date news stories under various categories, let us go on to build a news bot. Building a Twitter news bot Now that we have explored News API, a data source for the latest news updates, and a little bit of what the Twitter API can do, let us combine them both to build a bot tweeting interesting news stories, first on its own timeline and then specifically to each of its followers: Let's build a news tweeter module that tweets the top news article given the source. The following code uses the tweet() function we built earlier: topNewsTweeter('cnn', null); function topNewsTweeter(newsSource, screen_name, status_id){ request({ url: 'https://newsapi.org/v1/articles?source=' + newsSource + '&apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { var botResponse = JSON.parse(body); console.log(botResponse); tweetTopArticle(botResponse.articles, screen_name); } else { console.log('Sorry. No new'); } }); } function tweetTopArticle(articles, screen_name, status_id){ var article = articles[0]; tweet(article.title + " " + article.url, screen_name); } Run the preceding program to fetch news from CNN and post the topmost article on Twitter: Here is the post on Twitter: Now, let us build a module that tweets news stories from a randomly-chosen source in a list of sources: function tweetFromRandomSource(sources, screen_name, status_id){ var max = sources.length; var randomSource = sources[Math.floor(Math.random() * (max + 1))]; //topNewsTweeter(randomSource, screen_name, status_id); } Let's call the tweeting module after we acquire the list of sources: function getAllSourcesAndTweet(){ var sources = []; console.log('getting sources...') request({ url: 'https://newsapi.org/v1/sources? apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { // Print out the response body var botResponse = JSON.parse(body); for (var i = 0; i < botResponse.sources.length; i++){ console.log('adding.. ' + botResponse.sources[i].id) sources.push(botResponse.sources[i].id) } tweetFromRandomSource(sources, null, null); } else { console.log('Sorry. No news sources!'); } }); } Let's create a new JS file called tweeter.js. In the tweeter.js file, call getSourcesAndTweet() to get the process started: //tweeter.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); getAllSourcesAndTweet(); Run the tweeter.js file on the console. This bot will tweet a news story every time it is called. It will choose top news stories from around 70 news sources randomly. Hurray! You have built your very own Twitter news bot. In this tutorial, we have covered a lot. We started off with the Twitter API and got a taste of how we can automatically tweet, retweet, and search for tweets using hashtags. We then explored a News source API that provides news articles from about 70 different newspapers. We integrated it with our Twitter bot to create a new tweeting bot. If you found this post useful, do check out the book, Hands-On Chatbots and Conversational UI Development, which will help you explore the world of conversational user interfaces. Build and train an RNN chatbot using TensorFlow [Tutorial] Building a two-way interactive chatbot with Twilio: A step-by-step guide How to create a conversational assistant or chatbot using Python
Read more
  • 0
  • 1
  • 52341

article-image-classifying-flowers-in-iris-dataset-using-scala-tutorial
Savia Lobo
06 Sep 2018
15 min read
Save for later

Classifying flowers in Iris Dataset using Scala [Tutorial]

Savia Lobo
06 Sep 2018
15 min read
The Iris dataset is the simplest, yet the most famous data analysis task in the ML space. In this article, you will build a solution for data analysis & classification task from an Iris dataset using Scala. This article is an excerpt taken from Modern Scala Projects written by Ilango Gurusamy. The following diagrams together help in understanding the different components of this project. That said, this pipeline involves training (fitting), transformation, and validation operations. More than one model is trained and the best model (or mapping function) is selected to give us an accurate approximation predicting the species of an Iris flower (based on measurements of those flowers): Project block diagram A breakdown of the project block diagram is as follows: Spark, which represents the Spark cluster and its ecosystem Training dataset Model Dataset attributes or feature measurements An inference process, that produces a prediction column The following diagram represents a more detailed description of the different phases in terms of the functions performed in each phase. Later we will come to visualize pipeline in terms of its constituent stages. For now, the diagram depicts four stages, starting with a data pre-processing phase, which is considered separate from the numbered phases deliberately. Think of the pipeline as a two-step process:  A data cleansing phase, or pre-processing phase. An important phase that could include a subphase of Exploratory Data Analysis (EDA) (not explicitly depicted in the latter diagram). A data analysis phase that begins with Feature Extraction, followed by Model Fitting, and Model validation, all the way to deployment of an Uber pipeline JAR into Spark: Pipeline diagram Referring to the preceding diagram, the first implementation objective is to set up Spark inside an SBT project. An SBT project is a self-contained application, which we can run on the command line to predict Iris labels. In the SBT project,  dependencies are specified in a build.sbt file and our application code will create its  own  SparkSession and SparkContext. So that brings us to a listing of implementation objectives and these are as follows: Get the Iris dataset from the UCI Machine Learning Repository Conduct preliminary EDA in the Spark shell Create a new Scala project in IntelliJ, and carry out all implementation steps, until the evaluation of the Random Forest classifier Deploy the application to your local Spark cluster Step 1# Getting the Iris dataset from the UCI Machine Learning Repository Head over to the UCI Machine Learning Repository website at https://archive.ics.uci.edu/ml/datasets/iris and click on Download: Data Folder. Extract this folder someplace convenient and copy over iris.csv into the root of your project folder. You may refer back to the project overview for an in-depth description of the Iris dataset. We depict the contents of the iris.csv file here, as follows: A snapshot of the Iris dataset with 150 sets You may recall that the iris.csv file is a 150-row file, with comma-separated values. Now that we have the dataset, the first step will be performing EDA on it. The Iris dataset is multivariate, meaning there is more than one (independent) variable, so we will carry out a basic multivariate EDA on it. But we need DataFrame to let us do that. How we create a dataframe as a prelude to EDA is the goal of the next section. Step 2# Preliminary EDA Before we get down to building the SBT pipeline project, we will conduct a preliminary EDA in spark-shell. The plan is to derive a dataframe out of the dataset and then calculate basic statistics on it. We have three tasks at hand for spark-shell: Fire up spark-shell Load the iris.csv file and build DataFrame Calculate the statistics We will then port that code over to a Scala file inside our SBT project. That said, let's get down to loading the iris.csv file (inputting the data source) before eventually building DataFrame. Step 3# Creating an SBT project Lay out your SBT project in a folder of your choice and name it IrisPipeline or any name that makes sense to you. This will hold all of our files needed to implement and run the pipeline on the Iris dataset. The structure of our SBT project looks like the following: Project structure We will list dependencies in the build.sbt file. This is going to be an SBT project. Hence, we will bring in the following key libraries: Spark Core Spark MLlib Spark SQL The following screenshot illustrates the build.sbt file: The build.sbt file with Spark dependencies The build.sbt file referenced in the preceding snapshot is readily available for you in the book's download bundle. Drill down to the folder Chapter01 code under ModernScalaProjects_Code and copy the folder over to a convenient location on your computer. Drop the iris.csv file that you downloaded in Step 1 – getting the Iris dataset from the UCI Machine Learning Repository into the root folder of our new SBT project. Refer to the earlier screenshot that depicts the updated project structure with the iris.csv file inside of it. Step 4# Creating Scala files in SBT project Step 4 is broken down into the following steps: Create the Scala file iris.scala in the com.packt.modern.chapter1 package. Up until now, we relied on SparkSession and SparkContext, which spark-shell gave us. This time around, we need to create SparkSession, which will, in turn, give us SparkContext. What follows is how the code is laid out in the iris.scala file. In iris.scala, after the package statement, place the following import statements: import org.apache.spark.sql.SparkSession Create SparkSession inside a trait, which we shall call IrisWrapper: lazy val session: SparkSession = SparkSession.builder().getOrCreate() Just one SparkSession is made available to all classes extending from IrisWrapper. Create val to hold the iris.csv file path: val dataSetPath = "<<path to folder containing your iris.csv file>>\\iris.csv" Create a method to build DataFrame. This method takes in the complete path to the Iris dataset path as String and returns DataFrame: def buildDataFrame(dataSet: String): DataFrame = { /* The following is an example of a dataSet parameter string: "C:\\Your\\Path\\To\\iris.csv" */ Import the DataFrame class by updating the previous import statement for SparkSession: import org.apache.spark.sql.{DataFrame, SparkSession} Create a nested function inside buildDataFrame to process the raw dataset. Name this function getRows. getRows which takes no parameters but returns Array[(Vector, String)]. The textFile method on the SparkContext variable processes the iris.csv into RDD[String]: val result1: Array[String] = session.sparkContext.textFile(<<path to iris.csv represented by the dataSetPath variable>>) The resulting RDD contains two partitions. Each partition, in turn, contains rows of strings separated by a newline character, '\n'. Each row in the RDD represents its original counterpart in the raw data. In the next step, we will attempt several data transformation steps. We start by applying a flatMap operation over the RDD, culminating in the DataFrame creation. DataFrame is a view over Dataset, which happens to the fundamental data abstraction unit in the Spark 2.0 line. Step 5# Preprocessing, data transformation, and DataFrame creation We will get started by invoking flatMap, by passing a function block to it, and successive transformations listed as follows, eventually resulting in Array[(org.apache.spark.ml.linalg.Vector, String)]. A vector represents a row of feature measurements. The Scala code to give us Array[(org.apache.spark.ml.linalg.Vector, String)] is as follows: //Each line in the RDD is a row in the Dataset represented by a String, which we can 'split' along the new //line character val result2: RDD[String] = result1.flatMap { partition => partition.split("\n").toList } //the second transformation operation involves a split inside of each line in the dataset where there is a //comma separating each element of that line val result3: RDD[Array[String]] = result2.map(_.split(",")) Next, drop the header column, but not before doing a collection that returns an Array[Array[String]]: val result4: Array[Array[String]] = result3.collect.drop(1) The header column is gone; now import the Vectors class: import org.apache.spark.ml.linalg.Vectors Now, transform Array[Array[String]] into Array[(Vector, String)]: val result5 = result4.map(row => (Vectors.dense(row(1).toDouble, row(2).toDouble, row(3).toDouble, row(4).toDouble),row(5))) Step 6# Creating, training, and testing data Now, let's split our dataset in two by providing a random seed: val splitDataSet: Array[org.apache.spark.sql.Dataset [org.apache.spark.sql.Row]] = dataSet.randomSplit(Array(0.85, 0.15), 98765L) Now our new splitDataset contains two datasets: Train dataset: A dataset containing Array[(Vector, iris-species-label-column: String)] Test dataset: A dataset containing Array[(Vector, iris-species-label-column: String)] Confirm that the new dataset is of size 2: splitDataset.size res48: Int = 2 Assign the training dataset to a variable, trainSet: val trainDataSet = splitDataSet(0) trainSet: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iris-features-column: vector, iris-species-label-column: string] Assign the testing dataset to a variable, testSet: val testDataSet = splitDataSet(1) testSet: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iris-features-column: vector, iris-species-label-column: string] Count the number of rows in the training dataset: trainSet.count res12: Long = 14 Count the number of rows in the testing dataset: testSet.count res9: Long = 136 There are 150 rows in all. Step 7# Creating a Random Forest classifier In reference to Step 5 - DataFrame Creation. This DataFrame 'dataFrame' contains column names that corresponds to the columns present in the DataFrame produced in that step The first step to create a classifier is to  pass into it (hyper) parameters. A fairly comprehensive list of parameters look like this: From 'dataFrame' we need the Features column name - iris-features-column From 'dataFrame' we also need the Indexed label column name - iris-species-label-column The sqrt setting for featureSubsetStrategy Number of features to be considered per split (we have 150 observations and four features that will make our max_features value 2) Impurity settings—values can be gini and entropy Number of trees to train (since the number of trees is greater than one, we set a tree maximum depth), which is a number equal to the number of nodes The required minimum number of feature measurements (sampled observations), also known as the minimum instances per node Look at the IrisPipeline.scala file for values of each of these parameters. But this time, we will employ an exhaustive grid search-based model selection process based on combinations of parameters, where parameter value ranges are specified. Create a randomForestClassifier instance. Set the features and featureSubsetStrategy: val randomForestClassifier = new RandomForestClassifier() .setFeaturesCol(irisFeatures_CategoryOrSpecies_IndexedLabel._1) .setFeatureSubsetStrategy("sqrt") Start building Pipeline, which has two stages, Indexer and Classifier: val irisPipeline = new Pipeline().setStages(Array[PipelineStage](indexer) ++ Array[PipelineStage](randomForestClassifier)) Next, set the hyperparameter num_trees (number of trees) on the classifier to 15, a Max_Depth parameter, and an impurity with two possible values of gini and entropy. Build a parameter grid with all three hyperparameters: val finalParamGrid: Array[ParamMap] = gridBuilder3.build() Step 8# Training the Random Forest classifier Next, we want to split our training set into a validation set and a training set: val validatedTestResults: DataFrame = new TrainValidationSplit() On this variable, set Seed, set EstimatorParamMaps, set Estimator with irisPipeline, and set a training ratio to 0.8: val validatedTestResults: DataFrame = new TrainValidationSplit().setSeed(1234567L).setEstimator(irisPipeline) Finally, do a fit and a transform with our training dataset and testing dataset. Great! Now the classifier is trained. In the next step, we will apply this classifier to testing the data. Step 9# Applying the Random Forest classifier to test data The purpose of our validation set is to be able to make a choice between models. We want an evaluation metric and hyperparameter tuning. We will now create an instance of a validation estimator called TrainValidationSplit, which will split the training set into a validation set and a training set: val validatedTestResults.setEvaluator(new MulticlassClassificationEvaluator()) Next, we fit this estimator over the training dataset to produce a model and a transformer that we will use to transform our testing dataset. Finally, we perform a validation for hyperparameter tuning by applying an evaluator for a metric. The new ValidatedTestResults DataFrame should look something like this: --------+ |iris-features-column|iris-species-column|label| rawPrediction| probability|prediction| +--------------------+-------------------+-----+--------------------+ | [4.4,3.2,1.3,0.2]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| | [5.4,3.9,1.3,0.4]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| | [5.4,3.9,1.7,0.4]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| Let's return a new dataset by passing in column expressions for prediction and label: val validatedTestResultsDataset:DataFrame = validatedTestResults.select("prediction", "label") In the line of code, we produced a new DataFrame with two columns: An input label A predicted label, which is compared with its corresponding value in the input label column That brings us to the next step, an evaluation step. We want to know how well our model performed. That is the goal of the next step. Step 10# Evaluate Random Forest classifier In this section, we will test the accuracy of the model. We want to know how well our model performed. Any ML process is incomplete without an evaluation of the classifier. That said, we perform an evaluation as a two-step process: Evaluate the model output Pass in three hyperparameters: val modelOutputAccuracy: Double = new MulticlassClassificationEvaluator() Set the label column, a metric name, the prediction column label, and invoke evaluation with the validatedTestResults dataset. Note the accuracy of the model output results on the testing dataset from the modelOutputAccuracy variable. The other metrics to evaluate are how close the predicted label value in the 'predicted' column is to the actual label value in the (indexed) label column. Next, we want to extract the metrics: val multiClassMetrics = new MulticlassMetrics(validatedRDD2) Our pipeline produced predictions. As with any prediction, we need to have a healthy degree of skepticism. Naturally, we want a sense of how our engineered prediction process performed. The algorithm did all the heavy lifting for us in this regard. That said, everything we did in this step was done for the purpose of evaluation. Who is being evaluated here or what evaluation is worth reiterating? That said, we wanted to know how close the predicted values were compared to the actual label value. To obtain that knowledge, we decided to use the MulticlassMetrics class to evaluate metrics that will give us a measure of the performance of the model via two methods: Accuracy Weighted precision val accuracyMetrics = (multiClassMetrics.accuracy, multiClassMetrics.weightedPrecision) val accuracy = accuracyMetrics._1 val weightedPrecsion = accuracyMetrics._2 These metrics represent evaluation results for our classifier or classification model. In the next step, we will run the application as a packaged SBT application. Step 11# Running the pipeline as an SBT application At the root of your project folder, issue the sbt console command, and in the Scala shell, import the IrisPipeline object and then invoke the main method of IrisPipeline with the argument iris: sbt console scala> import com.packt.modern.chapter1.IrisPipeline IrisPipeline.main(Array("iris") Accuracy (precision) is 0.9285714285714286 Weighted Precision is: 0.9428571428571428 Step 12# Packaging the application In the root folder of your SBT application, run: sbt package When SBT is done packaging, the Uber JAR can be deployed into our cluster, using spark-submit, but since we are in standalone deploy mode, it will be deployed into [local]: The application JAR file The package command created a JAR file that is available under the target folder. In the next section, we will deploy the application into Spark. Step 13# Submitting the pipeline application to Spark local At the root of the application folder, issue the spark-submit command with the class and JAR file path arguments, respectively. If everything went well, the application does the following: Loads up the data. Performs EDA. Creates training, testing, and validation datasets. Creates a Random Forest classifier model. Trains the model. Tests the accuracy of the model. This is the most important part—the ML classification task. To accomplish this, we apply our trained Random Forest classifier model to the test dataset. This dataset consists of Iris flower data of so far not seen by the model. Unseen data is nothing but Iris flowers picked in the wild. Applying the model to the test dataset results in a prediction about the species of an unseen (new) flower. The last part is where the pipeline runs an evaluation process, which essentially is about checking if the model reports the correct species. Lastly, pipeline reports back on how important a certain feature of the Iris flower turned out to be. As a matter of fact, the petal width turns out to be more important than the sepal width in carrying out the classification task. Thus we implemented an ML workflow or an ML pipeline. The pipeline combined several stages of data analysis into one workflow. We started by loading the data and from there on, we created training and test data, preprocessed the dataset, trained the RandomForestClassifier model, applied the Random Forest classifier to test data, evaluated the classifier, and computed a process that demonstrated the importance of each feature in the classification. If you've enjoyed reading this post visit the book, Modern Scala Projects to build efficient data science projects that fulfill your software requirements. Deep Learning Algorithms: How to classify Irises using multi-layer perceptrons Introducing Android 9 Pie, filled with machine learning and baked-in UI features Paper in Two minutes: A novel method for resource efficient image classification
Read more
  • 0
  • 0
  • 19474

article-image-intelligent-mobile-projects-with-tensorflow-build-your-first-reinforcement-learning-model-on-raspberry-pi-tutorial
Bhagyashree R
05 Sep 2018
13 min read
Save for later

Intelligent mobile projects with TensorFlow: Build your first Reinforcement Learning model on Raspberry Pi [Tutorial]

Bhagyashree R
05 Sep 2018
13 min read
OpenAI Gym (https://gym.openai.com) is an open source Python toolkit that offers many simulated environments to help you develop, compare, and train reinforcement learning algorithms, so you don't have to buy all the sensors and train your robot in the real environment, which can be costly in both time and money. In this article, we'll show you how to develop and train a reinforcement learning model on Raspberry Pi using TensorFlow in an OpenAI Gym's simulated environment called CartPole (https://gym.openai.com/envs/CartPole-v0). This tutorial is an excerpt from a book written by Jeff Tang titled Intelligent Mobile Projects with TensorFlow. To install OpenAI Gym, run the following commands: git clone https://github.com/openai/gym.git cd gym sudo pip install -e . You can verify that you have TensorFlow 1.6 and gym installed by running pip list: pi@raspberrypi:~ $ pip list gym (0.10.4, /home/pi/gym) tensorflow (1.6.0) Or you can start IPython then import TensorFlow and gym: pi@raspberrypi:~ $ ipython Python 2.7.9 (default, Sep 17 2016, 20:26:04) IPython 5.5.0 -- An enhanced Interactive Python. In [1]: import tensorflow as tf In [2]: import gym In [3]: tf.__version__ Out[3]: '1.6.0' In [4]: gym.__version__ Out[4]: '0.10.4' We're now all set to use TensorFlow and gym to build some interesting reinforcement learning model running on Raspberry Pi. Understanding the CartPole simulated environment CartPole is an environment that can be used to train a robot to stay in balance. In the CartPole environment, a pole is attached to a cart, which moves horizontally along a track. You can take an action of 1 (accelerating right) or 0 (accelerating left) to the cart. The pole starts upright, and the goal is to prevent it from falling over. A reward of 1 is provided for every time step that the pole remains upright. An episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center. Let's play with the CartPole environment now. First, create a new environment and find out the possible actions an agent can take in the environment: env = gym.make("CartPole-v0") env.action_space # Discrete(2) env.action_space.sample() # 0 or 1 Every observation (state) consists of four values about the cart: its horizontal position, its velocity, its pole's angle, and its angular velocity: obs=env.reset() obs # array([ 0.04052535, 0.00829587, -0.03525301, -0.00400378]) Each step (action) in the environment will result in a new observation, a reward of the action, whether the episode is done (if it is then you can't take any further steps), and some additional information: obs, reward, done, info = env.step(1) obs # array([ 0.04069127, 0.2039052 , -0.03533309, -0.30759772]) Remember action (or step) 1 means moving right, and 0 left. To see how long an episode can last when you keep moving the cart right, run: while not done: obs, reward, done, info = env.step(1) print(obs) #[ 0.08048328 0.98696604 -0.09655727 -1.54009127] #[ 0.1002226 1.18310769 -0.12735909 -1.86127705] #[ 0.12388476 1.37937549 -0.16458463 -2.19063676] #[ 0.15147227 1.5756628 -0.20839737 -2.52925864] #[ 0.18298552 1.77178219 -0.25898254 -2.87789912] Let's now manually go through a series of actions from start to end and print out the observation's first value (the horizontal position) and third value (the pole's angle in degrees from vertical) as they're the two values that determine whether an episode is done. First, reset the environment and accelerate the cart right a few times: import numpy as np obs=env.reset() obs[0], obs[2]*360/np.pi # (0.008710582898326602, 1.4858315848689436) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.009525842685697472, 1.5936049816642313) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.014239775393474322, 1.040038643681757) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.0228521194217381, -0.17418034908781568) You can see that the cart's position value gets bigger and bigger as it's moved right, the pole's vertical degree gets smaller and smaller, and the last step shows a negative degree, meaning the pole is going to the left side of the center. All this makes sense, with just a little vivid picture in your mind of your favorite dog pushing a cart with a pole. Now change the action to accelerate the cart left (0) a few times: obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.03536432554326476, -2.0525933052704954) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04397450935915654, -3.261322987287562) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04868738508385764, -3.812330822419413) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04950617929263011, -3.7134404042580687) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04643238384389254, -2.968245724428785) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.039465670006712444, -1.5760901885345346) You may be surprised at first to see the 0 action causes the positions (obs[0]) to continue to get bigger for several times, but remember that the cart is moving at a velocity and one or several actions of moving the cart to the other direction won't decrease the position value immediately. But if you keep moving the cart to the left, you'll see that the cart's position starts becoming smaller (toward the left). Now continue the 0 action and you'll see the position gets smaller and smaller, with a negative value meaning the cart enters the left side of the center, while the pole's angle gets bigger and bigger: obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.028603948219811447, 0.46789197320636305) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.013843572459953138, 3.1726728882727504) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (-0.00482029774222077, 6.551160678086707) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (-0.02739315127299434, 10.619948631208114) For the CartPole environment, the reward value returned in each step call is always 1, and the info is always {}.  So that's all there's to know about the CartPole simulated environment. Now that we understand how CartPole works, let's see what kinds of policies we can come up with so at each state (observation), we can let the policy tell us which action (step) to take in order to keep the pole upright for as long as possible, in other words, to maximize our rewards. Using neural networks to build a better policy Let's first see how to build a random policy using a simple fully connected (dense) neural network, which takes 4 values in an observation as input, uses a hidden layer of 4 neurons, and outputs the probability of the 0 action, based on which, the agent can sample the next action between 0 and 1: To follow along you can download the code files from the book's GitHub repository. # nn_random_policy.py import tensorflow as tf import numpy as np import gym env = gym.make("CartPole-v0") num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) outputs = tf.layers.dense(hidden, 1, activation=tf.nn.sigmoid) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) total_rewards = [] for _ in range(1000): rewards = 0 obs = env.reset() while True: a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) Note that we use the tf.multinomial function to sample an action based on the probability distribution of action 0 and 1, defined as outputs and 1-outputs, respectively (the sum of the two probabilities is 1). The mean of the total rewards will be around 20-something. This is a neural network that is generating a random policy, with no training at all. To train the network, we use tf.nn.sigmoid_cross_entropy_with_logits to define the loss function between the network output and the desired y_target action, defined using the basic simple policy in the previous subsection, so we expect this neural network policy to achieve about the same rewards as the basic non-neural-network policy: # nn_simple_policy.py import tensorflow as tf import numpy as np import gym env = gym.make("CartPole-v0") num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) y = tf.placeholder(tf.float32, shape=[None, 1]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) logits = tf.layers.dense(hidden, 1) outputs = tf.nn.sigmoid(logits) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits) optimizer = tf.train.AdamOptimizer(0.01) training_op = optimizer.minimize(cross_entropy) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(1000): obs = env.reset() while True: y_target = np.array([[1. if obs[2] < 0 else 0.]]) a, _ = sess.run([action, training_op], feed_dict={inputs: obs.reshape(1, num_inputs), y: y_target}) obs, reward, done, info = env.step(a[0][0]) if done: break print("training done") We define outputs as a sigmoid function of the logits net output, that is, the probability of action 0, and then use the tf.multinomial to sample an action. Note that we use the standard tf.train.AdamOptimizer and its minimize method to train the network. To test and see how good the policy is, run the following code: total_rewards = [] for _ in range(1000): rewards = 0 obs = env.reset() while True: y_target = np.array([1. if obs[2] < 0 else 0.]) a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) We're now all set to explore how we can implement a policy gradient method on top of this to make our neural network perform much better, getting rewards several times larger. The basic idea of a policy gradient is that in order to train a neural network to generate a better policy, when all an agent knows from the environment is the rewards it can get when taking an action from any given state, we can adopt two new mechanisms: Discounted rewards: Each action's value needs to consider its future action rewards. For example, an action that gets an immediate reward, 1, but ends the episode two actions (steps) later should have fewer long-term rewards than an action that gets an immediate reward, 1, but ends the episode 10 steps later. Test run the current policy and see which actions lead to higher discounted rewards, then update the current policy's gradients (of the loss for weights) with the discounted rewards, in a way that an action with higher discounted rewards will, after the network update, have a higher probability of being chosen next time. Repeat such test runs and update the process many times to train a neural network for a better policy. Implementing a policy gradient in TensorFlow Let's now see how to implement a policy gradient for our CartPole problem in TensorFlow. First, import tensorflow, numpy, and gym, and define a helper method that calculates the normalized and discounted rewards: import tensorflow as tf import numpy as np import gym def normalized_discounted_rewards(rewards): dr = np.zeros(len(rewards)) dr[-1] = rewards[-1] for n in range(2, len(rewards)+1): dr[-n] = rewards[-n] + dr[-n+1] * discount_rate return (dr - dr.mean()) / dr.std() Next, create the CartPole gym environment, define the learning_rate and discount_rate hyper-parameters, and build the network with four input neurons, four hidden neurons, and one output neuron as before: env = gym.make("CartPole-v0") learning_rate = 0.05 discount_rate = 0.95 num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) logits = tf.layers.dense(hidden, 1) outputs = tf.nn.sigmoid(logits) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) prob_action_0 = tf.to_float(1-action) cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=prob_action_0) optimizer = tf.train.AdamOptimizer(learning_rate) To manually fine-tune the gradients to take into consideration the discounted rewards for each action we first use the compute_gradients method, then update the gradients the way we want, and finally call the apply_gradients method. So let's now  compute the gradients of the cross-entropy loss for the network parameters (weights and biases), and set up gradient placeholders, which are to be fed later with the values that consider both the computed gradients and the discounted rewards of the actions taken using the current policy during test run: gvs = optimizer.compute_gradients(cross_entropy) gvs = [(g, v) for g, v in gvs if g != None] gs = [g for g, _ in gvs] gps = [] gvs_feed = [] for g, v in gvs: gp = tf.placeholder(tf.float32, shape=g.get_shape()) gps.append(gp) gvs_feed.append((gp, v)) training_op = optimizer.apply_gradients(gvs_feed) The  gvs returned from optimizer.compute_gradients(cross_entropy) is a list of tuples, and each tuple consists of the gradient (of the cross_entropy for a trainable variable) and the trainable variable. If you run the script multiple times from IPython, the default graph of the tf object will contain trainable variables from previous runs, so unless you call tf.reset_default_graph(), you need to use gvs = [(g, v) for g, v in gvs if g != None] to remove those obsolete training variables, which would return None gradients. Now, play some games and save the rewards and gradient values: with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(1000): rewards, grads = [], [] obs = env.reset() # using current policy to test play a game while True: a, gs_val = sess.run([action, gs], feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards.append(reward) grads.append(gs_val) if done: break After the test play of a game, update the gradients with discounted rewards and train the network (remember that training_op is defined as optimizer.apply_gradients(gvs_feed)): # update gradients and do the training nd_rewards = normalized_discounted_rewards(rewards) gp_val = {} for i, gp in enumerate(gps): gp_val[gp] = np.mean([grads[k][i] * reward for k, reward in enumerate(nd_rewards)], axis=0) sess.run(training_op, feed_dict=gp_val) Finally, after 1,000 iterations of test play and updates, we can test the trained model: total_rewards = [] for _ in range(100): rewards = 0 obs = env.reset() while True: a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) Note that we now use the trained policy network and sess.run to get the next action with the current observation as input. The output mean of the total rewards will be about 200. You can also save a trained model after the training using tf.train.Saver: saver = tf.train.Saver() saver.save(sess, "./nnpg.ckpt") Then you can reload it in a separate test program with: with tf.Session() as sess: saver.restore(sess, "./nnpg.ckpt") Now that you have a powerful neural-network-based policy model that can help your robot keep in balance, fully tested in a simulated environment, you can deploy it in a real physical environment, after replacing the simulated environment API returns with real environment data, of course—but the code to build and train the neural network reinforcement learning model can certainly be easily reused. If you liked this tutorial and would like to learn more such techniques, pick up this book, Intelligent Mobile Projects with TensorFlow, authored by Jeff Tang. AI on mobile: How AI is taking over the mobile devices marketspace Introducing Intelligent Apps AI and the Raspberry Pi: Machine Learning and IoT, What’s the Impact?
Read more
  • 0
  • 0
  • 18158
article-image-how-to-create-advanced-environment-interactions-with-ai-tutorial
Sugandha Lahoti
04 Sep 2018
10 min read
Save for later

How to use artificial intelligence to create games with rich and interactive environments [Tutorial]

Sugandha Lahoti
04 Sep 2018
10 min read
Many of the most popular games on the planet have one thing in common: they all have rich, vivid worlds for the player to inhabit and interact with. This doesn't just mean a huge terrain or an extensive map (although it might do), it could simply be how things appear within the world. Similarly, it's not just about the environment - it's also about characters who are able to react in different ways according to the game. The only way to achieve an impressive level of 'realism' is through powerful artificial intelligence. This isn't easy, but it can be done. And learning how to do it will be well worth it, as it will create a much more engaging end product for players. This tutorial is taken from the book Practical Game AI Programming by Micael DaGraca. This book teaches you to create Game AI and implement cutting-edge AI algorithms from scratch. Let's take a look at how we can use AI to create rich environments. Breaking down the game environment by area When we create a map, often we have two or more different areas that could be used to change the gameplay, areas that could contain water, quicksand, flying zones, caves, and much more. If we wish to create an AI character that can be used in any level of our game, and anywhere, we need to take this into consideration and make the AI aware of the different zones of the map. Usually, that means that we need to input more information into the character's behavior, including how to react according to the position in which he is currently placed, or a situation where he can choose where to go. Should he avoid some areas? Should he prefer others? This type of information is relevant because it makes the character aware of the surroundings, choosing or adapting and taking into consideration his position. Not planning this correctly can lead to some unnatural decisions. For example, in Elder Scrolls V: Skyrim developed by Bethesda Softworks studio, we can watch some AI characters of the game simply turning back when they do not have information about how they should behave in some parts of the map, especially on mountains or rivers. Depending on the zones that our character finds, he might react differently or update his behavior tree to adapt to his environment. The environment that surrounds our characters can redefine their priorities or completely change their behaviors. This is a little similar to what Jean-Jacques Rousseau said about humanity: "We are good by nature, but corrupted by society." As humans, we are a representation of the environment that surrounds us, and for that reason, artificial intelligence should follow the same principle. Let's pick a  soldier and update his code to work on a different scenario. We want to change his behavior according to three different zones, beach, river, and forest. So, we'll create three public static Boolean functions with the names Beach, Forest and River; then we define the zones on the map that will turn them on or off. public static bool Beach; public static bool River; public static bool Forest; Because in this example, just one of them can be true at a time, we'll add a simple line of code that disables the other options once one of them gets activated. if(Beach == true) { Forest = false; River = false; } if(Forest == true){ Beach = false; River = false; } if(River == true){ Forest = false; Beach = false; } Once we have that done, we can start defining the different behaviors for each zone. For example, in the beach zone, the characters don't have a place to get cover, so that option needs to be taken away and updated with a new one. The river zone can be used to get across to the other side, so the character can hide from the player and attack from that position. To conclude, we can define the character to be more careful and use the trees to get cover. Depending on the zones, we can change the values to better adapt to the environment, or create new functions that would allow us to use some specific characteristics of that zone. if (Forest == true) {// The AI will remain passive until an interaction with the player occurs if (Health == 100 && triggerL == false && triggerR == false && triggerM == false) { statePassive = true; stateAggressive = false; stateDefensive = false; } // The AI will shift to the defensive mode if player comes from the right side or if the AI is below 20 HP if (Health <= 100 && triggerR == true || Health <= 20) { statePassive = false; stateAggressive = false; stateDefensive = true; } // The AI will shift to the aggressive mode if player comes from the left side or it's on the middle and AI is above 20HP if (Health > 20 && triggerL == true || Health > 20 && triggerM == true) { statePassive = false; stateAggressive = true; stateDefensive = false; } walk = speed * Time.deltaTime; walk = speedBack * Time.deltaTime; } Advanced environment interactions with AI As the video game industry and the technology associated with it kept evolving, new gameplay ideas appeared, and rapidly, the interaction between the characters of the game and the environment became even more interesting, especially when using physics. This means that the outcome of the environment could be completely random, where it was required for the AI characters to constantly adapt to different situations. One honorable mention on this subject is the video game Worms developed by Team17, where the map can be fully destroyed and the AI characters of the game are able to adapt and maintain smart decisions. The objective of this game is to destroy the opponent team by killing all their worms, the last man standing wins. From the start, the characters can find some extra health points or ammunition on the map and from time to time, it drops more points from the sky. So, there are two main objectives for the character, namely survive and kill. To survive, he needs to keep a decent amount of HP and away from the enemy, the other part is to choose the best character to shoot and take as much health as possible from him. Meanwhile, the map gets destroyed by the bombs and all of the fire power used by the characters, making it a challenge for artificial intelligence. Adapting to unstable terrain Let's decompose this example and create a character that could be used in this game. We'll start by looking at the map. At the bottom, there's water that automatically kills the worms. Then, we have the terrain where the worms can walk, or destroy if needed. Finally, there's the absence of terrain, specifically, the empty space that cannot be walked on. Then we have the characters (worms) they are placed in random positions at the beginning of the game and they can walk, jump, and shoot. The characters of the game should be able to constantly adapt to the instability of the terrain, so we need to use that and make it part of the behavior tree. As demonstrated in the diagram above, the character will need to understand the position where he is currently placed, as well as the opponent's position, health, and items. Because the terrain can be blocking them, the AI character has a chance of being in a situation where he cannot attack or obtain an item. So, we give him options on what to do in those situations and many others that he might find, but the most important is to define what happens if he cannot successfully accomplish any of them. Because the terrain can be shaped into different forms, during gameplay there will be times that it is near impossible to do anything, and that is why we need to provide options on what to do in those situations. For example, in this situation where the worm doesn't have enough free space to move, a close item to pick up, or an enemy that can be properly attacked, what should he do? It's necessary to make information about the surroundings available to our character so he can make a good judgment for that situation. In this scenario, we have defined our character to shoot anyway, against the closest enemy, or to stay close to a wall. Because he is too close to the explosion that would occur from attacking the closest enemy, he should decide to stay in a corner and wait there until the next turn. Using raycast to evaluate decisions Ideally, at the start of the turn, the character has two raycasts, one for his left side and another for the right side. This will check if there's a wall obstructing one of those directions. This can be used to determine what side the character should be moving toward if he wants to protect himself from being attacked. Then, we would use another raycast in the aim direction, to see if there's something blocking the way when the character is preparing to shoot. If there's something in the middle, the character should be calculating the distance between the two to determine if it's still safe to shoot. So, each character should have a shared list of all of the worms that are currently in the game; that way they can compare the distance between them all and choose which of them are closest and shoot them. Additionally, we add the two raycasts to check if there's something blocking the sides, and we have the basic information to make the character adapt to the constant modifications of the terrain. public int HP; public int Ammunition; public static List<GameObject> wormList = new List<GameObject>(); //creates a list with all the worms public static int wormCount; //Amount of worms in the game public int ID; //It's used to differentiate the worms private float proximityValueX; private float proximityValueY; private float nearValue; public float distanceValue; //how far the enemy should be private bool canAttack; void Awake () { wormList.Add(gameObject); //add this worm to the list wormCount++; //adds plus 1 to the amount of worms in the game } void Start () { HP = 100; distanceValue = 30f; } void Update () { proximityValueX = wormList[1].transform.position.x - this.transform.position.x; proximityValueY = wormList[1].transform.position.y - this.transform.position.y; nearValue = proximityValueX + proximityValueY; if(nearValue <= distanceValue) { canAttack = true; } else { canAttack = false; } Vector3 raycastRight = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, 10)) print("There is something blocking the Right side!"); Vector3 raycastLEft = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, -10)) print("There is something blocking the Left side!"); } In this post, we explored different ways to interact with the environment. First, we learned how to break down the game environment by area. Then we learned about the advanced environment interactions with AI. To learn about manipulating animation behavior with AI read our book  Practical Game AI Programming. Read Next Developing Games Using AI Techniques and Practices of Game AI Unite Berlin 2018 Keynote: Unity partners with Google, launches Ml-Agents ToolKit 0.4, Project MARS and more
Read more
  • 0
  • 0
  • 24135

article-image-implementing-cost-effective-iot-analytics-for-predictive-maintenance-tutorial
Prasad Ramesh
04 Sep 2018
10 min read
Save for later

Implementing cost-effective IoT analytics for predictive maintenance [Tutorial]

Prasad Ramesh
04 Sep 2018
10 min read
Predictive maintenance is a common value proposition cited for IoT analytics. In this tutorial will look at a value formula for net savings. Then we walk through an example as a way to highlight how to think financially about when it makes sense to implement a decision and when it does not. The economics of predictive maintenance may not be entirely obvious. Believe it or not, it does not always make sense, even if you can predict early failures accurately. In many cases, you will actually lose money by doing it. Even when it can save you money, there is an optimal point for when it should be used. The optimal point depends on the costs and the accuracy of the predictive model. This article is an excerpt from a book written by Andrew Minteer titled Analytics for the Internet of Things (IoT). The value formula A formula to guide decision making compares the cost of allowing a failure to occur versus the cost to proactively repair the component while considering the probability of predicting the failure: Net Savings = (Cost of Failure * (Expected Number of Failures - Expected True Positive Predictions)) - (Proactive Repair Cost * (Expected True Positives + Expected False Positives)) If the cost of failure is the same as the proactive repair cost, even with a perfect prediction model, then there will be no savings. Make sure to include intangible costs into the cost of failure. Some examples of intangible costs include legal expenses, loss of brand equity, and even the customer's expenses. Predictive repair does make sense when there is a large spread between the cost of failure and the cost of proactive replacement, combined with a well-performing prediction model. For example, if the cost of a failure is a locomotive engine replacement at $1 million USD and the cost of a proactive repair is $200 USD, then the accuracy of the model does not even have to be all that great before a proactive replacement program makes financial sense. On the other hand, if the failure is a $400 USD automotive turbocharger replacement, and the proactive repair cost is $350 USD for a turbocharger actuator subcomponent replacement, the predictive model would need to be highly accurate for that to make financial sense. An example of making a value decision To illustrate the example, we will walk through a business situation and then some R code that simulates a cost-benefit curve for that decision. The code will use a fitted predictive model to calculate the net savings (or lack thereof) to generate a cost curve. The cost curve can then be used in a business decision on what proportion of units with predicted failures should have a proactive replacement. Imagine you work for a company that builds diesel-powered generators. There is a coolant control valve that normally lasts for 4,000 hours of operation until there is a planned replacement. From the analysis, your company has realized that the generators built two years prior are experiencing an earlier than the expected failure of the valve. When the valve fails, the engine overheats and several other components are damaged. The cost of failure, including labor rates for repair personnel and the cost to the customer for downtime, is an average of $1,000 USD. The cost of a proactive replacement of the valve is $253 USD. Should you replace all coolant valves in the population? It depends on how high a failure rate is expected. In this case, about 10% of the current non-failed units are expected to fail before the scheduled replacement. Also, importantly, it matters how well you can predict the failures. The following R code simulates this situation and uses a simple predictive model (logistic regression) to estimate a cost curve. The model has an AUC of close to 0.75. This will vary as you run the code since the dataset is randomly simulated: #make sure all needed packages are installed if(!require(caret)){ install.packages("caret") } if(!require(pROC)){ install.packages("pROC") } if(!require(dplyr)){ install.packages("dplyr") } if(!require(data.table)){ install.packages("data.table") } #Load required libraries library(caret) library(pROC) library(dplyr) library(data.table) #Generate sample data simdata = function(N=1000) { #simulate 4 features X = data.frame(replicate(4,rnorm(N))) #create a hidden data structure to learn hidden = X[,1]^2+sin(X[,2]) + rnorm(N)*1 #10% TRUE, 90% FALSE rare.class.probability = 0.1 #simulate the true classification values y.class = factor(hidden<quantile(hidden,c(rare.class.probability))) return(data.frame(X,Class=y.class)) } #make some data structure model_data = simdata(N=50000) #train a logistic regression model on the simulated data training <- createDataPartition(model_data$Class, p = 0.6, list=FALSE) trainData <- model_data[training,] testData <- model_data[-training,] glmModel <- glm(Class~ . , data=trainData, family=binomial) testData$predicted <- predict(glmModel, newdata=testData, type="response") #calculate AUC roc.glmModel <- pROC::roc(testData$Class, testData$predicted) auc.glmModel <- pROC::auc(roc.glmModel) print(auc.glmModel) #Pull together test data and predictions simModel <- data.frame(trueClass = testData$Class, predictedClass = testData$predicted) # Reorder rows and columns simModel <- simModel[order(simModel$predictedClass, decreasing = TRUE), ] simModel <- select(simModel, trueClass, predictedClass) simModel$rank <- 1:nrow(simModel) #Assign costs for failures and proactive repairs proactive_repair_cost <- 253 # Cost of proactively repairing a part failure_repair_cost <- 1000 # Cost of a failure of the part (include all costs such as lost production, etc not just the repair cost) # Define each predicted/actual combination fp.cost <- proactive_repair_cost # The part was predicted to fail but did not (False Positive) fn.cost <- failure_repair_cost # The part was not predicted to fail and it did (False Negative) tp.cost <- (proactive_repair_cost - failure_repair_cost) # The part was predicted to fail and it did (True Positive). This will be negative for a savings. tn.cost <- 0.0 # The part was not predicted to fail and it did not (True Negative) #incorporate probability of future failure simModel$future_failure_prob <- prob_failure #Function to assign costs for each instance assignCost <- function(pred, outcome, tn.cost, fn.cost, fp.cost, tp.cost, prob){ cost <- ifelse(pred == 0 & outcome == FALSE, tn.cost, # No cost since no action was taken and no failure ifelse(pred == 0 & outcome == TRUE, fn.cost, # The cost of no action and a repair resulted ifelse(pred == 1 & outcome == FALSE, fp.cost, # The cost of proactive repair which was not needed ifelse(pred == 1 & outcome == TRUE, tp.cost, 999999999)))) # The cost of proactive repair which avoided a failure return(cost) } # Initialize list to hold final output master <- vector(mode = "list", length = 100) #use the simulated model. In practice, this code can be adapted to compare multiple models test_model <- simModel # Create a loop to increment through dynamic threshold (starting at 1.0 [no proactive repairs] to 0.0 [all proactive repairs]) threshold <- 1.00 for (i in 1:101) { #Add predicted class with percentile ranking test_model$prob_ntile <- ntile(test_model$predictedClass, 100) / 100 # Dynamically determine if proactive repair would apply based on incrementing threshold test_model$glm_failure <- ifelse(test_model$prob_ntile >= threshold, 1, 0) test_model$threshold <- threshold # Compare to actual outcome to assign costs test_model$glm_impact <- assignCost(test_model$glm_failure, test_model$trueClass, tn.cost, fn.cost, fp.cost, tp.cost, test_model$future_failure_prob) # Compute cost for not doing any proactive repairs test_model$nochange_impact <- ifelse(test_model$trueClass == TRUE, fn.cost, tn.cost) # *test_model$future_failure_prob) # Running sum to produce the overall impact test_model$glm_cumul_impact <- cumsum(test_model$glm_impact) / nrow(test_model) test_model$nochange_cumul_impact <- cumsum(test_model$nochange_impact) / nrow(test_model) # Count the # of classified failures test_model$glm_failure_ct <- cumsum(test_model$glm_failure) # Create new object to house the one row per iteration output for the final plot master[[i]] <- test_model[nrow(test_model),] # Reduce the threshold by 1% and repeat to calculate new value threshold <- threshold - 0.01 } finalOutput <- rbindlist(master) finalOutput <- subset(finalOutput, select = c(threshold, glm_cumul_impact, glm_failure_ct, nochange_cumul_impact) ) # Set baseline to costs of not doing any proactive repairs baseline <- finalOutput$nochange_cumul_impact # Plot the cost curve par(mfrow = c(2,1)) plot(row(finalOutput)[,1], finalOutput$glm_cumul_impact, type = "l", lwd = 3, main = paste("Net Costs: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost, sep = ""), ylim = c(min(finalOutput$glm_cumul_impact) - 100, max(finalOutput$glm_cumul_impact) + 100), xlab = "Percent of Population", ylab = "Net Cost ($) / Unit") # Plot the cost difference of proactive repair program and a 'do nothing' approach plot(row(finalOutput)[,1], baseline - finalOutput$glm_cumul_impact, type = "l", lwd = 3, col = "black", main = paste("Savings: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost,sep = ""), ylim = c(min(baseline - finalOutput$glm_cumul_impact) - 100, max(baseline - finalOutput$glm_cumul_impact) + 100), xlab = "% of Population", ylab = "Savings ($) / Unit") abline(h=0,col="gray")   As seen in the resulting net cost and savings curves, based on the model's predictions, the optimal savings would be from a proactive repair program of the top 30 percentile units. The savings decreases after this, although you would still save money when replacing up to 75% of the population. After this point, you should expect to spend more than you save. The following set of charts is the output from the preceding code: Cost and savings curves for the proactive repair $253 and failure cost at $1,000 scenario Note the changes in the following graph when the failure cost drops to $300 USD. At no point do you save money, as the proactive repair cost will always outweigh the reduced failure cost. This does not mean you should not do a proactive repair; you may still want to do so in order to satisfy your customers. Even in such a case, this cost curve method can help in decisions on how much you are willing to spend to address the problem. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 300 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $300 scenario And finally, notice how the savings curve changes when the failure cost moves to $5,000. You will notice that the spread between the proactive repair cost and the failure cost determines much of when doing a proactive repair makes business sense. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 5000 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $5,000 scenario Ultimately, the decision is a business case based on the expected costs and benefits. ML modeling can help optimize savings under the right conditions. Utilizing cost curves helps to determine the expected costs and savings of proactive replacements. In this tutorial, we looked at implementing economically cost effective IoT analytics for predictive maintenance with example.  To further explore IoT Analytics and cloud check out the book Analytics for the Internet of Things (IoT). AWS IoT Analytics: The easiest way to run analytics on IoT data, Amazon says Build an IoT application with Azure IoT [Tutorial] Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2
Read more
  • 0
  • 0
  • 23484

article-image-build-intelligent-interfaces-with-coreml-using-a-cnn-tutorial
Savia Lobo
03 Sep 2018
19 min read
Save for later

Build intelligent interfaces with CoreML using a CNN [Tutorial]

Savia Lobo
03 Sep 2018
19 min read
Core ML gives the potential for devices to better serve us rather than us serving them. This adheres to a rule stated by developer Eric Raymond that a computer should never ask the user for any information that it can auto-detect, copy, or deduce. This article is an excerpt taken from Machine Learning with Core ML written by Joshua Newnham. In today's post, we will implement an application that will attempt to guess what the user is trying to draw and provide pre-drawn drawings that the user can substitute with (image search).  We will be exploring two techniques. The first is using a convolutional neural network (CNN), which we are becoming familiar with, to make the prediction, and then look at how we can apply a context-based similarity sorting strategy to better align the suggestions with what the user is trying to sketch. Reviewing the training data and model We will be using a slightly smaller set, with 205 out of the 250 categories; the exact categories can be found in the CSV file /Chapter7/Training/sketch_classes.csv, along with the Jupyter Notebooks used to prepare the data and train the model. The original sketches are available in SVG and PNG formats. Because we're using a CNN, rasterized images (PNG) were used but rescaled from 1111 x 1111 to 256 x 256; this is the expected input of our model. The data was then split into a training and a validation set, using 80% (64 samples from each category) for training and 20% (17 samples from each category) for validation. After 68 iterations (epochs), the model was able to achieve an accuracy of approximately 65% on the validation data. Not exceptional, but if we consider the top two or three predictions, then this accuracy increases to nearly 90%. The following diagram shows the plots comparing training and validation accuracy, and loss during training: With our model trained, our next step is to export it using the Core ML Tools made available by Apple (as discussed in previous chapters) and imported into our project. Classifying sketches Here we will walk through importing the Core ML model into our project and hooking it up, including using the model to perform inference on the user's sketch and also searching and suggesting substitute images for the user to swap their sketch with. Let's get started with importing the Core ML model into our project. Locate the model in the project repositories folder /CoreMLModels/Chapter7/cnnsketchclassifier.mlmodel; with the model selected, drag it into your Xcode project, leaving the defaults for the Import options. Once imported, select the model to inspect the details, which should look similar to the following screenshot: As with all our models, we verify that the model is included in the target by verifying that the appropriate Target Membership is checked, and then we turn our attention to the inputs and outputs, which should be familiar by now. We can see that our model is expecting a single-channel (grayscale) 256 x 256 image and it returns the dominate class via the classLabel property of the output, along with a dictionary of probabilities of all classes via the classLabelProbs property. With our model now imported, let's discuss the details of how we will be integrating it into our project. Recall that our SketchView emits the events UIControlEvents.editingDidStart, UIControlEvents.editingChanged, and UIControlEvents.editingDidEnd as the user draws. If you inspect the SketchViewController, you will see that we have already registered to listen for the UIControlEvents.editingDidEnd event, as shown in the following code snippet: override func viewDidLoad() { super.viewDidLoad() ... ... self.sketchView.addTarget(self, action: #selector(SketchViewController.onSketchViewEditingDidEnd), for: .editingDidEnd) queryFacade.delegate = self } Each time the user ends a stroke, we will start the process of trying to guess what the user is sketching and search for suitable substitutes. This functionality is triggered via the .editingDidEnd action method onSketchViewEditingDidEnd, but will be delegated to the class QueryFacade, which will be responsible for implementing this functionality. This is where we will spend the majority of our time in this section and the next section. It's also probably worth highlighting the statement queryFacade.delegate = self in the previous code snippet. QueryFacade will be performing most of its work off the main thread and will notify this delegate of the status and results once finished, which we will get to in a short while. Let's start by implementing the functionality of the onSketchViewEditingDidEnd method, before turning our attention to the QueryFacade class. Within the SketchViewController class, navigate to the onSketchViewEditingDidEnd method and append the following code: guard self.sketchView.currentSketch != nil, let sketch = self.sketchView.currentSketch as? StrokeSketch else{ return } queryFacade.asyncQuery(sketch: sketch) Here, we are getting the current sketch, and returning it if no sketch is available or if it's not a StrokeSketch; we hand it over to our queryFacade (an instance of the QueryFacade class). Let's now turn our attention to the QueryFacade class; select the QueryFacade.swift file from the left-hand panel within Xcode to bring it up in the editor area. A lot of plumbing has already been implemented to allow us to focus our attention on the core functionality of predicting, searching, and sorting. Let's quickly discuss some of the details, starting with the properties: let context = CIContext() let queryQueue = DispatchQueue(label: "query_queue") var targetSize = CGSize(width: 256, height: 256) weak var delegate : QueryDelegate? var currentSketch : Sketch?{ didSet{ self.newQueryWaiting = true self.queryCanceled = false } } fileprivate var queryCanceled : Bool = false fileprivate var newQueryWaiting : Bool = false fileprivate var processingQuery : Bool = false var isProcessingQuery : Bool{ get{ return self.processingQuery } } var isInterrupted : Bool{ get{ return self.queryCanceled || self.newQueryWaiting } } QueryFacade is only concerned with the most current sketch. Therefore, each time a new sketch is assigned using the currentSketch property, queryCanceled is set to true. During each task (such as performing prediction, search, and downloading), we check the isInterrupted property, and if true, we will exit early and proceed to process the latest sketch. When you pass the sketch to the asyncQuery method, the sketch is assigned to the currentSketch property and then proceeds to call queryCurrentSketch to do the bulk of the work, unless there is one currently being processed: func asyncQuery(sketch:Sketch){ self.currentSketch = sketch if !self.processingQuery{ self.queryCurrentSketch() } } fileprivate func processNextQuery(){ self.queryCanceled = false if self.newQueryWaiting && !self.processingQuery{ self.queryCurrentSketch() } } fileprivate func queryCurrentSketch(){ guard let sketch = self.currentSketch else{ self.processingQuery = false self.newQueryWaiting = false return } self.processingQuery = true self.newQueryWaiting = false queryQueue.async { DispatchQueue.main.async{ self.processingQuery = false self.delegate?.onQueryCompleted( status:self.isInterrupted ? -1 : -1, result:nil) self.processNextQuery() } } } Let's work bottom-up by implementing all the supporting methods before we tie everything together within the queryCurrentSketch method. Let's start by declaring an instance of our model; add the following variable within the QueryFacade class near the top: let sketchClassifier = cnnsketchclassifier() Now, with our model instantiated and ready, we will navigate to the classifySketch method of the QueryFacade class; it is here that we will make use of our imported model to perform inference, but let's first review what already exists: func classifySketch(sketch:Sketch) -> [(key:String,value:Double)]?{ if let img = sketch.exportSketch(size: nil)? .resize(size: self.targetSize).rescalePixels(){ return self.classifySketch(image: img) } return nil } func classifySketch(image:CIImage) -> [(key:String,value:Double)]?{ return nil } Here, we see that the classifySketch is overloaded, with one method accepting a Sketch and the other a CIImage. The former, when called, will obtain the rasterize version of the sketch using the exportSketch method. If successful, it will resize the rasterized image using the targetSize property. Then, it will rescale the pixels before passing the prepared CIImage along to the alternative classifySketch method. Pixel values are in the range of 0-255 (per channel; in this case, it's just a single channel). Typically, you try to avoid having large numbers in your network. The reason is that they make it more difficult for your model to learn (converge)—somewhat analogous to trying to drive a car whose steering wheel can only be turned hard left or hard right. These extremes would cause a lot of over-steering and make navigating anywhere extremely difficult. The second classifySketch method will be responsible for performing the actual inference. Add the following code within the classifySketch(image:CIImage) method: if let pixelBuffer = image.toPixelBuffer(context: self.context, gray: true){ let prediction = try? self.sketchClassifier.prediction(image: pixelBuffer) if let classPredictions = prediction?.classLabelProbs{ let sortedClassPredictions = classPredictions.sorted(by: { (kvp1, kvp2) -> Bool in kvp1.value > kvp2.value }) return sortedClassPredictions } } return nil Here, we use the images, toPixelBuffer method, an extension we added to the CIImage class, to obtain a grayscale CVPixelBuffer representation of itself. Now, with reference to its buffer, we pass it onto the prediction method of our model instance, sketchClassifier, to obtain the probabilities for each label. We finally sort these probabilities from the most likely to the least likely before returning the sorted results to the caller. Now, with some inkling as to what the user is trying to sketch, we will proceed to search and download the ones we are most confident about. The task of searching and downloading will be the responsibility of the downloadImages method within the QueryFacade class. This method will make use of an existing BingService that exposes methods for searching and downloading images. Let's hook this up now; jump into the downloadImages method and append the following highlighted code to its body: func downloadImages(searchTerms:[String], searchTermsCount:Int=4, searchResultsCount:Int=2) -> [CIImage]?{ var bingResults = [BingServiceResult]() for i in 0..<min(searchTermsCount, searchTerms.count){ let results = BingService.sharedInstance.syncSearch( searchTerm: searchTerms[i], count:searchResultsCount) for bingResult in results{ bingResults.append(bingResult) } if self.isInterrupted{ return nil } } } The downloadImages method takes the arguments searchTerms, searchTermsCount, and searchResultsCount. The searchTerms is a sorted list of labels returned by our classifySketch method, from which the searchTermsCount determines how many of these search terms we use (defaulting to 4). Finally, searchResultsCount limits the results returned for each search term. The preceding code performs a sequential search using the search terms passed into the method. And as mentioned previously, here we are using Microsoft's Bing Image Search API, which requires registration, something we will return to shortly. After each search, we check the property isInterrupted to see whether we need to exit early; otherwise, we continue on to the next search. The result returned by the search includes a URL referencing an image; we will use this next to download the image with each of the results, before returning an array of CIImage to the caller. Let's add this now. Append the following code to the downloadImages method: var images = [CIImage]() for bingResult in bingResults{ if let image = BingService.sharedInstance.syncDownloadImage( bingResult: bingResult){ images.append(image) } if self.isInterrupted{ return nil } } return images As before, the process is synchronous and after each download, we check the isInterrupted property to see if we need to exit early, otherwise returning the list of downloaded images to the caller. So far, we have implemented the functionality to support prediction, searching, and downloading; our next task is to hook all of this up. Head back to the queryCurrentSketch method and add the following code within the queryQueue.async block. Ensure that you replace the DispatchQueue.main.async block: queryQueue.async { guard let predictions = self.classifySketch( sketch: sketch) else{ DispatchQueue.main.async{ self.processingQuery = false self.delegate?.onQueryCompleted( status:-1, result:nil) self.processNextQuery() } return } let searchTerms = predictions.map({ (key, value) -> String in return key }) guard let images = self.downloadImages( searchTerms: searchTerms, searchTermsCount: 4) else{ DispatchQueue.main.async{ self.processingQuery = false self.delegate?.onQueryCompleted( status:-1, result:nil) self.processNextQuery() } return } guard let sortedImage = self.sortByVisualSimilarity( images: images, sketch: sketch) else{ DispatchQueue.main.async{ self.processingQuery = false self.delegate?.onQueryCompleted( status:-1, result:nil) self.processNextQuery() } return } DispatchQueue.main.async{ self.processingQuery = false self.delegate?.onQueryCompleted( status:self.isInterrupted ? -1 : 1, result:QueryResult( predictions: predictions, images: sortedImage)) self.processNextQuery() } } It's a large block of code but nothing complicated; let's quickly walk our way through it. We start by calling the classifySketch method we just implemented. As you may recall, this method returns a sorted list of label and probability peers unless interrupted, in which case nil will be returned. We should handle this by notifying the delegate before exiting the method early (a check we apply to all of our tasks). Once we've obtained the list of sorted labels, we pass them to the downloadImages method to receive the associated images, which we then pass to the sortByVisualSimilarity method. This method currently returns just the list of images, but it's something we will get back to in the next section. Finally, the method passes the status and sorted images wrapped in a QueryResult instance to the delegate via the main thread, before checking whether it needs to process a new sketch (by calling the processNextQuery method). At this stage, we have implemented all the functionality required to download our substitute images based on our guess as to what the user is currently sketching. Now, we just need to jump into the SketchViewController class to hook this up, but before doing so, we need to obtain a subscription key to use Bing's Image Search. Within your browser, head to https://azure.microsoft.com/en-gb/services/cognitive-services/bing-image-search-api/ and click on the Try Bing Image Search API, as shown in the following screenshot: After clicking on Try Bing Image Search API, you will be presented with a series of dialogs; read, and once (if) agreed, sign in or register. Continue following the screens until you reach a page informing you that the Bing Search API has been successfully added to your subscription, as shown in the following screenshot: On this page, scroll down until you come across the entry Bing Search APIs v7. If you inspect this block, you should see a list of Endpoints and Keys. Copy and paste one of these keys within the BingService.swift file, replacing the value of the constant subscriptionKey; the following screenshot shows the web page containing the service key: Return to the SketchViewController by selecting the SketchViewController.swift file from the left-hand panel, and locate the method onQueryCompleted: func onQueryCompleted(status: Int, result:QueryResult?){ } Recall that this is a method signature defined in the QueryDelegate protocol, which the QueryFacade uses to notify the delegate if the query fails or completes. It is here that we will present the matching images we have found through the process we just implemented. We do this by first checking the status. If deemed successful (greater than zero), we remove every item that is referenced in the queryImages array, which is the data source for our UICollectionView used to present the suggested images to the user. Once emptied, we iterate through all the images referenced within the QueryResult instance, adding them to the queryImages array before requesting the UICollectionView to reload the data. Add the following code to the body of the onQueryCompleted method: guard status > 0 else{ return } queryImages.removeAll() if let result = result{ for cimage in result.images{ if let cgImage = self.ciContext.createCGImage(cimage, from:cimage.extent){ queryImages.append(UIImage(cgImage:cgImage)) } } } toolBarLabel.isHidden = queryImages.count == 0 collectionView.reloadData() There we have it; everything is in place to handle guessing of what the user draws and present possible suggestions. Now is a good time to build and run the application on either the simulator or the device to check whether everything is working correctly. If so, then you should see something similar to the following: There is one more thing left to do before finishing off this section. Remembering that our goal is to assist the user to quickly sketch out a scene or something similar, our hypothesis is that guessing what the user is drawing and suggesting ready-drawn images will help them achieve their task. So far, we have performed prediction and provided suggestions to the user, but currently the user is unable to replace their sketch with any of the presented suggestions. Let's address this now. Our SketchView currently only renders StrokeSketch (which encapsulates the metadata of the user's drawing). Because our suggestions are rasterized images, our choice is to either extend this class (to render strokes and rasterized images) or create a new concrete implementation of the Sketch protocol. In this example, we will opt for the latter and implement a new type of Sketch capable of rendering a rasterized image. Select the Sketch.swift file to bring it to focus in the editor area of Xcode, scroll to the bottom, and add the following code: class ImageSketch : Sketch{ var image : UIImage! var size : CGSize! var origin : CGPoint! var label : String! init(image:UIImage, origin:CGPoint, size:CGSize, label: String) { self.image = image self.size = size self.label = label self.origin = origin } } We have defined a simple class that is referencing an image, origin, size, and label. The origin determines the top-left position where the image should be rendered, while the size determines its, well, size! To satisfy the Sketch protocol, we must implement the properties center and boundingBox along with the methods draw and exportSketch. Let's implement each of these in turn, starting with boundingBox. The boundingBox property is a computed property derived from the properties origin and size. Add the following code to your ImageSketch class: var boundingBox : CGRect{ get{ return CGRect(origin: self.origin, size: self.size) } } Similarly, center will be another computed property derived from the origin and size properties, simply translating the origin with respect to the size. Add the following code to your ImageSketch class: var center : CGPoint{ get{ let bbox = self.boundingBox return CGPoint(x:bbox.origin.x + bbox.size.width/2, y:bbox.origin.y + bbox.size.height/2) } set{ self.origin = CGPoint(x:newValue.x - self.size.width/2, y:newValue.y - self.size.height/2) } } The draw method will simply use the passed-in context to render the assigned image within the boundingBox; append the following code to your ImageSketch class: func draw(context:CGContext){ self.image.draw(in: self.boundingBox) }   Our last method, exportSketch, is also fairly straightforward. Here, we create an instance of CIImage, passing in the image (of type UIImage). Then, we resize it using the extension method we implemented back in Chapter 3, Recognizing Objects in the World. Add the following code to finish off the ImageSketch class: func exportSketch(size:CGSize?) -> CIImage?{ guard let ciImage = CIImage(image: self.image) else{ return nil } if self.image.size.width == self.size.width && self.image.size.height == self.size.height{ return ciImage } else{ return ciImage.resize(size: self.size) } } We now have an implementation of Sketch that can handle rendering of rasterized images (like those returned from our search). Our final task is to swap the user's sketch with an item the user selects from the UICollectionView. Return to SketchViewController class by selecting the SketchViewController.swift from the left-hand-side panel in Xcode to bring it up in the editor area. Once loaded, navigate to the method collectionView(_ collectionView:, didSelectItemAt:); this should look familiar to most of you. It is the delegate method for handling cells selected from a UICollectionView and it's where we will handle swapping of the user's current sketch with the selected item. Let's start by obtaining the current sketch and associated image that was selected. Add the following code to the body of the collectionView(_collectionView:,didSelectItemAt:) method: guard let sketch = self.sketchView.currentSketch else{ return } self.queryFacade.cancel() let image = self.queryImages[indexPath.row]   Now, with reference to the current sketch and image, we want to try and keep the size relatively the same as the user's sketch. We will do this by simply obtaining the sketch's bounding box and scaling the dimensions to respect the aspect ratio of the selected image. Add the following code, which handles this: var origin = CGPoint(x:0, y:0) var size = CGSize(width:0, height:0) if bbox.size.width > bbox.size.height{ let ratio = image.size.height / image.size.width size.width = bbox.size.width size.height = bbox.size.width * ratio } else{ let ratio = image.size.width / image.size.height size.width = bbox.size.height * ratio size.height = bbox.size.height } Next, we obtain the origin (top left of the image) by obtaining the center of the sketch and offsetting it relative to its width and height. Do this by appending the following code: origin.x = sketch.center.x - size.width / 2 origin.y = sketch.center.y - size.height / 2 We can now use the image, size, and origin to create an ImageSketch, and replace it with the current sketch simply by assigning it to the currentSketch property of the SketchView instance. Add the following code to do just that: self.sketchView.currentSketch = ImageSketch(image:image, origin:origin, size:size, label:"") Finally, some housekeeping; we'll clear the UICollectionView by removing all images from the queryImages array (its data source) and request it to reload itself. Add the following block to complete the collectionView(_ collectionView:,didSelectItemAt:) method: self.queryImages.removeAll() self.toolBarLabel.isHidden = queryImages.count == 0 self.collectionView.reloadData() Now is a good time to build and run to ensure that everything is working as planned. If so then, you should be able to swap out your sketch with one of the suggestions presented at the top, as shown in the following screenshot: We learned how to build Intelligent interfaces using Core ML. If you've enjoyed reading this post, do check out Machine Learning with Core ML to further implement Core ML for visual-based applications using the principles of transfer learning and neural networks. Introducing Intelligent Apps 5 examples of Artificial Intelligence in Web apps Voice, natural language, and conversations: Are they the next web UI?
Read more
  • 0
  • 0
  • 7937
article-image-getting-started-with-amazon-machine-learning-workflow-tutorial
Melisha Dsouza
02 Sep 2018
14 min read
Save for later

Getting started with Amazon Machine Learning workflow [Tutorial]

Melisha Dsouza
02 Sep 2018
14 min read
Amazon Machine Learning is useful for building ML models and generating predictions. It also enables the development of robust and scalable smart applications. The process of building ML models with Amazon Machine Learning consists of three operations: data analysis model training evaluation. The code files for this article are available on Github. This tutorial is an excerpt from a book written by Alexis Perrier titled Effective Amazon Machine Learning. The Amazon Machine Learning service is available at https://console.aws.amazon.com/machinelearning/. The Amazon ML workflow closely follows a standard Data Science workflow with steps: Extract the data and clean it up. Make it available to the algorithm. Split the data into a training and validation set, typically a 70/30 split with equal distribution of the predictors in each part. Select the best model by training several models on the training dataset and comparing their performances on the validation dataset. Use the best model for predictions on new data. As shown in the following Amazon ML menu, the service is built around four objects: Datasource ML model Evaluation Prediction The Datasource and Model can also be configured and set up in the same flow by creating a new Datasource and ML model. Let us take a closer look at each one of these steps. Understanding the dataset used We will use the simple Predicting Weight by Height and Age dataset (from Lewis Taylor (1967)) with 237 samples of children's age, weight, height, and gender, which is available at https://v8doc.sas.com/sashtml/stat/chap55/sect51.htm. This dataset is composed of 237 rows. Each row has the following predictors: sex (F, M), age (in months), height (in inches), and we are trying to predict the weight (in lbs) of these children. There are no missing values and no outliers. The variables are close enough in range and normalization is not required. We do not need to carry out any preprocessing or cleaning on the original dataset. Age, height, and weight are numerical variables (real-valued), and sex is a categorical variable. We will randomly select 20% of the rows as the held-out subset to use for prediction on previously unseen data and keep the other 80% as training and evaluation data. This data split can be done in Excel or any other spreadsheet editor: By creating a new column with randomly generated numbers Sorting the spreadsheet by that column Selecting 190 rows for training and 47 rows for prediction (roughly a 80/20 split) Let us name the training set LT67_training.csv and the held-out set that we will use for prediction LT67_heldout.csv, where LT67 stands for Lewis and Taylor, the creator of this dataset in 1967. As with all datasets, scripts, and resources mentioned in this book, the training and holdout files are available in the GitHub repository at https://github.com/alexperrier/packt-aml. It is important for the distribution in age, sex, height, and weight to be similar in both subsets. We want the data on which we will make predictions to show patterns that are similar to the data on which we will train and optimize our model. Loading the data on S3 Follow these steps to load the training and held-out datasets on S3: Go to your s3 console at https://console.aws.amazon.com/s3. Create a bucket if you haven't done so already. Buckets are basically folders that are uniquely named across all S3. We created a bucket named aml.packt. Since that name has now been taken, you will have to choose another bucket name if you are following along with this demonstration. Click on the bucket name you created and upload both the LT67_training.csv and LT67_heldout.csv files by selecting Upload from the Actions drop-down menu: Both files are small, only a few KB, and hosting costs should remain negligible for that exercise. Note that for each file, by selecting the Properties tab on the right, you can specify how your files are accessed, what user, role, group or AWS service may download, read, write, and delete the files, and whether or not they should be accessible from the Open Web. When creating the datasource in Amazon ML, you will be prompted to grant Amazon ML access to your input data. You can specify the access rules to these files now in S3 or simply grant access later on. Our data is now in the cloud in an S3 bucket. We need to tell Amazon ML where to find that input data by creating a datasource. We will first create the datasource for the training file ST67_training.csv. Declaring a datasource Go to the Amazon ML dashboard, and click on Create new... | Datasource and ML model. We will use the faster flow available by default: As shown in the following screenshot, you are asked to specify the path to the LT67_training.csv file {S3://bucket}{path}{file}. Note that the S3 location field automatically populates with the bucket names and file names that are available to your user: Specifying a Datasource name is useful to organize your Amazon ML assets. By clicking on Verify, Amazon ML will make sure that it has the proper rights to access the file. In case it needs to be granted access to the file, you will be prompted to do so as shown in the following screenshot: Just click on Yes to grant access. At this point, Amazon ML will validate the datasource and analyze its contents. Creating the datasource An Amazon ML datasource is composed of the following: The location of the data file: The data file is not duplicated or cloned in Amazon ML but accessed from S3 The schema that contains information on the type of the variables contained in the CSV file: Categorical Text Numeric (real-valued) Binary It is possible to supply Amazon ML with your own schema or modify the one created by Amazon ML. At this point, Amazon ML has a pretty good idea of the type of data in your training dataset. It has identified the different types of variables and knows how many rows it has: Move on to the next step by clicking on Continue, and see what schema Amazon ML has inferred from the dataset as shown in the next screenshot: Amazon ML needs to know at that point which is the variable you are trying to predict. Be sure to tell Amazon ML the following: The first line in the CSV file contains te column name The target is the weight We see here that Amazon ML has correctly inferred the following: sex is categorical age, height, and weight are numeric (continuous real values) Since we chose a numeric variable as the target Amazon ML, will use Linear Regression as the predictive model. For binary or categorical values, we would have used Logistic Regression. This means that Amazon ML will try to find the best a, b, and c coefficients so that the weight predicted by the following equation is as close as possible to the observed real weight present in the data: predicted weight = a * age + b * height + c * sex Amazon ML will then ask you if your data contains a row identifier. In our present case, it does not. Row identifiers are useful when you want to understand the prediction obtained for each row or add an extra column to your dataset later on in your project. Row identifiers are for reference purposes only and are not used by the service to build the model. You will be asked to review the datasource. You can go back to each one of the previous steps and edit the parameters for the schema, the target and the input data. Now that the data is known to Amazon ML, the next step is to set up the parameters of the algorithm that will train the model. Understanding the model We select the default parameters for the training and evaluation settings. Amazon ML will do the following: Create a recipe for data transformation based on the statistical properties it has inferred from the dataset Split the dataset (ST67_training.csv) into a training part and a validation part, with a 70/30 split. The split strategy assumes the data has already been shuffled and can be split sequentially. The recipe will be used to transform the data in a similar way for the training and the validation datasets. The only transformation suggested by Amazon ML is to transform the categorical variable sex into a binary variable, where m = 0 and f = 1 for instance. No other transformation is needed. The default advanced settings for the model are shown in the following screenshot: We see that Amazon ML will pass over the data 10 times, shuffle splitting the data each time. It will use an L2 regularization strategy based on the sum of the square of the coefficients of the regression to prevent overfitting. We will evaluate the predictive power of the model using our LT67_heldout.csv dataset later on. Regularization comes in 3 levels with a mild (10^-6), medium (10^-4), or aggressive (10^-02) setting, each value stronger than the previous one. The default setting is mild, the lowest, with a regularization constant of 0.00001 (10^-6) implying that Amazon ML does not anticipate much overfitting on this dataset. This makes sense when the number of predictors, three in our case, is much smaller than the number of samples (190 for the training set). Clicking on the Create ML model button will launch the model creation. This takes a few minutes to resolve, depending on the size and complexity of your dataset. You can check its status by refreshing the model page. In the meantime, the model status remains pending. At that point, Amazon ML will split our training dataset into two subsets: a training and a validation set. It will use the training portion of the data to train several settings of the algorithm and select the best one based on its performance on the training data. It will then apply the associated model to the validation set and return an evaluation score for that model. By default, Amazon ML will sequentially take the first 70% of the samples for training and the remaining 30% for validation. It's worth noting that Amazon ML will not create two extra files and store them on S3, but instead create two new datasources out of the initial datasource we have previously defined. Each new datasource is obtained from the original one via a Data rearrangement JSON recipe such as the following: { "splitting": { "percentBegin": 0, "percentEnd": 70 } } You can see these two new datasources in the Datasource dashboard. Three datasources are now available where there was initially only one, as shown by the following screenshot: While the model is being trained, Amazon ML runs the Stochastic Gradient algorithm several times on the training data with different parameters: Varying the learning rate in increments of powers of 10: 0.01, 0.1, 1, 10, and 100. Making several passes over the training data while shuffling the samples before each path. At each pass, calculating the prediction error, the Root Mean Squared Error (RMSE), to estimate how much of an improvement over the last pass was obtained. If the decrease in RMSE is not really significant, the algorithm is considered to have converged, and no further pass shall be made. At the end of the passes, the setting that ends up with the lowest RMSE wins, and the associated model (the weights of the regression) is selected as the best version. Once the model has finished training, Amazon ML evaluates its performance on the validation datasource. Once the evaluation itself is also ready, you have access to the model's evaluation. Evaluating the model Amazon ML uses the standard metric RMSE for linear regression. RMSE is defined as the sum of the squares of the difference between the real values and the predicted values: Here, ŷ is the predicted values, and y the real values we want to predict (the weight of the children in our case). The closer the predictions are to the real values, the lower the RMSE is. A lower RMSE means a better, more accurate prediction. Making batch predictions We now have a model that has been properly trained and selected among other models. We can use it to make predictions on new data. A batch prediction consists in applying a model to a datasource in order to make predictions on that datasource. We need to tell Amazon ML which model we want to apply on which data. Batch predictions are different from streaming predictions. With batch predictions, all the data is already made available as a datasource, while for streaming predictions, the data will be fed to the model as it becomes available. The dataset is not available beforehand in its entirety. In the Main Menu select Batch Predictions to access the dashboard predictions and click on Create a New Prediction: The first step is to select one of the models available in your model dashboard. You should choose the one that has the lowest RMSE: The next step is to associate a datasource to the model you just selected. We had uploaded the held-out dataset to S3 at the beginning of this chapter (under the Loading the data on S3 section) but had not used it to create a datasource. We will do so now.When asked for a datasource in the next screen, make sure to check My data is in S3, and I need to create a datasource, and then select the held-out dataset that should already be present in your S3 bucket: Don't forget to tell Amazon ML that the first line of the file contains columns. In our current project, our held-out dataset also contains the true values for the weight of the students. This would not be the case for "real" data in a real-world project where the real values are truly unknown. However, in our case, this will allow us to calculate the RMSE score of our predictions and assess the quality of these predictions. The final step is to click on the Verify button and wait for a few minutes: Amazon ML will run the model on the new datasource and will generate predictions in the form of a CSV file. Contrary to the evaluation and model-building phase, we now have real predictions. We are also no longer given a score associated with these predictions. After a few minutes, you will notice a new batch-prediction folder in your S3 bucket. This folder contains a manifest file and a results folder. The manifest file is a JSON file with the path to the initial datasource and the path to the results file. The results folder contains a gzipped CSV file: Uncompressed, the CSV file contains two columns, trueLabel, the initial target from the held-out set, and score, which corresponds to the predicted values. We can easily calculate the RMSE for those results directly in the spreadsheet through the following steps: Creating a new column that holds the square of the difference of the two columns. Summing all the rows. Taking the square root of the result. The following illustration shows how we create a third column C, as the squared difference between the trueLabel column A and the score (or predicted value) column B: As shown in the following screenshot, averaging column C and taking the square root gives an RMSE of 11.96, which is even significantly better than the RMSE we obtained during the evaluation phase (RMSE 14.4): The fact that the RMSE on the held-out set is better than the RMSE on the validation set means that our model did not overfit the training data, since it performed even better on new data than expected. Our model is robust. The left side of the following graph shows the True (Triangle) and Predicted (Circle) Weight values for all the samples in the held-out set. The right side shows the histogram of the residuals. Similar to the histogram of residuals we had observed on the validation set, we observe that the residuals are not centered on 0. Our model has a tendency to overestimate the weight of the students: In this tutorial, we have successfully performed the loading of the data on S3 and let Amazon ML infer the schema and transform the data. We also created a model and evaluated its performance. Finally, we made a prediction on the held -out dataset. To understand how to leverage Amazon's powerful platform for your predictive analytics needs,  check out this book Effective Amazon Machine Learning. Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics Amazon Sagemaker makes machine learning on the cloud easy Amazon ML Solutions Lab to help customers “work backwards” and leverage machine learning    
Read more
  • 0
  • 0
  • 10120

article-image-should-you-use-javascript-for-machine-learning-and-how-do-you-get-started
Sugandha Lahoti
01 Sep 2018
4 min read
Save for later

Why use JavaScript for machine learning?

Sugandha Lahoti
01 Sep 2018
4 min read
Python has always been and remains the language of choice for machine learning, in part due to the maturity of the language, in part due to the maturity of the ecosystem, and in part due to the positive feedback loop of early ML efforts in Python. Recent developments in the JavaScript world, however, are making JavaScript more attractive to ML projects. I think we will see a major ML renaissance in JavaScript within a few years, especially as laptops and mobile devices become ever more powerful and JavaScript itself surges in popularity. This post is extracted from the book  Hands-on Machine Learning with JavaScript by Burak Kanber. The book is a  definitive guide to creating intelligent web applications with the best of machine learning and JavaScript. Advantages and challenges of JavaScript JavaScript, like any other tool, has its advantages and disadvantages. Much of the historical criticism of JavaScript has focused on a few common themes: strange behavior in type coercion, the prototypical object-oriented model, difficulty organizing large codebases, and managing deeply nested asynchronous function calls with what many developers call callback hell. Fortunately, most of these historic gripes have been resolved by the introduction of ES6, that is, ECMAScript 2015, a recent update to the JavaScript syntax. Con: Immature ecosystem for machine learning development Despite the recent language improvements, most developers would still advise against using JavaScript for ML for one reason: the ecosystem. The Python ecosystem for ML is so mature and rich that it's difficult to justify choosing any other ecosystem. But this logic is self-fulfilling and self-defeating; we need brave individuals to take the leap and work on real ML problems if we want JavaScript's ecosystem to mature. Fortunately, JavaScript has been the most popular programming language on GitHub for a few years running, and is growing in popularity by almost every metric. Pro #1: JavaScript is the most popular web development language with a mature npm ecosystem There are some advantages to using JavaScript for ML. Its popularity is one; while ML in JavaScript is not very popular at the moment, the language itself is. As demand for ML applications rises, and as hardware becomes faster and cheaper, it's only natural for ML to become more prevalent in the JavaScript world. There are tons of resources available for learning JavaScript in general, maintaining Node.js servers, and deploying JavaScript applications. The Node Package Manager (npm) ecosystem is also large and still growing, and while there aren't many very mature ML packages available, there are a number of well built, useful tools out there that will come to maturity soon. Pro #2: JavaScript is now a general purpose, cross-platform programming language Another advantage to using JavaScript is the universality of the language. The modern web browser is essentially a portable application platform which allows you to run your code, basically without modification, on nearly any device. Tools like electron (while considered by many to be bloated) allow developers to quickly develop and deploy downloadable desktop applications to any operating system. Node.js lets you run your code in a server environment. React Native brings your JavaScript code to the native mobile application environment, and may eventually allow you to develop desktop applications as well. JavaScript is no longer confined to just dynamic web interactions, it's now a general-purpose, cross-platform programming language. Pro #3: JavaScript makes Machine Learning accessible to web and front-end developers Finally, using JavaScript makes ML accessible to web and frontend developers, a group that historically has been left out of the ML discussion. Server-side applications are typically preferred for ML tools, since the servers are where the computing power is. That fact has historically made it difficult for web developers to get into the ML game, but as hardware improves, even complex ML models can be run on the client, whether it's the desktop or the mobile browser. If web developers, frontend developers, and JavaScript developers all start learning about ML today, that same community will be in a position to improve the ML tools available to us all tomorrow. If we take these technologies and democratize them, expose as many people as possible to the concepts behind ML, we will ultimately elevate the community and seed the next generation of ML researchers. Summary In this article, we've discussed the important moments of JavaScript's history as applied to ML. We’ve discussed some advantages to using JavaScript for machine learning, and also some of the challenges we’re facing, particularly in terms of the machine learning ecosystem. To begin exploring and processing the data itself, read our book  Hands-on Machine Learning with JavaScript. 5 JavaScript machine learning libraries you need to know V8 JavaScript Engine releases version 6.9! HTML5 and the rise of modern JavaScript browser APIs [Tutorial]
Read more
  • 0
  • 0
  • 43869
Modal Close icon
Modal Close icon