Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7010 Articles
article-image-understanding-the-tensorflow-data-model-tutorial
Sugandha Lahoti
16 Sep 2018
12 min read
Save for later

Understanding the TensorFlow data model [Tutorial]

Sugandha Lahoti
16 Sep 2018
12 min read
TensorFlow is a mathematical software and an open source framework for deep learning developed by the Google Brain Team in 2011. Nevertheless, it can be used to help us analyze data in order to predict an effective business outcome. Although the initial target of TensorFlow was to conduct research in ML and in Deep Neural Networks (DNNs), the system is general enough to be applicable to a wide variety of classical machine learning algorithm such as Support Vector Machine (SVM), logistic regression, decision trees, random forest and so on. In this article we will talk about data model in TensorFlow. The data model in TensorFlow is represented by tensors. Without using complex mathematical definitions, we can say that a tensor (in TensorFlow) identifies a multidimensional numerical array. We will see more details on tensors in the next subsection. This article is taken from the book Deep Learning with TensorFlow - Second Edition by Giancarlo Zaccone and Md. Rezaul Karim. In this book, we will delve into neural networks, implement deep learning algorithms, and explore layers of data abstraction with the help of TensorFlow. Tensors in a data model Let's see the formal definition of tensor on Wikipedia, as follows: "Tensors are geometric objects that describe linear relations between geometric vectors, scalars, and other tensors. Elementary examples of such relations include the dot product, the cross product, and linear maps. Geometric vectors, often used in physics and engineering applications, and scalars themselves are also tensors." This data structure is characterized by three parameters: rank, shape, and type, as shown in the following figure: Figure 6: Tensors are nothing but geometric objects with a shape, rank, and type, used to hold a multidimensional array A tensor can thus be thought of as the generalization of a matrix that specifies an element with an arbitrary number of indices. The syntax for tensors is more or less the same as nested vectors. [box type="shadow" align="" class="" width=""]Tensors just define the type of this value and the means by which this value should be calculated during the session. Therefore, they do not represent or hold any value produced by an operation.[/box] Some people love to compare NumPy and TensorFlow. However, in reality, TensorFlow and NumPy are quite similar in the sense that both are N-d array libraries! Well, it's true that NumPy has n-dimensional array support, but it doesn't offer methods to create tensor functions and automatically compute derivatives (and it has no GPU support). The following figure is a short and one-to-one comparison of NumPy and TensorFlow: Figure 7: NumPy versus TensorFlow: a one-to-one comparison Now let's see an alternative way of creating tensors before they could be fed (we will see other feeding mechanisms later on) by the TensorFlow graph: >>> X = [[2.0, 4.0],        [6.0, 8.0]] # X is a list of lists >>> Y = np.array([[2.0, 4.0],              [6.0, 6.0]], dtype=np.float32)#Y is a Numpy array >>> Z = tf.constant([[2.0, 4.0],                 [6.0, 8.0]]) # Z is a tensor Here, X is a list, Y is an n-dimensional array from the NumPy library, and Z is a TensorFlow tensor object. Now let's see their types: >>> print(type(X)) >>> print(type(Y)) >>> print(type(Z)) #Output <class 'list'> <class 'numpy.ndarray'> <class 'tensorflow.python.framework.ops.Tensor'> Well, their types are printed correctly. However, a more convenient function that we're formally dealing with tensors as opposed to the other types is tf.convert_to_tensor() function as follows: t1 = tf.convert_to_tensor(X, dtype=tf.float32) t2 = tf.convert_to_tensor(Y dtype=tf.float32) Now let's see their types using the following code: >>> print(type(t1)) >>> print(type(t2)) #Output: <class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'> Fantastic! That's enough discussion about tensors for now. So, we can think about the structure that is characterized by the term rank. Rank and shape of Tensors A unit of dimensionality called rank describes each tensor. It identifies the number of dimensions of the tensor. For this reason, a rank is known as order or n–dimensions of a tensor. A rank zero tensor is a scalar, a rank one tensor is a vector, and a rank two tensor is a matrix. The following code defines a TensorFlow scalar, vector, matrix, and cube_matrix. In the next example, we will show how rank works: import tensorflow as tf scalar = tf.constant(100) vector = tf.constant([1,2,3,4,5]) matrix = tf.constant([[1,2,3],[4,5,6]]) cube_matrix = tf.constant([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) print(scalar.get_shape()) print(vector.get_shape()) print(matrix.get_shape()) print(cube_matrix.get_shape()) The results are printed here: >>> () (5,) (2, 3) (3, 3, 1) >>> The shape of a tensor is the number of rows and columns it has. Now we will see how to relate the shape of a tensor to its rank: >>scalar.get_shape() TensorShape([]) >>vector.get_shape() TensorShape([Dimension(5)]) >>matrix.get_shape() TensorShape([Dimension(2), Dimension(3)]) >>cube.get_shape() TensorShape([Dimension(3), Dimension(3), Dimension(1)]) Data type of Tensors In addition to rank and shape, tensors have a data type. Here is a list of the data types: Data type Python type Description DT_FLOAT tf.float32 32-bit floating point DT_DOUBLE tf.float64 64-bit floating point DT_INT8 tf.int8 8-bit signed integer DT_INT16 tf.int16 16-bit signed integer DT_INT32 tf.int32 32-bit signed integer DT_INT64 tf.int64 64-bit signed integer DT_UINT8 tf.uint8 8-bit unsigned integer DT_STRING tf.string Variable length byte arrays. Each element of a tensor is a byte array DT_BOOL tf.bool Boolean DT_COMPLEX64 tf.complex64 Complex number made of two 32-bit floating points: real and imaginary parts DT_COMPLEX128 tf.complex128 Complex number made of two 64-bit floating points: real and imaginary parts DT_QINT8 tf.qint8 8-bit signed integer used in quantized Ops DT_QINT32 tf.qint32 32-bit signed integer used in quantized Ops DT_QUINT8 tf.quint8 8-bit unsigned integer used in quantized Ops The preceding table is self-explanatory, so we have not provided a detailed discussion of the data types. The TensorFlow APIs are implemented to manage data to and from NumPy arrays. Thus, to build a tensor with a constant value, pass a NumPy array to the tf.constant() operator, and the result will be a tensor with that value: import tensorflow as tf import numpy as np array_1d = np.array([1,2,3,4,5,6,7,8,9,10]) tensor_1d = tf.constant(array_1d) with tf.Session() as sess:    print(tensor_1d.get_shape())    print(sess.run(tensor_1d)) # Close the TensorFlow session when you're done sess.close() Running the example, we obtain the following: >>> (10,) [ 1  2 3 4  5 6 7 8  9 10] To build a tensor with variable values, use a NumPy array and pass it to the tf.Variable constructor. The result will be a variable tensor with that initial value: import tensorflow as tf import numpy as np # Create a sample NumPy array array_2d = np.array([(1,2,3),(4,5,6),(7,8,9)]) # Now pass the preceding array to tf.Variable() tensor_2d = tf.Variable(array_2d) # Execute the preceding op under an active session with tf.Session() as sess:    sess.run(tf.global_variables_initializer())    print((tensor_2d.get_shape()))    print sess.run(tensor_2d) # Finally, close the TensorFlow session when you're done sess.close() In the preceding code block, tf.global_variables_initializer() is used to initialize all the ops we created before. If you need to create a variable with an initial value dependent on another variable, use the other variable's initialized_value(). This ensures that variables are initialized in the right order. The result is as follows: >>> (3, 3) [[1 2 3] [4 5 6] [7 8 9]] For ease of use in interactive Python environments, we can use the InteractiveSession class, and then use that session for all Tensor.eval() and Operation.run() calls: import tensorflow as tf # Import TensorFlow import numpy as np # Import numpy # Create an interactive TensorFlow session interactive_session = tf.InteractiveSession() # Create a 1d NumPy array array1 = np.array([1,2,3,4,5]) # An array # Then convert the preceding array into a tensor tensor = tf.constant(array1) # convert to tensor print(tensor.eval()) # evaluate the tensor op interactive_session.close() # close the session [box type="shadow" align="" class="" width=""]tf.InteractiveSession() is just convenient syntactic sugar for keeping a default session open in IPython.[/box] The result is as follows: >>>   [1 2 3 4 5] This can be easier in an interactive setting, such as the shell or an IPython Notebook, as it can be tedious to pass around a session object everywhere. [box type="shadow" align="" class="" width=""]The IPython Notebook is now known as the Jupyter Notebook. It is an interactive computational environment in which you can combine code execution, rich text, mathematics, plots, and rich media. For more information, interested readers should refer to https://ipython.org/notebook.html.[/box] Another way to define a tensor is using the tf.convert_to_tensor statement: import tensorflow as tf import numpy as np tensor_3d = np.array([[[0, 1, 2], [3, 4, 5], [6, 7, 8]],                   [[9, 10, 11], [12, 13, 14], [15, 16, 17]],                  [[18, 19, 20], [21, 22, 23], [24, 25, 26]]]) tensor_3d = tf.convert_to_tensor(tensor_3d, dtype=tf.float64) with tf.Session() as sess:    print(tensor_3d.get_shape())    print(sess.run(tensor_3d)) # Finally, close the TensorFlow session when you're done sess.close() Following is the output of the preceding code: >>> (3, 3, 3) [[[  0. 1.   2.]  [ 3.   4. 5.]  [ 6.   7. 8.]] [[  9. 10.  11.]  [ 12.  13. 14.]  [ 15.  16. 17.]] [[ 18.  19. 20.]  [ 21.  22. 23.]  [ 24.  25. 26.]]] Variables Variables are TensorFlow objects used to hold and update parameters. A variable must be initialized so that you can save and restore it to analyze your code later on. Variables are created by using either tf.Variable() or tf.get_variable() statements. Whereas tf.get_varaiable() is recommended but tf.Variable() is lower-label abstraction. In the following example, we want to count the numbers from 1 to 10, but let's import TensorFlow first: import tensorflow as tf We created a variable that will be initialized to the scalar value 0: value = tf.get_variable("value", shape=[], dtype=tf.int32, initializer=None, regularizer=None, trainable=True, collections=None) The assign() and add() operators are just nodes of the computation graph, so they do not execute the assignment until the session is run: one = tf.constant(1) update_value = tf.assign_add(value, one) initialize_var = tf.global_variables_initializer() We can instantiate the computation graph: with tf.Session() as sess:    sess.run(initialize_var)    print(sess.run(value))    for _ in range(5):        sess.run(update_value)        print(sess.run(value)) # Close the session sess.close() Let's recall that a tensor object is a symbolic handle to the result of an operation, but it does not actually hold the values of the operation's output: >>> 0 1 2 3 4 5 Fetches To fetch the output of an operation, the graph can be executed by calling run() on the session object and passing in the tensors. Apart from fetching a single tensor node, you can also fetch multiple tensors. In the following example, the sum and multiply tensors are fetched together using the run() call: import tensorflow as tf constant_A = tf.constant([100.0]) constant_B = tf.constant([300.0]) constant_C = tf.constant([3.0]) sum_ = tf.add(constant_A,constant_B) mul_ = tf.multiply(constant_A,constant_C) with tf.Session() as sess:    result = sess.run([sum_,mul_])# _ means throw away afterwards    print(result) # Finally, close the TensorFlow session when you're done: sess.close() The output is as follows: >>> [array(400.],dtype=float32),array([ 300.],dtype=float32)] It should be noted that all the ops that need to be executed (that is, in order to produce tensor values) are run once (not once per requested tensor). Feeds and placeholders There are four methods of getting data into a TensorFlow program (for more information, see https://www.tensorflow.org/api_guides/python/reading_data): The Dataset API: This enables you to build complex input pipelines from simple and reusable pieces of distributed filesystems and perform complex operations. Using the Dataset API is recommended if you are dealing with large amounts of data in different data formats. The Dataset API introduces two new abstractions to TensorFlow for creating a feedable dataset: tf.contrib.data.Dataset (by creating a source or applying transformation operations) and tf.contrib.data.Iterator. Feeding: This allows us to inject data into any tensor in a computation graph. Reading from files: This allows us to develop an input pipeline using Python's built-in mechanism for reading data from data files at the beginning of the graph. Preloaded data: For a small dataset, we can use either constants or variables in the TensorFlow graph to hold all the data. In this section, we will see an example of a feeding mechanism. TensorFlow provides a feed mechanism that allows us to inject data into any tensor in a computation graph. You can provide the feed data through the feed_dict argument to a run() or eval() invocation that initiates the computation. [box type="shadow" align="" class="" width=""]Feeding using feed_dict argument is the least efficient way to feed data into a TensorFlow execution graph and should only be used for small experiments needing small dataset. It can also be used for debugging.[/box] We can also replace any tensor with feed data (that is, variables and constants). Best practice is to use a TensorFlow placeholder node using tf.placeholder() (https://www.tensorflow.org/api_docs/python/tf/placeholder). A placeholder exists exclusively to serve as the target of feeds. An empty placeholder is not initialized, so it does not contain any data. Therefore, it will always generate an error if it is executed without a feed, so you won't forget to feed it. The following example shows how to feed data to build a random 2×3 matrix: import tensorflow as tf import numpy as np a = 3 b = 2 x = tf.placeholder(tf.float32,shape=(a,b)) y = tf.add(x,x) data = np.random.rand(a,b) sess = tf.Session() print(sess.run(y,feed_dict={x:data})) sess.close()# close the session The output is as follows: >>> [[ 1.78602004  1.64606333] [ 1.03966308  0.99269408] [ 0.98822606  1.50157797]] >>> We understood the data model in TensorFlow. To understand the TensorFlow computational graph and the TensorFlow code structure, read our book Deep Learning with TensorFlow - Second Edition. Why TensorFlow always tops machine learning and artificial intelligence tool surveys. TensorFlow 2.0 is coming. Here’s what we can expect. Getting to know and manipulate Tensors in TensorFlow.
Read more
  • 0
  • 0
  • 20298

article-image-how-to-perform-sentiment-analysis-using-python-tutorial
Sugandha Lahoti
15 Sep 2018
4 min read
Save for later

How to perform sentiment analysis using Python [Tutorial]

Sugandha Lahoti
15 Sep 2018
4 min read
Sentiment analysis is one of the most popular applications of NLP. Sentiment analysis refers to the process of determining whether a given piece of text is positive or negative. In some variations, we consider "neutral" as a third option. This technique is commonly used to discover how people feel about a particular topic. This is used to analyze the sentiments of users in various forms, such as marketing campaigns, social media, e-commerce customers, and so on. In this article, we will perform sentiment analysis using Python. This extract is taken from Python Machine Learning Cookbook by Prateek Joshi. This book contains 100 recipes that teach you how to perform various machine learning tasks in the real world. How to Perform Sentiment Analysis in Python Step 1: Create a new Python file, and import the following packages: import nltk.classify.util from nltk.classify import NaiveBayesClassifier from nltk.corpus import movie_reviews Step 2: Define a function to extract features: def extract_features(word_list): return dict([(word, True) for word in word_list]) Step 3: We need training data for this, so we will use movie reviews in NLTK: if __name__=='__main__':    # Load positive and negative reviews      positive_fileids = movie_reviews.fileids('pos')    negative_fileids = movie_reviews.fileids('neg') Step 4: Let's separate these into positive and negative reviews: features_positive = [(extract_features(movie_reviews.words(fileids=[f])),            'Positive') for f in positive_fileids]    features_negative = [(extract_features(movie_reviews.words(fileids=[f])),            'Negative') for f in negative_fileids] Step 5: Divide the data into training and testing datasets: # Split the data into train and test (80/20)    threshold_factor = 0.8    threshold_positive = int(threshold_factor * len(features_positive))    threshold_negative = int(threshold_factor * len(features_negative)) Step 6: Extract the features: features_train = features_positive[:threshold_positive] + features_negative[:threshold_negative]    features_test = features_positive[threshold_positive:] + features_negative[threshold_negative:]      print "\nNumber of training datapoints:", len(features_train)    print "Number of test datapoints:", len(features_test) Step 7: We will use a Naive Bayes classifier. Define the object and train it: # Train a Naive Bayes classifier    classifier = NaiveBayesClassifier.train(features_train)    print "\nAccuracy of the classifier:", nltk.classify.util.accuracy(classifier, features_test) Step 8: The classifier object contains the most informative words that it obtained during analysis. These words basically have a strong say in what's classified as a positive or a negative review. Let's print them out: print "\nTop 10 most informative words:"    for item in classifier.most_informative_features()[:10]:        print item[0] Step 9: Create a couple of random input sentences: # Sample input reviews    input_reviews = [        "It is an amazing movie",        "This is a dull movie. I would never recommend it to anyone.",        "The cinematography is pretty great in this movie",        "The direction was terrible and the story was all over the place"    ] Step 10: Run the classifier on those input sentences and obtain the predictions: print "\nPredictions:"    for review in input_reviews:        print "\nReview:", review        probdist = classifier.prob_classify(extract_features(review.split()))        pred_sentiment = probdist.max() Step 11: Print the output: print "Predicted sentiment:", pred_sentiment        print "Probability:", round(probdist.prob(pred_sentiment), 2) If you run this code, you will see three main things printed on the Terminal. The first is the accuracy, as shown in the following image: The next is a list of most informative words: The last is the list of predictions, which are based on the input sentences: How does the Code work? We use NLTK's Naive Bayes classifier for our task here. In the feature extractor function, we basically extract all the unique words. However, the NLTK classifier needs the data to be arranged in the form of a dictionary. Hence, we arranged it in such a way that the NLTK classifier object can ingest it. Once we divide the data into training and testing datasets, we train the classifier to categorize the sentences into positive and negative. If you look at the top informative words, you can see that we have words such as "outstanding" to indicate positive reviews and words such as "insulting" to indicate negative reviews. This is interesting information because it tells us what words are being used to indicate strong reactions. Thus we learn how to perform Sentiment Analysis in Python. For more interesting machine learning recipes read our book, Python Machine Learning Cookbook. Understanding Sentiment Analysis and other key NLP concepts. Twitter Sentiment Analysis. Sentiment Analysis of the 2017 US elections on Twitter.
Read more
  • 0
  • 0
  • 65714

article-image-how-facebook-is-advancing-artificial-intelligence-video
Richard Gall
14 Sep 2018
4 min read
Save for later

How Facebook is advancing artificial intelligence [Video]

Richard Gall
14 Sep 2018
4 min read
Facebook is playing a huge role in artificial intelligence research. It’s not only a core part of the Facebook platform, it’s central to how the organization works. The company launched its AI research lab - FAIR - back in 2013. Today, led by some of the best minds in the field, it's not only helping Facebook to leverage artificial intelligence, it's also making it more accessible to researchers and engineers around the world. Let’s take a look at some of the tools built by Facebook that are doing just that. PyTorch: Facebook's leading artificial intelligence tool PyTorch is a hugely popular deep learning framework (rivalling Google's TensorFlow) that, by combining flexiblity and dynamism with stability, bridges the gap between research and production. Using a tape-based auto-differentiation system, PyTorch can be modified and changed by engineers without losing speed. That’s good news for everyone. Although PyTorch steals the headlines, there are a range of supporting tools that are making artificial intelligence and deep learning more accessible and achievable for other engineers. Read next: Is PyTorch better than Google’s TensorFlow? Find PyTorch eBooks and videos on the Packt website.  Facebook's computer vision tools Another field that Facebook has revolutionized is computer vision and image processing. Detectron, Facebook’s state-of-the-art object detection software system, has powered many research projects including Mask R-CNN - a simple and flexible way of developing Convolution Neural Networks for image processing. Mask R-CNN has also helped to power DensePose, a tool that map all human pixels of an RGB image to a 3D surface-based representation of the human body. Facebook has also heavily contributed to research in detecting and recognizing Human-Object interactions as well. Their contribution to the field of generative modeling is equally very important, with tasks such as minimizing variations in the quality of images, JPEG compression as well as image quantization now becoming easier and more accessible. Facebook, language and artificial intelligence We share updates, we send messages - language is a cornerstone of Facebook. This is why it's such an important area for Facebook’s AI researchers. There are a whole host of libraries and tools that are built for language problems. FastText is a library for text representation and classification, while ParlAI is a platform pushing the boundaries of dialog research. The platform is focused on tackling 5 key AI tasks: question answering, sentence completion, goal-oriented dialog, chit-chat dialog, and visual dialog. The ultimate aim for ParlAI is to develop a general dialog AI. There are also a few more language tools in Facebook’s AI toolkit - Fairseq and Translate are helping with translation and text generation, while Wav2Letter is an Automatic Speech Recognition system that can be used for transcription tasks. Rational artificial intelligence for gaming and smart decision making Although Facebook isn’t known for gaming, its interest in developing artificial intelligence that can reason could have an impact on the way games are built in the future. ELF is a tool developed by Facebook that allows game developers to train and test AI algorithms in a gaming environment. ELF was used by Facebook researchers to recreate DeepMind’s AlphaGo Zero, the AI bot that has defeated Go champions. Running on a single GPU, the ELF OpenGo bot defeated four professional Go players 14-0. Impressive, right? There are other tools built by Facebook that aim to build AI into game reasoning. Torchcraft is probably the most notable example - its a library that’s making AI research on Starcraft - a strategy game - accessible to game developers and AI specialists alike. Facebook is defining the future of artificial intelligence As you can see, Facebook is doing a lot to push the boundaries of artificial intelligence. However, it’s not just keeping these tools for itself - all these tools are open source, which means they can be used by anyone.
Read more
  • 0
  • 0
  • 18689

article-image-emotional-ai-detecting-facial-expressions-and-emotions-using-coreml-tutorial
Savia Lobo
14 Sep 2018
11 min read
Save for later

Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial]

Savia Lobo
14 Sep 2018
11 min read
Recently we see computers allow natural forms of interaction and are becoming more ubiquitous, more capable, and more ingrained in our daily lives. They are becoming less like heartless dumb tools and more like friends, able to entertain us, look out for us, and assist us with our work. This article is an excerpt taken from the book Machine Learning with Core ML authored by Joshua Newnham. With this shift comes a need for computers to be able to understand our emotional state. For example, you don't want your social robot cracking a joke after you arrive back from work having lost your job (to an AI bot!). This is a field of computer science known as affective computing (also referred to as artificial emotional intelligence or emotional AI), a field that studies systems that can recognize, interpret, process, and simulate human emotions. The first stage of this is being able to recognize the emotional state. In this article, we will be creating a model that can detect the exact face expression or emotion using CoreML. Input data and preprocessing We will implement the preprocessing functionality required to transform images into something the model is expecting. We will build up this functionality in a playground project before migrating it across to our project in the next section. If you haven't done so already, pull down the latest code from the accompanying repository: https://github.com/packtpublishing/machine-learning-with-core-ml. Once downloaded, navigate to the directory Chapter4/Start/ and open the Playground project ExploringExpressionRecognition.playground. Once loaded, you will see the playground for this extract, as shown in the following screenshot: Before starting, to avoid looking at images of me, please replace the test images with either personal photos of your own or royalty free images from the internet, ideally a set expressing a range of emotions. Along with the test images, this playground includes a compiled Core ML model (we introduced it in the previous image) with its generated set of wrappers for inputs, outputs, and the model itself. Also included are some extensions for UIImage, UIImageView, CGImagePropertyOrientation, and an empty CIImage extension, to which we will return later in the extract. The others provide utility functions to help us visualize the images as we work through this playground. When developing machine learning applications, you have two broad paths. The first, which is becoming increasingly popular, is to use an end-to-end machine learning model capable of just being fed the raw input and producing adequate results. One particular field that has had great success with end-to-end models is speech recognition. Prior to end-to-end deep learning, speech recognition systems were made up of many smaller modules, each one focusing on extracting specific pieces of data to feed into the next module, which was typically manually engineered. Modern speech recognition systems use end-to-end models that take the raw input and output the result. Both of the described approaches can been seen in the following diagram: Obviously, this approach is not constrained to speech recognition and we have seen it applied to image recognition tasks, too, along with many others. But there are two things that make this particular case different; the first is that we can simplify the problem by first extracting the face. This means our model has less features to learn and offers a smaller, more specialized model that we can tune. The second thing, which is no doubt obvious, is that our training data consisted of only faces and not natural images. So, we have no other choice but to run our data through two models, the first to extract faces and the second to perform expression recognition on the extracted faces, as shown in this diagram: Luckily for us, Apple has mostly taken care of our first task of detecting faces through the Vision framework it released with iOS 11. The Vision framework provides performant image analysis and computer vision tools, exposing them through a simple API. This allows for face detection, feature detection and tracking, and classification of scenes in images and video. The latter (expression recognition) is something we will take care of using the Core ML model introduced earlier. Prior to the introduction of the Vision framework, face detection would typically be performed using the Core Image filter. Going back further, you had to use something like OpenCV. You can learn more about Core Image here: https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_detect_faces/ci_detect_faces.html. Now that we have got a bird's-eye view of the work that needs to be done, let's turn our attention to the editor and start putting all of this together. Start by loading the images; add the following snippet to your playground: var images = [UIImage]() for i in 1...3{ guard let image = UIImage(named:"images/joshua_newnham_\(i).jpg") else{ fatalError("Failed to extract features") } images.append(image) } let faceIdx = 0 let imageView = UIImageView(image: images[faceIdx]) imageView.contentMode = .scaleAspectFit In the preceding snippet, we are simply loading each of the images we have included in our resources' Images folder and adding them to an array we can access conveniently throughout the playground. Once all the images are loaded, we set the constant faceIdx, which will ensure that we access the same images throughout our experiments. Finally, we create an ImageView to easily preview it. Once it has finished running, click on the eye icon in the right-hand panel to preview the loaded image, as shown in the following screenshot: Next, we will take advantage of the functionality available in the Vision framework to detect faces. The typical flow when working with the Vision framework is defining a request, which determines what analysis you want to perform, and defining the handler, which will be responsible for executing the request and providing means of obtaining the results (either through delegation or explicitly queried). The result of the analysis is a collection of observations that you need to cast into the appropriate observation type; concrete examples of each of these can be seen here: As illustrated in the preceding diagram, the request determines what type of image analysis will be performed; the handler, using a request or multiple requests and an image, performs the actual analysis and generates the results (also known as observations). These are accessible via a property or delegate if one has been assigned. The type of observation is dependent on the request performed; it's worth highlighting that the Vision framework is tightly integrated into Core ML and provides another layer of abstraction and uniformity between you and the data and process. For example, using a classification Core ML model would return an observation of type VNClassificationObservation. This layer of abstraction not only simplifies things but also provides a consistent way of working with machine learning models. In the previous figure, we showed a request handler specifically for static images. Vision also provides a specialized request handler for handling sequences of images, which is more appropriate when dealing with requests such as tracking. The following diagram illustrates some concrete examples of the types of requests and observations applicable to this use case: So, when do you use VNImageRequestHandler and VNSequenceRequestHandler? Though the names provide clues as to when one should be used over the other, it's worth outlining some differences. The image request handler is for interactive exploration of an image; it holds a reference to the image for its life cycle and allows optimizations of various request types. The sequence request handler is more appropriate for performing tasks such as tracking and does not optimize for multiple requests on an image. Let's see how this all looks in code; add the following snippet to your playground: let faceDetectionRequest = VNDetectFaceRectanglesRequest() let faceDetectionRequestHandler = VNSequenceRequestHandler() Here, we are simply creating the request and handler; as discussed in the preceding code, the request encapsulates the type of image analysis while the handler is responsible for executing the request. Next, we will get faceDetectionRequestHandler to run faceDetectionRequest; add the following code: try? faceDetectionRequestHandler.perform( [faceDetectionRequest], on: images[faceIdx].cgImage!, orientation: CGImagePropertyOrientation(images[faceIdx].imageOrientation)) The perform function of the handler can throw an error if it fails; for this reason, we wrap the call with try? at the beginning of the statement and can interrogate the error property of the handler to identify the reason for failing. We pass the handler a list of requests (in this case, only our faceDetectionRequest), the image we want to perform the analysis on, and, finally, the orientation of the image that can be used by the request during analysis. Once the analysis is done, we can inspect the observation obtained through the results property of the request itself, as shown in the following code: if let faceDetectionResults = faceDetectionRequest.results as? [VNFaceObservation]{ for face in faceDetectionResults{ // ADD THE NEXT SNIPPET OF CODE HERE } } The type of observation is dependent on the analysis; in this case, we're expecting a VNFaceObservation. Hence, we cast it to the appropriate type and then iterate through all the observations. Next, we will take each recognized face and extract the bounding box. Then, we'll proceed to draw it in the image (using an extension method of UIImageView found within the UIImageViewExtension.swift file). Add the following block within the for loop shown in the preceding code: if let currentImage = imageView.image{ let bbox = face.boundingBox let imageSize = CGSize( width:currentImage.size.width, height: currentImage.size.height) let w = bbox.width * imageSize.width let h = bbox.height * imageSize.height let x = bbox.origin.x * imageSize.width let y = bbox.origin.y * imageSize.height let faceRect = CGRect( x: x, y: y, width: w, height: h) let invertedY = imageSize.height - (faceRect.origin.y + faceRect.height) let invertedFaceRect = CGRect( x: x, y: invertedY, width: w, height: h) imageView.drawRect(rect: invertedFaceRect) } We can obtain the bounding box of each face via the let boundingBox property; the result is normalized, so we then need to scale this based on the dimensions of the image. For example, you can obtain the width by multiplying boundingBox with the width of the image: bbox.width * imageSize.width. Next, we invert the y axis as the coordinate system of Quartz 2D is inverted with respect to that of UIKit's coordinate system, as shown in this diagram: We invert our coordinates by subtracting the bounding box's origin and height from height of the image and then passing this to our UIImageView to render the rectangle. Click on the eye icon in the right-hand panel in line with the statement imageView.drawRect(rect: invertedFaceRect) to preview the results; if successful, you should see something like the following: An alternative to inverting the face rectangle would be to use an AfflineTransform, such as: var transform = CGAffineTransform(scaleX: 1, y: -1) transform = transform.translatedBy(x: 0, y: -imageSize.height) let invertedFaceRect = faceRect.apply(transform) This approach leads to less code and therefore less chances of errors. So, it is the recommended approach. The long approach was taken previously to help illuminate the details. As a designer and builder of intelligent systems, it is your task to interpret these results and present them to the user. Some questions you'll want to ask yourself are as follows: What is an acceptable threshold of a probability before setting the class as true? Can this threshold be dependent on probabilities of other classes to remove ambiguity? That is, if Sad and Happy have a probability of 0.3, you can infer that the prediction is inaccurate, or at least not useful. Is there a way to accept multiple probabilities? Is it useful to expose the threshold to the user and have it manually set and/or tune it? These are only a few questions you should ask. The specific questions and their answers will depend on your use case and users. At this point, we have everything we need to preprocess and perform inference We briefly explored some use cases showing how emotion recognition could be applied. For a detailed overview of this experiment, check out our book, Machine Learning with Core ML to further implement Core ML for visual-based applications using the principles of transfer learning and neural networks. Amazon Rekognition can now ‘recognize’ faces in a crowd at real-time 5 cool ways Transfer Learning is being used today My friend, the robot: Artificial Intelligence needs Emotional Intelligence
Read more
  • 0
  • 0
  • 34336

article-image-aws-machine-learning-learning-aws-cli-to-execute-a-simple-amazon-ml-workflow-tutorial
Melisha Dsouza
13 Sep 2018
15 min read
Save for later

AWS machine learning: Learning AWS CLI to execute a simple Amazon ML workflow [Tutorial]

Melisha Dsouza
13 Sep 2018
15 min read
Using the AWS web interface to manage and run your projects is time-consuming. We will, therefore, start running our projects via the command line with the AWS Command Line Interface (AWS CLI). With just one tool to download and configure, multiple  AWS services can be controlled from the command line and they can be automated through scripts. The code files for this article are available on Github. This article is an excerpt from a book written by Alexis Perrier titled Effective Amazon Machine Learning. Getting started and setting up Creating a performing predictive model from raw data requires many trials and errors, much back and forth. Creating new features, cleaning up data, and trying out new parameters for the model are needed to ensure the robustness of the model. There is a constant back and forth between the data, the models, and the evaluations. Scripting this workflow either via the AWS CLI will give us the ability to speed up the create, test, select loop. Installing AWS CLI In order to set up your CLI credentials, you need your access key ID and your secret access key.  You can simply create them from the IAM console (https://console.aws.amazon.com/iam). Navigate to Users, select your IAM user name and click on the Security credentials tab. Choose Create Access Key and download the CSV file. Store the keys in a secure location. We will need the key in a few minutes to set up AWS CLI. But first, we need to install AWS CLI. Docker environment – This tutorial will help you use the AWS CLI within a docker container: https://blog.flowlog-stats.com/2016/05/03/aws-cli-in-a-docker-container/. A docker image for running the AWS CLI is available at https://hub.docker.com/r/fstab/aws-cli/. There is no need to rewrite the AWS documentation on how to install the AWS CLI. It is complete and up to date, and available at http://docs.aws.amazon.com/cli/latest/userguide/installing.html. In a nutshell, installing the CLI requires you to have Python and pip already installed. Then, run the following: $ pip install --upgrade --user awscli Add AWS to your $PATH: $ export PATH=~/.local/bin:$PATH Reload the bash configuration file (this is for OSX): $ source ~/.bash_profile Check that everything works with the following command: $ aws --version You should see something similar to the following output: $ aws-cli/1.11.47 Python/3.5.2 Darwin/15.6.0 botocore/1.5.10 Once installed, we need to configure the AWS CLI type: $ aws configure Now input the access keys you just created: $ aws configure AWS Access Key ID [None]: ABCDEF_THISISANEXAMPLE AWS Secret Access Key [None]: abcdefghijk_THISISANEXAMPLE Default region name [None]: us-west-2 Default output format [None]: json Choose the region that is closest to you and the format you prefer (JSON, text, or table). JSON is the default format. The AWS configure command creates two files: a config file and a credential file. On OSX, the files are ~/.aws/config and ~/.aws/credentials. You can directly edit these files to change your access or configuration. You will need to create different profiles if you need to access multiple AWS accounts. You can do so via the AWS configure command: $ aws configure --profile user2 You can also do so directly in the config and credential files: ~/.aws/config [default] output = json region = us-east-1 [profile user2] output = text region = us-west-2 You can edit Credential file as follows: ~/.aws/credentials [default] aws_secret_access_key = ABCDEF_THISISANEXAMPLE aws_access_key_id = abcdefghijk_THISISANEXAMPLE [user2] aws_access_key_id = ABCDEF_ANOTHERKEY aws_secret_access_key = abcdefghijk_ANOTHERKEY Refer to the AWS CLI setup page for more in-depth information: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html Picking up CLI syntax The overall format of any AWS CLI command is as follows: $ aws <service> [options] <command> <subcommand> [parameters] Here the terms are stated as: <service>: Is the name of the service you are managing: S3, machine learning, and EC2 [options] : Allows you to set the region, the profile, and the output of the command <command> <subcommand>: Is the actual command you want to execute  [parameters] : Are the parameters for these commands A simple example will help you understand the syntax better. To list the content of an S3 bucket named aml.packt, the command is as follows: $ aws s3 ls aml.packt Here, s3 is the service, ls is the command, and aml.packt is the parameter. The aws help command will output a list of all available services. There are many more examples and explanations on the AWS documentation available at http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-using.html. Passing parameters using JSON files For some services and commands, the list of parameters can become long and difficult to check and maintain. For instance, in order to create an Amazon ML model via the CLI, you need to specify at least seven different elements: the Model ID, name, type, the model's parameters, the ID of the training data source, and the recipe name and URI (aws machinelearning create-ml-model help ). When possible, we will use the CLI ability to read parameters from a JSON file instead of specifying them in the command line. AWS CLI also offers a way to generate a JSON template, which you can then use with the right parameters. To generate that JSON parameter file model (the JSON skeleton), simply add --generate-cli-skeleton after the command name. For instance, to generate the JSON skeleton for the create model command of the machine learning service, write the following: $ aws machinelearning create-ml-model --generate-cli-skeleton This will give the following output: { "MLModelId": "", "MLModelName": "", "MLModelType": "", "Parameters": { "KeyName": "" }, "TrainingDataSourceId": "", "Recipe": "", "RecipeUri": "" } You can then configure this to your liking. To have the skeleton command generate a JSON file and not simply output the skeleton in the terminal, add > filename.json: $ aws machinelearning create-ml-model --generate-cli-skeleton > filename.json This will create a filename.json file with the JSON template. Once all the required parameters are specified, you create the model with the command (assuming the filename.json is in the current folder): $ aws machinelearning create-ml-model file://filename.json Before we dive further into the machine learning workflow via the CLI, we need to introduce the dataset we will be using in this chapter. Introducing the Ames Housing dataset We will use the Ames Housing dataset that was compiled by Dean De Cock for use in data science education. It is a great alternative to the popular but older Boston Housing dataset. The Ames Housing dataset is used in the Advanced Regression Techniques challenge on the Kaggle website: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/. The original version of the dataset is available: http://www.amstat.org/publications/jse/v19n3/decock/AmesHousing.xls and in the GitHub repository for this chapter. For more information on the genesis of this dataset and an in-depth explanation of the different variables, read the paper by Dean De Cock available in PDF at https://ww2.amstat.org/publications/jse/v19n3/decock.pdf. We will start by splitting the dataset into a train and a validate set and build a model on the train set. Both train and validate sets are available in the GitHub repository as ames_housing_training.csv and ames_housing_validate.csv. The entire dataset is in the ames_housing.csv file. Splitting the dataset with shell commands We will use shell commands to shuffle, split, and create training and validation subsets of the Ames Housing dataset: First, extract the first line into a separate file, ames_housing_header.csv and remove it from the original file: $ head -n 1 ames_housing.csv > ames_housing_header.csv We just tail all the lines after the first one into the same file: $ tail -n +2 ames_housing.csv > ames_housing_nohead.csv Then randomly sort the rows into a temporary file. (gshuf is the OSX equivalent of the Linux shuf shell command. It can be installed via brew install coreutils): $ gshuf ames_housing_nohead.csv -o ames_housing_nohead.csv Extract the first 2,050 rows as the training file and the last 880 rows as the validation file: $ head -n 2050 ames_housing_nohead.csv > ames_housing_training.csv $ tail -n 880 ames_housing_nohead.csv > ames_housing_validate.csv Finally, add back the header into both training and validation files: $ cat ames_housing_header.csv ames_housing_training.csv > tmp.csv $ mv tmp.csv ames_housing_training.csv $ cat ames_housing_header.csv ames_housing_validate.csv > tmp.csv $ mv tmp.csv ames_housing_validate.csv A simple project using AWS CLI We are now ready to execute a simple Amazon ML workflow using the CLI. This includes the following: Uploading files on S3 Creating a datasource and the recipe Creating a model Creating an evaluation Prediction batch and real time Let's start by uploading the training and validation files to S3. In the following lines, replace the bucket name aml.packt with your own bucket name. To upload the files to the S3 location s3://aml.packt/data/ch8/, run the following command lines: $ aws s3 cp ./ames_housing_training.csv s3://aml.packt/data/ch8/ upload: ./ames_housing_training.csv to s3://aml.packt/data/ch8/ames_housing_training.csv $ aws s3 cp ./ames_housing_validate.csv s3://aml.packt/data/ch8/ upload: ./ames_housing_validate.csv to s3://aml.packt/data/ch8/ames_housing_validate.csv An overview of Amazon ML CLI commands That's it for the S3 part. Now let's explore the CLI for Amazon's machine learning service. All Amazon ML CLI commands are available at http://docs.aws.amazon.com/cli/latest/reference/machinelearning/. There are 30 commands, which can be grouped by object and action. You can perform the following: create : creates the object describe: searches objects given some parameters (location, dates, names, and so on) get: given an object ID, returns information update: given an object ID, updates the object delete: deletes an object These can be performed on the following elements: datasource create-data-source-from-rds create-data-source-from-redshift create-data-source-from-s3 describe-data-sources delete-data-source get-data-source update-data-source ml-model create-ml-model describe-ml-models get-ml-model delete-ml-model update-ml-model evaluation create-evaluation describe-evaluations get-evaluation delete-evaluation update-evaluation batch prediction create-batch-prediction describe-batch-predictions get-batch-prediction delete-batch-prediction update-batch-prediction real-time end point create-realtime-endpoint delete-realtime-endpoint predict You can also handle tags and set waiting times. Note that the AWS CLI gives you the ability to create datasources from S3, Redshift, and RDS, while the web interface only allowed datasources from S3 and Redshift. Creating the datasource We will start by creating the datasource. Let's first see what parameters are needed by generating the following skeleton: $ aws machinelearning create-data-source-from-s3 --generate-cli-skeleton This generates the following JSON object: { "DataSourceId": "", "DataSourceName": "", "DataSpec": { "DataLocationS3": "", "DataRearrangement": "", "DataSchema": "", "DataSchemaLocationS3": "" }, "ComputeStatistics": true } The different parameters are mostly self-explanatory and further information can be found on the AWS documentation at http://docs.aws.amazon.com/cli/latest/reference/machinelearning/create-data-source-from-s3.html. A word on the schema: when creating a datasource from the web interface, you have the possibility to use a wizard, to be guided through the creation of the schema. The wizard facilitates the process by guessing the type of the variables, thus making available a default schema that you can modify. There is no default schema available via the AWS CLI. You have to define the entire schema yourself, either in a JSON format in the DataSchema field or by uploading a schema file to S3 and specifying its location, in the DataSchemaLocationS3 field. Since our dataset has many variables (79), we cheated and used the wizard to create a default schema that we uploaded to S3. Throughout the rest of the chapter, we will specify the schema location not its JSON definition. In this example, we will create the following datasource parameter file, dsrc_ames_housing_001.json: { "DataSourceId": "ch8_ames_housing_001", "DataSourceName": "[DS] Ames Housing 001", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_training.csv", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } For the validation subset (save to dsrc_ames_housing_002.json): { "DataSourceId": "ch8_ames_housing_002", "DataSourceName": "[DS] Ames Housing 002", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_validate.csv", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } Since we have already split our data into a training and a validation set, there's no need to specify the data DataRearrangement field. Alternatively, we could also have avoided splitting our dataset and specified the following DataRearrangement on the original dataset, assuming it had been already shuffled: (save to dsrc_ames_housing_003.json): { "DataSourceId": "ch8_ames_housing_003", "DataSourceName": "[DS] Ames Housing training 003", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_shuffled.csv", "DataRearrangement": "{"splitting":{"percentBegin":0,"percentEnd":70}}", "DataSchemaLocationS3": "s3://aml.packt/data/ch8/ames_housing.csv.schema" }, "ComputeStatistics": true } For the validation set (save to dsrc_ames_housing_004.json): { "DataSourceId": "ch8_ames_housing_004", "DataSourceName": "[DS] Ames Housing validation 004", "DataSpec": { "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_shuffled.csv", "DataRearrangement": "{"splitting":{"percentBegin":70,"percentEnd":100}}", }, "ComputeStatistics": true } Here, the ames_housing.csv file has previously been shuffled using the gshuf command line and uploaded to S3: $ gshuf ames_housing_nohead.csv -o ames_housing_nohead.csv $ cat ames_housing_header.csv ames_housing_nohead.csv > tmp.csv $ mv tmp.csv ames_housing_shuffled.csv $ aws s3 cp ./ames_housing_shuffled.csv s3://aml.packt/data/ch8/ Note that we don't need to create these four datasources; these are just examples of alternative ways to create datasources. We then create these datasources by running the following: $ aws machinelearning create-data-source-from-s3 --cli-input-json file://dsrc_ames_housing_001.json We can check whether the datasource creation is pending: In return, we get the datasoure ID we had specified: { "DataSourceId": "ch8_ames_housing_001" } We can then obtain information on that datasource with the following: $ aws machinelearning get-data-source --data-source-id ch8_ames_housing_001 This returns the following: { "Status": "COMPLETED", "NumberOfFiles": 1, "CreatedByIamUser": "arn:aws:iam::178277xxxxxxx:user/alexperrier", "LastUpdatedAt": 1486834110.483, "DataLocationS3": "s3://aml.packt/data/ch8/ames_housing_training.csv", "ComputeStatistics": true, "StartedAt": 1486833867.707, "LogUri": "https://eml-prod-emr.s3.amazonaws.com/178277513911-ds-ch8_ames_housing_001/.....", "DataSourceId": "ch8_ames_housing_001", "CreatedAt": 1486030865.965, "ComputeTime": 880000, "DataSizeInBytes": 648150, "FinishedAt": 1486834110.483, "Name": "[DS] Ames Housing 001" } Note that we have access to the operation log URI, which could be useful to analyze the model training later on. Creating the model Creating the model with the create-ml-model command follows the same steps: Generate the skeleton with the following: $ aws machinelearning create-ml-model --generate-cli-skeleton > mdl_ames_housing_001.json Write the configuration file: { "MLModelId": "ch8_ames_housing_001", "MLModelName": "[MDL] Ames Housing 001", "MLModelType": "REGRESSION", "Parameters": { "sgd.shuffleType": "auto", "sgd.l2RegularizationAmount": "1.0E-06", "sgd.maxPasses": "100" }, "TrainingDataSourceId": "ch8_ames_housing_001", "RecipeUri": "s3://aml.packt/data/ch8 /recipe_ames_housing_001.json" } Note the parameters of the algorithm. Here, we used mild L2 regularization and 100 passes. Launch the model creation with the following: $ aws machinelearning create-ml-model --cli-input-json file://mdl_ames_housing_001.json The model ID is returned: { "MLModelId": "ch8_ames_housing_001" } This get-ml-model command gives you a status update on the operation as well as the URL to the log. $ aws machinelearning get-ml-model --ml-model-id ch8_ames_housing_001 The watch command allows you to repeat a shell command every n seconds. To get the status of the model creation every 10s, just write the following: $ watch -n 10 aws machinelearning get-ml-model --ml-model-id ch8_ames_housing_001 The output of the get-ml-model will be refreshed every 10s until you kill it. It is not possible to create the default recipe via the AWS CLI commands. You can always define a blank recipe that would not carry out any transformation on the data. However, the default recipe has been shown to be positively impacting the model performance. To obtain this default recipe, we created it via the web interface, copied it into a file that we uploaded to S3. The resulting file recipe_ames_housing_001.json is available in our GitHub repository. Its content is quite long as the dataset has 79 variables and is not reproduced here for brevity purposes. Evaluating our model with create-evaluation Our model is now trained and we would like to evaluate it on the evaluation subset. For that, we will use the create-evaluation CLI command: Generate the skeleton: $ aws machinelearning create-evaluation --generate-cli-skeleton > eval_ames_housing_001.json Configure the parameter file: { "EvaluationId": "ch8_ames_housing_001", "EvaluationName": "[EVL] Ames Housing 001", "MLModelId": "ch8_ames_housing_001", "EvaluationDataSourceId": "ch8_ames_housing_002" } Launch the evaluation creation: $ aws machinelearning create-evaluation --cli-input-json file://eval_ames_housing_001.json Get the evaluation information: $ aws machinelearning get-evaluation --evaluation-id ch8_ames_housing_001 From that output, we get the performance of the model in the form of the RMSE: "PerformanceMetrics": { "Properties": { "RegressionRMSE": "29853.250469108018" } } The value may seem big, but it is relative to the range of the salePrice variable for the houses, which has a mean of 181300.0 and std of 79886.7. So an RMSE of 29853.2 is a decent score. You don't have to wait for the datasource creation to be completed in order to launch the model training. Amazon ML will simply wait for the parent operation to conclude before launching the dependent one. This makes chaining operations possible. At this point, we have a trained and evaluated model. In this tutorial, we have successfully seen the detailed steps on how to get started with CLI and we have also implemented a  simple project to get comfortable with the same. To understand how to leverage Amazon's powerful platform for your predictive analytics needs,  check out this book Effective Amazon Machine Learning Part1. Learning AWS CLI Part2. ChatOps with Slack and AWS CLI Automate tasks using Azure PowerShell and Azure CLI [Tutorial]
Read more
  • 0
  • 0
  • 25076

article-image-how-to-predict-viral-content-using-random-forest-regression-in-python-tutorial
Prasad Ramesh
12 Sep 2018
9 min read
Save for later

How to predict viral content using random forest regression in Python [Tutorial]

Prasad Ramesh
12 Sep 2018
9 min read
Understanding sharing behavior is a big business. As consumers become blind to traditional advertising, the push is to go beyond simple pitches to tell engaging stories. In this article we will build a predictive content scoring model that will predict whether the content will go viral or not using random forest regression. This article is an excerpt from a book written by Alexander T. Combs titled Python Machine Learning Blueprints: Intuitive data projects you can relate to. You can download the code and other relevant files used in this article from this GitHub link. What does research tell us about content virality? Increasingly, the success of these endeavors is measured in social shares. Why go to so much trouble? Because as a brand, every share that I receive represents another consumer that I've reached—all without spending an additional cent. Due to this value, several researchers have examined sharing behavior in the hopes of understanding what motivates it. Among the reasons researchers have found: To provide practical value to others (an altruistic motive) To associate ourselves with certain ideas and concepts (an identity motive) To bond with others around a common emotion (a communal motive) With regard to the last motive, one particularly well-designed study looked at the 7,000 pieces of content from the New York Times to examine the effect of emotion on sharing. They found that simple emotional sentiment was not enough to explain sharing behavior, but when combined with emotional arousal, the explanatory power was greater. For example, while sadness has a strong negative valence, it is considered to be a low arousal state. Anger, on the other hand, has a negative valence paired with a high arousal state. As such, stories that sadden the reader tend to generate far fewer stories than anger-inducing stories: Source : “What Makes Online Content Viral?” by Jonah Berger and Katherine L. Milkman Building a predictive content scoring model Let's create a model that can estimate the share counts for a given piece of content. Ideally, we would have a much larger sample of content, especially content that had more typical share counts. However, we'll make do with what we have here. We're going to use an algorithm called random forest regression. Here we're going to use a regression and attempt to predict the share counts. We could bucket our share classes into ranges, but it is preferable to use regression when dealing with continuous variables. To begin, we'll create a bare-bones model. We'll use the number of images, the site, and the word count. We'll train our model on the number of Facebook likes. We'll first import the sci-kit learn library, then we'll prepare our data by removing the rows with nulls, resetting our index, and finally splitting the frame into our training and testing set: from sklearn.ensemble import RandomForestRegressor all_data = dfc.dropna(subset=['img_count', 'word_count']) all_data.reset_index(inplace=True, drop=True) train_index = [] test_index = [] for i in all_data.index: result = np.random.choice(2, p=[.65,.35]) if result == 1: test_index.append(i) else: train_index.append(i) We used a random number generator with a probability set for approximately 2/3 and 1/3 to determine which row items (based on their index) would be placed in each set. Setting the probabilities this way ensures that we get approximately twice the number of rows in our training set as compared to the test set. We see this, as follows: print('test length:', len(test_index), '\ntrain length:', len(train_index)) The preceding code will generate the following output: Now, we'll continue on with preparing our data. Next, we need to set up categorical encoding for our sites. Currently, our DataFrame object has the name for each site represented with a string. We need to use dummy encoding. This creates a column for each site. If the row is for that particular site, then that column will be filled in with 1; all the other site columns be filled in with 0. Let's do that now: sites = pd.get_dummies(all_data['site']) sites The preceding code will generate the following output: The dummy encoding can be seen in the preceding image. We'll now continue by splitting our data into training and test sets as follows: y_train = all_data.iloc[train_index]['fb'].astype(int) X_train_nosite = all_data.iloc[train_index][['img_count', 'word_count']] X_train = pd.merge(X_train_nosite, sites.iloc[train_index], left_index=True, right_index=True) y_test = all_data.iloc[test_index]['fb'].astype(int) X_test_nosite = all_data.iloc[test_index][['img_count', 'word_count']] X_test = pd.merge(X_test_nosite, sites.iloc[test_index], left_index=True, right_index=True) With this, we've set up our X_test, X_train, y_test, and y_train variables. We'll use this now to build our model: clf = RandomForestRegressor(n_estimators=1000) clf.fit(X_train, y_train) With these two lines of code, we have trained our model. Let's now use it to predict the Facebook likes for our testing set: y_actual = y_test deltas = pd.DataFrame(list(zip(y_pred, y_actual, (y_pred - y_actual)/(y_actual))), columns=['predicted', 'actual', 'delta']) deltas The preceding code will generate the following output: Here we see the predicted value, the actual value, and the difference as a percentage. Let's take a look at the descriptive stats for this: deltas['delta'].describe() The preceding code will generate the following output: Our median error is 0! Well, unfortunately, this isn't a particularly useful bit of information as errors are on both sides—positive and negative, and they tend to average out, which is what we see here. Let's now look at a more informative metric to evaluate our model. We're going to look at root mean square error as a percentage of the actual mean. To first illustrate why this is more useful, let's run the following scenario on two sample series: a = pd.Series([10,10,10,10]) b = pd.Series([12,8,8,12]) np.sqrt(np.mean((b-a)**2))/np.mean(a) This results in the following output: Now compare this to the mean: (b-a).mean() This results in the following output: Clearly the former is the more meaningful statistic. Let's now run this for our model: np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) The preceding code will generate the following output: Let's now add another feature that iscounts for words and see if it  helps our model. We'll use a count vectorizer to do this. Much like what we did with the site names, we'll transform individual words and n-grams into features: from sklearn.feature_extraction.text import CountVectorizer vect = CountVectorizer(ngram_range=(1,3)) X_titles_all = vect.fit_transform(all_data['title']) X_titles_train = X_titles_all[train_index] X_titles_test = X_titles_all[test_index] X_test = pd.merge(X_test, pd.DataFrame(X_titles_test.toarray(), index=X_test.index), left_index=True, right_index=True) X_train = pd.merge(X_train, pd.DataFrame(X_titles_train.toarray(), index=X_train.index), left_index=True, right_index=True) In these lines, we joined our existing features to our new n-gram features. Let's now train our model and see if we have any improvement: clf.fit(X_train, y_train) y_pred = clf.predict(X_test) deltas = pd.DataFrame(list(zip(y_pred, y_actual, (y_pred - y_actual)/(y_actual))), columns=['predicted', 'actual', 'delta']) deltas The preceding code will generate the following output: While checking our errors again, we see the following: np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) This code results in the following output: So, it appears that we have a modestly improved model. Now, let's add another feature i.e the word count of the title, as follows: all_data = all_data.assign(title_wc = all_data['title'].map(lambda x: len(x.split(' ')))) X_train = pd.merge(X_train, all_data[['title_wc']], left_index=True, right_index=True) X_test = pd.merge(X_test, all_data[['title_wc']], left_index=True, right_index=True) clf.fit(X_train, y_train) y_pred = clf.predict(X_test) np.sqrt(np.mean((y_pred-y_actual)**2))/np.mean(y_actual) The preceding code will generate the following output: It appears that each feature has modestly improved our model. There are certainly more features that we could add to our model. For example, we could add the day of the week and the hour of the posting, we could determine if the article is a listicle by running a regex on the headline, or we could examine the sentiment of each article. This only begins to touch on the features that could be important to model virality. We would certainly need to go much further to continue reducing the error in our model. We have performed only the most cursory testing of our model. Each measurement should be run multiple times to get a more accurate representation of the true error rate. It is possible that there is no statistically discernible difference between our last two models, as we only performed one test. To summarize, we learned how we can build a model to predict content virality using a random forest regression. To know more about predicting and other machine learning projects in Python projects check out Python Machine Learning Blueprints: Intuitive data projects you can relate to. Writing web services with functional Python programming [Tutorial] Visualizing data in R and Python using Anaconda [Tutorial] Python 3.7 beta is available as the second generation Google App Engine standard runtime
Read more
  • 0
  • 0
  • 33426
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-implementing-an-ai-in-unreal-engine-4-with-ai-perception-components-tutorial
Natasha Mathur
11 Sep 2018
8 min read
Save for later

Implementing an AI in Unreal Engine 4 with AI Perception components [Tutorial]

Natasha Mathur
11 Sep 2018
8 min read
AI Perception is a system within Unreal Engine 4 that allows sources to register their senses to create stimuli, and then other listeners are periodically updated as the sense stimuli is created within the system. This works wonders for creating a reusable system that can react to an array of customizable sensors. In this tutorial, we will explore different components available within Unreal Engine 4 to enable artificial intelligence sensing within our games. We will do this by taking advantage of a system within Unreal Engine called AI Perception components. These components can be customized and even scripted to introduce new behavior by extending the current sensing interface. This tutorial is an excerpt taken from the book ‘Unreal Engine 4 AI Programming Essentials’ written by Peter L. Newton, Jie Feng. Let’s now have a look at AI sensing. Implementing AI sensing in Unreal Engine 4 Let's start by bringing up Unreal Engine 4 and open our New Project window. Then, perform the following steps: First, name our new project AI Sense and hit create project. After it finishes loading, we want to start by creating a new AIController that will be responsible for sending our AI the appropriate instructions. Let's navigate to the Blueprint folder and create a new AIController class, naming it EnemyPatrol. Now, to assign EnemyPatrol, we need to place a pawn into the world then assign the controller to it. After placing the pawn, click on the Details tab within the editor. Next, we want to search for AI Controller. By default, it is the parent class AI Controller, but we want this to be EnemyPatrol: Next, we will create a new PlayerController named PlayerSense. Then, we need to introduce the AI Perception component to those who we want to be seen by or to see. Let's open the PlayerSense controller first and then add the necessary components. Building AI Perception components There are two components that are currently available within the Unreal Engine Framework. The first one is the AI Perception component that listens for perception of stimulants (sight, hearing, etc.) The other is the AIPerceptionStimuliSource component. It is used to easily register the pawn as a source of stimuli, allowing it to be detected by other AI Perception components. This comes in handy, particularly in our case. Now, follow these steps: With PlayerSense open, let's add a new component called AIPerceptionStimuliSource. Then, under the Details tab, let's select AutoRegister as Source. Next, we want to add new senses to create a source for. So, looking at Register as Source for Senses, there is an AISense array. Populate this array with the AISense_Sight blueprint in order to be detected by sight by other AI Perception components. You will note that there are also other senses to choose from—for example, AISense_Hearing, AISense_Touch, and so on. The complete settings are shown in the following screenshot: This was pretty straightforward considering our next process. This allows our player pawn to be detected by Enemy AI whenever we get within their sense's configured range. Next, let's open our EnemyPatrol class and add the other AI Perception components to our AI. This component is called AIPerception and contains many other configurations, allowing you to customize and tailor the AI for different scenarios: Clicking on the AI Perception component, you will notice that under the AI section, everything is grayed out. This is because we have configurations specific to each sense. This also goes if you create your own AI Sense classes. Let's focus on two sections within this component: the first is the AI Perception settings, and the other is the event provided with this component: The AI Perception section should look similar to the same section on AIPerceptionStimuliSource. The differences are that you have to register your senses, and you can also specify a dominant sense. The dominant sense takes precedence of other senses determined in the same location. Let's look at the Senses configuration and add a new element. This will populate the array with a new sense configuration, which you can then modify. For now, let's select the AI Sight configuration, and then we can leave the default values as the same. In the game, we are able to visualize the configurations, allowing us to have more control over our senses. There is another configuration that allows you to specify affiliation, but at the time of writing this, these options aren't available. When you click on Detection by Affiliation, you must select Detect Neutrals to detect any pawn with Sight Sense Source. Next, we need to be able to notify our AI of a new target. We will do this by utilizing the Event we saw as part of the AI Perception component. By navigating there, we can see an event called OnPerceptionUpdated.  This will be updated when there are changes in the sensory state which makes the tracking of senses easy and straightforward. Let's move toward the OnPerceptionUpdated event and perform the following: Click on OnPerceptionUpdated and create it within the EventGraph. Now, within the EventGraph, whenever this event is called, changes will be made to the senses, and it will return the available sensed actors, as shown in the following screenshot: Now that we understand how we will obtain our referenced sensed actors, we should create a way for our pawn to maintain different states of being similar to what we would do in Behavior Tree. Let's first establish a home location for our pawn to run to when the player is no longer detected by the AI. In the same Blueprint folder, we will create a subclass of Target Point. Let's name this Waypoint and place it at an appropriate location within the world. Now, we need to open this Waypoint subclass and create additional variables to maintain traversable routes. We can do this by defining the next waypoint within a waypoint, allowing us to create what programmers call a linked list. This results in the AI being able to continuously move to the next available route after reaching the destination of its current route. With Waypoint open, add a new variable named NextWaypoint and make the type of this be the same as that of the Waypoint class we created. Navigate back to our Content Browser. Now, within our EnemyPatrol AIController, let's focus on Event Begin in EventGraph. We have to grab the reference to the waypoint we created earlier and store it within our AIController. So, let's create a new waypoint variable type and name it CurrentPoint. Now, on Event Begin Play, the first thing we need is the AIController, which is the self -reference for this EventGraph because we are in the AIController Class. So, let's grab our self-reference and check whether it is valid. Safety first! Next, we will get our AIController from our self-reference. Then, again for safety, let's check whether our AIController is valid.How does our AI sense? Next, we want to create a Get all Actors Of Class node and set the Actor class to Waypoint. Now, we need to convert a few instructions into a macro because we will use the instructions throughout the project. So, let's select the nodes shown as follows and hit convert to macro. Lastly, rename this variable getAIController. You can see the final nodes in the following screenshot: Next, we want our AI to grab a random new route and set it as a new variable. So, let's first get the length of the array of actors returned. Then, we want to subtract 1 from this length, and this will give us the range of our array. From there, we want to pull from Subtract and get Random Integer. Then, from our array, we want to get the Get node and pump our Random Integer node into the index to retrieve. Next, pull the returned available variable from the Get node and promote it to a local variable. This will automatically create the type dragged from thepin, and we want to rename this Current Point to understand why this variable exists. Then, from our getAIController macro, we want to assign the ReceiveMoveCompleted event. This is done so that when our AI successfully moves to the next route, we can update the information and tell our AI to move to the next route. We learned AI sensing in Unreal Engine 4 with the help of a system within Unreal Engine called AI Perception components. We also explored different components within that system. If you found this post useful, be sure to check out the book, ‘Unreal Engine 4 AI Programming Essentials’ for more concepts on AI sensing in Unreal Engine. Development Tricks with Unreal Engine 4 What’s new in Unreal Engine 4.19? Unreal Engine 4.20 released with focus on mobile and immersive (AR/VR/MR) devices    
Read more
  • 0
  • 0
  • 39120

article-image-build-a-custom-news-feed-with-python-tutorial
Prasad Ramesh
10 Sep 2018
13 min read
Save for later

Build a custom news feed with Python [Tutorial]

Prasad Ramesh
10 Sep 2018
13 min read
To create a model a custom news feed, we need data which can be trained. This training data will be fed into a model in order to teach it to discriminate between the articles that we'd be interested in and the ones that we would not. This article is an excerpt from a book written by Alexander T. Combs titled Python Machine Learning Blueprints: Intuitive data projects you can relate to. In this article, we will learn to build a custom news corpus and annotate a large number of articles corresponding to the interests respectively. You can download the code and other relevant files used in this article from this GitHub link. Creating a supervised training dataset Before we can create a model of our taste in news articles, we need training data. This training data will be fed into our model in order to teach it to discriminate between the articles that we'd be interested in and the ones that we would not. To build this corpus, we will need to annotate a large number of articles that correspond to these interests. For each article, we'll label it either “y” or “n”. This will indicate whether the article is the one that we would want to have sent to us in our daily digest or not. To simplify this process, we will use the Pocket app. Pocket is an application that allows you to save stories to read later. You simply install the browser extension, and then click on the Pocket icon in your browser's toolbar when you wish to save a story. The article is saved to your personal repository. One of the great features of Pocket for our purposes is its ability to save the article with a tag of your choosing. We'll use this feature to mark interesting articles as “y” and non-interesting articles as “n”. Installing the Pocket Chrome extension We use Google Chrome here, but other browsers should work similarly. For Chrome, go into the Google App Store and look for the Extensions section: Image from https://chrome.google.com/webstore/search/pocket Click on the blue Add to Chrome button. If you already have an account, log in, and if you do not have an account, go ahead and sign up (it's free). Once this is complete, you should see the Pocket icon in the upper right-hand corner of your browser. It will be greyed out, but once there is an article you wish to save, you can click on it. It will turn red once the article has been saved as seen in the following images. The greyed out icon can be seen in the upper right-hand corner. Image from https://news.ycombinator.com When the icon is clicked, it turns red to indicated the article has been saved.  Image from https://www.wsj.com Now comes the fun part! Begin saving all articles that you come across. Tag the interesting ones with “y”, and the non-interesting ones with “n”. This is going to take some work. Your end results will only be as good as your training set, so you're going to to need to do this for hundreds of articles. If you forget to tag an article when you save it, you can always go to the site, http://www.get.pocket.com, to tag it there. Using the Pocket API to retrieve stories Now that you've diligently saved your articles to Pocket, the next step is to retrieve them. To accomplish this, we'll use the Pocket API. You can sign up for an account at https://getpocket.com/developer/apps/new. Click on Create New App in the upper left-hand side and fill in the details to get your API key. Make sure to click all of the permissions so that you can add, change, and retrieve articles. Image from https://getpocket.com/developer Once you have filled this in and submitted it, you will receive your CONSUMER KEY. You can find this in the upper left-hand corner under My Apps. This will look like the following screen, but obviously with a real key: Image from https://getpocket.com/developer Once this is set, you are ready to move on the the next step, which is to set up the authorizations. It requires that you input your consumer key and a redirect URL. The redirect URL can be anything. Here I have used my Twitter account: import requests auth_params = {'consumer_key': 'MY_CONSUMER_KEY', 'redirect_uri': 'https://www.twitter.com/acombs'} tkn = requests.post('https://getpocket.com/v3/oauth/request', data=auth_params) tkn.content You will see the following output: The output will have the code that you'll need for the next step. Place the following in your browser bar: https://getpocket.com/auth/authorize?request_token=some_long_code&redir ect_uri=https%3A//www.twitter.com/acombs If you change the redirect URL to one of your own, make sure to URL encode it. There are a number of resources for this. One option is to use the Python library urllib, another is to use a free online source. At this point, you should be presented with an authorization screen. Go ahead and approve it, and we can move on to the next step: usr_params = {'consumer_key':'my_consumer_key', 'code': 'some_long_code'} usr = requests.post('https://getpocket.com/v3/oauth/authorize', data=usr_params) usr.content We'll use the following output code here to move on to retrieving the stories: First, we retrieve the stories tagged “n”: no_params = {'consumer_key':'my_consumer_key', 'access_token': 'some_super_long_code', 'tag': 'n'} no_result = requests.post('https://getpocket.com/v3/get', data=no_params) no_result.text The preceding code generates the following output: Note that we have a long JSON string on all the articles that we tagged “n”. There are several keys in this, but we are really only interested in the URL at this point. We'll go ahead and create a list of all the URLs from this: no_jf = json.loads(no_result.text) no_jd = no_jf['list'] no_urls=[] for i in no_jd.values(): no_urls.append(i.get('resolved_url')) no_urls The preceding code generates the following output: This list contains all the URLs of stories that we aren't interested in. Now, let's put this in a DataFrame object and tag it as such: import pandas no_uf = pd.DataFrame(no_urls, columns=['urls']) no_uf = no_uf.assign(wanted = lambda x: 'n') no_uf The preceding code generates the following output: Now, we're all set with the unwanted stories. Let's do the same thing with the stories that we are interested in: ye_params = {'consumer_key': 'my_consumer_key', 'access_token': 'some_super_long_token', 'tag': 'y'} yes_result = requests.post('https://getpocket.com/v3/get', data=yes_params) yes_jf = json.loads(yes_result.text) yes_jd = yes_jf['list'] yes_urls=[] for i in yes_jd.values(): yes_urls.append(i.get('resolved_url')) yes_uf = pd.DataFrame(yes_urls, columns=['urls']) yes_uf = yes_uf.assign(wanted = lambda x: 'y') yes_uf The preceding code generates the following output: Now that we have both types of stories for our training data, let's join them together into a single DataFrame: df = pd.concat([yes_uf, no_uf]) df.dropna(inplace=1) df The preceding code generates the following output: Now that we're set with all our URLs and their corresponding tags in a single frame, we'll move on to downloading the HTML for each article. We'll use another free service for this called embed.ly. Using the embed.ly API to download story bodies We have all the URLs for our stories, but unfortunately this isn't enough to train on. We'll need the full article body. By itself, this could become a huge challenge if we wanted to roll our own scraper, especially if we were going to be pulling stories from dozens of sites. We would need to write code to target the article body while carefully avoiding all the othersite gunk that surrounds it. Fortunately, there are a number of free services that will do this for us. We're going to use embed.ly to do this, but there are a number of other services that you also could use. The first step is to sign up for embed.ly API access. You can do this at https://app.embed.ly/signup. This is a straightforward process. Once you confirm your registration, you will receive an API key.. You need to just use this key in your HTTPrequest. Let's do this now: import urllib def get_html(x): qurl = urllib.parse.quote(x) rhtml = requests.get('https://api.embedly.com/1/extract?url=' + qurl + '&key=some_api_key') ctnt = json.loads(rhtml.text).get('content') return ctnt df.loc[:,'html'] = df['urls'].map(get_html) df.dropna(inplace=1) df The preceding code generates the following output: With that, we have the HTML of each story. As the content is embedded in HTML markup, and we want to feed plain text into our model, we'll use a parser to strip out the markup tags: from bs4 import BeautifulSoup def get_text(x): soup = BeautifulSoup(x, 'lxml') text = soup.get_text() return text df.loc[:,'text'] = df['html'].map(get_text) df The preceding code generates the following output: With this, we have our training set ready. We can now move on to a discussion of how to transform our text into something that a model can work with. Setting up your daily personal newsletter In order to set up a personal e-mail with news stories, we're going to utilize IFTTT again. Build an App to Find Cheap Airfares, we'll use the Maker Channel to send a POST request. However, this time the payload will be our news stories. If you haven't set up the Maker Channel, do this now. Instructions can be found in Chapter 3, Build an App to Find Cheap Airfares. You should also set up the Gmail channel. Once that is complete, we'll add a recipe to combine the two. First, click on Create a Recipe from the IFTTT home page. Then, search for the Maker Channel: Image from https://www.iftt.com Select this, then select Receive a web request: Image from https://www.iftt.com Then, give the request a name. I'm using news_event: Image from https://www.iftt.com Finish by clicking on Create Trigger. Next, click on that to set up the e-mail piece. Search for Gmail and click on the icon seen as follows: Image from https://www.iftt.com Once you have clicked on Gmail, click on Send an e-mail. From here, you can customize your e-mail message. Image from https://www.iftt.com Input your e-mail address, a subject line, and finally, include Value1 in the e-mail body. We will pass our story title and link into this with our POST request. Click on Create Recipe to finalize this. Now, we're ready to generate the script that will run on a schedule automatically sending us articles of interest. We're going to create a separate script for this, but one last thing that we need to do in our existing code is serialize our vectorizer and our model: import pickle pickle.dump(model, open (r'/Users/alexcombs/Downloads/news_model_pickle.p', 'wb')) pickle.dump(vect, open (r'/Users/alexcombs/Downloads/news_vect_pickle.p', 'wb')) With this, we have saved everything that we need from our model. In our new script, we will read these in to generate our new predictions. We're going to use the same scheduling library to run the code that we used in Chapter  3, Build an App to Find Cheap Airfares. Putting it all together, we have the following script:   # get our imports. import pandas as pd from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.svm import LinearSVC import schedule import time import pickle import json import gspread import requests from bs4 import BeautifulSoup from oauth2client.client import SignedJwtAssertionCredentials # create our fetching function def fetch_news(): try: vect = pickle.load(open(r'/Users/alexcombs/Downloads/news_vect_pickle.p', 'rb')) model = pickle.load(open(r'/Users/alexcombs/Downloads/news_model_pickle.p', 'rb')) json_key = json.load(open(r'/Users/alexcombs/Downloads/APIKEY.json')) scope = ['https://spreadsheets.google.com/feeds'] credentials = SignedJwtAssertionCredentials(json_key['client_email'], json_key['private_key'].encode(), scope) gc = gspread.authorize(credentials) ws = gc.open("NewStories") sh = ws.sheet1 zd = list(zip(sh.col_values(2), sh.col_values(3), sh.col_values(4))) zf = pd.DataFrame(zd, columns=['title', 'urls', 'html']) zf.replace('', pd.np.nan, inplace=True) zf.dropna(inplace=True) def get_text(x): soup = BeautifulSoup(x, 'lxml') text = soup.get_text() return text zf.loc[:, 'text'] = zf['html'].map(get_text) tv = vect.transform(zf['text']) res = model.predict(tv) rf = pd.DataFrame(res, columns=['wanted']) rez = pd.merge(rf, zf, left_index=True, right_index=True) news_str = '' for t, u in zip(rez[rez['wanted'] == 'y']['title'], rez[rez['wanted'] == 'y']['urls']): news_str = news_str + t + '\n' + u + '\n' payload = {"value1": news_str} r = requests.post('https://maker.ifttt.com/trigger/news_event/with/key/IFTTT_KE Y', data=payload) # cleanup worksheet lenv = len(sh.col_values(1)) cell_list = sh.range('A1:F' + str(lenv)) for cell in cell_list: cell.value = "" sh.update_cells(cell_list) print(r.text) except: print('Failed') schedule.every(480).minutes.do(fetch_news) while 1: schedule.run_pending() time.sleep(1) What this script will do is run every 4 hours, pull down the news stories from Google Sheets, run the stories through the model, generate an e-mail by sending a POST request to IFTTT for the stories that are predicted to be of interest, and then finally, it will clear out the stories in the spreadsheet so that only new stories get sent in the next e-mail. Congratulations! You now have your own personalize news feed! In this tutorial we learned how to create a custom news feed, to know more about setting it up and other intuitive Python projects, check out Python Machine Learning Blueprints: Intuitive data projects you can relate to. Writing web services with functional Python programming [Tutorial] Visualizing data in R and Python using Anaconda [Tutorial] Python 3.7 beta is available as the second generation Google App Engine standard runtime
Read more
  • 0
  • 0
  • 33853

article-image-implementing-dependency-injection-google-guice
Natasha Mathur
09 Sep 2018
10 min read
Save for later

Implementing Dependency Injection in Google Guice [Tutorial]

Natasha Mathur
09 Sep 2018
10 min read
Choosing a framework wisely is important when implementing Dependency Injection as each framework has its own advantages and disadvantages. There are various Java-based dependency injection frameworks available in the open source community, such as Dagger, Google Guice, Spring DI, JAVA EE 8 DI, and PicoContainer. In this article we will learn about Google Guice (pronounced juice), a lightweight DI framework that helps developers to modularize applications. Guice encapsulates annotation and generics features introduced by Java 5 to make code type-safe. It enables objects to wire together and tests with fewer efforts. Annotations help you to write error-prone and reusable code. This tutorial is an excerpt taken from the book  'Java 9 Dependency Injection', written by Krunal Patel, Nilang Patel. In Guice, the new keyword is replaced with @inject for injecting dependency. It allows constructors, fields, and methods (any method with multiple numbers of arguments) level injections. Using Guice, we can define custom scopes and circular dependency. It also has features to integrate with Spring and AOP interception. Moreover, Guice also implements Java Specification Request (JSR) 330, and uses the standard annotation provided by JSR-330. The first version of Guice was introduced by Google in 2007 and the latest version is Guice 4.1. Before we see how dependency injection gets implemented in Guice, let's first setup Guice. Guice setup To make our coding simple, throughout this tutorial, we are going to use a Maven project to understand Guice DI.  Let’s create a simple Maven project using the following parameters: groupid:, com.packt.guice.id, artifactId : chapter4, and version : 0.0.1-SNAPSHOT. By adding Guice 4.1.0 dependency on the pom.xml file, our final pom.xml will look like this: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.packt.guice.di</groupId> <artifactId>chapter4</artifactId> <packaging>jar</packaging> <version>0.0.1-SNAPSHOT</version> <name>chapter4</name> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency> <dependency> <groupId>com.google.inject</groupId> <artifactId>guice</artifactId> <version>4.1.0</version> </dependency> </dependencies> <build> <finalName>chapter2</finalName> </build> </project> For this tutorial, we have used JDK 9, but not as a module project because the Guice library is not available as a Java 9 modular jar. Basic injection in Guice We have set up Guice, now it is time to understand how injection works in Guice. Let's rewrite the example of a notification system using Guice, and along with that, we will see several indispensable interfaces and classes in Guice.  We have a base interface called NotificationService, which is expecting a message and recipient details as arguments: public interface NotificationService { boolean sendNotification(String message, String recipient); } The SMSService concrete class is an implementation of the NotificationService interface. Here, we will apply the @Singleton annotation to the implementation class. When you consider that service objects will be made through injector classes, this annotation is furnished to allow them to understand that the service class ought to be a singleton object. Because of JSR-330 support in Guice, annotation, either from javax.inject or the com.google.inject package, can be used: import javax.inject.Singleton; import com.packt.guice.di.service.NotificationService; @Singleton public class SMSService implements NotificationService { public boolean sendNotification(String message, String recipient) { // Write code for sending SMS System.out.println("SMS has been sent to " + recipient); return true; } } In the same way, we can also implement another service, such as sending notifications to a social media platform, by implementing the NotificationService interface. It's time to define the consumer class, where we can initialize the service class for the application. In Guice, the @Inject annotation will be used to define setter-based as well as constructor-based dependency injection. An instance of this class is utilized to send notifications by means of the accessible correspondence services. Our AppConsumer class defines setter-based injection as follows: import javax.inject.Inject; import com.packt.guice.di.service.NotificationService; public class AppConsumer { private NotificationService notificationService; //Setter based DI @Inject public void setService(NotificationService service) { this.notificationService = service; } public boolean sendNotification(String message, String recipient){ //Business logic return notificationService.sendNotification(message, recipient); } } Guice needs to recognize which service implementation to apply, so we should configure it with the aid of extending the AbstractModule class, and offer an implementation for the configure() method. Here is an example of an injector configuration: import com.google.inject.AbstractModule; import com.packt.guice.di.impl.SMSService; import com.packt.guice.di.service.NotificationService; public class ApplicationModule extends AbstractModule{ @Override protected void configure() { //bind service to implementation class bind(NotificationService.class).to(SMSService.class); } } In the previous class, the module implementation determines that an instance of SMSService is to be injected into any place a NotificationService variable is determined. In the same way, we just need to define a binding for the new service implementation, if required. Binding in Guice is similar to wiring in Spring: import com.google.inject.Guice; import com.google.inject.Injector; import com.packt.guice.di.consumer.AppConsumer; import com.packt.guice.di.injector.ApplicationModule; public class NotificationClient { public static void main(String[] args) { Injector injector = Guice.createInjector(new ApplicationModule()); AppConsumer app = injector.getInstance(AppConsumer.class); app.sendNotification("Hello", "9999999999"); } } In the previous program, the  Injector object is created using the Guice class's createInjector() method, by passing the ApplicationModule class's implementation object. By using the injector's getInstance() method, we can initialize the AppConsumer class. At the same time as creating the AppConsumer's objects, Guice injects the needy service class implementation (SMSService, in our case). The following is the yield of running the previous code: SMS has been sent to Recipient :: 9999999999 with Message :: Hello So, this is how Guice dependency injection works compared to other DI. Guice has embraced a code-first technique for dependency injection, and management of numerous XML is not required. Let's test our client application by writing a JUnit test case. We can simply mock the service implementation of SMSService, so there is no need to implement the actual service. The MockSMSService class looks like this: import com.packt.guice.di.service.NotificationService; public class MockSMSService implements NotificationService { public boolean sendNotification(String message, String recipient) { System.out.println("In Test Service :: " + message + "Recipient :: " + recipient); return true; } } The following is the JUnit 4 test case for the client application: import org.junit.After; import org.junit.Assert; import org.junit.Before; import org.junit.Test; import com.google.inject.AbstractModule; import com.google.inject.Guice; import com.google.inject.Injector; import com.packt.guice.di.consumer.AppConsumer; import com.packt.guice.di.impl.MockSMSService; import com.packt.guice.di.service.NotificationService; public class NotificationClientTest { private Injector injector; @Before public void setUp() throws Exception { injector = Guice.createInjector(new AbstractModule() { @Override protected void configure() { bind(NotificationService.class).to(MockSMSService.class); } }); } @After public void tearDown() throws Exception { injector = null; } @Test public void test() { AppConsumer appTest = injector.getInstance(AppConsumer.class); Assert.assertEquals(true, appTest.sendNotification("Hello There", "9898989898"));; } } Take note that we are binding the MockSMSService class to NotificationService by having an anonymous class implementation of AbstractModule. This is done in the setUp() method, which runs for some time before the test methods run. Guice dependency injection As we know what dependency injection is, let us explore how Google Guice provides injection. We have seen that the injector helps to resolve dependencies by reading configurations from modules, which are called bindings. Injector is preparing charts for the requested objects. Dependency injection is managed by injectors using various types of injection: Constructor injection Method injection Field injection Optional injection Static injection Constructor Injection Constructor injection can be achieved  by using the @Inject annotation at the constructor level. This constructor ought to acknowledge class dependencies as arguments. Multiple constructors will, at that point, assign the arguments to their final fields: public class AppConsumer { private NotificationService notificationService; //Constructor level Injection @Inject public AppConsumer(NotificationService service){ this.notificationService=service; } public boolean sendNotification(String message, String recipient){ //Business logic return notificationService.sendNotification(message, recipient); } } If our class does not have a constructor with @Inject, then it will be considered a default constructor with no arguments. When we have a single constructor and the class accepts its dependency, at that time the constructor injection works perfectly and is helpful for unit testing. It is also easy because Java is maintaining the constructor invocation, so you don't have to stress about objects arriving in an uninitialized state. Method injection Guice allows us to define injection at the method level by annotating methods with the @Inject annotation. This is similar to the setter injection available in Spring. In this approach, dependencies are passed as parameters, and are resolved by the injector before invocation of the method. The name of the method and the number of parameters does not affect the method injection: private NotificationService notificationService; //Setter Injection @Inject public void setService(NotificationService service) { this.notificationService = service; } This could be valuable when we don't want to control instantiation of classes. We can, moreover, utilize it in case you have a super class that needs a few dependencies. (This is difficult to achieve in a constructor injection.) Field injection Fields can be injected by the @Inject annotation in Guice. This is a simple and short injection, but makes the field untestable if used with the private access modifier. It is advisable to avoid the following: @Inject private NotificationService notificationService; Optional injection Guice provides a way to declare an injection as optional. The method and field might be optional, which causes Guice to quietly overlook them when the dependencies aren't accessible. Optional injection can be used by mentioning the @Inject(optional=true) annotation: public class AppConsumer { private static final String DEFAULT_MSG = "Hello"; private string message = DEFAULT_MSG; @Inject(optional=true) public void setDefaultMessage(@Named("SMS") String message) { this.message = message; } } Static injection Static injection is helpful when we have to migrate a static factory implementation into Guice. It makes it feasible for objects to mostly take part in dependency injection by picking up access to injected types without being injected themselves. In a module, to indicate classes to be injected on injector creation, use requestStaticInjection(). For example,  NotificationUtil is a utility class that provides a static method, timeZoneFormat, to a string in a given format, and returns the date and timezone. The TimeZoneFormat string is hardcoded in NotificationUtil, and we will attempt to inject this utility class statically. Consider that we have one private static string variable, timeZonFmt, with setter and getter methods. We will use @Inject for the setter injection, using the @Named parameter. NotificationUtil will look like this: @Inject static String timezonFmt = "yyyy-MM-dd'T'HH:mm:ss"; @Inject public static void setTimeZoneFmt(@Named("timeZoneFmt")String timeZoneFmt){ NotificationUtil.timeZoneFormat = timeZoneFmt; } Now, SMSUtilModule should look like this: class SMSUtilModule extends AbstractModule{ @Override protected void configure() { bindConstant().annotatedWith(Names.named(timeZoneFmt)).to(yyyy-MM-dd'T'HH:mm:ss); requestStaticInjection(NotificationUtil.class); } } This API is not suggested for common utilization since it faces many of the same issues as static factories. It is also difficult to test and it makes dependencies uncertain. To sum up, what we learned in this tutorial, we began with basic dependency injection then we learned how basic Dependency Injection works in Guice, with examples. If you found this post useful, be sure to check out the book  'Java 9 Dependency Injection' to learn more about Google Guice and other concepts in dependency injection. Learning Dependency Injection (DI) Angular 2 Dependency Injection: A powerful design pattern
Read more
  • 0
  • 6
  • 42807

article-image-building-recommendation-system-with-scala-and-apache-spark-tutorial
Savia Lobo
08 Sep 2018
12 min read
Save for later

Building Recommendation System with Scala and Apache Spark [Tutorial]

Savia Lobo
08 Sep 2018
12 min read
Recommendation systems can be defined as software applications that draw out and learn from data such as preferences, their actions (clicks, for example), browsing history, and generated recommendations, which are products that the system determines are appealing to the user in the immediate future. In this tutorial, we will learn to build a recommendation system with Scala and Apache Spark. This article is an excerpt taken from Modern Scala Projects written Ilango Gurusamy. What does a recommendation system look like The following diagram is representative of a typical recommendation system: Recommendation system In the preceding diagram, can be thought of as a recommendation ecosystem, where the recommendation system is at the heart of it. This system needs three entities: Users Products Transactions between users and products where transactions contain feedback from users about products Implementation and deployment Implementation is documented in the following subsections. All code is developed in an Intellij code editor. The very first step is to create an empty Scala project called Chapter7. Step 1 – creating the Scala project Let's create a Scala project called Chapter7 with the following artifacts: RecommendationSystem.scala RecommendationWrapper.scala Let's break down the project's structure: .idea: Generated IntelliJ configuration files. project: Contains build.properties and plugins.sbt. project/assembly.sbt: This file specifies the sbt-assembly plugin needed to build a fat JAR for deployment. src/main/scala: This is a folder that houses Scala source files in the com.packt.modern.chapter7 package. target: This is where artifacts of the compile process are stored. The generated assembly JAR file goes here. build.sbt: This is the main SBT configuration file. Spark and its dependencies are specified here. At this point, we will start developing code in the IntelliJ code editor. We will start with the AirlineWrapper Scala file and end with the deployment of the final application JAR into Spark with spark-submit. Step 2 – creating the AirlineWrapper definition Let's create the trait definition. The trait will hold the SparkSession variable, schema definitions for the datasets, and methods to build a dataframe: trait RecWrapper { } Next, let's create a schema for past weapon sales orders. Step 3 – creating a weapon sales orders schema Let's create a schema for the past sales order dataset: val salesOrderSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true), StructField("sItemUnitPrice",DoubleType,true), StructField("sOrderSize", DoubleType,true), StructField("sAmountPaid", DoubleType,true) )) Next, let's create a schema for weapon sales leads. Step 4 – creating a weapon sales leads schema Here is a schema definition for the weapon sales lead dataset: val salesLeadSchema: StructType = StructType(Array( StructField("sCustomerId", IntegerType,false), StructField("sCustomerName", StringType,false), StructField("sItemId", IntegerType,true), StructField("sItemName", StringType,true) )) Next, let's build a weapon sales order dataframe. Step 5 – building a weapon sales order dataframe Let's invoke the read method on our SparkSession instance and cache it. We will call this method later from the RecSystem object: def buildSalesOrders(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesOrderSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } Next up, let's build a sales leads dataframe: def buildSalesLeads(dataSet: String): DataFrame = { session.read .format("com.databricks.spark.csv") .option("header", true).schema(salesLeadSchema).option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(dataSet).cache() } This completes the trait. Overall, it looks like this: trait RecWrapper { 1) Create a lazy SparkSession instance and call it session. 2) Create a schema for the past sales orders dataset 3) Create a schema for sales lead dataset 4) Write a method to create a dataframe that holds past sales order data. This method takes in sales order dataset and returns a dataframe 5) Write a method to create a dataframe that holds lead sales data } Bring in the following imports: import org.apache.spark.mllib.recommendation.{ALS, Rating} import org.apache.spark.rdd.RDD import org.apache.spark.sql.{DataFrame, Dataset, SparkSession} Create a Scala object called RecSystem: object RecSystem extends App with RecWrapper { } Before going any further, bring in the following imports: import org.apache.spark.rdd.RDD import org.apache.spark.sql.DataFrame Inside this object, start by loading the past sales order data. This will be our training data. Load the sales order dataset, as follows: val salesOrdersDf = buildSalesOrders("sales\\PastWeaponSalesOrders.csv") Verify the schema. This is what the schema looks like: salesOrdersDf.printSchema() root |-- sCustomerId: integer (nullable = true) |-- sCustomerName: string (nullable = true) |-- sItemId: integer (nullable = true) |-- sItemName: string (nullable = true) |-- sItemUnitPrice: double (nullable = true) |-- sOrderSize: double (nullable = true) |-- sAmountPaid: double (nullable = true) Here is a partial view of a dataframe displaying past weapon sales order data: Partial view of dataframe displaying past weapon sales order data Now, we have what we need to create a dataframe of ratings: val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => Rating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save all and compile the project at the command line: C:\Path\To\Your\Project\Chapter7>sbt compile You are likely to run into the following error: [error] C:\Path\To\Your\Project\Chapter7\src\main\scala\com\packt\modern\chapter7\RecSystem.scala:50:50: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. [error] val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => [error] ^ [error] two errors found [error] (compile:compileIncremental) Compilation failed To fix this, place the following statement at the top of the declarations of the rating dataframe. It should look like this: import session.implicits._ val ratingsDf: DataFrame = salesOrdersDf.map( salesOrder => UserRating( salesOrder.getInt(0), salesOrder.getInt(2), salesOrder.getDouble(6) ) ).toDF("user", "item", "rating") Save and recompile the project. This time, it compiles just fine. Next, import the Rating class from the org.apache.spark.mllib.recommendation package. This transforms the rating dataframe that we obtained previously to its RDD equivalent: val ratings: RDD[Rating] = ratingsDf.rdd.map( row => Rating( row.getInt(0), row.getInt(1), row.getDouble(2) ) ) println("Ratings RDD is: " + ratings.take(10).mkString(" ") ) The following few lines of code are very important. We will be using the ALS algorithm from Spark MLlib to create and train a MatrixFactorizationModel, which takes an RDD[Rating] object as input. The ALS train method may require a combination of the following training hyperparameters: numBlocks: Preset to -1 in an auto-configuration setting. This parameter is meant to parallelize computation. custRank: The number of features, otherwise known as latent factors. iterations: This parameter represents the number of iterations for ALS to execute. For a reasonable solution to converge on, this algorithm needs roughly 20 iterations or less. regParam: The regularization parameter. implicitPrefs: This hyperparameter is a specifier. It lets us use either of the following: Explicit feedback Implicit feedback alpha: This is a hyperparameter connected to an implicit feedback variant of the ALS algorithm. Its role is to govern the baseline confidence in preference observations. We just explained the role played by each parameter needed by the ALS algorithm's train method. Let's get started by bringing in the following imports: import org.apache.spark.mllib.recommendation.MatrixFactorizationModel Now, let's get down to training the matrix factorization model using the ALS algorithm. Let's train a matrix factorization model given an RDD of ratings by customers (users) for certain items (products). Our train method on the ALS algorithm will take the following four parameters: Ratings. A rank. A number of iterations. A Lambda value or regularization parameter: val ratingsModel: MatrixFactorizationModel = ALS.train(ratings, 6, /* THE RANK */ 10, /* Number of iterations */ 15.0 /* Lambda, or regularization parameter */ ) Next, we load the sales lead file and convert it into a tuple format: val weaponSalesLeadDf = buildSalesLeads("sales\\ItemSalesLeads.csv") In the next section, we will display the new weapon sales lead dataframe. Step 6 – displaying the weapons sales dataframe First, we must invoke the show method: println("Weapons Sales Lead dataframe is: ") weaponSalesLeadDf.show   Here is a view of the weapon sales lead dataframe: View of weapon sales lead dataframe Next, create a version of the sales lead dataframe structured as (customer, item) tuples: val customerWeaponsSystemPairDf: DataFrame = weaponSalesLeadDf.map(salesLead => ( salesLead.getInt(0), salesLead.getInt(2) )).toDF("user","item") In the next section, let's display the dataframe that we just created. Step 7 – displaying the customer-weapons-system dataframe Let's the show method, as follows: println("The Customer-Weapons System dataframe as tuple pairs looks like: ") customerWeaponsSystemPairDf.show   Here is a screenshot of the new customer-weapons-system dataframe as tuple pairs: New customer-weapons-system dataframe as tuple pairs Next, we will convert the preceding dataframe into an RDD: val customerWeaponsSystemPairRDD: RDD[(Int, Int)] = customerWeaponsSystemDf.rdd.map(row => (row.getInt(0), row.getInt(1)) ) /* Notes: As far as the algorithm is concerned, customer corresponds to "user" and "product" or item corresponds to a "weapons system" */ We previously created a MatrixFactorization model that we trained with the weapons system sales orders dataset. We are in a position to predict how each customer country may rate a weapon system in the future. In the next section, we will generate predictions.   Step 8 – generating predictions Here is how we will generate predictions. The predict method of our model is designed to do just that. It will generate a predictions RDD that we call weaponRecs. It represents the ratings of weapons systems that were not rated by customer nations (listed in the past sales order data) previously: val weaponRecs: RDD[Rating] = ratingsModel.predict(customerWeaponsSystemPairRDD).distinct() Next up, we will display the final predictions. Step 9 – displaying predictions Here is how to display the predictions, lined up in tabular format: println("Future ratings are: " + weaponRecs.foreach(rating => { println( "Customer: " + rating.user + " Product: " + rating.product + " Rating: " + rating.rating ) } ) ) The following table displays how each nation is expected to rate a certain system in the future, that is, a weapon system that they did not rate earlier: System rating by each nation Our recommendation system proved itself capable of generating future predictions. Up until now, we did not say how all of the preceding code is compiled and deployed. We will look at this in the next section. Compilation and deployment Compiling the project Invoke the sbt compile project at the root folder of your Chapter7 project. You should get the following output: Output on compiling the project Besides loading build.sbt, the compile task is also loading settings from assembly.sbt which we will create below. What is an assembly.sbt file? We have not yet talked about the assembly.sbt file. Our scala-based Spark application is a Spark job that will be submitted to a (local) Spark cluster as a JAR file. This file, apart from Spark libraries, also needs other dependencies in it for our recommendation system job to successfully complete. The name fat JAR is from all dependencies bundled in one JAR. To build such a fat JAR, we need an sbt-assembly plugin. This explains the need for creating a new assembly.sbt and the assembly plugin. Creating assembly.sbt Create a new assembly.sbt in your IntelliJ project view and save it under your project folder, as follows: Creating assembly.sbt Contents of assembly.sbt Paste the following contents into the newly created assembly.sbt (under the project folder). The output should look like this: Output on placing contents of assembly.sbt The sbt-assembly plugin, version 0.14.7, gives us the ability to run an sbt-assembly task. With that, we are one step closer to building a fat or Uber JAR. This action is documented in the next step. Running the sbt assembly task Issue the sbt assembly command, as follows: Running the sbt assembly command This time, the assembly task loads the assembly-plugin in assembly.sbt. However, further assembly halts because of a common duplicate error. This error arises due to several duplicates, multiple copies of dependency files that need removal before the assembly task can successfully complete. To address this situation, build.sbt needs an upgrade. Upgrading the build.sbt file The following lines of code need to be added in, as follows: Code lines for upgrading the build.sbt file To test the effect of your changes, save this and go to the command line to reissue the sbt assembly task. Rerunning the assembly command Run the assembly task, as follows: Rerunning the assembly task This time, the settings in the assembly.sbt file are loaded. The task completes successfully. To verify, drill down to the target folder. If everything went well, you should see a fat JAR, as follows: Output as a JAR file Our JAR file under the target folder is the recommendation system application's JAR file that needs to be deployed into Spark. This is documented in the next step. Deploying the recommendation application The spark-submit command is how we will deploy the application into Spark. Here are two formats for the spark-submit command. The first one is a long one which sets more parameters than the second one: spark-submit --class "com.packt.modern.chapter7.RecSystem" --master local[2] --deploy-mode client --driver-memory 16g -num-executors 2 --executor-memory 2g --executor-cores 2 <path-to-jar> Leaning on the preceding format, let's submit our Spark job, supplying various parameters to it: Parameters for Spark The different parameters are explained as follows:    Tabular explanation of parameters for Spark Job We used Spark's support for recommendations to build a prediction model that generated recommendations and leveraged Spark's alternating least squares algorithm to implement our collaborative filtering recommendation system. If you've enjoyed reading this post, do check out the book  Modern Scala Projects to gain insights into data that will help organizations have a strategic and competitive advantage. How to Build a music recommendation system with PageRank Algorithm Recommendation Systems Building A Recommendation System with Azure
Read more
  • 0
  • 0
  • 16905
article-image-building-a-twitter-news-bot-using-twitter-api-tutorial
Bhagyashree R
07 Sep 2018
11 min read
Save for later

Building a Twitter news bot using Twitter API [Tutorial]

Bhagyashree R
07 Sep 2018
11 min read
This article is an excerpt from a book written by Srini Janarthanam titled Hands-On Chatbots and Conversational UI Development. In this article, we will explore the Twitter API and build core modules for tweeting, searching, and retweeting. We will further explore a data source for news around the globe and build a simple bot that tweets top news on its timeline. Getting started with the Twitter app To get started, let us explore the Twitter developer platform. Let us begin by building a Twitter app and later explore how we can tweet news articles to followers based on their interests: Log on to Twitter. If you don't have an account on Twitter, create one. Go to Twitter Apps, which is Twitter's application management dashboard. Click the Create New App button: Create an application by filling in the form providing name, description, and a website (fully-qualified URL). Read and agree to the Developer Agreement and hit Create your Twitter application: You will now see your application dashboard. Explore the tabs: Click Keys and Access Tokens: Copy consumer key and consumer secret and hang on to them. Scroll down to Your Access Token: Click Create my access token to create a new token for your app: Copy the Access Token and Access Token Secret and hang on to them. Now, we have all the keys and tokens we need to create a Twitter app. Building your first Twitter bot Let's build a simple Twitter bot. This bot will listen to tweets and pick out those that have a particular hashtag. All the tweets with a given hashtag will be printed on the console. This is a very simple bot to help us get started. In the following sections, we will explore more complex bots. To follow along you can download the code from the book's GitHub repository. Go to the root directory and create a new Node.js program using npm init: Execute the npm install twitter --save command to install the Twitter Node.js library: Run npm install request --save to install the Request library as well. We will use this in the future to make HTTP GET requests to a news data source. Explore your package.json file in the root directory: { "name": "twitterbot", "version": "1.0.0", "description": "my news bot", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "request": "^2.81.0", "twitter": "^1.7.1" } } Create an index.js file with the following code: //index.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); In the preceding code, put the keys and tokens you saved in their appropriate variables. We don't need the request package just yet, but we will later. Now let's create a hashtag listener to listen to the tweets on a specific hashtag: //Twitter stream var hashtag = '#brexit'; //put any hashtag to listen e.g. #brexit console.log('Listening to:' + hashtag); Twitter.stream('statuses/filter', {track: hashtag}, function(stream) { stream.on('data', function(tweet) { console.log('Tweet:@' + tweet.user.screen_name + '\t' + tweet.text); console.log('------') }); stream.on('error', function(error) { console.log(error); }); }); Replace #brexit with the hashtag you want to listen to. Use a popular one so that you can see the code in action. Run the index.js file with the node index.js command. You will see a stream of tweets from Twitter users all over the globe who used the hashtag: Congratulations! You have built your first Twitter bot. Exploring the Twitter SDK In the previous section, we explored how to listen to tweets based on hashtags. Let's now explore the Twitter SDK to understand the capabilities that we can bestow upon our Twitter bot. Updating your status You can also update your status on your Twitter timeline by using the following status update module code: tweet ('I am a Twitter Bot!', null, null); function tweet(statusMsg, screen_name, status_id){ console.log('Sending tweet to: ' + screen_name); console.log('In response to:' + status_id); var msg = statusMsg; if (screen_name != null){ msg = '@' + screen_name + ' ' + statusMsg; } console.log('Tweet:' + msg); Twitter.post('statuses/update', { status: msg }, function(err, response) { // if there was an error while tweeting if (err) { console.log('Something went wrong while TWEETING...'); console.log(err); } else if (response) { console.log('Tweeted!!!'); console.log(response) } }); } Comment out the hashtag listener code and instead add the preceding status update code and run it. When run, your bot will post a tweet on your timeline: In addition to tweeting on your timeline, you can also tweet in response to another tweet (or status update). The screen_name argument is used to create a response. tweet. screen_name is the name of the user who posted the tweet. We will explore this a bit later. Retweet to your followers You can retweet a tweet to your followers using the following retweet status code: var retweetId = '899681279343570944'; retweet(retweetId); function retweet(retweetId){ Twitter.post('statuses/retweet/', { id: retweetId }, function(err, response) { if (err) { console.log('Something went wrong while RETWEETING...'); console.log(err); } else if (response) { console.log('Retweeted!!!'); console.log(response) } }); } Searching for tweets You can also search for recent or popular tweets with hashtags using the following search hashtags code: search('#brexit', 'popular') function search(hashtag, resultType){ var params = { q: hashtag, // REQUIRED result_type: resultType, lang: 'en' } Twitter.get('search/tweets', params, function(err, data) { if (!err) { console.log('Found tweets: ' + data.statuses.length); console.log('First one: ' + data.statuses[1].text); } else { console.log('Something went wrong while SEARCHING...'); } }); } Exploring a news data service Let's now build a bot that will tweet news articles to its followers at regular intervals. We will then extend it to be personalized by users through a conversation that happens over direct messaging with the bot. In order to build a news bot, we need a source where we can get news articles. We are going to explore a news service called NewsAPI.org in this section. News API is a service that aggregates news articles from roughly 70 newspapers around the globe. Setting up News API Let us set up an account with the News API data service and get the API key: Go to NewsAPI.org: Click Get API key. Register using your email. Get your API key. Explore the sources: https://newsapi.org/v1/sources?apiKey=YOUR_API_KEY. There are about 70 sources from across the globe including popular ones such as BBC News, Associated Press, Bloomberg, and CNN. You might notice that each source has a category tag attached. The possible options are: business, entertainment, gaming, general, music, politics, science-and-nature, sport, and technology. You might also notice that each source also has language (en, de, fr) and country (au, de, gb, in, it, us) tags. The following is the information on the BBC-News source: { "id": "bbc-news", "name": "BBC News", "description": "Use BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news.", "url": "http://www.bbc.co.uk/news", "category": "general", "language": "en", "country": "gb", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] } Get sources for a specific category, language, or country using: https://newsapi.org/v1/sources?category=business&apiKey=YOUR_API_KEY The following is the part of the response to the preceding query asking for all sources under the business category: "sources": [ { "id": "bloomberg", "name": "Bloomberg", "description": "Bloomberg delivers business and markets news, data, analysis, and video to the world, featuring stories from Businessweek and Bloomberg News.", "url": "http://www.bloomberg.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top" ] }, { "id": "business-insider", "name": "Business Insider", "description": "Business Insider is a fast-growing business site with deep financial, media, tech, and other industry verticals. Launched in 2007, the site is now the largest business news site on the web.", "url": "http://www.businessinsider.com", "category": "business", "language": "en", "country": "us", "urlsToLogos": { "small": "", "medium": "", "large": "" }, "sortBysAvailable": [ "top", "latest" ] }, ... ] Explore the articles: https://newsapi.org/v1/articles?source=bbc-news&apiKey=YOUR_API_KEY The following is the sample response: "articles": [ { "author": "BBC News", "title": "US Navy collision: Remains found in hunt for missing sailors", "description": "Ten US sailors have been missing since Monday's collision with a tanker near Singapore.", "url": "http://www.bbc.co.uk/news/world-us-canada-41013686", "urlToImage": "https://ichef1.bbci.co.uk/news/1024/cpsprodpb/80D9/ production/_97458923_mediaitem97458918.jpg", "publishedAt": "2017-08-22T12:23:56Z" }, { "author": "BBC News", "title": "Afghanistan hails Trump support in 'joint struggle'", "description": "President Ghani thanks Donald Trump for supporting Afghanistan's battle against the Taliban.", "url": "http://www.bbc.co.uk/news/world-asia-41012617", "urlToImage": "https://ichef.bbci.co.uk/images/ic/1024x576/p05d08pf.jpg", "publishedAt": "2017-08-22T11:45:49Z" }, ... ] For each article, the author, title, description, url, urlToImage,, and publishedAt fields are provided. Now that we have explored a source of news data that provides up-to-date news stories under various categories, let us go on to build a news bot. Building a Twitter news bot Now that we have explored News API, a data source for the latest news updates, and a little bit of what the Twitter API can do, let us combine them both to build a bot tweeting interesting news stories, first on its own timeline and then specifically to each of its followers: Let's build a news tweeter module that tweets the top news article given the source. The following code uses the tweet() function we built earlier: topNewsTweeter('cnn', null); function topNewsTweeter(newsSource, screen_name, status_id){ request({ url: 'https://newsapi.org/v1/articles?source=' + newsSource + '&apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { var botResponse = JSON.parse(body); console.log(botResponse); tweetTopArticle(botResponse.articles, screen_name); } else { console.log('Sorry. No new'); } }); } function tweetTopArticle(articles, screen_name, status_id){ var article = articles[0]; tweet(article.title + " " + article.url, screen_name); } Run the preceding program to fetch news from CNN and post the topmost article on Twitter: Here is the post on Twitter: Now, let us build a module that tweets news stories from a randomly-chosen source in a list of sources: function tweetFromRandomSource(sources, screen_name, status_id){ var max = sources.length; var randomSource = sources[Math.floor(Math.random() * (max + 1))]; //topNewsTweeter(randomSource, screen_name, status_id); } Let's call the tweeting module after we acquire the list of sources: function getAllSourcesAndTweet(){ var sources = []; console.log('getting sources...') request({ url: 'https://newsapi.org/v1/sources? apiKey=YOUR_API_KEY', method: 'GET' }, function (error, response, body) { //response is from the bot if (!error && response.statusCode == 200) { // Print out the response body var botResponse = JSON.parse(body); for (var i = 0; i < botResponse.sources.length; i++){ console.log('adding.. ' + botResponse.sources[i].id) sources.push(botResponse.sources[i].id) } tweetFromRandomSource(sources, null, null); } else { console.log('Sorry. No news sources!'); } }); } Let's create a new JS file called tweeter.js. In the tweeter.js file, call getSourcesAndTweet() to get the process started: //tweeter.js var TwitterPackage = require('twitter'); var request = require('request'); console.log("Hello World! I am a twitter bot!"); var secret = { consumer_key: 'YOUR_CONSUMER_KEY', consumer_secret: 'YOUR_CONSUMER_SECRET', access_token_key: 'YOUR_ACCESS_TOKEN_KEY', access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET' } var Twitter = new TwitterPackage(secret); getAllSourcesAndTweet(); Run the tweeter.js file on the console. This bot will tweet a news story every time it is called. It will choose top news stories from around 70 news sources randomly. Hurray! You have built your very own Twitter news bot. In this tutorial, we have covered a lot. We started off with the Twitter API and got a taste of how we can automatically tweet, retweet, and search for tweets using hashtags. We then explored a News source API that provides news articles from about 70 different newspapers. We integrated it with our Twitter bot to create a new tweeting bot. If you found this post useful, do check out the book, Hands-On Chatbots and Conversational UI Development, which will help you explore the world of conversational user interfaces. Build and train an RNN chatbot using TensorFlow [Tutorial] Building a two-way interactive chatbot with Twilio: A step-by-step guide How to create a conversational assistant or chatbot using Python
Read more
  • 0
  • 1
  • 52341

article-image-classifying-flowers-in-iris-dataset-using-scala-tutorial
Savia Lobo
06 Sep 2018
15 min read
Save for later

Classifying flowers in Iris Dataset using Scala [Tutorial]

Savia Lobo
06 Sep 2018
15 min read
The Iris dataset is the simplest, yet the most famous data analysis task in the ML space. In this article, you will build a solution for data analysis & classification task from an Iris dataset using Scala. This article is an excerpt taken from Modern Scala Projects written by Ilango Gurusamy. The following diagrams together help in understanding the different components of this project. That said, this pipeline involves training (fitting), transformation, and validation operations. More than one model is trained and the best model (or mapping function) is selected to give us an accurate approximation predicting the species of an Iris flower (based on measurements of those flowers): Project block diagram A breakdown of the project block diagram is as follows: Spark, which represents the Spark cluster and its ecosystem Training dataset Model Dataset attributes or feature measurements An inference process, that produces a prediction column The following diagram represents a more detailed description of the different phases in terms of the functions performed in each phase. Later we will come to visualize pipeline in terms of its constituent stages. For now, the diagram depicts four stages, starting with a data pre-processing phase, which is considered separate from the numbered phases deliberately. Think of the pipeline as a two-step process:  A data cleansing phase, or pre-processing phase. An important phase that could include a subphase of Exploratory Data Analysis (EDA) (not explicitly depicted in the latter diagram). A data analysis phase that begins with Feature Extraction, followed by Model Fitting, and Model validation, all the way to deployment of an Uber pipeline JAR into Spark: Pipeline diagram Referring to the preceding diagram, the first implementation objective is to set up Spark inside an SBT project. An SBT project is a self-contained application, which we can run on the command line to predict Iris labels. In the SBT project,  dependencies are specified in a build.sbt file and our application code will create its  own  SparkSession and SparkContext. So that brings us to a listing of implementation objectives and these are as follows: Get the Iris dataset from the UCI Machine Learning Repository Conduct preliminary EDA in the Spark shell Create a new Scala project in IntelliJ, and carry out all implementation steps, until the evaluation of the Random Forest classifier Deploy the application to your local Spark cluster Step 1# Getting the Iris dataset from the UCI Machine Learning Repository Head over to the UCI Machine Learning Repository website at https://archive.ics.uci.edu/ml/datasets/iris and click on Download: Data Folder. Extract this folder someplace convenient and copy over iris.csv into the root of your project folder. You may refer back to the project overview for an in-depth description of the Iris dataset. We depict the contents of the iris.csv file here, as follows: A snapshot of the Iris dataset with 150 sets You may recall that the iris.csv file is a 150-row file, with comma-separated values. Now that we have the dataset, the first step will be performing EDA on it. The Iris dataset is multivariate, meaning there is more than one (independent) variable, so we will carry out a basic multivariate EDA on it. But we need DataFrame to let us do that. How we create a dataframe as a prelude to EDA is the goal of the next section. Step 2# Preliminary EDA Before we get down to building the SBT pipeline project, we will conduct a preliminary EDA in spark-shell. The plan is to derive a dataframe out of the dataset and then calculate basic statistics on it. We have three tasks at hand for spark-shell: Fire up spark-shell Load the iris.csv file and build DataFrame Calculate the statistics We will then port that code over to a Scala file inside our SBT project. That said, let's get down to loading the iris.csv file (inputting the data source) before eventually building DataFrame. Step 3# Creating an SBT project Lay out your SBT project in a folder of your choice and name it IrisPipeline or any name that makes sense to you. This will hold all of our files needed to implement and run the pipeline on the Iris dataset. The structure of our SBT project looks like the following: Project structure We will list dependencies in the build.sbt file. This is going to be an SBT project. Hence, we will bring in the following key libraries: Spark Core Spark MLlib Spark SQL The following screenshot illustrates the build.sbt file: The build.sbt file with Spark dependencies The build.sbt file referenced in the preceding snapshot is readily available for you in the book's download bundle. Drill down to the folder Chapter01 code under ModernScalaProjects_Code and copy the folder over to a convenient location on your computer. Drop the iris.csv file that you downloaded in Step 1 – getting the Iris dataset from the UCI Machine Learning Repository into the root folder of our new SBT project. Refer to the earlier screenshot that depicts the updated project structure with the iris.csv file inside of it. Step 4# Creating Scala files in SBT project Step 4 is broken down into the following steps: Create the Scala file iris.scala in the com.packt.modern.chapter1 package. Up until now, we relied on SparkSession and SparkContext, which spark-shell gave us. This time around, we need to create SparkSession, which will, in turn, give us SparkContext. What follows is how the code is laid out in the iris.scala file. In iris.scala, after the package statement, place the following import statements: import org.apache.spark.sql.SparkSession Create SparkSession inside a trait, which we shall call IrisWrapper: lazy val session: SparkSession = SparkSession.builder().getOrCreate() Just one SparkSession is made available to all classes extending from IrisWrapper. Create val to hold the iris.csv file path: val dataSetPath = "<<path to folder containing your iris.csv file>>\\iris.csv" Create a method to build DataFrame. This method takes in the complete path to the Iris dataset path as String and returns DataFrame: def buildDataFrame(dataSet: String): DataFrame = { /* The following is an example of a dataSet parameter string: "C:\\Your\\Path\\To\\iris.csv" */ Import the DataFrame class by updating the previous import statement for SparkSession: import org.apache.spark.sql.{DataFrame, SparkSession} Create a nested function inside buildDataFrame to process the raw dataset. Name this function getRows. getRows which takes no parameters but returns Array[(Vector, String)]. The textFile method on the SparkContext variable processes the iris.csv into RDD[String]: val result1: Array[String] = session.sparkContext.textFile(<<path to iris.csv represented by the dataSetPath variable>>) The resulting RDD contains two partitions. Each partition, in turn, contains rows of strings separated by a newline character, '\n'. Each row in the RDD represents its original counterpart in the raw data. In the next step, we will attempt several data transformation steps. We start by applying a flatMap operation over the RDD, culminating in the DataFrame creation. DataFrame is a view over Dataset, which happens to the fundamental data abstraction unit in the Spark 2.0 line. Step 5# Preprocessing, data transformation, and DataFrame creation We will get started by invoking flatMap, by passing a function block to it, and successive transformations listed as follows, eventually resulting in Array[(org.apache.spark.ml.linalg.Vector, String)]. A vector represents a row of feature measurements. The Scala code to give us Array[(org.apache.spark.ml.linalg.Vector, String)] is as follows: //Each line in the RDD is a row in the Dataset represented by a String, which we can 'split' along the new //line character val result2: RDD[String] = result1.flatMap { partition => partition.split("\n").toList } //the second transformation operation involves a split inside of each line in the dataset where there is a //comma separating each element of that line val result3: RDD[Array[String]] = result2.map(_.split(",")) Next, drop the header column, but not before doing a collection that returns an Array[Array[String]]: val result4: Array[Array[String]] = result3.collect.drop(1) The header column is gone; now import the Vectors class: import org.apache.spark.ml.linalg.Vectors Now, transform Array[Array[String]] into Array[(Vector, String)]: val result5 = result4.map(row => (Vectors.dense(row(1).toDouble, row(2).toDouble, row(3).toDouble, row(4).toDouble),row(5))) Step 6# Creating, training, and testing data Now, let's split our dataset in two by providing a random seed: val splitDataSet: Array[org.apache.spark.sql.Dataset [org.apache.spark.sql.Row]] = dataSet.randomSplit(Array(0.85, 0.15), 98765L) Now our new splitDataset contains two datasets: Train dataset: A dataset containing Array[(Vector, iris-species-label-column: String)] Test dataset: A dataset containing Array[(Vector, iris-species-label-column: String)] Confirm that the new dataset is of size 2: splitDataset.size res48: Int = 2 Assign the training dataset to a variable, trainSet: val trainDataSet = splitDataSet(0) trainSet: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iris-features-column: vector, iris-species-label-column: string] Assign the testing dataset to a variable, testSet: val testDataSet = splitDataSet(1) testSet: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iris-features-column: vector, iris-species-label-column: string] Count the number of rows in the training dataset: trainSet.count res12: Long = 14 Count the number of rows in the testing dataset: testSet.count res9: Long = 136 There are 150 rows in all. Step 7# Creating a Random Forest classifier In reference to Step 5 - DataFrame Creation. This DataFrame 'dataFrame' contains column names that corresponds to the columns present in the DataFrame produced in that step The first step to create a classifier is to  pass into it (hyper) parameters. A fairly comprehensive list of parameters look like this: From 'dataFrame' we need the Features column name - iris-features-column From 'dataFrame' we also need the Indexed label column name - iris-species-label-column The sqrt setting for featureSubsetStrategy Number of features to be considered per split (we have 150 observations and four features that will make our max_features value 2) Impurity settings—values can be gini and entropy Number of trees to train (since the number of trees is greater than one, we set a tree maximum depth), which is a number equal to the number of nodes The required minimum number of feature measurements (sampled observations), also known as the minimum instances per node Look at the IrisPipeline.scala file for values of each of these parameters. But this time, we will employ an exhaustive grid search-based model selection process based on combinations of parameters, where parameter value ranges are specified. Create a randomForestClassifier instance. Set the features and featureSubsetStrategy: val randomForestClassifier = new RandomForestClassifier() .setFeaturesCol(irisFeatures_CategoryOrSpecies_IndexedLabel._1) .setFeatureSubsetStrategy("sqrt") Start building Pipeline, which has two stages, Indexer and Classifier: val irisPipeline = new Pipeline().setStages(Array[PipelineStage](indexer) ++ Array[PipelineStage](randomForestClassifier)) Next, set the hyperparameter num_trees (number of trees) on the classifier to 15, a Max_Depth parameter, and an impurity with two possible values of gini and entropy. Build a parameter grid with all three hyperparameters: val finalParamGrid: Array[ParamMap] = gridBuilder3.build() Step 8# Training the Random Forest classifier Next, we want to split our training set into a validation set and a training set: val validatedTestResults: DataFrame = new TrainValidationSplit() On this variable, set Seed, set EstimatorParamMaps, set Estimator with irisPipeline, and set a training ratio to 0.8: val validatedTestResults: DataFrame = new TrainValidationSplit().setSeed(1234567L).setEstimator(irisPipeline) Finally, do a fit and a transform with our training dataset and testing dataset. Great! Now the classifier is trained. In the next step, we will apply this classifier to testing the data. Step 9# Applying the Random Forest classifier to test data The purpose of our validation set is to be able to make a choice between models. We want an evaluation metric and hyperparameter tuning. We will now create an instance of a validation estimator called TrainValidationSplit, which will split the training set into a validation set and a training set: val validatedTestResults.setEvaluator(new MulticlassClassificationEvaluator()) Next, we fit this estimator over the training dataset to produce a model and a transformer that we will use to transform our testing dataset. Finally, we perform a validation for hyperparameter tuning by applying an evaluator for a metric. The new ValidatedTestResults DataFrame should look something like this: --------+ |iris-features-column|iris-species-column|label| rawPrediction| probability|prediction| +--------------------+-------------------+-----+--------------------+ | [4.4,3.2,1.3,0.2]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| | [5.4,3.9,1.3,0.4]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| | [5.4,3.9,1.7,0.4]| Iris-setosa| 0.0| [40.0,0.0,0.0]| [1.0,0.0,0.0]| 0.0| Let's return a new dataset by passing in column expressions for prediction and label: val validatedTestResultsDataset:DataFrame = validatedTestResults.select("prediction", "label") In the line of code, we produced a new DataFrame with two columns: An input label A predicted label, which is compared with its corresponding value in the input label column That brings us to the next step, an evaluation step. We want to know how well our model performed. That is the goal of the next step. Step 10# Evaluate Random Forest classifier In this section, we will test the accuracy of the model. We want to know how well our model performed. Any ML process is incomplete without an evaluation of the classifier. That said, we perform an evaluation as a two-step process: Evaluate the model output Pass in three hyperparameters: val modelOutputAccuracy: Double = new MulticlassClassificationEvaluator() Set the label column, a metric name, the prediction column label, and invoke evaluation with the validatedTestResults dataset. Note the accuracy of the model output results on the testing dataset from the modelOutputAccuracy variable. The other metrics to evaluate are how close the predicted label value in the 'predicted' column is to the actual label value in the (indexed) label column. Next, we want to extract the metrics: val multiClassMetrics = new MulticlassMetrics(validatedRDD2) Our pipeline produced predictions. As with any prediction, we need to have a healthy degree of skepticism. Naturally, we want a sense of how our engineered prediction process performed. The algorithm did all the heavy lifting for us in this regard. That said, everything we did in this step was done for the purpose of evaluation. Who is being evaluated here or what evaluation is worth reiterating? That said, we wanted to know how close the predicted values were compared to the actual label value. To obtain that knowledge, we decided to use the MulticlassMetrics class to evaluate metrics that will give us a measure of the performance of the model via two methods: Accuracy Weighted precision val accuracyMetrics = (multiClassMetrics.accuracy, multiClassMetrics.weightedPrecision) val accuracy = accuracyMetrics._1 val weightedPrecsion = accuracyMetrics._2 These metrics represent evaluation results for our classifier or classification model. In the next step, we will run the application as a packaged SBT application. Step 11# Running the pipeline as an SBT application At the root of your project folder, issue the sbt console command, and in the Scala shell, import the IrisPipeline object and then invoke the main method of IrisPipeline with the argument iris: sbt console scala> import com.packt.modern.chapter1.IrisPipeline IrisPipeline.main(Array("iris") Accuracy (precision) is 0.9285714285714286 Weighted Precision is: 0.9428571428571428 Step 12# Packaging the application In the root folder of your SBT application, run: sbt package When SBT is done packaging, the Uber JAR can be deployed into our cluster, using spark-submit, but since we are in standalone deploy mode, it will be deployed into [local]: The application JAR file The package command created a JAR file that is available under the target folder. In the next section, we will deploy the application into Spark. Step 13# Submitting the pipeline application to Spark local At the root of the application folder, issue the spark-submit command with the class and JAR file path arguments, respectively. If everything went well, the application does the following: Loads up the data. Performs EDA. Creates training, testing, and validation datasets. Creates a Random Forest classifier model. Trains the model. Tests the accuracy of the model. This is the most important part—the ML classification task. To accomplish this, we apply our trained Random Forest classifier model to the test dataset. This dataset consists of Iris flower data of so far not seen by the model. Unseen data is nothing but Iris flowers picked in the wild. Applying the model to the test dataset results in a prediction about the species of an unseen (new) flower. The last part is where the pipeline runs an evaluation process, which essentially is about checking if the model reports the correct species. Lastly, pipeline reports back on how important a certain feature of the Iris flower turned out to be. As a matter of fact, the petal width turns out to be more important than the sepal width in carrying out the classification task. Thus we implemented an ML workflow or an ML pipeline. The pipeline combined several stages of data analysis into one workflow. We started by loading the data and from there on, we created training and test data, preprocessed the dataset, trained the RandomForestClassifier model, applied the Random Forest classifier to test data, evaluated the classifier, and computed a process that demonstrated the importance of each feature in the classification. If you've enjoyed reading this post visit the book, Modern Scala Projects to build efficient data science projects that fulfill your software requirements. Deep Learning Algorithms: How to classify Irises using multi-layer perceptrons Introducing Android 9 Pie, filled with machine learning and baked-in UI features Paper in Two minutes: A novel method for resource efficient image classification
Read more
  • 0
  • 0
  • 19474

article-image-intelligent-mobile-projects-with-tensorflow-build-your-first-reinforcement-learning-model-on-raspberry-pi-tutorial
Bhagyashree R
05 Sep 2018
13 min read
Save for later

Intelligent mobile projects with TensorFlow: Build your first Reinforcement Learning model on Raspberry Pi [Tutorial]

Bhagyashree R
05 Sep 2018
13 min read
OpenAI Gym (https://gym.openai.com) is an open source Python toolkit that offers many simulated environments to help you develop, compare, and train reinforcement learning algorithms, so you don't have to buy all the sensors and train your robot in the real environment, which can be costly in both time and money. In this article, we'll show you how to develop and train a reinforcement learning model on Raspberry Pi using TensorFlow in an OpenAI Gym's simulated environment called CartPole (https://gym.openai.com/envs/CartPole-v0). This tutorial is an excerpt from a book written by Jeff Tang titled Intelligent Mobile Projects with TensorFlow. To install OpenAI Gym, run the following commands: git clone https://github.com/openai/gym.git cd gym sudo pip install -e . You can verify that you have TensorFlow 1.6 and gym installed by running pip list: pi@raspberrypi:~ $ pip list gym (0.10.4, /home/pi/gym) tensorflow (1.6.0) Or you can start IPython then import TensorFlow and gym: pi@raspberrypi:~ $ ipython Python 2.7.9 (default, Sep 17 2016, 20:26:04) IPython 5.5.0 -- An enhanced Interactive Python. In [1]: import tensorflow as tf In [2]: import gym In [3]: tf.__version__ Out[3]: '1.6.0' In [4]: gym.__version__ Out[4]: '0.10.4' We're now all set to use TensorFlow and gym to build some interesting reinforcement learning model running on Raspberry Pi. Understanding the CartPole simulated environment CartPole is an environment that can be used to train a robot to stay in balance. In the CartPole environment, a pole is attached to a cart, which moves horizontally along a track. You can take an action of 1 (accelerating right) or 0 (accelerating left) to the cart. The pole starts upright, and the goal is to prevent it from falling over. A reward of 1 is provided for every time step that the pole remains upright. An episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center. Let's play with the CartPole environment now. First, create a new environment and find out the possible actions an agent can take in the environment: env = gym.make("CartPole-v0") env.action_space # Discrete(2) env.action_space.sample() # 0 or 1 Every observation (state) consists of four values about the cart: its horizontal position, its velocity, its pole's angle, and its angular velocity: obs=env.reset() obs # array([ 0.04052535, 0.00829587, -0.03525301, -0.00400378]) Each step (action) in the environment will result in a new observation, a reward of the action, whether the episode is done (if it is then you can't take any further steps), and some additional information: obs, reward, done, info = env.step(1) obs # array([ 0.04069127, 0.2039052 , -0.03533309, -0.30759772]) Remember action (or step) 1 means moving right, and 0 left. To see how long an episode can last when you keep moving the cart right, run: while not done: obs, reward, done, info = env.step(1) print(obs) #[ 0.08048328 0.98696604 -0.09655727 -1.54009127] #[ 0.1002226 1.18310769 -0.12735909 -1.86127705] #[ 0.12388476 1.37937549 -0.16458463 -2.19063676] #[ 0.15147227 1.5756628 -0.20839737 -2.52925864] #[ 0.18298552 1.77178219 -0.25898254 -2.87789912] Let's now manually go through a series of actions from start to end and print out the observation's first value (the horizontal position) and third value (the pole's angle in degrees from vertical) as they're the two values that determine whether an episode is done. First, reset the environment and accelerate the cart right a few times: import numpy as np obs=env.reset() obs[0], obs[2]*360/np.pi # (0.008710582898326602, 1.4858315848689436) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.009525842685697472, 1.5936049816642313) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.014239775393474322, 1.040038643681757) obs, reward, done, info = env.step(1) obs[0], obs[2]*360/np.pi # (0.0228521194217381, -0.17418034908781568) You can see that the cart's position value gets bigger and bigger as it's moved right, the pole's vertical degree gets smaller and smaller, and the last step shows a negative degree, meaning the pole is going to the left side of the center. All this makes sense, with just a little vivid picture in your mind of your favorite dog pushing a cart with a pole. Now change the action to accelerate the cart left (0) a few times: obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.03536432554326476, -2.0525933052704954) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04397450935915654, -3.261322987287562) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04868738508385764, -3.812330822419413) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04950617929263011, -3.7134404042580687) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.04643238384389254, -2.968245724428785) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.039465670006712444, -1.5760901885345346) You may be surprised at first to see the 0 action causes the positions (obs[0]) to continue to get bigger for several times, but remember that the cart is moving at a velocity and one or several actions of moving the cart to the other direction won't decrease the position value immediately. But if you keep moving the cart to the left, you'll see that the cart's position starts becoming smaller (toward the left). Now continue the 0 action and you'll see the position gets smaller and smaller, with a negative value meaning the cart enters the left side of the center, while the pole's angle gets bigger and bigger: obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.028603948219811447, 0.46789197320636305) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (0.013843572459953138, 3.1726728882727504) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (-0.00482029774222077, 6.551160678086707) obs, reward, done, info = env.step(0) obs[0], obs[2]*360/np.pi # (-0.02739315127299434, 10.619948631208114) For the CartPole environment, the reward value returned in each step call is always 1, and the info is always {}.  So that's all there's to know about the CartPole simulated environment. Now that we understand how CartPole works, let's see what kinds of policies we can come up with so at each state (observation), we can let the policy tell us which action (step) to take in order to keep the pole upright for as long as possible, in other words, to maximize our rewards. Using neural networks to build a better policy Let's first see how to build a random policy using a simple fully connected (dense) neural network, which takes 4 values in an observation as input, uses a hidden layer of 4 neurons, and outputs the probability of the 0 action, based on which, the agent can sample the next action between 0 and 1: To follow along you can download the code files from the book's GitHub repository. # nn_random_policy.py import tensorflow as tf import numpy as np import gym env = gym.make("CartPole-v0") num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) outputs = tf.layers.dense(hidden, 1, activation=tf.nn.sigmoid) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) total_rewards = [] for _ in range(1000): rewards = 0 obs = env.reset() while True: a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) Note that we use the tf.multinomial function to sample an action based on the probability distribution of action 0 and 1, defined as outputs and 1-outputs, respectively (the sum of the two probabilities is 1). The mean of the total rewards will be around 20-something. This is a neural network that is generating a random policy, with no training at all. To train the network, we use tf.nn.sigmoid_cross_entropy_with_logits to define the loss function between the network output and the desired y_target action, defined using the basic simple policy in the previous subsection, so we expect this neural network policy to achieve about the same rewards as the basic non-neural-network policy: # nn_simple_policy.py import tensorflow as tf import numpy as np import gym env = gym.make("CartPole-v0") num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) y = tf.placeholder(tf.float32, shape=[None, 1]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) logits = tf.layers.dense(hidden, 1) outputs = tf.nn.sigmoid(logits) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits) optimizer = tf.train.AdamOptimizer(0.01) training_op = optimizer.minimize(cross_entropy) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(1000): obs = env.reset() while True: y_target = np.array([[1. if obs[2] < 0 else 0.]]) a, _ = sess.run([action, training_op], feed_dict={inputs: obs.reshape(1, num_inputs), y: y_target}) obs, reward, done, info = env.step(a[0][0]) if done: break print("training done") We define outputs as a sigmoid function of the logits net output, that is, the probability of action 0, and then use the tf.multinomial to sample an action. Note that we use the standard tf.train.AdamOptimizer and its minimize method to train the network. To test and see how good the policy is, run the following code: total_rewards = [] for _ in range(1000): rewards = 0 obs = env.reset() while True: y_target = np.array([1. if obs[2] < 0 else 0.]) a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) We're now all set to explore how we can implement a policy gradient method on top of this to make our neural network perform much better, getting rewards several times larger. The basic idea of a policy gradient is that in order to train a neural network to generate a better policy, when all an agent knows from the environment is the rewards it can get when taking an action from any given state, we can adopt two new mechanisms: Discounted rewards: Each action's value needs to consider its future action rewards. For example, an action that gets an immediate reward, 1, but ends the episode two actions (steps) later should have fewer long-term rewards than an action that gets an immediate reward, 1, but ends the episode 10 steps later. Test run the current policy and see which actions lead to higher discounted rewards, then update the current policy's gradients (of the loss for weights) with the discounted rewards, in a way that an action with higher discounted rewards will, after the network update, have a higher probability of being chosen next time. Repeat such test runs and update the process many times to train a neural network for a better policy. Implementing a policy gradient in TensorFlow Let's now see how to implement a policy gradient for our CartPole problem in TensorFlow. First, import tensorflow, numpy, and gym, and define a helper method that calculates the normalized and discounted rewards: import tensorflow as tf import numpy as np import gym def normalized_discounted_rewards(rewards): dr = np.zeros(len(rewards)) dr[-1] = rewards[-1] for n in range(2, len(rewards)+1): dr[-n] = rewards[-n] + dr[-n+1] * discount_rate return (dr - dr.mean()) / dr.std() Next, create the CartPole gym environment, define the learning_rate and discount_rate hyper-parameters, and build the network with four input neurons, four hidden neurons, and one output neuron as before: env = gym.make("CartPole-v0") learning_rate = 0.05 discount_rate = 0.95 num_inputs = env.observation_space.shape[0] inputs = tf.placeholder(tf.float32, shape=[None, num_inputs]) hidden = tf.layers.dense(inputs, 4, activation=tf.nn.relu) logits = tf.layers.dense(hidden, 1) outputs = tf.nn.sigmoid(logits) action = tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) prob_action_0 = tf.to_float(1-action) cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=prob_action_0) optimizer = tf.train.AdamOptimizer(learning_rate) To manually fine-tune the gradients to take into consideration the discounted rewards for each action we first use the compute_gradients method, then update the gradients the way we want, and finally call the apply_gradients method. So let's now  compute the gradients of the cross-entropy loss for the network parameters (weights and biases), and set up gradient placeholders, which are to be fed later with the values that consider both the computed gradients and the discounted rewards of the actions taken using the current policy during test run: gvs = optimizer.compute_gradients(cross_entropy) gvs = [(g, v) for g, v in gvs if g != None] gs = [g for g, _ in gvs] gps = [] gvs_feed = [] for g, v in gvs: gp = tf.placeholder(tf.float32, shape=g.get_shape()) gps.append(gp) gvs_feed.append((gp, v)) training_op = optimizer.apply_gradients(gvs_feed) The  gvs returned from optimizer.compute_gradients(cross_entropy) is a list of tuples, and each tuple consists of the gradient (of the cross_entropy for a trainable variable) and the trainable variable. If you run the script multiple times from IPython, the default graph of the tf object will contain trainable variables from previous runs, so unless you call tf.reset_default_graph(), you need to use gvs = [(g, v) for g, v in gvs if g != None] to remove those obsolete training variables, which would return None gradients. Now, play some games and save the rewards and gradient values: with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(1000): rewards, grads = [], [] obs = env.reset() # using current policy to test play a game while True: a, gs_val = sess.run([action, gs], feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards.append(reward) grads.append(gs_val) if done: break After the test play of a game, update the gradients with discounted rewards and train the network (remember that training_op is defined as optimizer.apply_gradients(gvs_feed)): # update gradients and do the training nd_rewards = normalized_discounted_rewards(rewards) gp_val = {} for i, gp in enumerate(gps): gp_val[gp] = np.mean([grads[k][i] * reward for k, reward in enumerate(nd_rewards)], axis=0) sess.run(training_op, feed_dict=gp_val) Finally, after 1,000 iterations of test play and updates, we can test the trained model: total_rewards = [] for _ in range(100): rewards = 0 obs = env.reset() while True: a = sess.run(action, feed_dict={inputs: obs.reshape(1, num_inputs)}) obs, reward, done, info = env.step(a[0][0]) rewards += reward if done: break total_rewards.append(rewards) print(np.mean(total_rewards)) Note that we now use the trained policy network and sess.run to get the next action with the current observation as input. The output mean of the total rewards will be about 200. You can also save a trained model after the training using tf.train.Saver: saver = tf.train.Saver() saver.save(sess, "./nnpg.ckpt") Then you can reload it in a separate test program with: with tf.Session() as sess: saver.restore(sess, "./nnpg.ckpt") Now that you have a powerful neural-network-based policy model that can help your robot keep in balance, fully tested in a simulated environment, you can deploy it in a real physical environment, after replacing the simulated environment API returns with real environment data, of course—but the code to build and train the neural network reinforcement learning model can certainly be easily reused. If you liked this tutorial and would like to learn more such techniques, pick up this book, Intelligent Mobile Projects with TensorFlow, authored by Jeff Tang. AI on mobile: How AI is taking over the mobile devices marketspace Introducing Intelligent Apps AI and the Raspberry Pi: Machine Learning and IoT, What’s the Impact?
Read more
  • 0
  • 0
  • 18158
article-image-how-to-create-advanced-environment-interactions-with-ai-tutorial
Sugandha Lahoti
04 Sep 2018
10 min read
Save for later

How to use artificial intelligence to create games with rich and interactive environments [Tutorial]

Sugandha Lahoti
04 Sep 2018
10 min read
Many of the most popular games on the planet have one thing in common: they all have rich, vivid worlds for the player to inhabit and interact with. This doesn't just mean a huge terrain or an extensive map (although it might do), it could simply be how things appear within the world. Similarly, it's not just about the environment - it's also about characters who are able to react in different ways according to the game. The only way to achieve an impressive level of 'realism' is through powerful artificial intelligence. This isn't easy, but it can be done. And learning how to do it will be well worth it, as it will create a much more engaging end product for players. This tutorial is taken from the book Practical Game AI Programming by Micael DaGraca. This book teaches you to create Game AI and implement cutting-edge AI algorithms from scratch. Let's take a look at how we can use AI to create rich environments. Breaking down the game environment by area When we create a map, often we have two or more different areas that could be used to change the gameplay, areas that could contain water, quicksand, flying zones, caves, and much more. If we wish to create an AI character that can be used in any level of our game, and anywhere, we need to take this into consideration and make the AI aware of the different zones of the map. Usually, that means that we need to input more information into the character's behavior, including how to react according to the position in which he is currently placed, or a situation where he can choose where to go. Should he avoid some areas? Should he prefer others? This type of information is relevant because it makes the character aware of the surroundings, choosing or adapting and taking into consideration his position. Not planning this correctly can lead to some unnatural decisions. For example, in Elder Scrolls V: Skyrim developed by Bethesda Softworks studio, we can watch some AI characters of the game simply turning back when they do not have information about how they should behave in some parts of the map, especially on mountains or rivers. Depending on the zones that our character finds, he might react differently or update his behavior tree to adapt to his environment. The environment that surrounds our characters can redefine their priorities or completely change their behaviors. This is a little similar to what Jean-Jacques Rousseau said about humanity: "We are good by nature, but corrupted by society." As humans, we are a representation of the environment that surrounds us, and for that reason, artificial intelligence should follow the same principle. Let's pick a  soldier and update his code to work on a different scenario. We want to change his behavior according to three different zones, beach, river, and forest. So, we'll create three public static Boolean functions with the names Beach, Forest and River; then we define the zones on the map that will turn them on or off. public static bool Beach; public static bool River; public static bool Forest; Because in this example, just one of them can be true at a time, we'll add a simple line of code that disables the other options once one of them gets activated. if(Beach == true) { Forest = false; River = false; } if(Forest == true){ Beach = false; River = false; } if(River == true){ Forest = false; Beach = false; } Once we have that done, we can start defining the different behaviors for each zone. For example, in the beach zone, the characters don't have a place to get cover, so that option needs to be taken away and updated with a new one. The river zone can be used to get across to the other side, so the character can hide from the player and attack from that position. To conclude, we can define the character to be more careful and use the trees to get cover. Depending on the zones, we can change the values to better adapt to the environment, or create new functions that would allow us to use some specific characteristics of that zone. if (Forest == true) {// The AI will remain passive until an interaction with the player occurs if (Health == 100 && triggerL == false && triggerR == false && triggerM == false) { statePassive = true; stateAggressive = false; stateDefensive = false; } // The AI will shift to the defensive mode if player comes from the right side or if the AI is below 20 HP if (Health <= 100 && triggerR == true || Health <= 20) { statePassive = false; stateAggressive = false; stateDefensive = true; } // The AI will shift to the aggressive mode if player comes from the left side or it's on the middle and AI is above 20HP if (Health > 20 && triggerL == true || Health > 20 && triggerM == true) { statePassive = false; stateAggressive = true; stateDefensive = false; } walk = speed * Time.deltaTime; walk = speedBack * Time.deltaTime; } Advanced environment interactions with AI As the video game industry and the technology associated with it kept evolving, new gameplay ideas appeared, and rapidly, the interaction between the characters of the game and the environment became even more interesting, especially when using physics. This means that the outcome of the environment could be completely random, where it was required for the AI characters to constantly adapt to different situations. One honorable mention on this subject is the video game Worms developed by Team17, where the map can be fully destroyed and the AI characters of the game are able to adapt and maintain smart decisions. The objective of this game is to destroy the opponent team by killing all their worms, the last man standing wins. From the start, the characters can find some extra health points or ammunition on the map and from time to time, it drops more points from the sky. So, there are two main objectives for the character, namely survive and kill. To survive, he needs to keep a decent amount of HP and away from the enemy, the other part is to choose the best character to shoot and take as much health as possible from him. Meanwhile, the map gets destroyed by the bombs and all of the fire power used by the characters, making it a challenge for artificial intelligence. Adapting to unstable terrain Let's decompose this example and create a character that could be used in this game. We'll start by looking at the map. At the bottom, there's water that automatically kills the worms. Then, we have the terrain where the worms can walk, or destroy if needed. Finally, there's the absence of terrain, specifically, the empty space that cannot be walked on. Then we have the characters (worms) they are placed in random positions at the beginning of the game and they can walk, jump, and shoot. The characters of the game should be able to constantly adapt to the instability of the terrain, so we need to use that and make it part of the behavior tree. As demonstrated in the diagram above, the character will need to understand the position where he is currently placed, as well as the opponent's position, health, and items. Because the terrain can be blocking them, the AI character has a chance of being in a situation where he cannot attack or obtain an item. So, we give him options on what to do in those situations and many others that he might find, but the most important is to define what happens if he cannot successfully accomplish any of them. Because the terrain can be shaped into different forms, during gameplay there will be times that it is near impossible to do anything, and that is why we need to provide options on what to do in those situations. For example, in this situation where the worm doesn't have enough free space to move, a close item to pick up, or an enemy that can be properly attacked, what should he do? It's necessary to make information about the surroundings available to our character so he can make a good judgment for that situation. In this scenario, we have defined our character to shoot anyway, against the closest enemy, or to stay close to a wall. Because he is too close to the explosion that would occur from attacking the closest enemy, he should decide to stay in a corner and wait there until the next turn. Using raycast to evaluate decisions Ideally, at the start of the turn, the character has two raycasts, one for his left side and another for the right side. This will check if there's a wall obstructing one of those directions. This can be used to determine what side the character should be moving toward if he wants to protect himself from being attacked. Then, we would use another raycast in the aim direction, to see if there's something blocking the way when the character is preparing to shoot. If there's something in the middle, the character should be calculating the distance between the two to determine if it's still safe to shoot. So, each character should have a shared list of all of the worms that are currently in the game; that way they can compare the distance between them all and choose which of them are closest and shoot them. Additionally, we add the two raycasts to check if there's something blocking the sides, and we have the basic information to make the character adapt to the constant modifications of the terrain. public int HP; public int Ammunition; public static List<GameObject> wormList = new List<GameObject>(); //creates a list with all the worms public static int wormCount; //Amount of worms in the game public int ID; //It's used to differentiate the worms private float proximityValueX; private float proximityValueY; private float nearValue; public float distanceValue; //how far the enemy should be private bool canAttack; void Awake () { wormList.Add(gameObject); //add this worm to the list wormCount++; //adds plus 1 to the amount of worms in the game } void Start () { HP = 100; distanceValue = 30f; } void Update () { proximityValueX = wormList[1].transform.position.x - this.transform.position.x; proximityValueY = wormList[1].transform.position.y - this.transform.position.y; nearValue = proximityValueX + proximityValueY; if(nearValue <= distanceValue) { canAttack = true; } else { canAttack = false; } Vector3 raycastRight = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, 10)) print("There is something blocking the Right side!"); Vector3 raycastLEft = transform.TransformDirection(Vector3.forward); if (Physics.Raycast(transform.position, raycastRight, -10)) print("There is something blocking the Left side!"); } In this post, we explored different ways to interact with the environment. First, we learned how to break down the game environment by area. Then we learned about the advanced environment interactions with AI. To learn about manipulating animation behavior with AI read our book  Practical Game AI Programming. Read Next Developing Games Using AI Techniques and Practices of Game AI Unite Berlin 2018 Keynote: Unity partners with Google, launches Ml-Agents ToolKit 0.4, Project MARS and more
Read more
  • 0
  • 0
  • 24135

article-image-implementing-cost-effective-iot-analytics-for-predictive-maintenance-tutorial
Prasad Ramesh
04 Sep 2018
10 min read
Save for later

Implementing cost-effective IoT analytics for predictive maintenance [Tutorial]

Prasad Ramesh
04 Sep 2018
10 min read
Predictive maintenance is a common value proposition cited for IoT analytics. In this tutorial will look at a value formula for net savings. Then we walk through an example as a way to highlight how to think financially about when it makes sense to implement a decision and when it does not. The economics of predictive maintenance may not be entirely obvious. Believe it or not, it does not always make sense, even if you can predict early failures accurately. In many cases, you will actually lose money by doing it. Even when it can save you money, there is an optimal point for when it should be used. The optimal point depends on the costs and the accuracy of the predictive model. This article is an excerpt from a book written by Andrew Minteer titled Analytics for the Internet of Things (IoT). The value formula A formula to guide decision making compares the cost of allowing a failure to occur versus the cost to proactively repair the component while considering the probability of predicting the failure: Net Savings = (Cost of Failure * (Expected Number of Failures - Expected True Positive Predictions)) - (Proactive Repair Cost * (Expected True Positives + Expected False Positives)) If the cost of failure is the same as the proactive repair cost, even with a perfect prediction model, then there will be no savings. Make sure to include intangible costs into the cost of failure. Some examples of intangible costs include legal expenses, loss of brand equity, and even the customer's expenses. Predictive repair does make sense when there is a large spread between the cost of failure and the cost of proactive replacement, combined with a well-performing prediction model. For example, if the cost of a failure is a locomotive engine replacement at $1 million USD and the cost of a proactive repair is $200 USD, then the accuracy of the model does not even have to be all that great before a proactive replacement program makes financial sense. On the other hand, if the failure is a $400 USD automotive turbocharger replacement, and the proactive repair cost is $350 USD for a turbocharger actuator subcomponent replacement, the predictive model would need to be highly accurate for that to make financial sense. An example of making a value decision To illustrate the example, we will walk through a business situation and then some R code that simulates a cost-benefit curve for that decision. The code will use a fitted predictive model to calculate the net savings (or lack thereof) to generate a cost curve. The cost curve can then be used in a business decision on what proportion of units with predicted failures should have a proactive replacement. Imagine you work for a company that builds diesel-powered generators. There is a coolant control valve that normally lasts for 4,000 hours of operation until there is a planned replacement. From the analysis, your company has realized that the generators built two years prior are experiencing an earlier than the expected failure of the valve. When the valve fails, the engine overheats and several other components are damaged. The cost of failure, including labor rates for repair personnel and the cost to the customer for downtime, is an average of $1,000 USD. The cost of a proactive replacement of the valve is $253 USD. Should you replace all coolant valves in the population? It depends on how high a failure rate is expected. In this case, about 10% of the current non-failed units are expected to fail before the scheduled replacement. Also, importantly, it matters how well you can predict the failures. The following R code simulates this situation and uses a simple predictive model (logistic regression) to estimate a cost curve. The model has an AUC of close to 0.75. This will vary as you run the code since the dataset is randomly simulated: #make sure all needed packages are installed if(!require(caret)){ install.packages("caret") } if(!require(pROC)){ install.packages("pROC") } if(!require(dplyr)){ install.packages("dplyr") } if(!require(data.table)){ install.packages("data.table") } #Load required libraries library(caret) library(pROC) library(dplyr) library(data.table) #Generate sample data simdata = function(N=1000) { #simulate 4 features X = data.frame(replicate(4,rnorm(N))) #create a hidden data structure to learn hidden = X[,1]^2+sin(X[,2]) + rnorm(N)*1 #10% TRUE, 90% FALSE rare.class.probability = 0.1 #simulate the true classification values y.class = factor(hidden<quantile(hidden,c(rare.class.probability))) return(data.frame(X,Class=y.class)) } #make some data structure model_data = simdata(N=50000) #train a logistic regression model on the simulated data training <- createDataPartition(model_data$Class, p = 0.6, list=FALSE) trainData <- model_data[training,] testData <- model_data[-training,] glmModel <- glm(Class~ . , data=trainData, family=binomial) testData$predicted <- predict(glmModel, newdata=testData, type="response") #calculate AUC roc.glmModel <- pROC::roc(testData$Class, testData$predicted) auc.glmModel <- pROC::auc(roc.glmModel) print(auc.glmModel) #Pull together test data and predictions simModel <- data.frame(trueClass = testData$Class, predictedClass = testData$predicted) # Reorder rows and columns simModel <- simModel[order(simModel$predictedClass, decreasing = TRUE), ] simModel <- select(simModel, trueClass, predictedClass) simModel$rank <- 1:nrow(simModel) #Assign costs for failures and proactive repairs proactive_repair_cost <- 253 # Cost of proactively repairing a part failure_repair_cost <- 1000 # Cost of a failure of the part (include all costs such as lost production, etc not just the repair cost) # Define each predicted/actual combination fp.cost <- proactive_repair_cost # The part was predicted to fail but did not (False Positive) fn.cost <- failure_repair_cost # The part was not predicted to fail and it did (False Negative) tp.cost <- (proactive_repair_cost - failure_repair_cost) # The part was predicted to fail and it did (True Positive). This will be negative for a savings. tn.cost <- 0.0 # The part was not predicted to fail and it did not (True Negative) #incorporate probability of future failure simModel$future_failure_prob <- prob_failure #Function to assign costs for each instance assignCost <- function(pred, outcome, tn.cost, fn.cost, fp.cost, tp.cost, prob){ cost <- ifelse(pred == 0 & outcome == FALSE, tn.cost, # No cost since no action was taken and no failure ifelse(pred == 0 & outcome == TRUE, fn.cost, # The cost of no action and a repair resulted ifelse(pred == 1 & outcome == FALSE, fp.cost, # The cost of proactive repair which was not needed ifelse(pred == 1 & outcome == TRUE, tp.cost, 999999999)))) # The cost of proactive repair which avoided a failure return(cost) } # Initialize list to hold final output master <- vector(mode = "list", length = 100) #use the simulated model. In practice, this code can be adapted to compare multiple models test_model <- simModel # Create a loop to increment through dynamic threshold (starting at 1.0 [no proactive repairs] to 0.0 [all proactive repairs]) threshold <- 1.00 for (i in 1:101) { #Add predicted class with percentile ranking test_model$prob_ntile <- ntile(test_model$predictedClass, 100) / 100 # Dynamically determine if proactive repair would apply based on incrementing threshold test_model$glm_failure <- ifelse(test_model$prob_ntile >= threshold, 1, 0) test_model$threshold <- threshold # Compare to actual outcome to assign costs test_model$glm_impact <- assignCost(test_model$glm_failure, test_model$trueClass, tn.cost, fn.cost, fp.cost, tp.cost, test_model$future_failure_prob) # Compute cost for not doing any proactive repairs test_model$nochange_impact <- ifelse(test_model$trueClass == TRUE, fn.cost, tn.cost) # *test_model$future_failure_prob) # Running sum to produce the overall impact test_model$glm_cumul_impact <- cumsum(test_model$glm_impact) / nrow(test_model) test_model$nochange_cumul_impact <- cumsum(test_model$nochange_impact) / nrow(test_model) # Count the # of classified failures test_model$glm_failure_ct <- cumsum(test_model$glm_failure) # Create new object to house the one row per iteration output for the final plot master[[i]] <- test_model[nrow(test_model),] # Reduce the threshold by 1% and repeat to calculate new value threshold <- threshold - 0.01 } finalOutput <- rbindlist(master) finalOutput <- subset(finalOutput, select = c(threshold, glm_cumul_impact, glm_failure_ct, nochange_cumul_impact) ) # Set baseline to costs of not doing any proactive repairs baseline <- finalOutput$nochange_cumul_impact # Plot the cost curve par(mfrow = c(2,1)) plot(row(finalOutput)[,1], finalOutput$glm_cumul_impact, type = "l", lwd = 3, main = paste("Net Costs: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost, sep = ""), ylim = c(min(finalOutput$glm_cumul_impact) - 100, max(finalOutput$glm_cumul_impact) + 100), xlab = "Percent of Population", ylab = "Net Cost ($) / Unit") # Plot the cost difference of proactive repair program and a 'do nothing' approach plot(row(finalOutput)[,1], baseline - finalOutput$glm_cumul_impact, type = "l", lwd = 3, col = "black", main = paste("Savings: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost,sep = ""), ylim = c(min(baseline - finalOutput$glm_cumul_impact) - 100, max(baseline - finalOutput$glm_cumul_impact) + 100), xlab = "% of Population", ylab = "Savings ($) / Unit") abline(h=0,col="gray")   As seen in the resulting net cost and savings curves, based on the model's predictions, the optimal savings would be from a proactive repair program of the top 30 percentile units. The savings decreases after this, although you would still save money when replacing up to 75% of the population. After this point, you should expect to spend more than you save. The following set of charts is the output from the preceding code: Cost and savings curves for the proactive repair $253 and failure cost at $1,000 scenario Note the changes in the following graph when the failure cost drops to $300 USD. At no point do you save money, as the proactive repair cost will always outweigh the reduced failure cost. This does not mean you should not do a proactive repair; you may still want to do so in order to satisfy your customers. Even in such a case, this cost curve method can help in decisions on how much you are willing to spend to address the problem. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 300 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $300 scenario And finally, notice how the savings curve changes when the failure cost moves to $5,000. You will notice that the spread between the proactive repair cost and the failure cost determines much of when doing a proactive repair makes business sense. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 5000 to generate the following charts: Cost and savings curves for the proactive repair $253 and failure cost at $5,000 scenario Ultimately, the decision is a business case based on the expected costs and benefits. ML modeling can help optimize savings under the right conditions. Utilizing cost curves helps to determine the expected costs and savings of proactive replacements. In this tutorial, we looked at implementing economically cost effective IoT analytics for predictive maintenance with example.  To further explore IoT Analytics and cloud check out the book Analytics for the Internet of Things (IoT). AWS IoT Analytics: The easiest way to run analytics on IoT data, Amazon says Build an IoT application with Azure IoT [Tutorial] Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2
Read more
  • 0
  • 0
  • 23484
Modal Close icon
Modal Close icon