Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-neural-network-architectures-101-understanding-perceptrons

12 Feb 2018

9 min read

Neural Network Architectures 101: Understanding Perceptrons

12 Feb 2018

[box type="note" align="" class="" width=""]This article is an excerpt taken from a book Neural Network Programming with Java Second Edition written by Fabio M. Soares and Alan M. F. Souza. This book is for Java developers who want to master developing smarter applications like weather forecasting, pattern recognition etc using neural networks. [/box] In this article we will discuss about perceptrons along with their features, applications and limitations. Perceptrons are a very popular neural network architecture that implements supervised learning. Projected by Frank Rosenblatt in 1957, it has just one layer of neurons, receiving a set of inputs and producing another set of outputs. This was one of the first representations of neural networks to gain attention, especially because of their simplicity. In our Java implementation, this is illustrated with one neural layer (the output layer). The following code creates a perceptron with three inputs and two outputs, having the linear function at the output layer: int numberOfInputs=3; int numberOfOutputs=2; Linear outputAcFnc = new Linear(1.0); NeuralNet perceptron = new NeuralNet(numberOfInputs,numberOfOutputs, outputAcFnc); Applications and limitations However, scientists did not take long to conclude that a perceptron neural network could only be applied to simple tasks, according to that simplicity. At that time, neural networks were being used for simple classification problems, but perceptrons usually failed when faced with more complex datasets. Let's illustrate this with a very basic example (an AND function) to understand better this issue. Linear separation The example consists of an AND function that takes two inputs, x1 and x2. That function can be plotted in a two-dimensional chart as follows: And now let's examine how the neural network evolves the training using the perceptron rule, considering a pair of two weights, w1 and w2, initially 0.5, and bias valued 0.5 as well. Assume learning rate η equals 0.2: Epoch x1 x2 w1 w2 b y t E Δw1 Δw2 Δb 1 0 0 0.5 0.5 0.5 0.5 0 -0.5 0 0 -0.1 1 0 1 0.5 0.5 0.4 0.9 0 -0.9 0 -0.18 -0.18 1 1 0 0.5 0.32 0.22 0.72 0 -0.72 -0.144 0 -0.144 1 1 1 0.356 0.32 0.076 0.752 1 0.248 0.0496 0.0496 0.0496 2 0 0 0.406 0.370 0.126 0.126 0 -0.126 0.000 0.000 -0.025 2 0 1 0.406 0.370 0.100 0.470 0 -0.470 0.000 -0.094 -0.094 2 1 0 0.406 0.276 0.006 0.412 0 -0.412 -0.082 0.000 -0.082 2 1 1 0.323 0.276 -0.076 0.523 1 0.477 0.095 0.095 0.095 … … 89 0 0 0.625 0.562 -0.312 -0.312 0 0.312 0 0 0.062 89 0 1 0.625 0.562 -0.25 0.313 0 -0.313 0 -0.063 -0.063 89 1 0 0.625 0.500 -0.312 0.313 0 -0.313 -0.063 0 -0.063 89 1 1 0.562 0.500 -0.375 0.687 1 0.313 0.063 0.063 0.063 After 89 epochs, we find the network to produce values near to the desired output. Since in this example the outputs are binary (zero or one), we can assume that any value produced by the network that is below 0.5 is considered to be 0 and any value above 0.5 is considered to be 1. So, we can draw a function , with the final weights and bias found by the learning algorithm w1=0.562, w2=0.5 and b=-0.375, defining the linear boundary in the chart: This boundary is a definition of all classifications given by the network. You can see that the boundary is linear, given that the function is also linear. Thus, the perceptron network is really suitable for problems whose patterns are linearly separable. The XOR case Now let's analyze the XOR case: We see that in two dimensions, it is impossible to draw a line to separate the two patterns. What would happen if we tried to train a single layer perceptron to learn this function? Suppose we tried, let's see what happened in the following table: Epoch x1 x2 w1 w2 b y t E Δw1 Δw2 Δb 1 0 0 0.5 0.5 0.5 0.5 0 -0.5 0 0 -0.1 1 0 1 0.5 0.5 0.4 0.9 1 0.1 0 0.02 0.02 1 1 0 0.5 0.52 0.42 0.92 1 0.08 0.016 0 0.016 1 1 1 0.516 0.52 0.436 1.472 0 -1.472 -0.294 -0.294 -0.294 2 0 0 0.222 0.226 0.142 0.142 0 -0.142 0.000 0.000 -0.028 2 0 1 0.222 0.226 0.113 0.339 1 0.661 0.000 0.132 0.132 2 1 0 0.222 0.358 0.246 0.467 1 0.533 0.107 0.000 0.107 2 1 1 0.328 0.358 0.352 1.038 0 -1.038 -0.208 -0.208 -0.208 … … 127 0 0 -0.250 -0.125 0.625 0.625 0 -0.625 0.000 0.000 -0.125 127 0 1 -0.250 -0.125 0.500 0.375 1 0.625 0.000 0.125 0.125 127 1 0 -0.250 0.000 0.625 0.375 1 0.625 0.125 0.000 0.125 127 1 1 -0.125 0.000 0.750 0.625 0 -0.625 -0.125 -0.125 -0.125 The perceptron just could not find any pair of weights that would drive the following error 0.625. This can be explained mathematically as we already perceived from the chart that this function cannot be linearly separable in two dimensions. So what if we add another dimension? Let's see the chart in three dimensions: In three dimensions, it is possible to draw a plane that would separate the patterns, provided that this additional dimension could properly transform the input data. Okay, but now there is an additional problem: how could we derive this additional dimension since we have only two input variables? One obvious, but also workaround, answer would be adding a third variable as a derivation from the two original ones. And being this third variable a (derivation), our neural network would probably get the following shape: Okay, now the perceptron has three inputs, one of them being a composition of the other. This also leads to a new question: how should that composition be processed? We can see that this component could act as a neuron, so giving the neural network a nested architecture. If so, there would another new question: how would the weights of this new neuron be trained, since the error is on the output neuron? Multi-layer perceptrons As we can see, one simple example in which the patterns are not linearly separable has led us to more and more issue using the perceptron architecture. That need led to the application of multilayer perceptrons. The fact that the natural neural network is structured in layers as well, and each layer captures pieces of information from a specific environment is already established. In artificial neural networks, layers of neurons act in this way, by extracting and abstracting information from data, transforming them into another dimension or shape. In the XOR example, we found the solution to be the addition of a third component that would make possible a linear separation. But there remained a few questions regarding how that third component would be computed. Now let's consider the same solution as a two-layer perceptron: Now we have three neurons instead of just one, but in the output the information transferred by the previous layer is transformed into another dimension or shape, whereby it would be theoretically possible to establish a linear boundary on those data points. However, the question on finding the weights for the first layer remains unanswered, or can we apply the same training rule to neurons other than the output? We are going to deal with this issue in the Generalized delta rule section. MLP properties Multi-layer perceptrons can have any number of layers and also any number of neurons in each layer. The activation functions may be different on any layer. An MLP network is usually composed of at least two layers, one for the output and one hidden layer. There are also some references that consider the input layer as the nodes that collect input data; therefore, for those cases, the MLP is considered to have at least three layers. For the purpose of this article, let's consider the input layer as a special type of layer which has no weights, and as the effective layers, that is, those enabled to be trained, we'll consider the hidden and output layers. A hidden layer is called that because it actually hides its outputs from the external world. Hidden layers can be connected in series in any number, thus forming a deep neural network. However, the more layers a neural network has, the slower would be both training and running, and according to mathematical foundations, a neural network with one or two hidden layers at most may learn as well as deep neural networks with dozens of hidden layers. But it depends on several factors. MLP weights In an MLP feedforward network, one particular neuron i receives data from a neuron j of the previous layer and forwards its output to a neuron k of the next layer: The mathematical description of a neural network is recursive: Here, yo is the network output (should we have multiple outputs, we can replace yo with Y, representing a vector); fo is the activation function of the output; l is the number of hidden layers; nhi is the number of neurons in the hidden layer i; wi is the weight connecting the i th neuron of the last hidden layer to the output; fi is the activation function of the neuron i; and bi is the bias of the neuron i. It can be seen that this equation gets larger as the number of layers increases. In the last summing operation, there will be the inputs xi. Recurrent MLP The neurons on an MLP may feed signals not only to neurons in the next layers (feedforward network), but also to neurons in the same or previous layers (feedback or recurrent). This behavior allows the neural network to maintain state on some data sequence, and this feature is especially exploited when dealing with time series or handwriting recognition. Recurrent networks are usually harder to train, and eventually the computer may run out of memory while executing them. In addition, there are recurrent network architectures better than MLPs, such as Elman, Hopfield, Echo state, Bidirectional RNNs (recurrent neural networks). But we are not going to dive deep into these architectures. Coding an MLP Bringing these concepts into the OOP point of view, we can review the classes already designed so far: One can see that the neural network structure is hierarchical. A neural network is composed of layers that are composed of neurons. In the MLP architecture, there are three types of layers: input, hidden, and output. So suppose that in Java, we would like to define a neural network consisting of three inputs, one output (linear activation function) and one hidden layer (sigmoid function) containing five neurons. The resulting code would be as follows: int numberOfInputs=3; int numberOfOutputs=1; int[] numberOfHiddenNeurons={5}; Linear outputAcFnc = new Linear(1.0); Sigmoid hiddenAcFnc = new Sigmoid(1.0); NeuralNet neuralnet = new NeuralNet(numberOfInputs, numberOfOutputs, numberOfHiddenNeurons, hiddenAcFnc, outputAcFnc); To summarize, we saw how perceptrons can be applied to solve linear separation problems, their limitations in classifying nonlinear data and how to suppress those limitations with multi-layer perceptrons (MLPs). If you enjoyed this excerpt, check out the book Neural Network Programming with Java Second Edition for a better understanding of neural networks and how they fit in different real-world projects.

0
0
23853

article-image-brad-miro-talks-tensorflow-2-0-features-and-how-google-is-using-it-internally

Sugandha Lahoti

10 Dec 2019

6 min read

Brad Miro talks TensorFlow 2.0 features and how Google is using it internally

Sugandha Lahoti

10 Dec 2019

6 min read

TensorFlow 2.0, released in October, has got developers excited about a myriad of features and its ease of use. At the EuroPython Conference 2019, Brad Miro, developer programs engineer at Google talked about the updates being made to TensorFlow 2.0. He also gave an overview of how Google is using TensorFlow, moving on to why Python is important for TensorFlow development and how to migrate from TF 1.x to TF 2.0. EuroPython is one of the most popular Python programming language community conferences. Below are some highlights from Brad’s talk at EuroPython. What is TensorFlow? TensorFlow, an open-source deep learning library developed at Google, first released in 2015. It’s a Python framework that includes a number of utilities for helping you write deep neural networks supporting both GPUs and TPUs. A lot of deep learning involves using mathematics, statistics, and algebra and perform low-level optimizations with your system. TensorFlow removes a lot of those abstractions leaving you to focus on actually writing your model. How TensorFlow is used internally at Google Tensorflow is used internally at Google to power all of its machine learning and AI. Google’s data centers are powered using AI and TensorFlow to help optimize the usage of these data centers to reduce bandwidth, to ensure network connections are optimized, and to reduce power consumption. TensorFlow also is useful for performing global localization in Google Maps. It is also used heavily in the Google Pixel range of smartphones to help optimize the software. These technologies are also used in medical research specifically in the field of Computer Vision. For example, Tensorflow is used to distinguish between the retinal image of a healthy eye from the retinal image of an eye that has diabetic retinopathy. Further Learning If you want to learn to build more computer vision applications with TensorFlow 2.0, check out the book Hands-On Computer Vision with TensorFlow 2 by Benjamin Planche, and Eliot Andres. This book by Packt Publishing is a practical guide to building high-performance systems for object detection, segmentation, video processing, smartphone applications, and more. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0. Furthermore, Google is using AI and TensorFlow to predict whether or not objects in space are planets. To summarize, they use AI to predict whether or not fluctuations in the brightness of an object is due to it being a planet. Why Python is so important for TensorFlow Python has always been the choice for TensorFlow due to the language being extremely easy to use and having a rich ecosystem for data science including tools such as numpy, scikit-learn, and pandas. When TensorFlow was being built, the idea was that it should have the simplicity of numpy, performance of C but ease of use of Python. What does TensorFlow 2.0 bring to the table TensorFlow 2.0 is powerful, flexible, scalable and easily deployable. What’s gone Session.run tf.control_dependencies tf.global_variables_initializer tf.cond, tf.while_loop Tf.contrib What’s new Eager execution enabled by default tf.function Keras as main high level API Distribution Strategy API SavedModel API TensorFlow 2.0 had a major API Cleanup. Many API symbols are removed or renamed for better consistency and clarity. Session.run has been replaced with eager execution which effectively means that your tensorflow code runs like numpy code. Eager execution enables fast iteration and intuitive debugging without building a graph. It also makes creating and experimenting with models using TensorFlow easier. It can be especially useful when using the tf.keras model subclassing API. TensorFlow 2.0 has tf.function, a python decorator that lets you run regular Python code which is later compiled down to TensorFlow code using AutoGraph. The Distribution Strategy API in TensorFlow 2.0 allows machine learning researchers to distribute training across a wide variety of compute configurations. This release also allows distributed training with Keras’ model.fit and custom training loops. Keras is introduced as the main high-level API. Keras is a popular high-level API used for easy and fast prototyping, building, and training of deep learning models. This will enable developers to easily leverage their various model-building APIs. Using Keras with TensorFlow has two main methods. Symbolic (Keras sequential) Your model is a graph of layers Any graph you compile will run TensorFlow helps you debug by catching errors at compile time Imperative method (Keras subclassing) Your model is Python bytecode Complete flexibility and control Harder to debug/ Harder to maintain There are pros and cons of using each method; it really just depends on what your specific use cases are. The SavedModel API allows you to save your trained ML model into a language-neutral format. With TensorFlow 2.0, all TensorFlow ecosystem projects including TensorFlow Lite, TensorFlow JS, TensorFlow Serving, and TensorFlow Hub, support SavedModels. On Tensorflow Hub, you can store and download pre-built models. You can use TensorFlow Extended which is a Python library that can be run on your servers to productionalize your models. TensorFlow Lite lets you run your TensorFlow models on edge devices. With TensorFlow.js, you can run machine learning models using javascript in the browser or run them on servers using node. TensorFlow also has Swift for TensorFlow to help developers use Swift to develop machine learning models. “Swift for TensorFlow provides a new programming model that combines the performance of graphs with the flexibility and expressivity of Eager execution, with a strong focus on improved usability at every level of the stack. This is not just a TensorFlow API wrapper written in Swift — we added compiler and language enhancements to Swift to provide a first-class user experience for machine learning developers.” Other packages that exist in the TensorFlow ecosystem used for niche use cases are TF Probability, TF Agents (reinforcement learning), Tensor2Tensor, TF Ranking, TF Text (natural language processing), TF Federated, TF privacy and more. How to upgrade from TensorFlow 1.x to TensorFlow 2.0 There are several migration guides available on TensorFlow’s website. You can also use the tf.compat.v1 library for backwards compatibility and the tf_upgrade_v2 script which you can execute on top of any Python script to convert TF 1.x code to 2.0 code. You can also read more about TF 2.0 migration in our book Hands-On Computer Vision with TensorFlow 2 which introduces the automatic migration tool and compares TensorFlow 1 concepts with their TensorFlow 2 counterparts with a detailed guide on migrating to idiomatic TensorFlow 2 code. You can watch Brad’s full talk on YouTube. This video is licensed under the CC BY-NC-SA 3.0 license. TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track. TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!

0
0
23677

article-image-18-striking-ai-trends-2018-part-1

Sugandha Lahoti

27 Dec 2017

14 min read

18 striking AI Trends to watch in 2018 - Part 1

Sugandha Lahoti

27 Dec 2017

14 min read

Artificial Intelligence is the talk of the town. It has evolved past merely being a buzzword in 2016, to be used in a more practical manner in 2017. As 2018 rolls out, we will gradually notice AI transitioning into a necessity. We have prepared a detailed report, on what we can expect from AI in the upcoming year. So sit back, relax, and enjoy the ride through the future. (Don’t forget to wear your VR headgear! ) Here are 18 things that will happen in 2018 that are either AI driven or driving AI: Artificial General Intelligence may gain major traction in research. We will turn to AI enabled solution to solve mission-critical problems. Machine Learning adoption in business will see rapid growth. Safety, ethics, and transparency will become an integral part of AI application design conversations. Mainstream adoption of AI on mobile devices Major research on data efficient learning methods AI personal assistants will continue to get smarter Race to conquer the AI optimized hardware market will heat up further We will see closer AI integration into our everyday lives. The cryptocurrency hype will normalize and pave way for AI-powered Blockchain applications. Advancements in AI and Quantum Computing will share a symbiotic relationship Deep learning will continue to play a significant role in AI development progress. AI will be on both sides of the cybersecurity challenge. Augmented reality content will be brought to smartphones. Reinforcement learning will be applied to a large number of real-world situations. Robotics development will be powered by Deep Reinforcement learning and Meta-learning Rise in immersive media experiences enabled by AI. A large number of organizations will use Digital Twin. 1. General AI: AGI may start gaining traction in research. AlphaZero is only the beginning. 2017 saw Google’s AlphaGo Zero (and later AlphaZero) beat human players at Go, Chess, and other games. In addition to this, computers are now able to recognize images, understand speech, drive cars, and diagnose diseases better with time. AGI is an advancement of AI which deals with bringing machine intelligence as close to humans as possible. So, machines can possibly do any intellectual task that a human can! The success of AlphaGo covered one of the crucial aspects of AGI systems—the ability to learn continually, avoiding catastrophic forgetting. However, there is a lot more to achieving human-level general intelligence than the ability to learn continually. For instance, AI systems of today can draw on skills it learned on one game to play another. But they lack the ability to generalize the learned skill. Unlike humans, these systems do not seek solutions from previous experiences. An AI system cannot ponder and reflect on a new task, analyze its capabilities, and work out how best to apply them. In 2018, we expect to see advanced research in the areas of deep reinforcement learning, meta-learning, transfer learning, evolutionary algorithms and other areas that aid in developing AGI systems. Detailed aspects of these ideas are highlighted in later points. We can indeed say, Artificial General Intelligence is inching closer than ever before and 2018 is expected to cover major ground in that direction. 2. Enterprise AI: Machine Learning adoption in enterprises will see rapid growth. 2017 saw a rise in cloud offerings by major tech players, such as the Amazon Sagemaker, Microsoft Azure Cloud, Google Cloud Platform, allowing business professionals and innovators to transfer labor-intensive research and analysis to the cloud. Cloud is a $130 billion industry as of now, and it is projected to grow. Statista carried out a survey to present the level of AI adoption among businesses worldwide, as of 2017. Almost 80% of the participants had incorporated some or other form of AI into their organizations or planned to do so in the future. Source: https://www.statista.com/statistics/747790/worldwide-level-of-ai-adoption-business/ According to a report from Deloitte, medium and large enterprises are set to double their usage of machine learning by the end of 2018. Apart from these, 2018 will see better data visualization techniques, powered by machine learning, which is a critical aspect of every business. Artificial intelligence is going to automate the cycle of report generation and KPI analysis, and also, bring in deeper analysis of consumer behavior. Also with abundant Big data sources coming into the picture, BI tools powered by AI will emerge, which can harness the raw computing power of voluminous big data for data models to become streamlined and efficient. 3. Transformative AI: We will turn to AI enabled solutions to solve mission-critical problems. 2018 will see the involvement of AI in more and more mission-critical problems that can have world-changing consequences: read enabling genetic engineering, solving the energy crisis, space exploration, slowing climate change, smart cities, reducing starvation through precision farming, elder care etc. Recently NASA revealed the discovery of a new exoplanet, using data crunched from Machine learning and AI. With this recent reveal, more AI techniques would be used for space exploration and to find other exoplanets. We will also see the real-world deployment of AI applications. So it will not be only about academic research, but also about industry readiness. 2018 could very well be the year when AI becomes real for medicine. According to Mark Michalski, executive director, Massachusetts General Hospital and Brigham and Women’s Center for Clinical Data Science, “By the end of next year, a large number of leading healthcare systems are predicted to have adopted some form of AI within their diagnostic groups.” We would also see the rise of robot assistants, such as virtual nurses, diagnostic apps in smartphones, and real clinical robots that can monitor patients, take care of the elderly, alert doctors, and send notifications in case of emergency. More research will be done on how AI enabled technology can help in difficult to diagnose areas in health care like mental health, the onset of hereditary diseases among others. Facebook's attempt at detection of potential suicidal messages using AI is a sign of things to come in this direction. As we explore AI enabled solutions to solve problems that have a serious impact on individuals and societies at large, considering the ethical and moral implications of such solutions will become central to developing them, let alone hard to ignore. 4. Safe AI: Safety, Ethics, and Transparency in AI applications will become integral to conversations on AI adoption and app design. The rise of machine learning capabilities has also given rise to forms of bias, stereotyping and unfair determination in such systems. 2017 saw some high profile news stories about gender bias, object recognition datasets like MS COCO, to racial disparities in education AI systems. At NIPS 2017, Kate Crawford talked about bias in machine learning systems which resonated greatly with the community and became pivotal to starting conversations and thinking by other influencers on how to address the problems raised. DeepMind also launched a new unit, the DeepMind Ethics & Society, to help technologists put ethics into practice, and to help society anticipate and direct the impact of AI for the benefit of all. Independent bodies like IEEE also pushed for standards in it’s ethically aligned design paper. As news about the bro culture in Silicon Valley and the lack of diversity in the tech sector continued to stay in the news all of 2017, it hit closer home as the year came to an end, when Kristian Lum, Lead Statistician at HRDAG, described her experiences with harassment as a graduate student at prominent stat conferences. This has had a butterfly effect of sorts with many more women coming forward to raise the issue in the ML/AI community. They talked about the formation of a stronger code of conduct by boards of key conferences such as NIPS among others. Eric Horvitz, a Microsoft research director, called Lum’s post a "powerful and important report." Jeff Dean, head of Google’s Brain AI unit applauded Lum for having the courage to speak about this behavior. Other key influencers from the ML and statisticians community also spoke in support of Lum and added their views on how to tackle the problem. While the road to recovery is long and machines with moral intelligence may be decades away, 2018 is expected to start that journey in the right direction by including safety, ethics, and transparency in AI/ML systems. Instead of just thinking about ML contributing to decision making in say hiring or criminal justice, data scientists would begin to think of the potential role of ML in the harmful representation of human identity. These policies will not only be included in the development of larger AI ecosystems but also in national and international debates in politics, businesses, and education. 5. Ubiquitous AI: AI will start redefining life as we know it, and we may not even know it happened. Artificial Intelligence will gradually integrate into our everyday lives. We will see it in our everyday decisions like what kind of food we eat, the entertainment we consume, the clothes we wear, etc. Artificially intelligent systems will get better at complex tasks that humans still take for granted, like walking around a room and over objects. We’re going to see more and more products that contain some form of AI enter our lives. AI enabled stuff will become more common and available. We will also start seeing it in the background for life-altering decisions we make such as what to learn, where to work, whom to love, who our friends are, whom should we vote for, where should we invest, and where should we live among other things. 6. Embedded AI: Mobile AI means a radically different way of interacting with the world. There is no denying that AI is the power source behind the next generation of smartphones. A large number of organizations are enabling the use of AI in smartphones, whether in the form of deep learning chips, or inbuilt software with AI capabilities. The mobile AI will be a combination of on-device AI and cloud AI. Intelligent phones will have end-to-end capabilities that support coordinated development of chips, devices, and the cloud. The release of iPhone X’s FaceID—which uses a neural network chip to construct a mathematical model of the user’s face— and self-driving cars are only the beginning. As 2018 rolls out we will see vast applications on smartphones and other mobile devices which will run deep neural networks to enable AI. AI going mobile is not just limited to the embedding of neural chips in smartphones. The next generation of mobile networks 5G will soon greet the world. 2018 is going to be a year of closer collaborations and increasing partnerships between telecom service providers, handset makers, chip markers and AI tech enablers/researchers. The Baidu-Huawei partnership—to build an open AI mobile ecosystem, consisting of devices, technology, internet services, and content—is an example of many steps in this direction. We will also see edge computing rapidly becoming a key part of the Industrial Internet of Things (IIoT) to accelerate digital transformation. In combination with cloud computing, other forms of network architectures such as fog and mist would also gain major traction. All of the above will lead to a large-scale implementation of cognitive IoT, which combines traditional IoT implementations with cognitive computing. It will make sensors capable of diagnosing and adapting to their environment without the need for human intervention. Also bringing in the ability to combine multiple data streams that can identify patterns. This means we will be a lot closer to seeing smart cities in action. 7. Data-sparse AI: Research into data efficient learning methods will intensify 2017 saw highly scalable solutions for problems in object detection and recognition, machine translation, text-to-speech, recommender systems, and information retrieval. The second conference on Machine Translation happened in September 2017. The 11th ACM Conference on Recommender Systems in August 2017 witnessed a series of papers presentations, featured keynotes, invited talks, tutorials, and workshops in the field of recommendation system. Google launched the Tacotron 2 for generating human-like speech from text. However, most of these researches and systems attain state-of-the-art performance only when trained with large amounts of data. With GDPR and other data regulatory frameworks coming into play, 2018 is expected to witness machine learning systems which can learn efficiently maintaining performance, but in less time and with less data. A data-efficient learning system allows learning in complex domains without requiring large quantities of data. For this, there would be developments in the field of semi-supervised learning techniques, where we can use generative models to better guide the training of discriminative models. More research would happen in the area of transfer learning (reuse generalize knowledge across domains), active learning, one-shot learning, Bayesian optimization as well as other non-parametric methods. In addition, researchers and organizations will exploit bootstrapping and data augmentation techniques for efficient reuse of available data. Other key trends propelling data efficient learning research are growing in-device/edge computing, advancements in robotics, AGI research, and energy optimization of data centers, among others. 8. Conversational AI: AI personal assistants will continue to get smarter AI-powered virtual assistants are expected to skyrocket in 2018. 2017 was filled to the brim with new releases. Amazon brought out the Echo Look and the Echo Show. Google made its personal assistant more personal by allowing linking of six accounts to the Google Assistant built into the Home via the Home app. Bank of America unveiled Erica, it’s AI-enabled digital assistant. As 2018 rolls out, AI personal assistants will find its way into an increasing number of homes and consumer gadgets. These include increased availability of AI assistants in our smartphones and smart speakers with built-in support for platforms such as Amazon’s Alexa and Google Assistant. With the beginning of the new year, we can see personal assistants integrating into our daily routines. Developers will build voice support into a host of appliances and gadgets by using various voice assistant platforms. More importantly, developers in 2018 will try their hands on conversational technology which will include emotional sensitivity (affective computing) as well as machine translational technology (the ability to communicate seamlessly between languages). Personal assistants would be able to recognize speech patterns, for instance, of those indicative of wanting help. AI bots may also be utilized for psychiatric counseling or providing support for isolated people. And it’s all set to begin with the AI assistant summit in San Francisco scheduled on 25 - 26 January 2018. It will witness talks by world's leading innovators in advances in AI Assistants and artificial intelligence. 9. AI Hardware: Race to conquer the AI optimized hardware market will heat up further Top tech companies (read Google, IBM, Intel, Nvidia) are investing heavily in the development of AI/ML optimized hardware. Research and Markets have predicted the global AI chip market will have a growth rate of about 54% between 2017 and 2021. 2018 will see further hardware designs intended to greatly accelerate the next generation of applications and run AI computational jobs. With the beginning of 2018 chip makers will battle it out to determine who creates the hardware that artificial intelligence lives on. Not only that, there would be a rise in the development of new AI products, both for hardware and software platforms that run deep learning programs and algorithms. Also, chips which move away from the traditional one-size-fits-all approach to application-based AI hardware will grow in popularity. 2018 would see hardware which does not only store data, but also transform it into usable information. The trend for AI will head in the direction of task-optimized hardware. 2018 may also see hardware organizations move to software domains and vice-versa. Nvidia, most famous for their Volta GPUs have come up with NVIDIA DGX-1, a software for AI research, designed to streamline the deep learning workflow. More such transitions are expected at the highly anticipated CES 2018. [dropcap]P[/dropcap]hew, that was a lot of writing! But I hope you found it just as interesting to read as I found writing it. However, we are not done yet. And here is part 2 of our 18 AI trends in ‘18.

0
0
23572

article-image-twitter-sentiment-analysis

Packt

19 Feb 2016

30 min read

Twitter Sentiment Analysis

Packt

19 Feb 2016

30 min read

0
0
23321

article-image-how-to-perform-numeric-metric-aggregations-with-elasticsearch

Pravin Dhandre

22 Feb 2018

7 min read

How to perform Numeric Metric Aggregations with Elasticsearch

Pravin Dhandre

22 Feb 2018

7 min read

[box type="note" align="" class="" width=""]This article is an excerpt from the book Learning Elastic Stack 6.0 written by Pranav Shukla and Sharath Kumar M N . This book provides detailed coverage on fundamentals of each components of Elastic Stack, making it easy to search, analyze and visualize data across different sources in real-time.[/box] Today, we are going to demonstrate how to run numeric and statistical queries such as summation, average, count and various similar metric aggregations on Elastic Stack to serve a better analytics engine on your dataset. Metric aggregations Metric aggregations work with numeric data, computing one or more aggregate metrics within the given context. The context could be a query, filter, or no query to include the whole index/type. Metric aggregations can also be nested inside other bucket aggregations. In this case, these metrics will be computed for each bucket in the bucket aggregations. We will start with simple metric aggregations without nesting them inside bucket aggregations. When we learn about bucket aggregations later in the chapter, we will also learn how to use metric aggregations inside bucket aggregations. We will learn about the following metric aggregations: Sum, average, min, and max aggregations Stats and extended stats aggregations Cardinality aggregation Let us learn about them one by one. Sum, average, min, and max aggregations Finding the sum of a field, the minimum value for a field, the maximum value for a field, or an average, are very common operations. For the people who are familiar with SQL, the query to find the sum would look like the following: SELECT sum(downloadTotal) FROM usageReport; The preceding query will calculate the sum of the downloadTotal field across all records in the table. This requires going through all records of the table or all records in the given context and adding the values of the given fields. In Elasticsearch, a similar query can be written using the sum aggregation. Let us understand the sum aggregation first. Sum aggregation Here is how to write a simple sum aggregation: GET bigginsight/_search { "aggregations": { 1 "download_sum": { 2 "sum": { 3 "field": "downloadTotal" 4 } } }, "size": 0 5 } The aggs or aggregations element at the top level should wrap any aggregation. Give a name to the aggregation; here we are doing the sum aggregation on the downloadTotal field and hence the name we chose is download_sum. You can name it anything. This field will be useful while looking up this particular aggregation's result in the response. We are doing a sum aggregation, hence the sum element. We want to do term aggregation on the downloadTotal field. Specify size = 0 to prevent raw search results from being returned. We just want aggregation results and not the search results in this case. Since we haven't specified any top level query elements, it matches all documents. We do not want any raw documents (or search hits) in the result. The response should look like the following: { "took": 92, ... "hits": { "total": 242836, 1 "max_score": 0, "hits": [] }, "aggregations": { 2 "download_sum": { 3 "value": 2197438700 4 } } } Let us understand the key aspects of the response. The key parts are numbered 1, 2, 3, and so on, and are explained in the following points: The hits.total element shows the number of documents that were considered or were in the context of the query. If there was no additional query or filter specified, it will include all documents in the type or index. Just like the request, this response is wrapped inside aggregations to indicate as Such. The response of the aggregation requested by us was named download_sum, hence we get our response from the sum aggregation inside an element with the same name. The actual value after applying the sum aggregation. The average, min, and max aggregations are very similar. Let's look at them briefly. Average aggregation The average aggregation finds an average across all documents in the querying context: GET bigginsight/_search { "aggregations": { "download_average": { 1 "avg": { 2 "field": "downloadTotal" } } }, "size": 0 } The only notable differences from the sum aggregation are as follows: We chose a different name, download_average, to make it apparent that the aggregation is trying to compute the average. The type of aggregation that we are doing is avg instead of the sum aggregation that we were doing earlier. The response structure is identical but the value field will now represent the average of the requested field. The min and max aggregations are the exactly same. Min aggregation Here is how we will find the minimum value of the downloadTotal field in the entire index/type: GET bigginsight/_search { "aggregations": { "download_min": { "min": { "field": "downloadTotal" } } }, "size": 0 } Let's finally look at max aggregation also. Max aggregation Here is how we will find the maximum value of the downloadTotal field in the entire index/type: GET bigginsight/_search { "aggregations": { "download_max": { "max": { "field": "downloadTotal" } } }, "size": 0 } These aggregations were really simple. Now let's look at some more advanced yet simple stats and extended stats aggregations. Stats and extended stats aggregations These aggregations compute some common statistics in a single request without having to issue multiple requests. This saves resources on the Elasticsearch side as well because the statistics are computed in a single pass rather than being requested multiple times. The client code also becomes simpler if you are interested in more than one of these statistics. Let's look at the stats aggregation first. Stats aggregation The stats aggregation computes the sum, average, min, max, and count of documents in a single pass: GET bigginsight/_search { "aggregations": { "download_stats": { "stats": { "field": "downloadTotal" } } }, "size": 0 } The structure of the stats request is the same as the other metric aggregations we have seen so far, so nothing special is going on here. The response should look like the following: { "took": 4, ..., "hits": { "total": 242836, "max_score": 0, "hits": [] }, "aggregations": { "download_stats": { "count": 242835, "min": 0, "max": 241213, "avg": 9049.102065188297, "sum": 2197438700 } } } As you can see, the response with the download_stats element contains count, min, max, average, and sum; everything is included in the same response. This is very handy as it reduces the overhead of multiple requests and also simplifies the client code. Let us look at the extended stats aggregation. Extended stats Aggregation The extended stats aggregation returns a few more statistics in addition to the ones returned by the stats aggregation: GET bigginsight/_search { "aggregations": { "download_estats": { "extended_stats": { "field": "downloadTotal" } } }, "size": 0 } The response looks like the following: { "took": 15, "timed_out": false, ..., "hits": { "total": 242836, "max_score": 0, "hits": [] }, "aggregations": { "download_estats": { "count": 242835, "min": 0, "max": 241213, "avg": 9049.102065188297, "sum": 2197438700, "sum_of_squares": 133545882701698, "variance": 468058704.9782911, "std_deviation": 21634.664429528162, "std_deviation_bounds": { "upper": 52318.43092424462, "lower": -34220.22679386803 } } } } It also returns the sum of squares, variance, standard deviation, and standard deviation Bounds. Cardinality aggregation Finding the count of unique elements can be done with the cardinality aggregation. It is similar to finding the result of a query such as the following: select count(*) from (select distinct username from usageReport) u; Finding the cardinality or the number of unique values for a specific field is a very common requirement. If you have click-stream from the different visitors on your website, you may want to find out how many unique visitors you got in a given day, week, or month. Let us understand how we find out the count of unique users for which we have network traffic data: GET bigginsight/_search { "aggregations": { "unique_visitors": { "cardinality": { "field": "username" } } }, "size": 0 } The cardinality aggregation response is just like the other metric aggregations: { "took": 110, ..., "hits": { "total": 242836, "max_score": 0, "hits": [] }, "aggregations": { "unique_visitors": { "value": 79 } } } To summarize, we learned how to perform numerous metric aggregations on numeric datasets and easily deploy elasticsearch in building powerful analytics application. If you found this tutorial useful, do check out the book Learning Elastic Stack 6.0 to examine the fundamentals of Elastic Stack in detail and start developing solutions for problems like logging, site search, app search, metrics and more.

0
0
23309

article-image-tensorflow-models-mobile-embedded-devices

Savia Lobo

15 May 2018

12 min read

How to Build TensorFlow Models for Mobile and Embedded devices

Savia Lobo

15 May 2018

12 min read

TensorFlow models can be used in applications running on mobile and embedded platforms. TensorFlow Lite and TensorFlow Mobile are two flavors of TensorFlow for resource-constrained mobile devices. TensorFlow Lite supports a subset of the functionality compared to TensorFlow Mobile. It results in better performance due to smaller binary size with fewer dependencies. The article covers topics for training a model to integrate TensorFlow into an application. The model can then be saved and used for inference and prediction in the mobile application. [box type="note" align="" class="" width=""]This article is an excerpt from the book Mastering TensorFlow 1.x written by Armando Fandango. This book will help you leverage the power of TensorFlow and Keras to build deep learning models, using concepts such as transfer learning, generative adversarial networks, and deep reinforcement learning.[/box] To learn how to use TensorFlow models on mobile devices, following topics are covered: TensorFlow on mobile platforms TF Mobile in Android apps TF Mobile demo on Android TF Mobile demo on iOS TensorFlow Lite TF Lite demo on Android TF Lite demo on iOS TensorFlow on mobile platforms TensorFlow can be integrated into mobile apps for many use cases that involve one or more of the following machine learning tasks: Speech recognition Image recognition Gesture recognition Optical character recognition Image or text classification Image, text, or speech synthesis Object identification To run TensorFlow on mobile apps, we need two major ingredients: A trained and saved model that can be used for predictions A TensorFlow binary that can receive the inputs, apply the model, produce the predictions, and send the predictions as output The high-level architecture looks like the following figure: The mobile application code sends the inputs to the TensorFlow binary, which uses the trained model to compute predictions and send the predictions back. TF Mobile in Android apps The TensorFlow ecosystem enables it to be used in Android apps through the interface class TensorFlowInferenceInterface, and the TensorFlow Java API in the jar file libandroid_tensorflow_inference_java.jar. You can either use the jar file from the JCenter, download a precompiled jar from ci.tensorflow.org, or build it yourself. The inference interface has been made available as a JCenter package and can be included in the Android project by adding the following code to the build.gradle file: allprojects { repositories { jcenter() } } dependencies { compile 'org.tensorflow:tensorflow-android:+' } Note : Instead of using the pre-built binaries from the JCenter, you can also build them yourself using Bazel or Cmake by following the instructions at this link: https://github.com/tensorflow/tensorflow/blob/r1.4/ tensorflow/contrib/android/README.md Once the TF library is configured in your Android project, you can call the TF model with the following four steps: Load the model: TensorFlowInferenceInterface inferenceInterface = new TensorFlowInferenceInterface(assetManager, modelFilename); Send the input data to the TensorFlow binary: inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); Run the prediction or inference: inferenceInterface.run(outputNames, logStats); Receive the output from the TensorFlow binary: inferenceInterface.fetch(outputName, outputs); TF Mobile demo on Android In this section, we shall learn about recreating the Android demo app provided by the TensorFlow team in their official repo. The Android demo will install the following four apps on your Android device: TF Classify: This is an object identification app that identifies the images in the input from the device camera and classifies them in one of the pre-defined classes. It does not learn new types of pictures but tries to classify them into one of the categories that it has already learned. The app is built using the inception model pre-trained by Google. TF Detect: This is an object detection app that detects multiple objects in the input from the device camera. It continues to identify the objects as you move the camera around in continuous picture feed mode. TF Stylize: This is a style transfer app that transfers one of the selected predefined styles to the input from the device camera. TF Speech: This is a speech recognition app that identifies your speech and if it matches one of the predefined commands in the app, then it highlights that specific command on the device screen. Note: The sample demo only works for Android devices with an API level greater than 21 and the device must have a modern camera that supports FOCUS_MODE_CONTINUOUS_PICTURE. If your device camera does not have this feature supported, then you have to add the path submitted to TensorFlow by the author: https://github.com/ tensorflow/tensorflow/pull/15489/files. The easiest way to build and deploy the demo app on your device is using Android Studio. To build it this way, follow these steps: Install Android Studio. We installed Android Studio on Ubuntu 16.04 from the instructions at the following link: https://developer.android.com/studio/ install.html Check out the TensorFlow repository, and apply the patch mentioned in the previous tip. Let's assume you checked out the code in the tensorflow folder in your home directory. Using Android Studio, open the Android project in the path ~/tensorflow/tensorflow/examples/Android. Your screen will look similar to this: Expand the Gradle Scripts option from the left bar and then open the build.gradle file. In the build.gradle file, locate the def nativeBuildSystem definition and set it to 'none'. In the version of the code we checked out, this definition is at line 43: def nativeBuildSystem = 'none' Build the demo and run it on either a real or simulated device. We tested the app on these devices: 7. You can also build the apk and install the apk file on the virtual or actual connected device. Once the app installs on the device, you will see the four apps we discussed earlier: You can also build the whole demo app from the source using Bazel or Cmake by following the instructions at this link: https://github.com/tensorflow/tensorflow/tree/r1.4/tensorflow/examples/android TF Mobile in iOS apps TensorFlow enables support for iOS apps by following these steps: Include TF Mobile in your app by adding a file named Profile in the root directory of your project. Add the following content to the Profile: target 'Name-Of-Your-Project' pod 'TensorFlow-experimental' Run the pod install command to download and install the TensorFlow Experimental pod. Run the myproject.xcworkspace command to open the workspace so you can add the prediction code to your application logic. Note: To create your own TensorFlow binaries for iOS projects, follow the instructions at this link: https://github.com/tensorflow/tensorflow/ tree/master/tensorflow/examples/ios Once the TF library is configured in your iOS project, you can call the TF model with the following four steps: Load the model: PortableReadFileToProto(file_path, &tensorflow_graph); Create a session: tensorflow::Status s = session->Create(tensorflow_graph); Run the prediction or inference and get the outputs: std::string input_layer = "input"; std::string output_layer = "output"; std::vector<tensorflow::Tensor> outputs; tensorflow::Status run_status = session->Run( {{input_layer, image_tensor}}, {output_layer}, {}, &outputs); Fetch the output data: tensorflow::Tensor* output = &outputs[0]; TF Mobile demo on iOS In order to build the demo on iOS, you need Xcode 7.3 or later. Follow these steps to build the iOS demo apps: Check out the TensorFlow code in a tensorflow folder in your home directory. Open a terminal window and execute the following commands from your home folder to download the Inception V1 model, extract the label and graph files, and move these files into the data folders inside the sample app code: $ mkdir -p ~/Downloads $ curl -o ~/Downloads/inception5h.zip https://storage.googleapis.com/download.tensorflow.org/models/incep tion5h.zip && unzip ~/Downloads/inception5h.zip -d ~/Downloads/inception5h $ cp ~/Downloads/inception5h/* ~/tensorflow/tensorflow/examples/ios/benchmark/data/ $ cp ~/Downloads/inception5h/* ~/tensorflow/tensorflow/examples/ios/camera/data/ $ cp ~/Downloads/inception5h/* ~/tensorflow/tensorflow/examples/ios/simple/data/ Navigate to one of the sample folders and download the experimental pod: $ cd ~/tensorflow/tensorflow/examples/ios/camera $ pod install Open the Xcode workspace: $ open tf_simple_example.xcworkspace Run the sample app in the device simulator. The sample app will appear with a Run Model button. The camera app requires an Apple device to be connected, while the other two can run in a simulator too. TensorFlow Lite TF Lite is the new kid on the block and still in the developer view at the time of writing this book. TF Lite is a very small subset of TensorFlow Mobile and TensorFlow, so the binaries compiled with TF Lite are very small in size and deliver superior performance. Apart from reducing the size of binaries, TensorFlow employs various other techniques, such as: The kernels are optimized for various device and mobile architectures The values used in the computations are quantized The activation functions are pre-fused It leverages specialized machine learning software or hardware available on the device, such as the Android NN API The workflow for using the models in TF Lite is as follows: Get the model: You can train your own model or pick a pre-trained model available from different sources, and use the pre-trained as is or retrain it with your own data, or retrain after modifying some parts of the model. As long as you have a trained model in the file with an extension .pb or .pbtxt, you are good to proceed to the next step. We learned how to save the models in the previous chapters. Checkpoint the model: The model file only contains the structure of the graph, so you need to save the checkpoint file. The checkpoint file contains the serialized variables of the model, such as weights and biases. We learned how to save a checkpoint in the previous chapters. Freeze the model: The checkpoint and the model files are merged, also known as freezing the graph. TensorFlow provides the freeze_graph tool for this step, which can be executed as follows: $ freeze_graph --input_graph=mymodel.pb --input_checkpoint=mycheckpoint.ckpt --input_binary=true --output_graph=frozen_model.pb --output_node_name=mymodel_nodes Convert the model: The frozen model from step 3 needs to be converted to TF Lite format with the toco tool provided by TensorFlow: $ toco --input_file=frozen_model.pb --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --input_type=FLOAT --input_arrays=input_nodes --output_arrays=mymodel_nodes --input_shapes=n,h,w,c The .tflite model saved in step 4 can now be used inside an Android or iOS app that employs the TFLite binary for inference. The process of including the TFLite binary in your app is continuously evolving, so we recommend the reader follows the information at this link to include the TFLite binary in your Android or iOS app: https://github.com/tensorflow/tensorflow/tree/master/ tensorflow/contrib/lite/g3doc Generally, you would use the graph_transforms:summarize_graph tool to prune the model obtained in step 1. The pruned model will only have the paths that lead from input to output at the time of inference or prediction. Any other nodes and paths that are required only for training or for debugging purposes, such as saving checkpoints, are removed, thus making the size of the final model very small. The official TensorFlow repository comes with a TF Lite demo that uses a pre-trained mobilenet to classify the input from the device camera in the 1001 categories. The demo app displays the probabilities of the top three categories. TF Lite Demo on Android To build a TF Lite demo on Android, follow these steps: Install Android Studio. We installed Android Studio on Ubuntu 16.04 from the instructions at the following link: https://developer.android.com/studio/ install.html Check out the TensorFlow repository, and apply the patch mentioned in the previous tip. Let's assume you checked out the code in the tensorflow folder in your home directory. Using Android Studio, open the Android project from the path ~/tensorflow/tensorflow/contrib/lite/java/demo. If it complains about a missing SDK or Gradle components, please install those components and sync Gradle. Build the project and run it on a virtual device with API > 21. We received the following warnings, but the build succeeded. You may want to resolve the warnings if the build fails: Warning:The Jack toolchain is deprecated and will not run. To enable support for Java 8 language features built into the plugin, remove 'jackOptions { ... }' from your build.gradle file, and add android.compileOptions.sourceCompatibility 1.8 android.compileOptions.targetCompatibility 1.8 Note: Future versions of the plugin will not support usage 'jackOptions' in build.gradle. To learn more, go to https://d.android.com/r/tools/java-8-support-message.html Warning:The specified Android SDK Build Tools version (26.0.1) is ignored, as it is below the minimum supported version (26.0.2) for Android Gradle Plugin 3.0.1. Android SDK Build Tools 26.0.2 will be used. To suppress this warning, remove "buildToolsVersion '26.0.1'" from your build.gradle file, as each version of the Android Gradle Plugin now has a default version of the build tools. TF Lite demo on iOS In order to build the demo on iOS, you need Xcode 7.3 or later. Follow these steps to build the iOS demo apps: Check out the TensorFlow code in a tensorflow folder in your home directory. Build the TF Lite binary for iOS from the instructions at this link: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite Navigate to the sample folder and download the pod: $ cd ~/tensorflow/tensorflow/contrib/lite/examples/ios/camera $ pod install Open the Xcode workspace: $ open tflite_camera_example.xcworkspace Run the sample app in the device simulator. We learned about using TensorFlow models on mobile applications and devices. TensorFlow provides two ways to run on mobile devices: TF Mobile and TF Lite. We learned how to build TF Mobile and TF Lite apps for iOs and Android. We used TensorFlow demo apps as an example. If you found this post useful, do check out the book Mastering TensorFlow 1.x to skill up for building smarter, faster, and efficient machine learning and deep learning systems. The 5 biggest announcements from TensorFlow Developer Summit 2018 Getting started with Q-learning using TensorFlow Implement Long-short Term Memory (LSTM) with TensorFlow

0
0
23157

article-image-microsoft-open-sources-infer-net-its-popular-model-based-machine-learning-framework

Melisha Dsouza

08 Oct 2018

3 min read

Microsoft open sources Infer.NET, it’s popular model-based machine learning framework

Melisha Dsouza

08 Oct 2018

3 min read

Last week, Microsoft open sourced Infer.NET, the cross-platform framework used for model-based machine learning. This popular machine learning engine used in Office, Xbox and Azure, will be available on GitHub under the permissive MIT license for free use in commercial applications. Features of Infer.NET The team at Microsoft Research in Cambridge initially envisioned Infer.NET as a research tool and released it for academic use in 2008. The framework has served as a base to publish hundreds of papers across a variety of fields, including information retrieval and healthcare. The team then started using the framework as a machine learning engine within a wide range of Microsoft products. A model-based approach to machine learning Infer.NET allows users to incorporate domain knowledge into their model. The framework can be used to build bespoke machine learning algorithms directly from their model. To sum it up, this framework actually constructs a learning algorithm for users based on the model they have provided. Facilitates interpretability Infer.NET also facilitates interpretability. If users have designed the model themselves and the learning algorithm follows that model, they can understand why the system behaves in a particular way or makes certain predictions. Probabilistic Approach In Infer.NET, models are described using a probabilistic program. This is used to describe real-world processes in a language that machines understand. Infer.NET compiles the probabilistic program into high-performance code for implementing something cryptically called deterministic approximate Bayesian inference. This approach allows a notable amount of scalability. For instance, it can be used in a system that automatically extracts knowledge from billions of web pages, comprising petabytes of data. Additional Features The framework also supports the ability of the system to learn as new data arrives. The team is also working towards developing and growing it further. Infer.NET will become a part of ML.NET (the machine learning framework for .NET developers). They have already set up the repository under the .NET Foundation and moved the package and namespaces to Microsoft.ML.Probabilistic. Being cross platform, Infer.NET supports .NET Framework 4.6.1, .NET Core 2.0, and Mono 5.0. Windows users get to use Visual Studio 2017, while macOS and Linux folks have command-line options, which could be incorporated into the code wrangler of their choice. Download the framework to learn more about Infer.NET. You can also check the documentation for a detailed User Guide. To know more about this news, head over to Microsoft’s official blog. Microsoft announces new Surface devices to enhance user productivity, with style and elegance Neural Network Intelligence: Microsoft’s open source automated machine learning toolkit Microsoft’s new neural text-to-speech service lets machines speak like people

0
0
23092

article-image-how-pytorch-is-bridging-the-gap-between-research-and-production-at-facebook-pytorch-team-at-f8-conference

Vincy Davis

04 Dec 2019

7 min read

How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference

Vincy Davis

04 Dec 2019

7 min read

PyTorch, the machine learning library which was originally developed as a research framework by a Facebook intern in 2017, has now grown into a popular deep learning workflow. One of the most loved products by Facebook, PyTorch is free, open source, and used for applications like computer vision and natural language processing (NLP). At the F8 conference held this year, the PyTorch team consisting of Joe Spisak, the project manager for PyTorch at Facebook AI and Dmytro Dzhulgakov, the tech lead at Facebook AI gave a talk on how Facebook is developing and scaling AI experiences with PyTorch. Spisak describes PyTorch as an eager and graph-based execution that is defined by ‘run’. This means that when a user executes a Python code, it generates a graph on the fly. It is dynamic in nature and allows the compilation of the static graph. The dynamic neural networks are accessible, thus, allowing the user to change the parameters very quickly. This feature comes in handy for applications like control flow in NLP. Another important feature of PyTorch, according to Spisak, is the ability to generate accurately distributed training models that possess close to billion parameters, including the cutting-edge ones. It also has a simple and easy API that is very intuitive by nature. This is one of the qualities of PyTorch which has endeared many developers, claims Spisak. Become a pro at Deep Learning with PyTorch! If you want to become an expert in building and training neural network models with high speed and flexibility in text, vision, and advanced analytics using PyTorch 1.x, read our book Deep Learning with PyTorch 1.x - Second Edition written by Sri. Yogesh K., Laura Mitchell, et al. It will give you an insight into solving real-world problems using CNNs, RNNs, and LSTMs, along with discovering state-of-the-art modern deep learning architectures, such as ResNet, DenseNet, and Inception. How PyTorch is bridging the gap between research and production at Facebook Dzhulgakov points out how general advances in AI are driven by innovative research in the fields of academia or industry and why it’s necessary to bridge this big lag between research and production. He says, “If you have a new idea and you want to take it all the way through to deployment, you usually need to go through multiple steps - figure out what the approach is and then find the training data maybe prepare massage it a little bit. Actually, build and train your model after that and then there is this painful step of transferring your model to a production environment which often historically involved reimplementation of a lot of code so you can actually take and deploy it and scale-up.” According to Dzhulgakov, PyTorch is trying to minimize this big gap by encouraging advances and experimentations in the field, so that the research is brought into production in a few days, instead of months. Challenges in bringing research to production Following are the various classes of challenges associated with bringing research to production, according to the PyTorch team. Hardware efficiency: In case of a tight latency constraint environment, users are required to fit all the hardware into the performance budget. On the other hand, an underused hardware environment can lead to an increase in cost. Scalability: In Facebook’s recent work, Dzhulgakov says, they have trained on billions of public images, thus indicating significant accuracy gains as compared to regular datasets like imageNet. Similarly, when models are taken to inference, it means that billions of inferences per second are running with multiple diverse models sharing the same hardware. Cross-platform: Neural networks are mostly not isolated as they need to be deployed inside their target application. It has a lot of interdependence with the surrounding code and application, thus posing different constraints like the user will not be able to run Python code or the user will have to work on very constrained computer capabilities if running a mobile device, and more. Reliability: A lot of PyTorch jobs run for multiple weeks on hundreds of GPUs, hence it is important to design a reliable software which can tolerate hardware failures and deliver results. How PyTorch is tackling these challenges In order to tackle the above-listed challenges, Dzhulgakov says Facebook develops systems that can take up a training job and perform optimizations focused on performance for the performance-critical pieces. The system also applies “recipes for reliability” so that the developer written modeling code is automatically transformed. The Jit package comes into the picture here and acts like a key factor that is built to capture the structure of the Python program with minimal changes. The main goal of the Jit package is to make this process almost seamless. He asserts that PyTorch has been successful since it feels like regular programming in Python and most of its users start developing in traditional PyTorch mode (eager mode) just by writing and prototyping in the program. “For the subset of promising models which shall show what results you need to bring to production either scale up, so you can apply techniques provided by Jit to exist mental codes and annotated in order to run in so-called script code.” The Jit is like a subset of Python with a thread list of request semantics, which allows the user to apply transparent transformations for the eager mode to the user. The annotations include adding a few lines of Python code on top of the function in such a way that it can be done incrementally on function by function or module by module fashion. This hybrid fashion ensures that the model works along the way. Such powerful PyTorch tools permit the user to share the same code base between research and production environments. Next, Dzhulgakov deduces that the common factor between research and production is that both teams of developers work on the same code base built on top of PyTorch. Thus, they share the codes among the teams that have a common domain like text classification or object detection or reinforcement learning. These developers prototype models, train new algorithms and address new tasks for quickly transitioning this functionality to the opposite environment. Watch the full talk to see Dzhulgakov’s examples of PyTorch bridging the gap between research and production at Facebook. If you want to become an expert at implementing deep learning applications in PyTorch, check out our latest book Deep Learning with PyTorch 1.x - Second Edition written by Sri. Yogesh K., Laura Mitchell, and Et al. This book will show you how to apply neural networks to domains such as computer vision and NLP. It will also guide you to build, train, and scale a model with PyTorch and cover complex neural networks such as GANs and autoencoders for producing text and images. NVIDIA releases Kaolin, a PyTorch library to accelerate research in 3D computer vision and AI Introducing ESPRESSO, an open-source, PyTorch based, end-to-end neural automatic speech recognition (ASR) toolkit for distributed training across GPUs Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility

0
0
23083

Packt

15 Sep 2015

24 min read

DynamoDB Best Practices

Packt

15 Sep 2015

24 min read

0
0
23026

article-image-perform-regression-analysis-using-sas

Gebin George

27 Feb 2018

7 min read

How to perform regression analysis using SAS

Gebin George

27 Feb 2018

7 min read

[box type="note" align="" class="" width=""]This article is an excerpt from the book, Big Data Analysis with SAS written by David Pope. This book will help you leverage the power of SAS for data management, analysis and reporting. It contains practical use-cases and real-world examples on predictive modelling, forecasting, optimizing, and reporting your Big Data analysis using SAS.[/box] Today, we will perform regression analysis using SAS in a step-by-step manner with a practical use-case. Regression analysis is one of the earliest predictive techniques most people learn because it can be applied across a wide variety of problems dealing with data that is related in linear and non-linear ways. Linear data is one of the easier use cases, and as such PROC REG is a well-known and often-used procedure to help predict likely outcomes before they happen. The REG procedure provides extensive capabilities for fitting linear regression models that involve individual numeric independent variables. Many other procedures can also fit regression models, but they focus on more specialized forms of regression, such as robust regression, generalized linear regression, nonlinear regression, nonparametric regression, quantile regression, regression modeling of survey data, regression modeling of survival data, and regression modeling of transformed variables. The SAS/STAT procedures that can fit regression models include the ADAPTIVEREG, CATMOD, GAM, GENMOD, GLIMMIX, GLM, GLMSELECT, LIFEREG, LOESS, LOGISTIC, MIXED, NLIN, NLMIXED, ORTHOREG, PHREG, PLS, PROBIT, QUANTREG, QUANTSELECT, REG, ROBUSTREG, RSREG, SURVEYLOGISTIC, SURVEYPHREG, SURVEYREG, TPSPLINE, and TRANSREG procedures. Several procedures in SAS/ETS software also fit regression models. SAS/STAT14.2 / SAS/STAT User's Guide - Introduction to Regression Procedures - Overview: Regression Procedures (http://documentation.sas.com/?cdcId=statcdccdcVersion=14.2 docsetId=statugdocsetTarget=statug_introreg_sect001.htmlocale=enshowBanner=yes). Regression analysis attempts to model the relationship between a response or output variable and a set of input variables. The response is considered the target variable or the variable that one is trying to predict, while the rest of the input variables make up parameters used as input into the algorithm. They are used to derive the predicted value for the response variable. PROC REG One of the easiest ways to determine if regression analysis is applicable to helping you answer a question is if the type of question being asked has only two answers. For example, should a bank lend an applicant money? Yes or no? This is known as a binary response, and as such, regression analysis can be applied to help determine the answer. In the following example, the reader will use the SASHELP.BASEBALL dataset to create a regression model to predict the value of a baseball player's salary. The SASHELP.BASEBALL dataset contains salary and performance information for Major League. Baseball players who played at least one game in both the 1986 and 1987 seasons, excluding pitchers. The salaries (Sports Illustrated, April 20, 1987) are for the 1987 season and the performance measures are from 1986 (Collier Books, The 1987 Baseball Encyclopedia Update). SAS/STAT® 14.2 / SAS/STAT User's Guide - Example 99: Modeling Salaries of Major League Baseball Players (http://documentation.sas.com/ ?cdcId= statcdc cdcVersion= 14.2 docsetId=statugdocsetTarget= statug_ reg_ examples01.htmlocale= en showBanner= yes). Let's first use PROC UNIVARIATE to learn something about this baseball data by submitting the following code: proc univariate data=sashelp.baseball; quit; While reviewing the results of the output, the reader will notice that the variance associated with logSalary, 0.79066, is much less than the variance associated with the actual target variable Salary, 203508. In this case, it makes better sense to attempt to predict the logSalary value of a player instead of Salary. Write the following code in a SAS Studio program section and submit it: proc reg data=sashelp.baseball; id name team league; model logSalary = nAtBat nHits nHome nRuns nRBI YrMajor CrAtBat CrHits CrHome CrRuns CrRbi; Quit; Notice that there are 59 observations as specified in the first output table with at least one of the input variables with missing values; as such those are not used in the development of the regression model. The Root Mean Squared Error (RMSE) and R-square are statistics that typically inform the analyst how good the model is in predicting the target. These range from 0 to 1.0 with higher values typically indicating a better model. The higher the Rsquared values typically indicate a better performing model but sometimes conditions or the data used to train the model over-fit and don't represent the true value of the prediction power of that particular model. Over-fitting can happen when an analyst doesn't have enough real-life data and chooses data or a sample of data that over-presents the target event, and therefore it will produce a poor performing model when using real-world data as input. Since several of the input values appear to have little predictive power on the target, an analyst may decide to drop these variables, thereby reducing the need for that information to make a decent prediction. In this case, it appears we only need to use four input variables. YrMajor, nHits, nRuns, and nAtBat. Modify the code as follows and submit it again: proc reg data=sashelp.baseball; id name team league; model logSalary = YrMajor nHits nRuns nAtBat; Quit; The p-value associated with each of the input variables provides the analyst with an insight into which variables have the biggest impact on helping to predict the target variable. In this case, the smaller the value, the higher the predictive value of the input variable. Both the RMSE and R-square values for this second model are slightly lower than the original. However, the adjusted R-square value is slightly higher. In this case, an analyst may chose to use the second model since it requires much less data and provides basically the same predictive power. Prior to accepting any model, an analyst should determine whether there are a few observations that may be over-influencing the results by investigating the influence and fit diagnostics. The default output from PROC REG provides this type of visual insight: The top-right corner plot, showing the externally studentized residuals (RStudent) by leverage values, shows that there are a few observations with high leverage that may be overly influencing the fit produced. In order to investigate this further, we will add a plots statement to our PROC REG to produce a labeled version of this plot. Type the following code in a SAS Studio program section and submit: proc reg data=sashelp.baseball plots(only label)=(RStudentByLeverage); id name team league; model logSalary = YrMajor nHits nRuns nAtBat; Quit; Sure enough, there are three to five individuals whose input variables may have excessive influence on fitting this model. Let's remove those points and see if the model improves. Type this code in a SAS Studio program section and submit it: proc reg data=sashelp.baseball plots=(residuals(smooth)); where name NOT IN ("Mattingly, Don", "Henderson, Rickey", "Boggs, Wade", "Davis, Eric", "Rose, Pete"); id name team league; model logSalary = YrMajor nHits nRuns nAtBat; Quit; This change, in itself, has not improved the model but actually made the model worse as can be seen by the R-square, 0.5592. However, the plots residuals(smooth) option gives some insights as it pertains to YrMajor; players at the beginning and the end of their careers tend to be paid less compared to others, as can be seen in Figure 4.12: In order to address this lack of fit, an analyst can use polynomials of degree two for this variable, YrMajor. Type the following code in a SAS Studio program section and submit it: data work.baseball; set sashelp.baseball; where name NOT IN ("Mattingly, Don", "Henderson, Rickey", "Boggs, Wade", "Davis, Eric", "Rose, Pete"); YrMajor2 = YrMajor*YrMajor; run; proc reg data=work.baseball; id name team league; model logSalary = YrMajor YrMajor2 nHits nRuns nAtBat; Quit; After removing some outliers and adjusting for the YrMajor variable, the model's predictive power has improved significantly as can be seen in the much improved R-square value of 0.7149. We saw an effective way of performing regression analysis using SAS platform. If you found our post useful, do check out this book Big Data Analysis with SAS to understand other data analysis models and perform them practically using SAS.

0
0
22959

article-image-learn-to-build-a-scatterplot-in-ibm-spss

Kartikey Pandey

27 Nov 2017

4 min read

How to build a Scatterplot in IBM SPSS

Kartikey Pandey

27 Nov 2017

4 min read

[box type="note" align="" class="" width=""] ----The following excerpt is from the title Data Analysis with IBM SPSS Statistics, Chapter 5, written by Kenneth Stehlik-Barry and Anthony J. Babinec. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data. [/box] In this article we help you learn the techniques of SPSS to build Scatterplot using the Chart Builder feature. One of the most valuable methods for examining the relationship between two variables containing scale-level data is a scatterplot. In the previous chapter, scatterplots were used to detect points that deviated from the typical pattern--multivariate outliers. To produce a similar scatterplot using two fields from the 2016 General Social Survey data, navigate to Graphs | Chart Builder. An information box is displayed indicating that each field's measurement properties will be used to identify the types of graphs available so adjusting these properties is advisable. In this example, the properties will be modified as a part of the graph specification process but you may want to alter the properties of some variables permanently so that they don't need to be changed for each use. For now, just select OK to move ahead. In the main Chart Builder window, select Scatter/Dot from the menu at the lower left, double-click on the first graph to the right (Simple Scatter) to place it in the preview pane at the upper right, and then right-click on the first field labeled HIGHEST YEAR OF SCHOOL. Change this variable from Nominal to Scale, as shown in the following Screenshot: After changing the respondent's education to Scale, drag this field to the X-Axis location in the preview pane and drag spouse's education to the Y-Axis location. Once both elements are in place, the OK choice will become available. Select it to produce the scatterplot in the following screenshot: The scatterplot produced by default provides some sense of the trend in that the denser circles are concentrated in a band from the lower left to the upper right. This pattern, however, is rather subtle visually. With some editing, the relationship can be made more Evident. Double-click on the graph to open the Chart Editor and select the X icon at the top and change the major increment to 4 so that there are numbers corresponding to completing high school and college. Do the same for the y-axis values. Select a point on the graph to highlight all the "dots" and right-click to display the following dialog. Click on the Marker tab and change the symbol to the star shape, increase the size to 6, increase the border to 2, and change the border color to a dark blue. Use Apply to make the changes visible on the scatterplot: Use the Add Fit line at Total icon above the graph to show the regression line for this data. Drag the R2 box from the upper right to the bottom, below the graph and drag the box on the graphs with the equation displayed to the lower left away from the points: The modifications to the original scatterplot make it easier to see the pattern since the “stars” near the line are darker and denser than those farther from the line indicating fewer cases are associated with those points. The SPSS capabilities with respect to scatterplot in this article will give you a foundation to create a visual representation of data for both deeper pattern discovery and to communicate results to a broader audience. Several other graph types such as pie charts and multiple line charts can be built and edited using the approach shown in Chapter 5, Visually Exploring the Data from our title Data Analysis with IBM SPSS Statistics. Go on, explore these alternative graph styles to see when they may be better suited to your needs.

0
0
22902

article-image-use-m-functions-within-power-bi-querying-data

Amarabha Banerjee

21 May 2018

10 min read

How to use M functions within Microsoft Power BI for querying data

Amarabha Banerjee

21 May 2018

10 min read

Microsoft Power BI Desktop contains a rich set of data source connectors and transformation capabilities that support the integration and enhancement of source data. These features are all driven by a powerful functional language and query engine, M, which leverages source system resources when possible and can greatly extend the scope and robustness of the data retrieval process beyond the possibilities of the standard query editor interface alone. As with almost all BI projects, the design and development of the data access and retrieval process has great implications for the analytical value, scalability, and sustainability of the overall Power BI solution. [box type="note" align="" class="" width=""]Our article is an excerpt from the book Microsoft Power BI Cookbook, written by Brett Powell. This book shows how to leverage Microsoft Power BI and the development tools to create better data driven analytics and visualizations. [/box] In this article, we dive into Power BI Desktop's Get Data experience and go through the process of establishing and managing data source connections and queries. Examples are provided of using the Query Editor interface and the M language directly to construct and refine queries to meet common data transformation and cleansing needs. In practice and as per the examples, a combination of both tools is recommended to aid the query development process. Viewing and analyzing M functions Every time you click on a button to connect to any of Power BI Desktop's supported data sources or apply any transformation to a data source object, such as changing a column's data type, one or multiple M expressions are created reflecting your choices. These M expressions are automatically written to dedicated M documents and, if saved, are stored within the Power BI Desktop file as Queries. M is a functional programming language like F#, and it's important that Power BI developers become familiar with analyzing and later writing and enhancing the M code that supports their queries. Getting ready Build a query through the user interface that connects to the AdventureWorksDW2016CTP3 SQL Server database on the ATLAS server and retrieves the DimGeography table, filtered by United States for English. Click on Get Data from the Home tab of the ribbon, select SQL Server from the list of database sources, and provide the server and database names. For the Data Connectivity mode, select Import. A navigation window will appear, with the different objects and schemas of the database. Select the DimGeography table from the Navigation window and click on Edit. In the Query Editor window, select the EnglishCountryRegionName column and then filter on United States from its dropdown. Figure 2: Filtering for United States only in the Query Editor At this point, a preview of the filtered table is exposed in the Query Editor and the Query Settings pane displays the previous steps. Figure 3: The Query Settings pane in the Query Editor How to do it Formula Bar With the Formula Bar visible in the Query Editor, click on the Source step under Applied Steps in the Query Settings pane. You should see the following formula expression: Figure 4: The SQL.Database() function created for the Source step Click on the Navigation step to expose the following expression: Figure 5: The metadata record created for the Navigation step The navigation expression (2) references the source expression (1) The Formula Bar in the Query Editor displays individual query steps, which are technically individual M expressions It's convenient and very often essential to view and edit all the expressions in a centralized window, and for this, there's the Advanced Editor M is a functional language, and it can be useful to think of query evaluation in M as similar to Excel spreadsheet formulas in which multiple formulas can reference each other. The M engine can determine which expressions are required by the final expression to return and evaluate only those expressions. Configuring Power BI Development Tools, the display setting for both the Query Settings pane and the Formula bar should be enabled as GLOBAL | Query Editor options. Figure 6: Global layout options for the Query Editor Alternatively, on a per file basis, you can control these settings and others from the View tab of the Query Editor toolbar. Figure 7: Property settings of the View tab in the Query Editor Advanced Editor window Given its importance to the query development process, the Advanced Editor dialog is exposed on both the Home and View tabs of the Query Editor. It's recommended to use the Query Editor when getting started with a new query and when learning the M language. After several steps have been applied, use the Advanced Editor to review and optionally enhance or customize the M query. As a rich, functional programming language, there are many M functions and optional parameters not exposed via the Query Editor; going beyond the limits of the Query Editor enables more robust data retrieval and integration processes. Figure 8: The Home tab of the Query Editor Click on Advanced Editor from either the View or Home tabs (Figure 8 and Figure 9, respectively). All M function expressions and any comments are exposed Figure 9: The Advanced Editor view of the DimGeography query When developing retrieval processes for Power BI models, consider these common ETL questions: How are our queries impacting the source systems? Can we make our retrieval queries more resilient to changes in source data such that they avoid failure? Is our retrieval process efficient and simple to follow and support or are there unnecessary steps and queries? Are our retrieval queries delivering sufficient performance to the BI application? Is our process flexible such that we can quickly apply changes to data sources and logic? M queries are not intended as a substitute for the workloads typically handled by enterprise ETL tools such as SSIS or Informatica. However, just as BI professionals would carefully review the logic and test the performance of SQL stored procedures and ETL packages supporting their various cubes and reports environment, they should also review the M queries created to support Power BI models and reports. How it works Two of the top performance and scalability features of M's engine are Query Folding and Lazy Evaluation. If possible, the M queries developed in Power BI Desktop are converted (folded) into SQL statements and passed to source systems for processing. M can also reduce the required resources for a given query by ignoring any unnecessary or redundant steps (variables). M is a case-sensitive language. This includes referencing variables in M expressions (RenameColumns versus Renamecolumns) as well as the values in M queries. For example, the values "Apple" and "apple" are considered unique values in an M query; the Table.Distinct() function will not remove rows for one of the values. Variable names in M expressions cannot have spaces without a hash sign and double quotes. Per Figure 10, when the Query Editor graphical interface is used to create M queries this syntax is applied automatically, along with a name describing the M transformation applied. Applying short, descriptive variable names (with no spaces) improves the readability of M queries. Query folding The query from this recipe was "folded" into the following SQL statement and sent to the ATLAS server for processing. Figure 10: The SQL statement generated from the DimGeography M query Right-click on the Filtered Rows step and select View Native Query to access the Native Query window from Figure 11: Figure 11: View Native Query in Query Settings Finding and revising queries that are not being folded to source systems is a top technique for enhancing large Power BI datasets. See the Pushing Query Processing Back to Source Systems recipe of Chapter 11, Enhancing and Optimizing Existing Power BI Solutions for an example of this process. M query structure The great majority of queries created for Power BI will follow the let...in structure as per this recipe, as they contain multiple steps with dependencies among them. Individual expressions are separated by commas. The expression referred to following the in keyword is the expression returned by the query. The individual step expressions are technically "variables", and if the identifiers for these variables (the names of the query steps) contain spaces then the step is placed in quotes, and prefixed with a # sign as per the Filtered Rows step in Figure 10. Lazy evaluation The M engine also has powerful "lazy evaluation" logic for ignoring any redundant or unnecessary variables, as well as short-circuiting evaluation (computation) once a result is determinate, such as when one side (operand) of an OR logical operator is computed as True. The order of evaluation of the expressions is determined at runtime; it doesn't have to be sequential from top to bottom. In the following example, a step for retrieving Canada was added and the step for the United States was ignored. Since the CanadaOnly variable satisfies the overall let expression of the query, only the Canada query is issued to the server as if the United States row were commented out or didn't exist. Figure 12: Revised query that ignores Filtered Rows step to evaluate Canada only View Native Query (Figure 12) is not available given this revision, but a SQL Profiler trace against the source database server (and a refresh of the M query) confirms that CanadaOnly was the only SQL query passed to the source database. Figure 13: Capturing the SQL statement passed to the server via SQL Server Profiler trace There's more Partial query folding A query can be "partially folded", in which a SQL statement is created resolving only part of an overall query The results of this SQL statement would be returned to Power BI Desktop (or the on-premises data gateway) and the remaining logic would be computed using M's in-memory engine with local resources M queries can be designed to maximize the use of the source system resources, by using standard expressions supported by query folding early in the query process Minimizing the use of local or on-premises data gateway resources is a top consideration Limitations of query folding No folding will take place once a native SQL query has been passed to the source system. For example, passing a SQL query directly through the Get Data dialog. The following query, specified in the Get Data dialog, is included in the Source Step: Figure 14: Providing a user defined native SQL query Any transformations applied after this native query will use local system resources. Therefore, the general implication for query development with native or user-defined SQL queries is that if they're used, try to include all required transformations (that is, joins and derived columns), or use them to utilize an important feature of the source database not being utilized by the folded query, such as an index. Not all data sources support query folding, such as text and Excel files. Not all transformations available in the Query Editor or via M functions directly are supported by some data sources. The privacy levels defined for the data sources will also impact whether folding is used or not. SQL statements are not parsed before they're sent to the source system. The Table.Buffer() function can be used to avoid query folding. The table output of this function is loaded into local memory and transformations against it will remain local. We have discussed effective techniques for accessing and retrieving data using Microsoft Power BI. Do check out this book Microsoft Power BI Cookbook for more information on using Microsoft power BI for data analysis and visualization. Expert Interview: Unlocking the secrets of Microsoft Power BI Tutorial: Building a Microsoft Power BI Data Model Expert Insights:Ride the third wave of BI with Microsoft Power BI

0
0
22707

article-image-tensorflow-js-contributor-kai-sasaki-on-how-tensorflow-js-eases-web-based-machine-learning-application-development

Sugandha Lahoti

28 Nov 2019

6 min read

TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development

Sugandha Lahoti

28 Nov 2019

6 min read

Running Machine Learning applications on the web browser is one of the hottest trends in software development right now. Many notable machine learning projects are being built with Tensorflow.js. It is one of the most popular frameworks for building performant machine learning applications that run smoothly in a web browser. Recently, we spoke with Kai Sasaki, who is one of the initial contributors to TensorFlow.js. He talked about current and future versions of TF.js, how it compares to other browser-based ML tools and his contributions to the community. He also shared his views on why he thinks Javascript good for Machine Learning. If you are a web developer with working knowledge of Javascript who wants to learn how to integrate machine learning techniques with web-based applications, we recommend you to read the book, Hands-on Machine Learning with TensorFlow.js. This hands-on course covers important aspects of machine learning with TensorFlow.js using practical examples. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. On how TensorFlow.js has improved web-based machine learning How do you think Machine Learning for the Web has evolved in the last 2-3 years? What are some current applications of web-based machine learning and TensorFlow.js? What can we expect in future releases? Machine Learning on the web platform is a field attracting more developers and machine learning practitioners. There are two reasons. First, the web platform is universally available. The web browser mostly provides us a way to access the underlying resource transparently. The second reason is security.raining a model on the client-side means you can keep sensitive data inside the client environment as the entire training process is completed on the client-side itself. The data is not sent to the cloud, making it more secure and less susceptible to vulnerabilities or hacking. In future releases as well, TensorFlow.js is expected to provide more secure and accessible functionalities. You can find various kinds of TensorFlow.js based applications here. How does TensorFlow.js compare with other web and browser-based machine learning tools? Does it make web-based machine learning application development easier? The most significant advantage of TensorFlow.js is the full compatibility of the TensorFlow ecosystem. Not only can a TensorFlow model be seamlessly used in TensorFlow.js, tools for visualization and model deployment in the TensorFlow ecosystem can also be used in TensorFlow.js. TensorFlow 2 was released in October. What are some new changes made specific to TensorFlow.js as a part of TF 2.0 that machine learning developers will find useful? What are your first impressions of this new release? Although there is nothing special related to TensorFlow 2.0, the full support of new backends is actively developed, such as WASM and WebGPU. These hardware acceleration mechanisms provided by the web platform can enhance performance for any TensorFlow.js application. It surely makes the potential of TensorFlow.js stronger and possible use cases broader. On Kai’s experience working on his book, Hands-on Machine Learning with TensorFlow.js Tell us the motivation behind writing your book Hands-on Machine Learning with TensorFlow.js. What are some of your favorite chapters/projects from the book? TensorFlow.js does not have much history because only three years have passed since its initial publication. Due to the lack of resources to learn TensorFlow.js usage, I was motivated to write a book illustrating how to using TensorFlow.js practically. I think chapters 4 - 9 of my book Hands-On Machine Learning with TensorFlow.js provide readers good material to practice how to write the ML application with TensorFlow.js. Why Javascript for Machine Learning Why do you think Javascript is good for Machine Learning? What are some of the good machine learning packages available in Javascript? How does it compare to other languages like Python, R, Matlab, especially in terms of performance? JavaScript is a primary programming language in the web platform so it can work as a bridge between the web and machine learning applications. We have several other libraries working similarly. For example, machinelearn.js is a general machine learning framework running with JavaScript. Although JavaScript is not a highly performant language, its universal availability in the web platform is attractive to developers as they can build their machine learning applications that are “write once, run anywhere”. We can compare the performance by running state-of-the-art machine learning models such as MobileNet or ResNet practically. On his contribution towards TF.js You are a contributor for TensorFlow.js and were awarded by the Google Open Source Peer Bonus Program. What were your main contributions? How was your experience working for TF.js? One of the significant contributions I have made was fast Fourier transformation operations. I have created the initial implementation of fft, ifft, rfft and irfft. I also added stft (short term Fourier transformation). These operators are mainly used for performing signal analysis for audio applications. I have done several bug fixes and test enhancements in TensorFlow.js too. What are the biggest challenges today in the field of Machine Learning and AI in web development? What do you see as some of the greatest technology disruptors in the next 5 years? While many developers are writing Python programming languages in the machine learning field, not many web developers have familiarity and knowledge of machine learning in spite of the substantial advantage of the integration between machine learning and web platform. I believe machine learning technologies will be democratized among web developers so that a vast amount of creativity is flourished in the next five years. By cooperating with these enthusiastic developers in the community, I believe the machine learning on the client-side or edge device will be one of the major contributions in the machine learning field. About the author Kai Sasaki works as a software engineer in Treasure Data to build large-scale distributed systems. He is one of the initial contributors to TensorFlow.js and contributes to developing operators for newer machine learning models. He has also received the Google Open Source Peer Bonus in 2018. You can find him on Twitter, Linkedin, and GitHub. About the book Hands-On Machine Learning with TensorFlow.js is a comprehensive guide that will help you easily get started with machine learning algorithms and techniques using TensorFlow.js. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. By the end of this book, you will be able to create and optimize your own web-based machine learning applications using practical examples. Baidu adds Paddle Lite 2.0, new development kits, EasyDL Pro, and other upgrades to its PaddlePaddle platform. Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!

0
0
22626

article-image-tracking-faces-haar-cascades

Packt

13 May 2013

4 min read

OpenCV: Tracking Faces with Haar Cascades

Packt

13 May 2013

4 min read

Conceptualizing Haar cascades When we talk about classifying objects and tracking their location, what exactly are we hoping to pinpoint? What constitutes a recognizable part of an object? Photographic images, even from a webcam, may contain a lot of detail for our (human) viewing pleasure. However, image detail tends to be unstable with respect to variations in lighting, viewing angle, viewing distance, camera shake, and digital noise. Moreover, even real differences in physical detail might not interest us for the purpose of classification. I was taught in school, that no two snowflakes look alike under a microscope. Fortunately, as a Canadian child, I had already learned how to recognize snowflakes without a microscope, as the similarities are more obvious in bulk. Thus, some means of abstracting image detail is useful in producing stable classification and tracking results. The abstractions are called features, which are said to be extracted from the image data. There should be far fewer features than pixels, though any pixel might influence multiple features. The level of similarity between two images can be evaluated based on distances between the images' corresponding features. For example, distance might be defined in terms of spatial coordinates or color coordinates. Haar-like features are one type of feature that is often applied to real-time face tracking. They were first used f or this purpose by Paul Viola and Michael Jones in 2001. Each Haar-like feature describes the pattern of contrast among adjacent image regions. For example, edges, vertices, and thin lines each generate distinctive features. For any given image, the features may vary depending on the regions' size, which may be called the window size. Two images that differ only in scale should be capable of yielding similar features, albeit for different window sizes. Thus, it is useful to generate features for multiple window sizes. Such a collection of features is called a cascade. We may say a Haar cascade is scale-invariant or, in other words, robust to changes in scale. OpenCV provides a classifier and tracker for scale-invariant Haar cascades, whic h it expects to be in a certain file format. Haar cascades, as implemented in OpenCV, are not robust to changes in rotation. For example, an upside-down face is not considered similar to an upright face and a face viewed in profile is not considered similar to a face viewed from the front. A more complex and more resource-intensive implementation could improve Haar cascades' robustness to rotation by considering multiple transformations of images as well as multiple window sizes. However, we will confine ourselves to the implementation in OpenCV. Getting Haar cascade data As part of your OpenCV setup, you probably have a directory called haarcascades. It contains cascades that are trained for certain subjects using tools that come with OpenCV. The directory's full path depends on your system and method of setting up OpenCV, as follows: Build from source archive:: <unzip_destination>/data/haarcascades Windows with self-extracting ZIP:<unzip_destination>/data/haarcascades Mac with MacPorts:MacPorts: /opt/local/share/OpenCV/haarcascades Mac with Homebrew:The haarcascades file is not included; to get it, download the source archive Ubuntu with apt or Software Center: The haarcascades file is not included; to get it, download the source archive If you cannot find haarcascades, then download the source archive from http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.3/OpenCV-2.4.3.tar.bz2/download (or the Windows self-extracting ZIP from http://sourceforge.net/projects/opencvlibrary/files/opencvwin/ 2.4.3/OpenCV-2.4.3.exe/download), unzip it, and look for <unzip_destination>/data/haarcascades. Once you find haarcascades, create a directory called cascades in the same folder as cameo.py and copy the following files from haarcascades into cascades: haarcascade_frontalface_alt.xmlhaarcascade_eye.xmlhaarcascade_mcs_nose.xmlhaarcascade_mcs_mouth.xml As their names suggest, these cascades are for tracking faces, eyes, noses, and mouths. They require a frontal, upright view of the subject. With a lot of patience and a powerful computer, you can make your own cascades, trained for various types of objects. Creating modules We should continue to maintain good separation between application-specific code and reusable code. Let's make new modules for tracking classes and their helpers. A file called trackers.py should be created in the same directory as cameo.py (and, equivalently, in the parent directory of cascades ). Let's put the following import statements at the start of trackers.py: import cv2import rectsimport utils Alongside trackers.py and cameo.py, let's make another file called rects.py containing the following import statement: import cv2 Our face tracker and a definition of a face will go in trackers.py, while various helpers will go in rects.py and our preexisting utils.py file.

0
0
22603

article-image-recurrent-neural-networks-lstm-architecture

Richard Gall

11 Apr 2018

2 min read

Recurrent neural networks and the LSTM architecture

Richard Gall

11 Apr 2018

2 min read

A recurrent neural network is a class of artificial neural networks that contain a network like series of nodes, each with a directed or one-way connection to every other node. These nodes can be classified as either input, output, or hidden. Input nodes receive data from outside of the network, hidden nodes modify the input data, and output nodes provide the intended results. RNNs are well known for their extensive usage in NLP tasks. The video tutorial above has been taken from Natural Language Processing with Python. Why are recurrent neural networks well suited for NLP? What makes RNNs so popular and effective for natural language processing tasks is that they operate sequentially over data sets. For example, a movie review is an arbitrary sequence of letters and characters, which the RNN can take as an input. The subsequent hidden and output layers are also capable of working with sequences. In a basic sentiment analysis example, you might just have a binary output - like classifying movie reviews as positive or negative. RNNs can do more than this - they are capable of generating a sequential output, such as taking an input sentence in English and translating it into Spanish. This ability to sequentially process data is what makes recurrent neural networks so well suited for NLP tasks. RNNs and long short-term memory Recurrent neural networks can sometimes become unstable due to the complexity of the connections they are built upon. That's where LSTM architecture helps. LSTM introduces something called a memory cell. The memory cell simplifies what could be incredibly by using a series of different gates to govern the way it changes within the network. The input gate manages inputs The output gates manage outputs Self-recurrent connection that keeps the memory cell in a consistent state between different steps The forget gate simply allows the memory cell to 'forget' its previous state [dropcap]R[/dropcap]ead Next High-level concepts Neural Network Architectures 101: Understanding Perceptrons 4 ways to enable Continual learning into Neural Networks Tutorials Build a generative chatbot using recurrent neural networks (LSTM RNNs) Training RNNs for Time Series Forecasting Implement Long-short Term Memory (LSTM) with TensorFlow How to auto-generate texts from Shakespeare writing using deep recurrent neural networks Research in this area Paper in Two minutes: Attention Is All You Need

0
0
22547

How-To Tutorials - Data

Neural Network Architectures 101: Understanding Perceptrons

Brad Miro talks TensorFlow 2.0 features and how Google is using it internally

18 striking AI Trends to watch in 2018 - Part 1

Twitter Sentiment Analysis

How to perform Numeric Metric Aggregations with Elasticsearch

How to Build TensorFlow Models for Mobile and Embedded devices

Microsoft open sources Infer.NET, it’s popular model-based machine learning framework

How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference

DynamoDB Best Practices

How to perform regression analysis using SAS

Trending Topics

How to build a Scatterplot in IBM SPSS

How to use M functions within Microsoft Power BI for querying data

TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development

OpenCV: Tracking Faces with Haar Cascades

Recurrent neural networks and the LSTM architecture

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access