Thus far, we have studied predictive modeling techniques that use a set of features (columns in a tabular dataset) that are pre-defined for the problem at hand. For example, a user account, an internet transaction, a product, or any other item that is important to a business scenario are often described using properties derived from domain knowledge of a particular industry. More complex data, such as a document, can still be transformed into a vector representing something about the words in the text, and images can be represented by matrix factors as we saw in Chapter 6, Words and Pixels – Working with Unstructured Data. However, with both simple and complex data types, we could easily imagine higher-level interactions between features (for example, a user in a certain country and age range using a particular device is more likely to click on a webpage, while none of these three factors alone are predictive...
You're reading from Mastering Predictive Analytics with Python
The core building blocks for the deep learning algorithms we will examine are Neural Networks, a predictive model that simulates the way cells inside the brain fire impulses to transmit signals. By combining individual contributions from many inputs (for example, the many columns we might have in a tabular dataset, words in a document, or pixels in an image), the network integrates signals to predict an output of interest (whether it is price, click through rate, or some other response). Fitting this sort of model to data therefore involves determining the best parameters of the neuron to perform this mapping from input data to output variable.
Some common features of the deep learning models we will discuss in this chapter are the large number of parameters we can tune and the complexity of the models themselves. Whereas the regression models we have seen so far required us to determine the optimal value of ~50 coefficients, in deep learning models...
For the exercises in this chapter, we will be using the TensorFlow
library open-sourced by Google (available at https://www.tensorflow.org/). Installation instructions vary by operating system. Additionally, for Linux systems, it is possible to leverage both the CPU and
graphics processing unit (GPU) on your computer to run deep learning models. Because many of the steps in training (such as the multiplications required to update a grid of weight values) involve matrix operations, they can be readily parallelized (and thus accelerated) by using a GPU. However, the TensorFlow
library will work on CPU as well, so don't worry if you don't have access to an Nvidia GPU card.
In this chapter, we introduced deep neural networks as a way to generate models for complex data types where features are difficult to engineer. We examined how neural networks are trained through back-propagation, and why additional layers make this optimization intractable. We discussed solutions to this problem and demonstrated the use of the TensorFlow
library to build an image classifier for hand-drawn digits.
Now that you have covered a wide range of predictive models, we will turn in the final two chapters to the last two tasks in generating analytical pipelines: turning the models that we have trained into a repeatable, automated process, and visualizing the results for ongoing insights and monitoring.