1. Introduction to Machine Learning with TensorFlow
In this chapter, you will learn how to create, utilize, and apply linear transformations to the fundamental building blocks of programming with TensorFlow: tensors. You will then utilize tensors to understand the complex concepts associated with neural networks, including tensor reshaping, transposition, and multiplication.
Machine learning (ML) has permeated various aspects of daily life that are unknown to many. From the recommendations of your daily social feeds to the results of your online searches, they are all powered by machine learning algorithms. These algorithms began in research environments solving niche problems, but as their accessibility broadened, so too have their applications for broader use cases. Researchers and businesses of all types recognize the value of using models to optimize every aspect of their respective operations. Doctors can use machine learning to decide diagnosis and treatment options, retailers can use ML to get the right products to their stores at the right time, and entertainment companies can use ML to provide personalized recommendations to their customers.
In the age of data, machine learning models have proven to be valuable assets to any data-driven company. The large quantities of data available allow powerful and accurate models to be created...
Implementing Artificial Neural Networks in TensorFlow
The advanced flexibility that TensorFlow offers lends itself well to creating artificial neural networks (ANNs). ANNs are algorithms that are inspired by the connectivity of neurons in the brain and are intended to replicate the process in which humans learn. They consist of layers through which information propagates from the input to the output.
Figure 1.1 shows a visual representation of an ANN. An input layer is on the left-hand side, which, in this example, has two features (
X2). The input layer is connected to the first hidden layer, which has three units. All the data from the previous layer gets passed to each unit in the first hidden layer. The data is then passed to the second hidden layer, which also has three units. Again, the information from each unit of the prior layer is passed to each unit of the second hidden layer. Finally, all the information from the second hidden layer is passed to the output layer...
The TensorFlow Library in Python
TensorFlow can be used in Python by importing certain libraries. You can import libraries in Python using the
import tensorflow as tf
In the preceding command, you have imported the TensorFlow library and used the shorthand
In the next exercise, you will learn how to import the TensorFlow library and check its version so that you can utilize the classes and functions supplied by the library, which is an important and necessary first step when utilizing the library.
Exercise 1.01: Verifying Your Version of TensorFlow
In this exercise, you will load TensorFlow and check which version is installed on your system.
Perform the following steps:
- Open a Jupyter notebook to implement this exercise by typing
jupyter notebookin the terminal.
- Import the TensorFlow library by entering the following code in the Jupyter cell:
import tensorflow as tf
- Verify the version of TensorFlow using the following command...
Introduction to Tensors
Tensors can be thought of as the core components of ANNs—the input data, output predictions, and weights that are learned throughout the training process are all tensors. Information propagates through a series of linear and nonlinear transformations to turn the input data into predictions. This section demonstrates how to apply linear transformations such as additions, transpositions, and multiplications to tensors. Other linear transformations, such as rotations, reflections, and shears, also exist. However, their applications as they pertain to ANNs are less common.
Scalars, Vectors, Matrices, and Tensors
Tensors can be represented as multi-dimensional arrays. The number of dimensions a tensor spans is known as the tensor's rank. Tensors with ranks
2 are used often and have their own names, which are scalars, vectors, and matrices, respectively, although the term tensors can be used to describe each of them. Figure 1.2 shows...
Tensors can be added together to create new tensors. You will use the example of matrices in this chapter, but the concept can be extended to tensors with any rank. Matrices may be added to scalars, vectors, and other matrices under certain conditions in a process known as broadcasting. Broadcasting refers to the process of array arithmetic on tensors of different shapes.
Two matrices may be added (or subtracted) together if they have the same shape. For such matrix-matrix addition, the resultant matrix is determined by the element-wise addition of the input matrices. The resultant matrix will therefore have the same shape as the two input matrices. You can define the matrix
Z = [Zij
] as the matrix sum
Z = X + Y, where
yij and each element in
Z is the sum of the same element in
Matrix addition is commutative, which means that the order of
Y does not matter, that is,
X + Y = Y + X. Matrix addition is also associative, which ...
Some operations, such as addition, can only be applied to tensors if they meet certain conditions. Reshaping is one method for modifying the shape of tensors so that such operations can be performed. Reshaping takes the elements of a tensor and rearranges them into a tensor of a different size. A tensor of any size can be reshaped so long as the number of total elements remains the same.
For example, a
(4x3) matrix can be reshaped into a
(6x2) matrix since they both have a total of
12 elements. The rank, or number, of dimensions, can also be changed in the reshaping process. For instance, a
(4x3) matrix that has a rank equal to
2 can be reshaped into a
(3x2x2) tensor that has a rank equal to
(4x3) matrix can also be reshaped into a
(12x1) vector in which the rank has changed from
Figure 1.13 illustrates tensor reshaping. On the left is a tensor with shape
(3x2), which can be reshaped to a tensor of shape equal to either
Tensor multiplication is another fundamental operation that is used frequently in the process of building and training ANNs since information propagates through the network from the inputs to the result via a series of additions and multiplications. While the rules for addition are simple and intuitive, the rules for tensors are more complex. Tensor multiplication involves more than simple element-wise multiplication of the elements. Rather, a more complicated procedure is implemented that involves the dot product between the entire rows/columns of each of the tensors to calculate each element of the resulting tensor. This section will explain how multiplication works for two-dimensional tensors or matrices. However, tensors of higher orders can also be multiplied.
Given a matrix,
X = [xij
]m x n, and another matrix,
Y = [yij
]n x p, the product of the two matrices is
Z = XY = [zij
]m x p, and each element,
zij, is defined element-wise as . The shape of the resultant...
In this section, you will learn about some optimization approaches that are fundamental to training machine learning models. Optimization is the process by which the weights of the layers of an ANN are updated such that the error between the predicted values of the ANN and the true values of the training data is minimized.
Forward propagation is the process by which information propagates through ANNs. Operations such as a series of tensor multiplications and additions occur at each layer of the network until the final output. Forward propagation is explained in Figure 1.37, showing a single hidden layer ANN. The input data has two features, while the output layer has a single value for each input record.
The weights and biases for the hidden layer and output are shown as matrices and vectors with the appropriate indexes. For the hidden layer, the number of rows in the weight matrix is equal to the number of features of the input, and...
Activation functions are mathematical functions that are generally applied to the outputs of ANN layers to limit or bound the values of the layer. The reason that values may want to be bounded is that without activation functions, the value and corresponding gradients can either explode or vanish, thereby making the results unusable. This is because the final value is the cumulative product of the values from each subsequent layer. As the number of layers increases, the likelihood of values and gradients exploding to infinity or vanishing to zero increases. This concept is known as the exploding and vanishing gradient problem. Deciding whether a node in a layer should be activated is another use of activation functions, hence their name. Common activation functions and their visual representation in Figure 1.36 are as follows:
- Step function: The value is non-zero if it is above a certain threshold, otherwise it is zero. This is shown in Figure 1.36a...
In this chapter, you were introduced to the TensorFlow library. You learned how to use it in the Python programming language. You created the building blocks of ANNs (tensors) with various ranks and shapes, performed linear transformations on tensors using TensorFlow, and implemented addition, reshaping, transposition, and multiplication on tensors—all of which are fundamental for understanding the underlying mathematics of ANNs.
In the next chapter, you will improve your understanding of tensors and learn how to load data of various types and pre-process it such that it is appropriate for training ANNs in TensorFlow. You will work with tabular, visual, and textual data, all of which must be pre-processed differently. By working with visual data (that is, images), you will also learn how to use training data in which the size of the training data cannot fit into memory.