Deep learning intuition

If you're a newbie to deep learning, you may be wondering how exactly it is differs from machine learning; or is it the same? Deep learning is a subset of the larger domain of machine learning. Let's think about this in the context of an automobile image classification problem:

As you can see in the preceding diagram, we need to perform feature extraction ourselves as legacy machine learning algorithms cannot do that on their own. They might be super-efficient with accurate results, but they cannot learn signals from data. In fact, they don't learn on their own and still rely on human effort:

On the other hand, deep learning algorithms learn to perform tasks on their own. Neural networks under the hood are based on the concept of deep learning and it trains on their own to optimize the results. However, the final decision process is hidden and cannot be tracked. The intent of deep learning is to imitate the functioning of a human brain.

Backpropagation

The backbone of a neural network is the backpropagation algorithm. Refer to the sample neural network structure shown as follows:

For any neural network, data flows from the input layer to the output layer during the forward pass. Each circle in the diagram represents a neuron. Every layer has a number of neurons present. Our data will pass through the neurons across layers. The input needs to be in a numerical format to support computational operations in neurons. Each neuron in the neural network is assigned a weight (matrix) and an activation function. Using the input data, weight matrix, and an activation function, a probabilistic value is generated at each neuron. The error (that is, a deviation from the actual value) is calculated at the output layer using a loss function. We utilize the loss score during the backward pass (that is, from the output layer to the input layer ) by reassigning weights to the neurons to reduce the loss score. During this stage, some output layer neurons will be assigned with high weights and vice versa depending upon the loss score results. This process will continue backward as far as the input layer by updating the weights of neurons. In a nutshell, we are tracking the rate of change of loss with respect to the change in weights across all neurons. This entire cycle (a forward and backward pass) is called an epoch. We perform multiple epochs during a training session. A neural network will tend to optimize the results after every training epoch.

Multilayer Perceptron (MLP)

An MLP is a standard feed-forward neural network with at least three layers: an input layer, a hidden layer, and an output layer. Hidden layers come after the input layer in the structure. Deep neural networks have two or more hidden layers in the structure, while an MLP has only one.

Convolutional Neural Network (CNN)

CNNs are generally used for image classification problems, but can also be exposed in Natural Language Processing (NLP), in conjunction with word vectors, because of their proven results. Unlike a regular neural network, a CNN will have additional layers such as convolutional layers and subsampling layers. Convolutional layers take input data (such as images) and apply convolution operations on top of them. You can think of it as applying a function to the input. Convolutional layers act as filters that pass a feature of interest to the upcoming subsampling layer. A feature of interest can be anything (for example, a fur, shade and so on in the case of an image) that can be used to identify the image. In the subsampling layer, the input from convolutional layers is further smoothed. So, we end up with a much smaller image resolution and reduced color contrast, preserving only the important information. The input is then passed on to fully connected layers. Fully connected layers resemble regular feed-forward neural networks.

Recurrent Neural Network (RNN)

An RNN is a neural network that can process sequential data. In a regular feed-forward neural network, the current input is considered for neurons in the next layer. On the other hand, an RNN can accept previously received inputs as well. It can also use memory to memorize previous inputs. So, it is capable of preserving long-term dependencies throughout the training session. RNN is a popular choice for NLP tasks such as speech recognition. In practice, a slightly variant structure called Long Short-Term Memory (LSTM) is used as a better alternative to RNN.

Why is DL4J important for deep learning?

The following points will help you understand why DL4J is important for deep learning:

DL4J provides commercial support. It is the first commercial-grade, open source, deep learning library in Java.
Writing training code is simple and precise. DL4J supports Plug and Play mode, which means switching between hardware (CPU to GPU) is just a matter of changing the Maven dependencies and no modifications are needed on the code.
DL4J uses ND4J as its backend. ND4J is a computation library that can run twice as fast as NumPy (a computation library in Python) in large matrix operations. DL4J exhibits faster training times in GPU environments compared to other Python counterparts.
DL4J supports training on a cluster of machines that are running in CPU/GPU using Apache Spark. DL4J brings in automated parallelism in distributed training. This means that DL4J bypasses the need for extra libraries by setting up worker nodes and connections.
DL4J is a good production-oriented deep learning library. As a JVM-based library, DL4J applications can be easily integrated/deployed with existing corporate applications that are running in Java/Scala.