You're reading from The Deep Learning Architect's Handbook

Product typeBook

Published inDec 2023

PublisherPackt

ISBN-139781803243795

Edition1st Edition

Concepts

Deep Learning

Author (1)

Ee Kin Chin

Designing Deep Learning Architectures

In the previous chapter, we went through the entire deep learning life cycle and understood what it means to make a deep learning project successful from end to end. With that knowledge, we are now ready to dive further into the technicalities of deep learning models. In this chapter, we will dive into common deep learning architectures used in the industry and understand the reasons behind each architecture’s design. For intermediate and advanced readers, this will be a brief recap to ensure alignment in the definitions of terms. For beginner readers, architectures will be presented in a way that is easy to digest so that you can get up to speed on the useful neural architectures in the world of deep learning.

Grasping the methodologies behind a wide variety of architectures allows you to innovate custom architectures specific to your use case and, most importantly, gain the skill to choose an appropriate foundational architecture based...

Technical requirements

This chapter includes some practical implementations in the Python programming language. To complete it, you will need to have a computer with the following libraries installed:

pandas
Matplotlib
Seaborn
Scikit-learn
NumPy
Keras
PyTorch

The code files are available on GitHub: https://github.com/PacktPublishing/The-Deep-Learning-Architect-Handbook/tree/main/CHAPTER_2.

Exploring the foundations of neural networks using an MLP

A deep learning architecture is created when at least three perceptron layers are used, excluding the input layer. A perceptron is a single-layer network consisting of neuron units. Neuron units hold a bias variable and act as nodes for vertices to be connected. These neurons will interact with other neurons in a separate layer with weights applied to the connections/vertices between neurons. A perceptron is also known as a fully connected layer or dense layer, and MLPs are also known as feedforward neural networks or fully connected neural networks.

Let’s refer back to the MLP figure from the previous chapter to get a better idea.

Figure 2.1 – Simple deep learning architecture, also called an MLP

The figure shows how three data column inputs get passed into the input layer, then subsequently get propagated to the hidden layer, and finally, through the output layer. Although not...

Understanding neural network gradients

The goal of machine learning for an MLP is to find the weights and biases that will effectively map the inputs to the desired outputs. The weights and biases generally get initialized randomly. In the training process, with a provided dataset, they get updated iteratively and objectively in batches to minimize the loss function, which uses gradients computed with a method called backward propagation, also known as backpropagation. A batch is a subset of the dataset used for training or evaluation, allowing the neural network to process the data in smaller groups rather than the entire dataset at once. The loss function is also known as the error function or the cost function.

Backpropagation is a technique to find out how sensitive a change of weights and bias of every neuron is to the overall loss by using the partial derivative of the loss with respect to the weights and biases. Partial derivatives from calculus are a measure of the rate...

Understanding gradient descent

A good way to think about loss for a deep learning model is that it exists in a three-dimensional loss landscape that has many different hills and valleys, with valleys being more optimal, as shown in Figure 2.4.

Figure 2.4 – An example loss landscape

In reality, however, we can only approximate these loss landscapes as the parameter values of the neural networks can exist in an infinite number of ways. The most common way practitioners use to monitor the behavior of loss during each epoch of training and validation is to simply plot a two-dimensional line graph with the x axis being the epochs executed and the y axis being the loss performance. An epoch is a single iteration through the entire dataset during the training process of a neural network. The loss landscape in Figure 2.4 is an approximation of the loss landscape in three dimensions of a neural network. To visualize the three-dimensional loss landscape in...

Implementing an MLP from scratch

Today, the process to create a neural network and its layers along with the backpropagation process has been encapsulated in deep learning frameworks. The differentiation process has been automated, where there is no actual need to define the derivative formulas manually. Removing the abstraction layer provided by the deep learning libraries will help to solidify your understanding of neural network internals. So, let’s create this neural network manually and explicitly with the logic to forward pass and backward pass instead of using the deep learning libraries:

We’ll start by importing numpy and the methods from the scikit-learn library to load sample datasets and perform data partitioning:
```
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
```
Next, we define ReLU, the method that makes an MLP non-linear:
```
def ReLU(x):
  return np.maximum(x, 0)
```
Now, let’s define...

Summary

MLPs are the foundational piece of architecture in deep learning that transcends just processing tabular data and is more than an old architecture that got superseded. MLPs are very commonly utilized as a sub-component in many advanced neural network architectures today to either provide more automatic feature engineering, reduce the dimensionality of large features, or shape the features into the desired shapes for target predictions. Look out for MLPs or, more importantly, the fully connected layer, in the next few architectures that are going to be introduced in the next few chapters!

The automatic gradient computation provided by deep learning frameworks simplifies the implementation of backpropagation and allows us to focus on designing new neural networks. It is essential to ensure that the mathematical functions used in these networks are differentiable, although this is often taken care of when adopting successful research findings. And that’s the beauty of...

The rest of the chapter is locked

You have been reading a chapter from

The Deep Learning Architect's Handbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781803243795

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ee Kin Chin

Ee Kin Chin is a Senior Deep Learning Engineer at DataRobot. He holds a Bachelor of Engineering (Honours) in Electronics with a major in Telecommunications. Ee Kin is an expert in the field of Deep Learning, Data Science, Machine Learning, Artificial Intelligence, Supervised Learning, Unsupervised Learning, Python, Keras, Pytorch, and related technologies. He has a proven track record of delivering successful projects in these areas and is dedicated to staying up to date with the latest advancements in the field.
Read more about Ee Kin Chin

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages