You're reading from The Deep Learning Architect's Handbook

Product typeBook

Published inDec 2023

PublisherPackt

ISBN-139781803243795

Edition1st Edition

Concepts

Deep Learning

Author (1)

Ee Kin Chin

Developing deep learning models

Let’s start with a short recap of what deep learning is. Deep learning’s core foundational building block is a neural network. A neural network is an algorithm that was made to simulate the human brain. Its building blocks are called neurons, which mimic the billions of neurons the human brain contains. Neurons, in the context of neural networks, are objects that store simple information called weights and biases. Think of these as the memory of the algorithm.

Deep learning architectures are essentially neural network architectures that have three or more neural network layers. Neural network layers can be categorized into three high-level groups – the input layer, the hidden layer, and the output layer. The input layer is the simplest layer group and whose functionality is to pass the input data to subsequent layers. This layer group does not contain biases and can be considered passive neurons, but the group still contains weights in its connections to neurons from subsequent layers. The hidden layer comprises neurons that contain biases and weights in their connections to neurons from subsequent layers. Finally, the output layer comprises neurons that relate to the number of classes and problem types and contains bias. A best practice when counting neural network layers is to exclude the input layer when doing so. So, a neural network with one input layer, one hidden layer, and one output layer is considered to be a two-layer neural network. The following figure shows a basic neural network, called a multilayer perceptron (MLP), with a single input layer, a single hidden layer, and a single output layer:

Figure 1.12 – A simple deep learning architecture, also called an MLP

Being a subset of the wider machine learning category, deep learning models are capable of learning patterns from the data through a loss function and an optimizer algorithm that optimizes the loss function. A loss function defines the error made by the model so that its memory (weights and biases) can be updated to perform better in the next iteration. An optimizer algorithm is an algorithm that decides the strategy to update the weights given the loss value.

With this short recap, let’s dive into a summary of the common deep learning model families.

Deep learning model families

These layers can come in many forms as researchers have been able to invent new layer definitions to tackle new problem types and almost always comes with a non-linear activation function that allows the model to capture non-linear relationships between the data. Along with the variation of layers come many different deep learning architecture families that are meant for different problem types. A few of the most common and widely used deep learning models are as follows:

MLP for tabular data types. This will be explored in Chapter 2, Designing Deep Learning Architectures.
Convolutional neural network for image data types. This will be explored in Chapter 3, Understanding Convolutional Neural Networks.
Autoencoders for anomaly detection, data compression, data denoising, and feature representation learning. This will be explored in Chapter 5, Understanding Autoencoders.
Gated recurrent unit (GRU), Long Short-Term Memory (LSTM), and Transformers for sequence data types. These will be explored in Chapter 4, Understanding Recurrent Neural Networks, and Chapter 6, Understanding Neural Network Transformers, respectively.

These architectures will be the focus of Chapters 2 to 6, where we will discuss their methodology and go through some practical evaluation. Next, let’s discover the problem types we can tackle in deep learning.

The model development strategy

Today, deep learning models are easy to invent and create due to the advent of deep learning frameworks such as PyTorch and TensorFlow, along with their high-level library wrappers. Which framework you should choose at this point is a matter of preference regarding their interfaces as both frameworks are matured with years of improvement work done. Only when there is a pressing need for a very custom function to tackle a unique problem type will you need to choose the framework that can execute what you need. Once you’ve chosen your deep learning framework, the deep model creation, training, and evaluation process is pretty much covered all around.

However, model management functions do not come out of the box from these frameworks. Model management is an area of technology that allows teams, businesses, and deep learning practitioners to reliably, quickly, and effectively build models, evaluate models, deliver model insights, deploy models to production, and govern models. Model management can sometimes be referred to as machine learning operations (MLOps). You might still be wondering why you’d need such functionalities, especially if you’ve been building some deep learning models off Kaggle, a platform that hosts data and machine learning problems as competitions. So, here are some factors that drive the need to utilize these functionalities:

It is cumbersome to compare models manually:
- Manually typing performance data in an Excel sheet to keep track of model performance is slow and unreliable
Model artifacts are hard to keep track of:
- A model has many artifacts, such as its trained weights, performance graphs, feature importance, and prediction explanations
- It is also cumbersome to compare model artifacts
Model versioning is needed to make sure model-building experiments are not repeated:
- Overriding the top-performing model with the most reliable model insights is the last thing you want to experience
- Versioning should depend on the data partitioning method, model settings, and software library versions
It is not straightforward to deploy and govern models

Depending on the size of the team involved in the project and how often components need to be reused, different software and libraries would fit the bill. These software and libraries are split into paid and free (usually open sourced) categories. Metaflow, an open sourced software, is suitable for bigger data science teams where there are many chances of components needing to be reused across other projects and MLFlow (open sourced software) would be more suitable for small or single-person teams. Other notable model management tools are Comet (paid), Weights & Biases (paid), Neptune (paid), and Algorithmia (paid).

With that, we have provided a brief overview of deep learning model development methodology and strategy; we will dive deeper into model development topics in the next few chapters. But before that, let’s continue with an overview of the topic of delivering model insights.

You have been reading a chapter from

The Deep Learning Architect's Handbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781803243795

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ee Kin Chin

Ee Kin Chin is a Senior Deep Learning Engineer at DataRobot. He holds a Bachelor of Engineering (Honours) in Electronics with a major in Telecommunications. Ee Kin is an expert in the field of Deep Learning, Data Science, Machine Learning, Artificial Intelligence, Supervised Learning, Unsupervised Learning, Python, Keras, Pytorch, and related technologies. He has a proven track record of delivering successful projects in these areas and is dedicated to staying up to date with the latest advancements in the field.
Read more about Ee Kin Chin

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages