You're reading from TinyML Cookbook - Second Edition

Product typeBook

Published inNov 2023

PublisherPackt

ISBN-139781837637362

Edition2nd Edition

Concepts

Machine Learning

Author (1)

Gian Marco Iodice

Overview of deep learning

ML is the ingredient that makes our tiny devices capable of making intelligent decisions. These software algorithms heavily rely on the correct data to learn patterns or actions based on experience. As we commonly say, data is everything for ML because it is what makes or breaks an application.

This book will refer to deep learning (DL) as a specific class of ML that can perform complex prediction tasks directly on raw images, text, or sound. These algorithms have state-of-the-art accuracy and can be better and faster than humans in solving some data analysis problems.

A complete discussion of DL architectures and algorithms is beyond the scope of this book. However, this section will summarize some essential points relevant to understanding the following chapters.

Deep neural networks

A deep neural network consists of several stacked layers aimed at learning patterns.

Each layer contains several neurons, the fundamental computing elements for artificial neural networks (ANNs) inspired by the human brain.

A neuron produces a single output through a linear transformation, defined as the weighted sum of the inputs plus a constant value called bias, as shown in the following diagram:

Diagram

Description automatically generated

Figure 1.3: A neuron representation

The coefficients of this weighted sum are called weights.

Weights and bias are obtained after an iterative training process to make the neuron capable of learning complex patterns. However, neurons can only solve simple linear problems with linear transformations. Therefore, non-linear functions, called activations, generally follow the neuron’s output to help the network learn complex patterns:

Figure 1.4: An activation function

An example of a widely adopted activation function is the rectified linear unit (ReLU), which returns the maximum value between the input value and 0:

float relu(float input) {
  return max(input, 0);
}

Its computational simplicity makes it preferable to other non-linear functions, such as a hyperbolic tangent or logistic sigmoid, requiring more computational resources.

In the following subsection, we will see how the neurons are connected to solve complex visual recognition tasks.

Convolutional neural networks

Convolutional neural networks (CNNs) are specialized deep neural networks predominantly applied to visual recognition tasks.

We can consider CNNs as the evolution of a regularized version of the classic fully connected neural networks with dense layers, also known as fully connected layers.

As we can see in the following diagram, a characteristic of fully connected networks is connecting every neuron to all the output neurons of the previous layer:

Figure 1.5: A fully connected network

Unfortunately, this method of connecting neurons does not work well for training a model for image classification.

For instance, if we considered an RGB image of size 320x240 (width x height), we would need 230,400 (320*240*3) weights for just one neuron. Since our models will undoubtedly need several layers of neurons to discern complex problems, the model will likely overfit, given the unmanageable number of trainable parameters. Overfitting implies that the model learns to predict the training data well but struggles to generalize data not used during the training process (unseen data).

In the past, data scientists adopted manual feature engineering techniques to extract a reduced set of good features from images. However, the approach suffered from being difficult, time-consuming, and domain-specific.

With the rise of CNNs, visual recognition tasks saw improvement thanks to convolution layers, which make feature extraction part of the learning problem.

Based on the assumption that we are dealing with images and inspired by biological processes in the animal visual cortex, the convolution layer borrows the widely adopted convolution operator from image processing to create a set of learnable features.

The convolution operator is performed similarly to other image processing routines: sliding a window application (filter or kernel) onto the entire input image and applying the dot product between its weights and the underlying pixels, as shown in Figure 1.6:

A picture containing graphical user interface

Description automatically generated

Figure 1.6: Convolution operator

This approach brings two significant benefits:

It extracts the relevant features automatically without human intervention.
It reduces the number of input signals per neuron considerably.

For instance, applying a 3x3 filter on the preceding RGB image would only require 27 weights (3*3*3).

Like fully connected layers, convolution layers need several kernels to learn as many features as possible. Therefore, the convolution layer’s output generally produces a set of images (feature maps), commonly kept in a multidimensional memory object called a tensor, as shown in the following illustration:

Figure 1.7: Representation of a 3D tensor

Traditional CNNs for visual recognition tasks usually include the fully connected layers at the network’s end to carry out the prediction stage. Since the output of the convolution layers is a set of images, we generally adopt subsampling strategies to reduce the information propagated through the network and the risk of overfitting when feeding the fully connected layers.

Typically, there are two ways to perform subsampling:

Skipping the convolution operator for some input pixels. As a result, the output of the convolution layer will have fewer spatial dimensions than the input ones.
Adopting subsampling functions such as pooling layers.

The following figure shows a generic CNN architecture, where the pooling layer reduces the spatial dimensionality, and the fully connected layer performs the classification stage:

Figure 1.8: Traditional CNN with a pooling layer to reduce the spatial dimensionality

When developing DL networks for tinyML, one of the most crucial factors is the model’s size, defined as the number of trainable weights. Due to the limited physical memory of our platforms, the model needs to be compact to fit the target device. However, memory constraints are not the only challenge we may face. For instance, while trained models often use floating-point precision arithmetic operations, the CPUs on our platforms may lack hardware acceleration.

Thus, to overcome these limitations, quantization becomes an indispensable technique.

Model quantization

Quantization is the process of performing neural network computations in lower bit precision. The widely adopted technique for microcontrollers applies the quantization post-training and converts the 32-bit floating-point weights to 8-bit integer values. This technique brings a 4x model size reduction and a significant latency improvement with little or no accuracy drop.

Other techniques like pruning (setting weights to zero) or clustering (grouping weights into clusters) can help reduce the model size. However, in this book, we will limit the scope to the quantization technique because it is sufficient to showcase the model deployment on microcontrollers.

If you are interested in learning more about pruning and clustering, you can refer to the following practical blog post, which shows the benefit of these two techniques on the model size: https://community.arm.com/arm-community-blogs/b/ai-and-ml-blog/posts/pruning-clustering-arm-ethos-u-npu.

As we know, ML is the component that allows smartness into our application. Nevertheless, to ensure the longevity of battery-powered applications, it is essential to use low-power devices. So far, we have mentioned power and energy in general terms, but let’s see what they mean practically in the following section.

You have been reading a chapter from

TinyML Cookbook - Second Edition

Published in: Nov 2023Publisher: PacktISBN-13: 9781837637362

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages