Reader small image

You're reading from  TinyML Cookbook - Second Edition

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781837637362
Edition2nd Edition
Right arrow
Author (1)
Gian Marco Iodice
Gian Marco Iodice
author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice

Right arrow

Deploying a CIFAR-10 Model for Memory-Constrained Devices with the Zephyr OS on QEMU

Prototyping a tinyML application on a physical device is really fun because we can instantly transform our ideas into something that looks and feels like a real thing. However, before any application comes to life, we must ensure that the models work as expected and, possibly, on different devices. Testing and debugging applications directly on microcontroller boards often require a lot of development time. The main reason for this is the necessity to upload a program onto a device for every code change. However, virtual platforms can come in handy to make testing more straightforward and faster.

In this chapter, we will build an image classification application with TensorFlow Lite for Microcontrollers (tflite-micro) for an emulated Arm Cortex-M3 microcontroller. To accomplish our task, we will start by installing the Zephyr OS, the primary framework used in this chapter. Next, we will design...

Technical requirements

To complete all the practical recipes of this chapter, we will need the following:

  • A laptop/PC with either Linux or macOS

The source code and additional material are available in the Chapter10 folder of the GitHub repository: https://github.com/PacktPublishing/TinyML-Cookbook_2E/tree/main/Chapter10

Getting started with the Zephyr OS

In this recipe, we will install the Zephyr Project, the framework that will allow us to build and run the TensorFlow application on the emulated Arm Cortex-M3 microcontroller. By the end of this recipe, we will have the environment ready on our laptop/PC to run a sample application on the virtual platform considered for our project.

Getting ready

Zephyr (https://zephyrproject.org/) is an open-source Apache 2.0 project that provides a small-footprint Real-Time Operating System (RTOS) for various hardware platforms based on multiple architectures, including Arm Cortex-M, Intel x86, ARC, Nios II, and RISC-V. The RTOS has been designed for memory-constrained devices with security in mind.

Zephyr does not provide just an RTOS, though. It also offers a Software Development Kit (SDK) with a collection of ready-to-use examples and tools to build Zephyr-based applications for a wide range of devices, including virtual platforms through QEMU.

...

Designing and training a CIFAR-10 model for memory-constrained devices

The tight memory constraint on LM3S6965 forces us to develop a model with extremely low memory utilization. In fact, the target microcontroller has four times less memory capacity than the Arduino Nano.

Despite this challenging constraint, in this recipe, we will demonstrate the effective deployment of CIFAR-10 image classification on this microcontroller by employing the following convolutional neural network (CNN) with TensorFlow:

Figure 10.1: The model tailored for CIFAR-10 dataset image classification

As you can see from the preceding model architecture, the depthwise separable convolution (DepthSeparableConv2D) layer is the leading operator of the model. This operator, discussed in the upcoming Getting ready section, will make the model compact and accurate.

Getting ready

The network tailored in this recipe takes inspiration from the success of the MobileNet v1 model (https://arxiv...

Evaluating the accuracy of the quantized model

The trained model can classify the 10 classes of CIFAR-10 with an accuracy of 71.9%. However, before deploying the model on a microcontroller, it must be quantized with TensorFlow Lite, which may reduce the accuracy.

In this recipe, we will demonstrate the quantization process and perform an accuracy evaluation on the validation dataset using the TensorFlow Lite Python interpreter. The reason for using the validation rather than the test dataset is to assess how much the 8-bit quantization alters the accuracy observed during model training. Following the accuracy evaluation, we will finalize the recipe by converting the TensorFlow Lite model into a C-byte array.

Getting ready

As we know, the trained model must be converted into a more compact and lightweight representation before being deployed on a resource-constrained device such as a microcontroller.

Quantization is the essential part of this step to make the model...

Converting a NumPy image into a C-byte array

Our application will be running on a virtual platform with no access to a camera module. Therefore, we must provide a valid test input image for our application to check whether the model works as expected.

In this recipe, we will get an image from the validation dataset belonging to the ship class. The sample will then be converted into an int8_t C array and saved as an input.h file.

Getting ready

To prepare this recipe, we must know how to structure the C file containing the input test image. The structure of this file is quite simple and illustrated in Figure 10.7:

Figure 10.7: The C header file structure for the input test image

As you can observe from the file structure, we only need an array and two variables to describe our input test sample. These variables are as follows:

  • g_test: An int8_t array containing a ship image with the normalized and quantized pixel values. The pixel values (// Data...

Preparing the Zephyr Project structure

Only a few steps are separating us from the completion of this project. Now that we have the model and the input test image, we can leave Colab’s environment and focus on the application development with the Zephyr OS.

In this recipe, we will prepare the skeleton of the tflite-micro application from the pre-built hello_world sample available in the Zephyr SDK.

Getting ready

The easiest way to start a new tflite-micro project is to copy and edit one of the pre-built samples provided by the Zephyr SDK, available in the ~/zephyrproject/zephyr/samples/modules/tflite-micro/ folder. At the time of writing, there are two ready-to-use examples:

Deploying the TensorFlow Lite for Microcontrollers program on QEMU

The skeleton of our Zephyr Project is ready, so we just need to finalize our application to classify our input test image.

In this recipe, we will see how to implement the tflite-micro application and run the model on the emulated Arm Cortex-M3-based microcontroller.

Getting ready

Most of the ingredients required for this recipe are related to tflite-micro and have already been covered in earlier chapters, such as Chapter 3, Building a Weather Station with TensorFlow Lite for Microcontrollers. Nevertheless, there is one useful feature of this framework that has not been introduced yet, but it can come in handy when we want to reduce the program memory usage of our application drastically. This feature is the tflite::MicroMutableOpResolver, which enables the compilation of a subset of all operators available in tflite-micro.

As shown in the following ML model example, the model is composed of a series...

Summary

In this chapter, our focus has been on tailoring an ML model for image classification on a memory-constrained device with just 64 KB of SRAM.

In the first part, we prepared the Zephyr development environment by installing the components required to build and run a Zephyr application on virtual devices with QEMU.

Following the Zephyr installation, our attention shifted to model design. Here, we designed and trained a CNN based on the depthwise separable convolution layer, allowing us to reduce the training parameters and computational demand drastically.

Once the model was trained, we quantized it to 8-bit using the TensorFlow Lite converter and assessed its accuracy on the validation dataset. The evaluation proved that quantizing to 8-bit only marginally reduces the model’s accuracy.

Finally, we developed the Zephyr application to deploy the TensorFlow Lite quantized model and run it on the virtual device.

TensorFlow Lite for Microcontrollers has...

References

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/tiny

lock icon
The rest of the chapter is locked
You have been reading a chapter from
TinyML Cookbook - Second Edition
Published in: Nov 2023Publisher: PacktISBN-13: 9781837637362
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice