Reader small image

You're reading from  TinyML Cookbook - Second Edition

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781837637362
Edition2nd Edition
Right arrow
Author (1)
Gian Marco Iodice
Gian Marco Iodice
author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice

Right arrow

Running ML Models on Arduino and the Arm Ethos-U55 microNPU Using Apache TVM

In all our projects developed so far, we have relied on TensorFlow Lite for Microcontrollers (tflite-micro) as a software stack to deploy machine learning (ML) models on the Arduino Nano, Raspberry Pi Pico, and SparkFun Artemis Nano. However, other frameworks are available within the open-source community for this scope. Among these alternatives, Apache TVM (or simply TVM) has gained considerable attraction due to its ability to generate optimized code tailored to the desired target platform.

In this chapter, we will explore how to leverage this technology to deploy a quantized CIFAR-10 TensorFlow Lite model in various scenarios.

The chapter will start by giving an overview of Arduino Command Line Interface (CLI), an indispensable tool to compile and run the code generated by TVM on any Arduino-compatible platform.

After introducing Arduino CLI, we will present TVM by showing how to generate C...

Technical requirements

To complete all the practical recipes of this chapter, we will need the following:

  • An Arduino Nano 33 BLE Sense
  • A Raspberry Pi Pico
  • A SparkFun RedBoard Artemis Nano (optional)
  • A micro-USB data cable
  • A USB-C data cable (optional)
  • A laptop/PC with either Linux, macOS, or Windows
  • A Google account

The source code and additional material are available in the Chapter11 folder in the GitHub repository: https://github.com/PacktPublishing/TinyML-Cookbook_2E/tree/main/Chapter11.

Getting familiar with Arduino CLI

In this first recipe, we will install Arduino CLI on our local machine, to be ready to compile and run the code generated by TVM on any Arduino-compatible microcontroller. This step is essential to learn how to use this tool to create sketches for Arduino Nano and Raspberry Pi Pico.

Getting ready

Arduino CLI (https://arduino.github.io/arduino-cli) is a software package that exposes almost all functionalities bundled in the Arduino IDE into a CLI tool. Therefore, using shell commands, you can use Arduino CLI to create new sketches, download libraries, compile and upload programs on Arduino-compatible microcontrollers, and much more!

Arduino CLI is also available as a Python package (pyduinocli), a wrapper library around Arduino CLI. You can discover more about the Python Arduino CLI at the following link: https://pypi.org/project/pyduinocli/.

To install Arduino CLI, you can follow the instructions reported at the following...

Downloading a pre-trained CIFAR-10 model and input test image

In this project, we will not develop an ML model from scratch; instead, we will use a pre-trained model to dedicate more space to the model deployment with TVM. In this recipe, we will provide information on where to download the CIFAR-10 model and a header file containing an input test image as a C-byte array, required to validate the output classification.

Getting ready

The focus of this chapter will be primarily on the use of TVM to generate code for multiple target devices. To keep the problem as simple as possible and only focus on the model deployment, we will use a pre-trained CIFAR-10 quantized model and a constant input with a known output class to validate the inference process.

How to do it…

Open the web browser and create a new Colab notebook. Then, follow the following steps to download the pre-trained CIFAR-10 model and a C header file containing the input data, with an image that can...

Deploying the model with TVM using the AoT executor on the host machine

In this recipe, we will employ TVM for the first time to convert the pre-trained CIFAR-10 model into generic C code. Instead of deploying this code on a microcontroller, we will leverage the host-driven executor through Python. This simple approach will allow us to focus more on the essential TVM components needed for generating code.

Getting ready

The main advantage we have found in all projects developed with tflite-micro is certainly code portability. Regardless of the target device, the model inference can be accelerated on various devices using almost the same application code, which can be exemplified with the following pseudocode:

model = load_model(tflite_model)
model.allocate_memory()
model.invoke();

In the preceding code snippet, we do the following:

  1. Load the model at runtime with load_model()
  2. Allocate the memory required for the model inference with allocate_memory()
  3. Invoke the model inference with invoke()

When writing the tflite-micro application code, it is not strictly necessary to have prior knowledge of the target microcontroller because the software stack takes advantage of vendor-specific optimized operator libraries (performance libraries) to execute the model efficiently. As a result, the selection of the appropriate set of optimized operators happens...

Deploying the model on the Arduino Nano

Now that we have acquired the knowledge of code generation with TVM, we are prepared to shift our attention toward deploying an actual model on physical microcontrollers.

Thus, in this recipe, we aim to deploy the quantized CIFAR-10 model on the Arduino Nano.

Getting ready

To get ready with this recipe, we need to know how to generate and structure an Arduino project.

In the previous recipe, we executed the CIFAR-10 model on the host machine through the Python host-driven interface. However, we haven’t seen any actual code generated apart from a few Python objects returned by TVM.

As mentioned earlier, when dealing with microcontrollers, the output of TVM is a TAR package that contains the C code for the TVM runtime and TVM Lib, known as MLF. The TAR file is created when we call the tvm.micro.generate_project() function and is automatically decompressed, integrating only the necessary files into the target template...

Deploying the model on the Raspberry Pi Pico

The deployment of the quantized CIFAR-10 model on the Arduino Nano showcased TVM’s capability to generate code to run the model inference on this specific platform.

In this recipe, we will discover how we can use TVM to generate code for the Raspberry Pi Pico.

Getting ready

As we have seen in the previous chapter, the tvm.micro.generate_project() function is responsible for generating the Arduino project. Among the list of input arguments, this function requires the Arduino board name. However, what is the purpose of this information?

The board name is not used during the code generation phase because the code is already generated when calling the tvm.micro.generate_project() function. Instead, this information is required because TVM offers commands that allow building and flashing the application directly on the Arduino board from Python. Since we are not employing these commands to compile and upload the Arduino...

Installing the Fixed Virtual Platform (FVP) for the Arm Corstone-300

So far, our focus has been primarily on Arduino boards. However, TVM can generate code for various platforms, including those with the Arm Ethos-U55 processor, the first microNPU designed by Arm to extend the ML capabilities of Cortex-M-based devices.

In this recipe, we will give more details on the computational capabilities of the Arm Ethos-U55 microNPU and install the FVP model for the Arm Corstone-300 platform. This virtual device will allow us to see this new processor in action without needing a physical device.

Getting ready

The FVP model for the Arm Corstone-300 platform (https://developer.arm.com/tools-and-software/open-source-software/arm-platforms-software/arm-ecosystem-fvps) is a free-of-charge virtual platform, based on an Arm Cortex-M55 CPU and Ethos-U55 microNPU.

Arm Ethos-U55 (https://www.arm.com/products/silicon-ip-cpu/ethos/ethos-u55) is a specialized processor for ML...

Code generation with TVMC for Arm Ethos-U55

Until now, we have generated code with TVM using its native Python interface. However, TVM provides an alternative approach through the TVMC tool, which allows us to execute the same actions as the Python interface but from the command line.

In this recipe, we will show how you can use this tool in Colab to produce the MLF model, containing the generated code to run the CIFAR-10 model inference on the Arm Ethos-U55.

Getting ready

The native TVM Python interface proved to be convenient to use to generate code for a desired target. However, the framework also offers an additional tool to simplify code generation. This tool is TVMC, a command-line driver that exposes the same features of the Python API in a single command line.

At this point, you may wonder: where can we find the TVMC tool?

TVMC is bundled with the TVM Python installation, and you can invoke it using the following shell command:

$ python -m tvm.driver...

Installing the software dependencies to build an application for the Arm Ethos-U microNPU

The code generated by TVM for Arm Ethos-U55 cannot be compiled with Arduino CLI, as the target device is not an Arduino-based platform. Therefore, this recipe will guide you in preparing the required dependencies to build the application for the Arm Corstone-300 platform.

Getting ready

To build the code generated by TVM for Corstone-300, the following components are required:

  • A compiler to produce the application binary for the Arm Cortex-M55 CPU
  • The Ethos-U driver, which is the driver to offload the computation from the Arm Cortex-M CPU to the Arm Ethos-U55
  • The Ethos-U platform, which provides a basic driver for the device peripherals, such as the UART and Timer
  • The CMSIS library, which provides a collection of optimized ML and digital signal processing (DSP) functions

The following subsections will provide further details about the compiler, the...

Running the CIFAR-10 model inference on the Arm Ethos-U55 microNPU

Now that all the necessary tools and software libraries are installed, our final step involves building the application with code generated by TVM for the Arm Ethos-U55 microNPU, on the Corstone-300 FVP.

Although it seems there is still a lot left to do, this recipe offers a solution to simplify the remaining technicalities.

In this recipe, we will show you how to modify the Ethos-U prebuilt sample available in the TVM source code to run the CIFAR-10 inference, on the Arm Ethos-U55. After making the necessary modifications, we will compile the application, using the provided Makefile and Linker scripts from the prebuilt sample, and run the compiled application on the Corstone-300 FVP.

Getting ready

The prebuilt sample considered in this recipe is available in the TVM source code within the tvm/apps/microtvm/ethosu directory (https://github.com/apache/tvm/tree/v0.11.1/apps/microtvm/ethosu...

Summary

In this chapter, we have explored the capabilities of TVM, a deep learning compiler capable of generating code to run model inference on various target devices, including the latest Arm Ethos-U55 microNPU.

In the first part, we delved into this framework to deploy the CIFAR-10 model on the Arduino Nano and Raspberry Pi Pico. Here, we discussed the TVM Python API to generate the code for model inference and showed the steps to build and run the Arduino sketch on the microcontrollers using Arduino CLI.

Following the successful model deployment on the Arduino Nano and Raspberry Pi Pico, we moved our attention to a new and advanced processor: the microNPU.

In this second part, we introduced the Arm Ethos-U55 microNPU and installed the FVP model for the Arm Corstone-300 platform to play with this processor without needing a physical device.

After installing the virtual device, we generated the code to run the CIFAR-10 model inference on the microNPU using TVMC,...

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/tiny

lock icon
The rest of the chapter is locked
You have been reading a chapter from
TinyML Cookbook - Second Edition
Published in: Nov 2023Publisher: PacktISBN-13: 9781837637362
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice