Reader small image

You're reading from  TinyML Cookbook - Second Edition

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781837637362
Edition2nd Edition
Right arrow
Author (1)
Gian Marco Iodice
Gian Marco Iodice
author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice

Right arrow

Recognizing Music Genres with TensorFlow and the Raspberry Pi Pico – Part 2

The first part of this project gave us the prerequisites to train a music genre recognition model. Now that we have obtained the dataset and have gotten acquainted with implementing the MFCCs feature extraction, we can delve into the model design and the application deployment. Although we might consider leaving the deployment for the end, it should never be the case when building tinyML applications. Given its limited computational and memory capabilities, the target device must always be at the center of our design choice, from the feature extraction to the model design.

For this reason, this second part will amply discuss how the target device influences the implementation of the MFCCs feature extraction.

We will start our discussion by tailoring the MFCCs implementation for the Raspberry Pi Pico. Here, we will learn how fixed-point arithmetic can help minimize the latency performance and...

Technical requirements

To complete all the practical recipes of this chapter, we will need the following:

  • A Raspberry Pi Pico
  • A SparkFun RedBoard Artemis Nano (optional)
  • A micro-USB data cable
  • A USB-C data cable (optional)
  • 1 x electret microphone amplifier – MAX9814
  • A 5-pin press-fit header strip (optional but recommended)
  • 1 x half-size solderless breadboard
  • 6 x jumper wires
  • A laptop/PC with either Linux, macOS, or Windows
  • A Google Drive account

    The source code and additional material are available in the Chapter05_06 folder on the GitHub repository: https://github.com/PacktPublishing/TinyML-Cookbook_2E/tree/main/Chapter05_06.

Computing the FFT magnitude with fixed-point arithmetic using the CMSIS-DSP library

In the previous chapter, we discovered that the Raspberry Pi Pico has enough memory to handle the data pipeline to extract the MFCCs, using floating-point arithmetic. However, this data format does not offer the best computational efficiency for our desired target platform.

In this recipe, we will uncover why floating-point arithmetic is inefficient on the Raspberry Pi Pico and propose the 16-bit fixed-point (Q15) arithmetic as a more practical alternative. To provide a hands-on understanding of Q15, we will guide you through calculating the FFT magnitude, using this data type with the CMSIS-DSP Python library in the Colab notebook. This approach will simplify the transition of the code to the Raspberry Pi Pico in subsequent chapters.

Getting ready

In the previous recipe, we learned how to extract MFCCs from an audio sample using TensorFlow. Nevertheless, it is necessary to consider whether...

Implementing the MFCCs feature extraction with the CMSIS-DSP library

The practical use of Q15 fixed-point arithmetic in computing the FFT magnitude provided the foundational knowledge to build functions using fixed-point arithmetic.

In this recipe, we will exploit this knowledge to rewrite the implementation of the MFCCs feature extraction using the Q15 data format.

Getting ready

In the preceding recipe, we learned that the Raspberry Pi Pico, like many other microcontrollers, does not have hardware acceleration for floating-point arithmetic. Despite this limitation, we now know we can leverage the computation with the fixed-point format and rely solely on integer arithmetic.

However, the MFCCs computation can be optimized even further on the Raspberry Pi Pico by exploiting an incredible additional feature offered by this device: the large program memory size.

As we know, the program memory is reserved for storing the program. However, this memory can also be utilized...

Designing and training an LSTM RNN model

In this project, the model designed for classifying music genres is an LSTM RNN, as illustrated in the following diagram:

Figure 6.9: LSTM recurrent neural network for music genre classification

As shown in the previous image, the MFCCs extracted from 1 second of raw audio are the input for the model, which consists of the following layers:

  • 2 x LSTM layers with 32 number of units each (Num. units)
  • 1 x Dropout layer with a 50% rate (0.5)
  • 1 x Fully connected layer with three output neurons, followed by a Softmax activation function

In this recipe, we will design and train this LSTM model with TensorFlow.

Getting ready

In Chapter 4, Using Edge Impulse and the Arduino Nano to Control LEDs with Voice Commands, we addressed an audio classification problem using a standard convolutional neural network (CNN) that learned visual patterns from the Mel-filterbank energy...

Evaluating the accuracy of the quantized model on the test dataset

After training the model using TensorFlow, we are ready to make it suitable for microcontroller deployment.

In this recipe, we will quantize the trained model to 8-bit using the TensorFlow Lite converter tool and then assess its accuracy with the test dataset. After evaluating the model’s accuracy, we will use the xxd tool to convert the TensorFlow Lite model to a C-byte array, preparing it for deployment on the microcontroller.

Getting ready

Quantization is a pivotal technique in the ML world on microcontrollers because it makes the model storage-efficient and boosts its latency inference.

The quantization adopted in this book involves converting 32-bit floating-point numbers to 8-bit integers. While this technique offers a model reduction of four times and a latency improvement, we could lose accuracy because of the reduced numerical precision. For this reason, it is paramount to evaluate...

Deploying the MFCCs feature extraction algorithm on the Raspberry Pi Pico

The final two recipes of this chapter will guide us through the development of the music genre classification application on the microcontroller.

In this particular recipe, we will deploy the MFCCs feature extraction algorithm using the CMSIS-DSP on the Raspberry Pi Pico.

Getting ready

Extracting MFCCs from raw audio data is the first stage of our computing chain to classify music genres. Since we developed this algorithm in Python using the CMSIS-DSP library, transitioning to a C language implementation should be relatively straightforward.

The C version of the CMSIS-DSP library mirrors its Python counterpart, offering the same functions and, often, an identical API, simplifying the migration to the final implementation on a board.

The only ingredients needed to deploy the MFCCs feature extraction algorithm on the Raspberry Pi Pico or other Arm-based microcontrollers are the following:

...

Recognizing music genres with the Raspberry Pi Pico

Here we are, ready to finalize our application on the Raspberry Pi Pico.

In this final recipe, we will deploy the TensorFlow Lite model to recognize the music genre from audio clips recorded with the microphone connected to the microcontroller.

Getting ready

The application we will design in this recipe aims to continuously record a 1-second audio clip and run the model inference, as illustrated in the following image:

Figure 6.17: Recording and processing tasks running sequentially

From the task execution timeline shown in the preceding image, you can observe that the feature extraction and model inference are always performed after the audio recording and not concurrently. Therefore, it is evident that we do not process some segments of the live audio stream.

Unlike a real-time keyword spotting (KWS) application, which should capture and process all pieces of the audio stream to never miss any spoken...

Summary

In this chapter, we have completed our music genre recognition application on the Arduino Nano and Raspberry Pi Pico. Here, our focus has been optimizing the MFCCs feature extraction algorithm through software optimizations, using fixed-point arithmetic. Additionally, we have developed, trained, and deployed an RNN model that relies on the LSTM operator to classify music genres.

Our practical journey started by getting acquainted with fixed-point arithmetic and its role in accelerating the extraction of MFCCs from audio clips. To achieve this objective, the Python CMSIS-DSP played a crucial role, helping us to develop an algorithm in a Python environment that closely resembles the final implementation on the microcontroller.

Following the implementation of the MFCCs feature extraction, our attention shifted to model design. Here, we designed and trained an RNN model based on the LSTM layer with TensorFlow to classify music genres.

Once the model was trained, we...

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/tiny

lock icon
The rest of the chapter is locked
You have been reading a chapter from
TinyML Cookbook - Second Edition
Published in: Nov 2023Publisher: PacktISBN-13: 9781837637362
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice