Reader small image

You're reading from  TinyML Cookbook - Second Edition

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781837637362
Edition2nd Edition
Right arrow
Author (1)
Gian Marco Iodice
Gian Marco Iodice
author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice

Right arrow

Recognizing Music Genres with TensorFlow and the Raspberry Pi Pico – Part 1

The project we will develop together holds a special place in my heart because it brings back memories of my first programming experience on an Arm Cortex-M microcontroller.

Back in my university days, portable MP3 audio players were definitely one of the coolest things to have. I was fascinated by this technology that allowed me to carry thousands of high-quality songs in my pocket to enjoy anywhere. As a technology and music enthusiast, I wanted to learn more about it. Therefore, I undertook the challenge of building an audio player from scratch using a microcontroller and a touch screen (https://youtu.be/LXm6-LuMmUU).

Completing that project was fun and gave me valuable hands-on experience. One feature I hoped to include was a machine learning (ML) algorithm for recognizing the music genre to auto-equalize the sound and get the best listening experience.

However, deep learning (DL) was...

Technical requirements

To complete all the practical recipes of this chapter, we will need the following:

  • A Raspberry Pi Pico board
  • A SparkFun RedBoard Artemis Nano (optional)
  • A micro-USB data cable
  • A USB-C data cable (optional)
  • 1 x electret microphone amplifier – MAX9814
  • 5-pin press-fit header strip (optional but recommended)
  • 1 x half-size solderless breadboard
  • 1 x push-button
  • 8 x jumper wires
  • Laptop/PC with either Linux, macOS, or Windows
  • Google Drive account
  • Kaggle account

The source code and additional material are available in the Chapter05_06 folder in the GitHub repository: https://github.com/PacktPublishing/TinyML-Cookbook_2E/tree/main/Chapter05_06

Connecting the microphone to the Raspberry Pi Pico

Our application requires a microphone to record songs and classify their music genre. Since the Raspberry Pi Pico does not have a built-in microphone, we need to employ an external one and figure out the appropriate way to connect it to the microcontroller.

This recipe will help you achieve this objective by offering a step-by-step guide on integrating a microphone into an electronic circuit alongside the microcontroller.

Getting ready

The microphone put into action in this recipe is a low-cost electret microphone with the MAX9814 amplifier, which you can buy, for example, from one of the following distributors:

The signal generated by a microphone is generally weak, as it ranges from several microvolts to a few millivolts...

Recording audio samples with the Raspberry Pi Pico

Like any ML project, we need to prepare a dataset. For our purposes, the audio clips will be acquired with the microphone connected to the microcontroller. This choice will help us obtain an ML model with high accuracy because the samples derive from the exact microphone used in the final application.

In this recipe, we will create an Arduino sketch to record audio clips of 4 seconds with the Raspberry Pi Pico. The audio samples will then be transmitted over the serial connection and transformed into audio files in the following recipe.

Getting ready

Microcontrollers can be employed to build fully functional digital audio recorders when paired with a microphone.

To accomplish this task with our microphone, we must develop a program to sample the audio analog signal at regular intervals, also known as the sample rate.

To create the correct digital representation, we must consider the following:

  • The sample...

Generating audio files from samples transmitted over the serial

If the previous recipe taught us the method for recording audio using a microcontroller, our next objective is to create audio files we can reproduce on our computer.

In this recipe, we will develop a Python script locally to generate audio files in .wav format from the audio samples transmitted over the serial.

Getting ready

In this recipe, we will develop a Python script locally to create audio files from the data transmitted over the serial. To facilitate this task, we will need two main Python libraries:

The soundfile library can be installed with the following pip command:

$ pip install soundfile

Using the write() method provided by this library, we can generate audio files in various formats from NumPy...

Building the dataset for classifying music genres

Having been able to record audio with the Raspberry Pi Pico, we are now set to build the dataset for classifying music genres.

This recipe will walk you through collecting training samples from two sources: audio clips from the GTZAN dataset and audio recordings captured with the Raspberry Pi Pico. The collected audio clips will then be uploaded to Google Drive, ensuring easy access from Google Colab during the ML model preparation phase.

Getting ready

The dataset we need for training our model requires a substantial number of training samples for each music genre. Typically, a minimum of 100 samples per genre is recommended to yield better accuracy results.

However, the number of training samples is not the only factor to consider. In fact, the dataset must encompass a broad spectrum of songs, capturing the diverse stylistic variations within each genre.

As you might guess, collecting such a vast number of audio...

Extracting MFCCs from audio samples with TensorFlow

Acoustic models heavily rely on hand-crafted engineering input features to achieve high accuracy. Mel Frequency Cepstral Coefficients (MFCCs) are extensively utilized in audio applications and have demonstrated remarkable success in various use cases, including music genre classification.

In this recipe, we will show you how to extract MFCCs in Python using the TensorFlow signal processing functions (https://www.tensorflow.org/versions/r2.11/api_docs/python/tf/signal):

Getting ready

The primary goal of MFCCs is to combine the temporal information and spectral characteristics of the audio signal in a very compact manner.

In Chapter 4, Using Edge Impulse and Arduino Nano to Control LEDs with Voice Commands, we gave a high-level summary of this feature extraction method. Here, in this chapter, we will delve deeper into its underlying compute blocks for implementing it with the TensorFlow signal processing functions.

...

Summary

In the first part of this chapter, we walked through the steps of recording audio clips using an external microphone with the Raspberry Pi Pico and analyzed the compute build blocks of the MFCCs feature extraction algorithm.

Our practical journey started by learning to connect the microphone to the Raspberry Pi Pico and record audio clips using the ADC peripheral and timer interrupts.

Then, we crafted a Python script to create audio files from the samples transmitted by the microcontroller over the serial connection. This script was then extended to upload the audio files to Google Drive, laying the foundation for building the training dataset. Given the large number of samples required for training the ML model, we collected the training data from the GTZAN dataset and audio recordings captured with the Raspberry Pi Pico. After the dataset preparation, we finally analyzed and implemented the MFCCs feature extraction using TensorFlow.

In the upcoming second part...

References

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/tiny

lock icon
The rest of the chapter is locked
You have been reading a chapter from
TinyML Cookbook - Second Edition
Published in: Nov 2023Publisher: PacktISBN-13: 9781837637362
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Gian Marco Iodice

Gian Marco Iodice is team and tech lead in the Machine Learning Group at Arm, who co-created the Arm Compute Library in 2017. The Arm Compute Library is currently the most performant library for ML on Arm, and it's deployed on billions of devices worldwide – from servers to smartphones. Gian Marco holds an MSc degree, with honors, in electronic engineering from the University of Pisa (Italy) and has several years of experience developing ML and computer vision algorithms on edge devices. Now, he's leading the ML performance optimization on Arm Mali GPUs. In 2020, Gian Marco cofounded the TinyML UK meetup group to encourage knowledge-sharing, educate, and inspire the next generation of ML developers on tiny and power-efficient devices.
Read more about Gian Marco Iodice