You're reading from Hands-On Music Generation with Magenta

Product typeBook

Published inJan 2020

Reading LevelExpert

Publisher

ISBN-139781838824419

Edition1st Edition

Languages

Python

Tools

Magenta

Concepts

Machine Learning

Author (1)

Alexandre DuBreuil

Assessments

Chapter 1: Introduction to Magenta and Generative Art

Randomness.
Markov chain.
Algorave.
Long short-term memory (LSTM).
Autonomous systems generate music without operator input; assisting music systems will complement an artist while working.
Symbolic: sheet music, MIDI, MusicXML, AbcNotation. Sub-symbolic: raw audio (waveform), spectrogram.
"Note On" and "Note Off" timing, pitch between 1 and 127 kHz, velocity, and channel.
At a sample rate of 96 kHz, the Nyquist frequency is 96 kHz/2 = 48 kHz and the frequency range is 0 to 48 kHz. This is worse for listening to audio since 28 kHz of audio is lost on the ear (remember anything over 20 khz cannot be heard), and that sampling rate is not properly supported by much audio equipment. It is useful in recording and audio editing though.
A single musical note, A4, is played for 1 second loudly.
Drums, voice (melody...

Chapter 2: Generating Drum Sequences with the Drums RNN

Given a current sequence, predict the score for the next note, then do a prediction for each step you want to generate.
(1) RNNs operate on sequences of vectors, for the input and output, which is good for sequential data such as a music score, and (2) keep an internal state composed of the previous output steps, which is good for doing a prediction based on past inputs, not only the current input.
(1) First, the hidden layer will get h(t + 1), which is the output of the previous hidden layer, and (2) it will also receive x(t + 2), which is the input of the current step.
The number of bars generated will be 2 bars, or 32 steps, since we have 16 steps per bar. At 80 QPM, each step takes 0.1875 seconds, because you take the number of seconds in a minute, divide by the QPM, and divide by the number of steps per quarter: 60...

Chapter 3: Generating Polyphonic Melodies

Vanishing gradients (values get multiplied by small values in each RNN step) and exploding gradients are common RNN problems that occur when training during the backpropagation step. LSTM provides a dedicated cell state that is modified by forget, input, and output gates to alleviate those problems.
Gated recurrent units (GRUs) are simpler but less expressive memory cells, where the forget and input gates are combined into a single update gate.
For a 3/4 time signature, you need 3 steps per quarter note, times 4 steps per quarter note, which equals 12 steps per bar. For a binary step counter to count to 12, you need 5 bits (like for 4/4 time) that will only count to 12. For 3 lookbacks, you'll need to look at the past 3 bars, with each bar being 12 steps, so you have [36, 24, 12].
The resulting vector is the sum of the previous...

Chapter 4: Latent Space Interpolation with MusicVAE

The main use is dimensionality reduction, to force the network to learn important features, making it possible to reconstruct the original input. The downside of AE is that the latent space represented by the hidden layer is not continuous, making it hard to sample since the decoder won't be able to make sense of some of the points.
The reconstruction loss penalizes the network when it creates outputs that are different from the input.
In VAE, the latent space is continuous and smooth, making it possible to sample any point of the space and interpolate between two points. It is achieved by having the latent variables follow a probability distribution of P(z), often a Gaussian distribution.
The KL divergence measures how much two probability distributions diverge from each other. When combined with the reconstruction loss...

Chapter 5: Audio Generation with NSynth and GANSynth

You have to handle 16,000 samples per second (at least) and keep track of the general structure at a bigger time scale.
NSynth is a WaveNet-style autoencoder that learns its own temporal embedding, making it possible to capture long term structure, and providing access to a useful hidden space.
The colors in the rainbowgram are the 16 dimensions of the temporal embedding.
Check the timestretch method in the audio_utils.py file in the chapter's code.

GANSynth uses upsampling convolutions, making the training and generation processing in parallel possible for the entire audio sample.
You need to sample the random normal distribution using np.random.normal(size=[10, 256]), where 10 is the number of sampled instruments, and 256 is the size of the latent vector (given by the latent_vector_size configuration).

...

Chapter 6: Data Preparation for Training

MIDI is not a text format, so it is harder to use and modify, but it is extremely common. MusicXML is rather rare and cumbersome but has the advantage of being in text format. ABCNotation is also rather rare, but has the advantage of being in text format and closer to sheet music.
Use the code from chapter_06_example_08.py, and change the program=43 in the extraction.
There are 1,116 rock songs in LMD and 3,138 songs for jazz, blues, and country. Refer to chapter_06_example_02.py and chapter_06_example_03.py to see how to make statistics with genre information.
Use the RepeatSequence class in melody_rnn_pipeline_example.py.
Use the code from chapter_06_example_09.py. Yes, we can train a quantized model with it since the data preparation pipeline quantizes the input.
For small datasets, data augmentation plays an essential role in creating...

Chapter 7: Training Magenta Models

See chapter_07_example_03.py.
A network that underfits is a network that hasn't reached its optimum, meaning it won't predict well with the evaluation data, because it fits poorly the training data (for now). It can be fixed by letting it train long enough, by adding more network capacity, and more data.

A network that overfits is a network that has learned to predict the input but cannot generalize to values outside of its training set. It can be fixed by adding more data, by reducing the network capacity, or by using regularization techniques such as dropout.
Early stopping.
Read On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima, which explains that a larger batch size leads to sharp minimizers, which in turn leads to poorer generalization. Therefore it is worse in terms of efficiency, but might...

Chapter 8: Magenta in the Browser with Magenta.js

We can train models using TensorFlow.js, but we cannot train models using Magenta.js. We need to train the models in Magenta using Python and import the resulting models in Magenta.js.
The Web Audio API enables audio synthesis in the browser using audio nodes for generation, transformation, and routing. The easiest way to use it is to use an audio framework such as Tone.js.
The method is randomSample and the argument is the pitch of the generated note. As an example, using 60 will result in a single note at MIDI pitch 60, or C4 in letter notation. This is also useful as a reference for pitching the note up or down using Tone.js.

The method is sample and the number of instruments depends on the model that is being used. In our example, we've used the trio model, which generates three instruments. Using a melody model will...

Chapter 9: Making Magenta Interact with Music Applications

A DAW will have more functions geared towards music production such as recording, audio, MIDI editing, effects and mastering, and song composition. A software synthesizer like FluidSynth will have less functionalities, but have the advantage of being lightweight and easy to use.
Most music software won't open MIDI ports by themselves, so to send sequences back and forth between them we have to manually open ports.
See the code in chapter_09_example_05.py in this chapter's code.
Because syncing two pieces of software that have desynced requires restarting them. A MIDI clock enables syncing once per beat.
Because Magenta Studio integrates with existing music production tools such as DAWs and doesn't require any technical knowledge, it makes AI-generated music available to a greater audience, which is ultimately...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Music Generation with Magenta

Published in: Jan 2020Publisher: ISBN-13: 9781838824419

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Alexandre DuBreuil

Alexandre DuBreuil is a software engineer and generative music artist. Through collaborations with bands and artists, he has worked on many generative art projects, such as generative video systems for music bands in concerts that create visuals based on the underlying musical structure, a generative drawing software that creates new content based on a previous artist's work, and generative music exhibits in which the generation is based on real-time events and data. Machine learning has a central role in his music generation projects, and Alexandre has been using Magenta since its release for inspiration, music production, and as the cornerstone for making autonomous music generation systems that create endless soundscapes.
Read more about Alexandre DuBreuil

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages