Reader small image

You're reading from  Deep Learning with PyTorch Lightning

Product typeBook
Published inApr 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781800561618
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kunal Sawarkar
Kunal Sawarkar
author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Right arrow

Automatic speech recognition using Flash

Recognizing speech from an audio file is perhaps one of the most widely used applications of AI. It's part of smartphone speakers such as Alexa, as well as automatically generated captions for video streaming platforms such as YouTube, and also many music platforms. It can detect speech in an audio file and convert it into text. Detection of speech involves various challenges such as speaker modalities, pitch, and pronunciation, as well as dialect and language itself:

Figure 4.6 – A concept of automatic speech recognition

To train a model for Automatic Speech Recognition (ASR), we need a training dataset that is a collection of audio files along with the corresponding text transcription that describes that audio. The more diverse the set of audio files with people from different age groups, ethnicities, dialects, and so on is, the more robust the ASR model will be for the unseen audio files.

In the previous...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Deep Learning with PyTorch Lightning
Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618

Author (1)

author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar