Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Practical Guide to Azure Cognitive Services

You're reading from  Practical Guide to Azure Cognitive Services

Product type Book
Published in May 2023
Publisher Packt
ISBN-13 9781801812917
Pages 454 pages
Edition 1st Edition
Languages
Authors (3):
Chris Seferlis Chris Seferlis
Profile icon Chris Seferlis
Christopher Nellis Christopher Nellis
Profile icon Christopher Nellis
Andy Roberts Andy Roberts
Profile icon Andy Roberts
View More author details

Table of Contents (22) Chapters

Preface Part 1: Ocean Smart – an AI Success Story
Chapter 1: How Azure AI Changed Ocean Smart Chapter 2: Why Azure Cognitive Services? Chapter 3: Architectural and Cost Optimization Considerations Part 2: Deploying Next-Generation Knowledge Mining Solutions with Azure Cognitive Search
Chapter 4: Deriving Value from Knowledge Mining Solutions in Azure Chapter 5: Azure Cognitive Search Overview and Implementation Chapter 6: Exploring Further Azure Cognitive Services for Successful KM Solutions Chapter 7: Pulling It All Together for a Complete KM Solution Part 3: Other Cognitive Services That Will Help Your Company Optimize Operations
Chapter 8: Decluttering Paperwork with Form Recognizer Chapter 9: Identifying Problems with Anomaly Detector Chapter 10: Streamlining the Quality Control Process with Custom Vision Chapter 11: Deploying a Content Moderator Chapter 12: Using Personalizer to Cater to Your Audience Chapter 13: Improving Customer Experience with Speech to Text Chapter 14: Using Language Services in Chat Bots and Beyond Chapter 15: Surveying Our Progress Chapter 16: Appendix – Azure OpenAI Overview Index Other Books You May Enjoy

Improving Customer Experience with Speech to Text

When a customer calls the Ocean Smart customer service line, they are anticipating a friendly and helpful associate who will solve their current challenge. With years of exceptional service and quality being provided to customers, this is the expectation that has been built. However, with massive growth and global expansion, monitoring this quality is not as easy as it once was. Ocean Smart hoped that AI could help provide a solution and a way to track the quality of the customer service that was being provided.

A great customer experience is becoming more and more critical for successful businesses in this climate of on-demand everything. If a person has a not-so-great experience with a company, they’re sure to let the world know as quickly as possible using as many social media outlets as possible. Because of this, Ocean Smart wanted a better system for improving how customer calls were handled and wanted to set a precedent...

Technical requirements

As with previous chapters and deployments, there are some requirements you’ll need in order to build any example for your use. First is an Azure account with a minimum of contributor rights if you’re not an owner of the subscription to be able to deploy resources. To help reduce costs for development, oftentimes, developers can use Visual Studio Code. You can download it for Windows, Linux, or macOS platforms here: https://code.visualstudio.com/download. For the examples we will display later in the chapter, you will need several extensions in Visual Studio Code, which can be downloaded directly within the tool. These extensions are as follows:

Overview of Azure Speech services

The Azure Speech services are a collection of APIs surrounding various ways to convert speech to text, convert text to speech, translate speech, and other related services. When we consider the importance of speech in any business, and the ability to improve communications for accessibility and cultural reasons, it is easy to position these capabilities as transformative. As organizations become globalized and it is an everyday occurrence that language translation services can be used to improve internal and external communications, Microsoft has made significant effort and investment to support the most popular languages worldwide. In this chapter, you will learn how those investments have evolved to offer many solutions where the communication gaps have been closed significantly, which has led to an enhanced customer service experience for Ocean Smart customers.

In the Ocean Smart example, we are taking audio recordings from voice messages and...

Working with real-time audio and batch speech-to-text data

Now that we have provided an overview of the various services you can leverage and use cases you can expect to deploy using the Speech services, we will start to explore deeper to better position our example of building a customer service feedback system. Due to the nature of our example, we will focus on how to use batch audio transcription services; however, with so many applications for real-time transcription, we will explore both options in this section, as the approaches are vastly different.

In the case of a call center, and improving the customer service process, there could be applications for real-time feedback to be provided to the customer service agent. This could provide a sentiment score as the conversation is happening based on the words being used within the conversation; however, with the nature of any conversation, the tone could change very quickly from positive to negative and could cause a distraction...

Improving speech-to-text accuracy with Custom Speech

Even though the Microsoft research and development teams have received tremendous acclaim for all their work in developing groundbreaking machine learning technology for transcribing speech-to-text, they are aware that not all business domain-specific details can be captured. For this reason, they have provided the ability for customers to augment the base machine learning model with domain-specific terms directly related to the customer business. This portion of the chapter will focus on how to work with and deploy these custom models for use in your organization.

To build your augmented model, you will use the Speech Studio, which can be found at https://speech.microsoft.com/portal. After you have logged in with your Azure account, you will be presented with several options for working with various speech operations, including the following:

  • Speech-to-text
  • Text-to-speech
  • Voice assistant
  • Additional resources...

Working with different languages

As we have previously discussed in this book, the globalization of the planet Earth over the past 30-50 years has created an evolution in technology unlike ever imagined. Moore’s law observes that the number of transistors in a dense integrated circuit doubles about every two years, and this has roughly held true since his initial prediction back in 1975 until when, very recently, it was declared no longer considered possible (Wikipedia, https://en.wikipedia.org/wiki/Moore%27s_law). As one result, we have seen a massive proliferation of technology to help humans adapt to the challenges of globalization, and we cannot look past the language barriers faced by international travelers and companies. What’s more compelling when we can do more than simply text translation using a search engine is the ability to be able to translate the spoken word on the fly using that technology. For this reason and many more, Microsoft has made significant...

Building a complete batch solution

In this chapter, we use the Speech service to translate audio files that are sent into Azure Blob Storage. When the file is created, we can then choose to perform other downstream activities – for example, extracting a sentiment from the document and tracking the results. The following diagram shows the process that we follow to monitor the storage account, begin the async request to start the transcription, and once the transcription is complete, write the results file to Azure Blob Storage:

Figure 13.8 – Process outline for creating a transcript from an audio file

Figure 13.8 – Process outline for creating a transcript from an audio file

To support this process, we create the following:

  • Azure Cognitive Speech service to perform the transcription activity.
  • Azure Storage account:
    • Blob container to store audio files.
    • Blob container to store transcription results.
    • Storage table to track transcription job progress.
  • Azure Functions App:
    • One function to monitor the audio...

Summary

With that, you now have the ability to use the Speech service for creating a transcript from an audio conversation or capturing a live audio transcript and displaying it in real time for captioning and other uses. From there, you have the opportunity to track the quality of calls using the sentiment skill available with Language services and provide the ability for your organization to greatly enhance the customer service experience, as well as training tools. These capabilities are some of the more prevalent examples where the Cognitive Services tools are applied to real-world scenarios, but just a small portion of the overall capabilities from both the Speech and Language services. Be sure to use examples such as the one laid out in this chapter and apply critical thinking around what other skills are offered within the services, as well as enhancements applied over time, for what might be beneficial to your organization. Be mindful of the limitations of the service we discussed...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Practical Guide to Azure Cognitive Services
Published in: May 2023 Publisher: Packt ISBN-13: 9781801812917
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}