Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Voice Application Development for Android

You're reading from  Voice Application Development for Android

Product type Book
Published in Nov 2013
Publisher Packt
ISBN-13 9781783285297
Pages 134 pages
Edition 1st Edition
Languages

Table of Contents (19) Chapters

Voice Application Development for Android
Credits
Foreword
About the Authors
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
1. Speech on Android Devices 2. Text-to-Speech Synthesis 3. Speech Recognition 4. Simple Voice Interactions 5. Form-filling Dialogs 6. Grammars for Dialog 7. Multilingual and Multimodal Dialogs 8. Dialogs with Virtual Personal Assistants 9. Taking it Further Afterword
Index

Chapter 5. Form-filling Dialogs

Many speech-enabled apps use one-shot dialogs like the ones described in the previous chapter. Do you feel that speech interfaces can go further than that? Can you imagine more complex interactions in which several items of information have to be elicited from the user for a wide variety of purposes, for example, to launch apps, query databases, start web services or web services mashups, and a lot more?

These types of dialog are similar to form-filling in a traditional web application. By the end of this chapter you should be able to implement simple form-filling dialogs in order to obtain the data necessary to access a web service.

Form-filling dialogs


A form-filling dialog can be seen in terms of a number of slots to be filled. For example, in the case of a flight booking app, the system may have to fill five slots: destination, arrival date, arrival time, departure date, and departure time. In a simple form-filling dialog each slot is processed one at a time and the relevant questions are asked until all the slots have been filled. At that point the app can look up the required flight and present the results to the user. The following is an example of how a dialog might proceed and how status of the slots changes as the dialog progresses.

App: Welcome to the Flight Information Service. Where would you like to travel to?

Caller: London.

Slot

Destination

Arrival date

Arrival time

Departure date

Departure time

Value

London

unknown

unknown

unknown

unknown

App: What date would you like to fly to London?

Caller: The 10th of July.

Implementing form-filling dialogs


In order to implement form-filling dialogs it is necessary to:

  • Create a data structure to represent the slots that will hold the information that the system has to elicit from the user.

  • Develop an algorithm to process the slots, extracting the required prompts for each of them.

VoiceXML (http://www.w3.org/TR/voicexml20/) provides a useful structure for this task in terms of forms containing fields that represent the different items of information (slots) required to complete the form. The following code is an example:

  <form id = "flight">
  <field name="destination">
  <prompt>where would you like to travel to?</prompt>
  <grammar src = "destinations.grxml"/>
  </field>
  <field name="date">
  <prompt>what day would you like to travel?</prompt>
  <grammar src = "days.grxml"/>
  </field>
  </form>

The preceding example shows an app that asks for two pieces of information: destination and...

Threading


We will use XML files for various purposes in the remaining chapters and have encapsulated the common code in the XMLLib library. One important issue involves threading. When launching an app, a thread is created to run the code. This thread is responsible for the actions that involve updating the user interface, so it is sometimes called the UI thread. Carrying out very expensive operations in the UI thread, such as downloading files, carrying out HTTP requests, opening socket connections, or accessing databases, might block the UI thread for a long time, making it unresponsive and freezing updates of the interface. For this reason from Android 3 (HoneyComb) onwards, when trying to perform a networking operation on the main thread of an Android app, the android.os.NetworkOnMainThreadException is raised.

Android provides several ways to enable communication between background threads and the UI thread as explained here: http://developer.android.com/guide/components/processes-and...

XMLLib


In our library (sandra.libs.util.xmllib in the code bundle), the RetrieveXMLTask (see RetrieveXMLTask.java) is responsible for fetching an XML file from the web and saving its content in a String to be further processed. It is declared in the following way:

  class RetrieveXMLTask extends AsyncTask<String, Void, String>

It has been defined as an asynchronous task (AsyncTask) that receives a collection of Strings as input parameters. It does not produce any type of progress values (void), and produces a String as a result of the background computation (<parameters, progress, result>). In our case, the String input is the URL to retrieve the XML file, and the String result is the XML code in the file. The reading of the XML file from the specified URL is done as a background task in the doInBackground method that uses other private methods that open the HTTP connection and read the byte streams (saveXmlInString and readStream). Take a look at the doInBackground and saveXMLInString...

FormFillLib


To build a form-filling app, we must specify a data structure such as the one in the flight example. To do this, we define two classes: Form and Field. As shown in the UML diagram, a Form has a collection of Fields, and a Field has five attributes; a name, a string representing the prompt that the app will use to ask for the piece of data, two strings representing the prompts to be used when the app does not understand the user's response to the initial prompt (nomatch), or does not hear it (noinput), and the value that has been understood by the app.

For example, the Field flight setting could have the following values for its attributes:

name: Destination

prompt: What is your destination?

nomatch: Sorry, I did not understand what you said

noinput: Sorry, I could not hear you

value: Rome (when the user has said Rome in response to the system prompt)

This structure will suffice to build an app of the type we are discussing in this chapter. It is only necessary to create as many objects...

MusicBrain app


To illustrate how to use the FormFillLib we will develop an app that asks the user for the pieces of data necessary to query a web service. The relations between the classes in the app and the libraries described in this chapter are shown in the following class diagram:

Apps are no longer standalone isolated applications; usually they combine their own resources with data and functionalities gathered from third-party web services. Recently, many web applications have published APIs (Application Programming Interfaces) that allow interested developers to use them in their own apps. This integration can be as complex as desired, involving multiple sources. These are known as mashups. For example, a travel mashup can integrate Google maps to indicate geographical locations with Flickr to show pictures of the relevant tourist attractions while at the same time checking for good restaurants in FoodSpotting.

A list of a wide range of available APIs can be found at: http://www.programmableweb...

Summary


This chapter has shown how to implement form-filling dialogs in which the app engages in simple conversations with the user in order to retrieve several pieces of data which can later be used to provide advanced functionalities to the user through web services or mashups.

The FormFillLib contains the classes to retrieve and parse an XML definition of the dialog structure into Java objects. These objects are employed to control the oral interaction with the user. This library makes it possible to easily build any form-filling dialog in an Android app by specifying its structure in a simplified VoiceXML file accessible on the Internet.

The MusicBrain app shows how to use the library to gather information from the user through a spoken conversation to query a web service. In this case, the app asks the user for a word and two dates which are used to query the MusicBrainZ open music encyclopedia for albums with the word in their title and those released between the specified dates. The...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Voice Application Development for Android
Published in: Nov 2013 Publisher: Packt ISBN-13: 9781783285297
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}

Slot

Destination

Arrival date

Arrival time

Departure date

Departure time

Value...