Implementing form-filling dialogs
In order to implement form-filling dialogs it is necessary to:
Create a data structure to represent the slots that will hold the information that the system has to elicit from the user.
Develop an algorithm to process the slots, extracting the required prompts for each of them.
VoiceXML (http://www.w3.org/TR/voicexml20/) provides a useful structure for this task in terms of forms containing fields that represent the different items of information (slots) required to complete the form. The following code is an example:
The preceding example shows an app that asks for two pieces of information: destination and...
We will use XML files for various purposes in the remaining chapters and have encapsulated the common code in the XMLLib
library. One important issue involves threading. When launching an app, a thread is created to run the code. This thread is responsible for the actions that involve updating the user interface, so it is sometimes called the UI thread. Carrying out very expensive operations in the UI thread, such as downloading files, carrying out HTTP requests, opening socket connections, or accessing databases, might block the UI thread for a long time, making it unresponsive and freezing updates of the interface. For this reason from Android 3 (HoneyComb) onwards, when trying to perform a networking operation on the main thread of an Android app, the android.os.NetworkOnMainThreadException
is raised.
Android provides several ways to enable communication between background threads and the UI thread as explained here: http://developer.android.com/guide/components/processes-and...
In our library (sandra.libs.util.xmllib
in the code bundle), the RetrieveXMLTask
(see RetrieveXMLTask.java
) is responsible for fetching an XML file from the web and saving its content in a String to be further processed. It is declared in the following way:
It has been defined as an asynchronous task (AsyncTask
) that receives a collection of Strings as input parameters. It does not produce any type of progress values (void), and produces a String as a result of the background computation (<parameters, progress, result>). In our case, the String input is the URL to retrieve the XML file, and the String result is the XML code in the file. The reading of the XML file from the specified URL is done as a background task in the doInBackground
method that uses other private methods that open the HTTP connection and read the byte streams (saveXmlInString
and readStream
). Take a look at the doInBackground
and saveXMLInString...
To build a form-filling app, we must specify a data structure such as the one in the flight example. To do this, we define two classes: Form
and Field
. As shown in the UML diagram, a Form
has a collection of Fields, and a
Field
has five attributes; a name, a string representing the prompt that the app will use to ask for the piece of data, two strings representing the prompts to be used when the app does not understand the user's response to the initial prompt (nomatch
), or does not hear it (noinput
), and the value that has been understood by the app.
For example, the Field
flight setting could have the following values for its attributes:
name: Destination
prompt: What is your destination?
nomatch: Sorry, I did not understand what you said
noinput: Sorry, I could not hear you
value: Rome (when the user has said Rome in response to the system prompt)
This structure will suffice to build an app of the type we are discussing in this chapter. It is only necessary to create as many objects...
To illustrate how to use the FormFillLib
we will develop an app that asks the user for the pieces of data necessary to query a web service. The relations between the classes in the app and the libraries described in this chapter are shown in the following class diagram:
Apps are no longer standalone isolated applications; usually they combine their own resources with data and functionalities gathered from third-party web services. Recently, many web applications have published APIs (Application Programming Interfaces) that allow interested developers to use them in their own apps. This integration can be as complex as desired, involving multiple sources. These are known as mashups. For example, a travel mashup can integrate Google maps to indicate geographical locations with Flickr to show pictures of the relevant tourist attractions while at the same time checking for good restaurants in FoodSpotting.
A list of a wide range of available APIs can be found at: http://www.programmableweb...
This chapter has shown how to implement form-filling dialogs in which the app engages in simple conversations with the user in order to retrieve several pieces of data which can later be used to provide advanced functionalities to the user through web services or mashups.
The FormFillLib
contains the classes to retrieve and parse an XML definition of the dialog structure into Java objects. These objects are employed to control the oral interaction with the user. This library makes it possible to easily build any form-filling dialog in an Android app by specifying its structure in a simplified VoiceXML file accessible on the Internet.
The MusicBrain
app shows how to use the library to gather information from the user through a spoken conversation to query a web service. In this case, the app asks the user for a word and two dates which are used to query the MusicBrainZ open music encyclopedia for albums with the word in their title and those released between the specified dates. The...