Voice Interaction and Android Marshmallow

Raka Mahesa

July 01st, 2016

"Jarvis, play some music."

You might imagine that to be a quote from some Iron Man stories (and hey, that might be an actual quote), but if you replace the "Jarvis" part with "OK Google," you'll get an actual line that you can speak to your Android phone right now that will open a music player and play a song. Go ahead and try it out yourself. Just make sure you're on your phone's home screen when you do it.

This feature is called Voice Action, and it was actually introduced years ago in 2010, though back then it only worked on certain apps. However, Voice Action only accepts a single-line voice command, unlike Jarvis who usually engages in a conversation with its master. For example, if you ask Jarvis to play music, it will probably reply by asking what music you want to play. Fortunately, this type of conversation will no longer be limited to movies or comic books, because with Android Marshmallow, Google has introduced an API for that: the Voice Interaction API.

As the name implies, the Voice Interaction API enables you to add voice-based interaction to its app. When implemented properly, the user will be able to command his/her phone to do a particular task without any touch interaction just by having a conversation with the phone. Pretty similar to Jarvis, isn't it?

So, let's try it out!

One thing to note before beginning: the Voice Interaction API can only be activated if the app is launched using Voice Action. This means that if the app is opened from the launcher via touch, the API will return a null object and cannot be used on that instance. So let’s cover a bit of Voice Action first before we delve further into using the Voice Interaction API.

Requirements

To use the Voice Interaction API, you need:

  • Android Studio v1.0 or above
  • Android 6.0 (API 23) SDK
  • A device with Android Marshmallow installed (optional)

Voice Action

Let's start by creating a new project with a blank activity. You won’t use the app interface and you can use the terminal logging to check what app does, so it's fine to have an activity with no user interface here.

Okay, you now have the activity. Let’s give the user the ability to launch it using a voice command. Let's pick a voice command for our app—such as a simple "take a picture" command? This can be achieved by simply adding intent filters to the activity. Add these lines to your app manifest file and put them below the original intent filter of your app activity.

<intent-filter>
    <action android:name="android.media.action.STILL_IMAGE_CAMERA" />
    <category android:name="android.intent.category.DEFAULT" />
    <category android:name="android.intent.category.VOICE" />
</intent-filter>

These lines will notify the operating system that your activity should be triggered when a certain voice command is spoken. The action "android.media.action.STILL_IMAGE_CAMERA" is associated with the "take a picture" command, so to activate the app using a different command, you need to specify a different action. Check out this list if you want to find out what other commands are supported.

And that's all you need to do to implement Voice Action for your app. Build the app and run it on your phone. So when you say "OK Google, take a picture", your activity will show up.

Voice Interaction

All right, let's move on to Voice Interaction.

When the activity is created, before you start the voice interaction part, you must always check whether the activity was started from Voice Action and whether the VoiceInteractor service is available. To do that, call the isVoiceInteraction() function to check the returned value. If it returns true, then it means the service is available for you to use.

Let's say you want your app to first ask the user which side he/she is on, then changes the app background color accordingly. If the user chooses the dark side, the color will be black, but if the user chooses the light side, the app color will be white. Sounds like a simple and fun app, doesn't it?

So first, let’s define what options are available for users to choose. You can do this by creating an instance of VoiceInteractor.PickOptionRequest.Option for each available choice. Note that you can associate more than one word with a single option, as can be seen in the following code.

VoiceInteractor.PickOptionRequest.Option option1 = new VoiceInteractor.PickOptionRequest.Option(“Light”, 0);
option1.addSynonym(“White”);
option1.addSynonym(“Jedi”);
VoiceInteractor.PickOptionRequest.Option option2 = new VoiceInteractor.PickOptionRequest.Option(“Dark”, 1);
option12addSynonym(“Black”);
option2.addSynonym(“Sith”);

The next step is to define a Voice Interaction request and tell the VoiceInteractor service to execute that requests. For this app, use the PickOptionRequest for the request object. You can check out other request types on this page.

VoiceInteractor.Option[] options = new VoiceInteractor.Option[] { option1, option2 }
VoiceInteractor.Prompt prompt = new VoiceInteractor.Prompt("Which side are you on");

getVoiceInteractor().submitRequest(new PickOptionRequest(prompt, options, null) {
     //Handle each option here
});

And determine what to do based on the choice picked by the user. This time, we'll simply check the index of the selected option and change the app background color based on that (we won't delve into how to change the app background color here; let's leave it for another occasion).

@Override
public void onPickOptionResult(boolean finished, Option[] selections, Bundle result) {
    if (finished && selections.length == 1) {
        if (selections[0].getIndex() == 0) changeBackgroundToWhite();
        else if (selections[0].getIndex() == 1) changeBackgroundToBlack();
    }
}

@Override
public void onCancel() {
    closeActivity();
}

And that's it! When you run your app on your phone, it should ask which side you're on if you launch it using Voice Action. You've only learned the basics here, but this should be enough to add a little voice interactivity to your app. And if you ever want to create a Jarvis version, you just need to add "sir" to every question your app asks.

About the author

Raka Mahesa is a game developer at Chocoarts who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also tweets regularly as @legacy99.