Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Learning Hub
Free Learning
Arrow right icon
Appium Essentials
Appium Essentials

Appium Essentials: Explore mobile automation with Appium and discover new ways to test native, web, and hybrid applications

Can$33.99 Can$22.99
Free Trial

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Apr 9, 2015
Length 188 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781784392482
Table of content icon View table of contents Preview book icon Preview Book

Appium Essentials

Chapter 1. Appium – Important Conceptual Background

In this chapter, we will learn about the Appium architecture, JavaScript Object Notation (JSON) wire protocol, and Appium sessions as well as gain an understanding of the desired capabilities before starting Appium. This chapter will also touch upon the topics of the Appium server and its client library.

In short, we will cover the following topics:

  • Appium's architecture

  • The Selenium JSON wire protocol

  • Appium sessions

  • Desired capabilities

  • The Appium server and its client library

Appium architecture

Appium is an HTTP server written in Node.js that creates and handles WebDriver sessions. The Appium web server follows the same approach as the Selenium WebDriver, which receives HTTP requests from client libraries through JSON and then handles those requests in different ways, depending on the platform it is running on.

Let's discuss how Appium works in iOS and Android.

Appium on iOS

On an iOS device, Appium uses Apple's UIAutomation API to interact with the UI elements. UIAutomation is a JavaScript library provided by Apple to write test scripts; Appium utilizes these same libraries to automate iOS apps.

Let's take a look at the architecture, which is shown in the following diagram:

In the preceding diagram, when we execute the test scripts, it goes in the form of JSON through an HTTP request to the Appium server. The Appium server sends the command to the instruments, and the instruments look for the bootstrap.js file, which is pushed by the Appium server to the iOS device. Then, these commands execute in the bootstrap.js file within the iOS instruments' environment. After the execution of the command, the client sends back the message to the Appium server with the log details of the executed command.

A similar kind of architecture follows in the case of Android app automation. Let's discuss the Appium architecture for Android.

Appium on Android

On an Android device, Appium uses the UIAutomator framework to automate the apps. UIAutomator is a framework that is developed by the Android developers to test the Android user interface.

Let's take a look at the architecture, which is shown in the following diagram:

In the preceding diagram, we have a UIAutomator/Selendroid in place of Apple instruments and bootstrap.jar in place of the bootstrap.js file.

Appium supports Android versions greater than or equal to 17; for earlier versions, it uses the Selendroid framework. When we execute the test scripts, Appium sends the command to the UIAutomator or Selendroid on the basis of the Android version.

Here, bootstrap.jar plays the role of a TCP server, which we can use to send the test command in order to perform the action on the Android device using UIAutomator/Selendroid.

The Selenium JSON wire protocol

The JSON wire protocol (JSONWP) is a transport mechanism created by WebDriver developers. This wire protocol is a specific set of predefined, standardized endpoints exposed via a RESTful API. The purpose of WebDriver and JSONWP is the automated testing of websites via a browser such as Firefox driver, IE driver, and Chrome driver.

Appium implements the Mobile JSONWP, the extension to the Selenium JSONWP, and it controls the different mobile device behaviors, such as installing/uninstalling apps over the session.

Let's have a look at some of the endpoints from the API which are used to interact with mobile applications:

  • /session/:sessionId

  • /session/:sessionId/element

  • /session/:sessionId/elements

  • /session/:sessionId/element/:id/click

  • /session/:sessionId/source

  • /session/:sessionId/url

  • /session/:sessionId/timeouts/implicit_wait

Appium provides client libraries similar to WebDriver that act as an interface to the REST API. These libraries have functions similar to the following method:


This method will issue an HTTP request to the JSONWP, and it gets the response from the applicable API endpoint. In this case, the API endpoint that handles the getPageSource method is as follows:


The driver will execute the test script that comes in the JSON format from the AppiumDriver server to get the source. It will return the page source in the string format. In case of non-HTML (native mobile apps) platforms, the Appium library will respond with an XML document representation of the UI hierarchy. The specific structure of the document may vary from platform to platform.

Appium session

A session is a medium to send commands to the specific test application; a command is always performed in the context of a session. As we saw in the previous section, a client uses the session identifier as the sessionId parameter before performing any command. The client library requests the server to create a session. The server will then respond with a sessionId endpoint, which is used to send more commands to interact with the application(s) being tested.

Desired capabilities

Desired capabilities is a JSON object (a set of keys and values) sent by the client to the server. It describes the capabilities for the automation session in which we are interested.

Let's discuss the capabilities one by one; first, we will see the Appium server's capabilities:

We need to import "import org.openqa.Selenium.remote.DesiredCapabilities" library for Java to work with the desired capabilities.




This capability is used to define the automation engine. If you want to work with an Android SDK version less than 17, then you need to define the value as Selendroid; otherwise, the capability takes the default value as Appium. Let's see how we can implement it practically:

DesiredCapabilities caps = new DesiredCapabilities(); // creating an object
// to set capability value

We can also set the capabilities using Appium's client library. For this, users need to import "import io.appium.java_client.remote.MobileCapabilityType" library:


There's no need to use this capability in the case of iOS.


It is used to set the mobile OS platform. It uses the value as iOS, Android, or FirefoxOS:


In case of the Appium client library, you can use this:

caps.setCapability(MobileCapabilityType.PLATFORM_NAME, "Android");


To set the mobile OS version, for example, 7.1, 4.4.4, use the following command:


Alternatively, you can use the following command as well:

caps.setCapability(MobileCapabilityType.PLATFORM_VERSION, "4.4.4");


We can define the type of mobile device or emulator to use, using the following command, for example, iPhone Simulator, iPad Simulator, iPhone Retina 4-inch, Android Emulator, Moto x, Nexus 5, and so on:

caps.setCapability("deviceName", "Nexus 5");

You can use the following command as well:

caps.setCapability(MobileCapabilityType.DEVICE_NAME,"Nexus 5");


We can add the absolute local path or remote HTTP URL of the .ipa,.apk, or .zip file. Appium will install the app binary on the appropriate device first. Note that in the case of Android, if you specify the appPackage and appActivity (both the capabilities will be discussed later in this section) capabilities, then this capability shown here is not required:

caps.setCapability("app","/apps/demo/demo.apk or");

Alternatively, you can use the following command:

caps.setCapability(MobileCapabilityType.APP,"/apps/demo/demo.apk or");


If you want to automate mobile web applications, then you have to use this capability to define the browser.

For Safari on iOS, you can use this:

caps.setCapability("browserName", "Safari");

Also, you can use the following command:

caps.setCapability(MobileCapabilityType.BROWSER_NAME, "Safari");

For Chrome on Android, you can use this:

caps.setCapability("browserName", "Chrome");

Alternatively, you can use the following command:

caps.setCapability(MobileCapabilityType.BROWSER_NAME, "Chrome");


To end the session, Appium will wait for a few seconds for a new command from the client before assuming that the client quit. The default value is 60. To set this time, you can use the following command:

caps.setCapability("newCommandTimeout", "30");

You can also use this command to end the session:



This capability is used to install and launch the app automatically. The default value is set to true. You can set the capability with the following command:



This is used to set the language on the simulator/emulator, for example, fr, es, and so on. The following command will work only on the simulator/emulator:



This is used to set the locale for the simulator/emulator, for example, fr_CA, tr_TR, and so on:



A unique device identifier (udid) is basically used to identify iOS physical device. It is a 40 character long value (for example, 1be204387fc072g1be204387fc072g4387fc072g). This capability is used when you are automating apps on iOS physical device. We can easily get the device udid from iTunes, by clicking on Serial Number:

caps.setCapability("udid", "1be204387fc072g1be204387fc072g4387fc072g");


This is used to start in a certain orientation in simulator/emulator only, for example, LANDSCAPE or PORTRAIT:

caps.setCapability("orientation", "PORTRAIT");


If you are automating hybrid apps and want to move directly into the Webview context, then you can set it by using this capability; the default value is false:

caps.setCapability("autoWebview", "true");


This capability is used to reset the app's state before the session starts; the default value is false:

caps.setCapability("noReset"-," true");


In iOS, this will delete the entire simulator folder. In Android, you can reset the app's state by uninstalling the app instead of clearing the app data; also, it will remove the app after the session is complete. The default value is false. The following is the command for fullReset:

caps.setCapability("fullReset", "true");

Android capabilities

Now, let's discuss the Android capabilities, as shown in the following table:




This capability is for the Java package of the Android app that you want to run, for example,,, and so on:

caps.setCapability("appPackage", "");

Alternatively, you can use this command:

caps.setCapability(MobileCapabilityType.APP_PACKAGE, "");


By using this capability, you can specify the Android activity that you want to launch from your package, for example, MainActivity, .Settings,, and so on:

caps.setCapability("appActivity", "");

You can also use the following command:

caps.setCapability(MobileCapabilityType.APP_ACTIVITY, "");


Android activity for which the user wants to wait can be defined using this capability:


Alternatively, you can also use this command:



The Java package of the Android app you want to wait for can be defined using the following capability, for example,,, and so on:



You can set the timeout (in seconds) while waiting for the device to be ready, as follows; the default value is 5 seconds:


Alternatively, you can also use this command:



You can enable the Chrome driver's performance logging by the use of this capability. It will enable logging only for Chrome and web view; the default value is false:

caps.setCapability("enablePerformanceLogging", "true");


To set the timeout in seconds for a device to become ready after booting, you can use the following capability:



This capability is used to set DevTools socket name. It is only needed when an app is a Chromium-embedding browser. The socket is opened by the browser and the ChromeDriver connects to it as a DevTools client, for example, chrome_DevTools_remote:



Using this capability, you can specify the name of avd that you want to launch, for example, AVD_NEXUS_5:



This capability will help you define how long you need to wait (in milliseconds) for an avd to launch and connect to the Android Debug Bridge (ADB) (the default value is 120000):



You can specify the wait time (in milliseconds) for an avd to finish its boot animations using the following capability; the default wait timeout is 120000:



To pass the additional emulator arguments when launching an avd, use the following capability, for example, netfast:



You can give the absolute local path to the WebDriver executable (if the Chromium embedder provides its own WebDriver, it should be used instead of the original ChromeDriver bundled with Appium) using the following capability:



The following capability allows you to set the time (in milliseconds) for which you need to wait for the Webview context to become active; the default value is 2000:



Intent action is basically used to start an activity, as shown in the following code. The default value is android.intent.action.MAIN. For example, android.intent.action.MAIN, android.intent.action.VIEW, and so on:



This provides the intent category that will be used to start the activity (the default is android.intent.category.LAUNCHER), for example, android.intent.category.LAUNCHER, android.intent.category.APP_CONTACTS:



Flags are used to start an activity (the default is 0x10200000), for example, 0x10200000:



You can enable Unicode input by using the following code; the default value is false:



You can reset the keyboard to its original state by using this capability. The default value is false:


iOS capabilities

Let's discuss the iOS capabilities, as shown in the following table:




This is used to set the calendar format for the iOS simulator. It applies only to a simulator, for example, Gregorian:

caps.setCapability("calendarFormat"," Gregorian");


BundleId is basically used to start an app on a real device or to use other apps that require the bundleId during the test startup, for example, io.appium.TestApp:

caps.setCapability("bundleId"," io.appium.TestApp");


This is used to specify the amount of time (in millisecond) you need to wait for Instruments before assuming that it hung and the session failed. This can be done using the following command:



This capability is used to enable location services. You can apply it only on a simulator; you can give the Boolean value, as follows:



If you want to use this capability, you must provide the bundleId by using the bundleId capability. You can use this capability on a simulator. After setting this, the location services alert doesn't pop up. The default is the current simulator setting and its value is false:



Using this capability, you can accept privacy permission alerts automatically, such as location, contacts, photos, and so on, if they arise; the default value is false:



You can use the native instruments library by setting up this capability:



This can be used to enable real web taps in Safari, which are non-JavaScript based. The default value is false. Let me warn you that this might not perfectly deal with an element; it depends on the viewport's size/ratio:



You can use this capability on a simulator only. It allows JavaScript to open new windows in Safari. The default is the current simulator setting. To do this, you can use the following command:



This capability can be used only on a simulator. It prohibits Safari from displaying a fraudulent website warning. The default value is the current simulator setting, as follows:



This capability enables Safari to open links in new windows; the default keeps the current simulator settings:



Whether you need to keep keychains (Library/Keychains) when an Appium session is started/finished can be defined using this capability. You can apply it on a simulator, as follows:



This capability allows you to pass arguments while AUT using instruments, for example, myflag:



You can delay the keystrokes sent to an element when typing uses this capability. It takes the value in milliseconds:


We have seen all the desired capabilities that are used in Appium. Now, we will talk in brief about the Appium server and its client library.

The Appium server and its client libraries

The Appium server is used to interact with different platforms such as iOS and Android. It creates a session to interact with mobile apps, which are not supported on any platform. It is an HTTP server written in Node.js and uses the same concept as the Selenium Server, which identifies the HTTP requests from the client libraries and sends these requests to the appropriate platform. To start the Appium server, users need to download the source or install it directly from npm. Appium also provides the GUI version of the server. You can download it from the official Appium site, In the next chapter, we will discuss the GUI version in more detail.

One of the biggest advantages of Appium is because it is simply a REST API at its core, the code you use to interact with it is written in a number of languages such as Java, C#, Ruby, Python, and others. Appium extends the WebDriver client libraries and adds the extra commands in it to work with mobile devices. It provides client libraries that support Appium extensions to the WebDriver protocol. Because of these extensions to the protocol, it is important to use Appium-specific client libraries to write automation tests or procedures, instead of generic WebDriver client libraries.

Appium added some interesting functionality for working closely with mobile devices, such as multitouch gestures and screen orientation. We will see the practical implementation of these functionalities later.


We should now have an understanding of the Appium architecture, JSON wire protocol, desired capabilities, and its uses. We also learned about the Appium server and its language-specific client library in this chapter.

Specifically, we dove into JSONWP and Appium session, which are used to send further commands in order to interact with the application. We also set up automation sessions using the desired capabilities. In the last section, we grasped some information about the Appium server and its language-specific client libraries.

In the next chapter, we will take a look at what we require to get started with Appium.

Left arrow icon Right arrow icon

Key benefits

What you will learn

Understand the desired capabilities that need to be acquired before starting Appium Get to know and use the settings that are required to automate mobile applications Interact with mobile applications by identifying elements using different locator strategies and techniques Install mobile apps onto an emulator/simulator or a real device using Appium scripts Test scripts written to automate applications Configure mobile devices and perform automation testing on them Explore the advanced features of Appium such as scroll, zoom, and swipe

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Apr 9, 2015
Length 188 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781784392482

Table of Contents

14 Chapters
Appium Essentials Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
1. Appium – Important Conceptual Background Chevron down icon Chevron up icon
2. Getting Started with Appium Chevron down icon Chevron up icon
3. The Appium GUI Chevron down icon Chevron up icon
4. Finding Elements with Different Locators Chevron down icon Chevron up icon
5. Working with Appium Chevron down icon Chevron up icon
6. Driving Appium on Real Devices Chevron down icon Chevron up icon
7. Advanced User Interactions Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by

No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial


What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.