In this chapter, we will learn about the Appium architecture, JavaScript Object Notation (JSON) wire protocol, and Appium sessions as well as gain an understanding of the desired capabilities before starting Appium. This chapter will also touch upon the topics of the Appium server and its client library.
In short, we will cover the following topics:
Appium's architecture
The Selenium JSON wire protocol
Appium sessions
Desired capabilities
The Appium server and its client library
Appium is an HTTP server written in Node.js that creates and handles WebDriver sessions. The Appium web server follows the same approach as the Selenium WebDriver, which receives HTTP requests from client libraries through JSON and then handles those requests in different ways, depending on the platform it is running on.
Let's discuss how Appium works in iOS and Android.
On an iOS device, Appium uses Apple's UIAutomation API to interact with the UI elements. UIAutomation is a JavaScript library provided by Apple to write test scripts; Appium utilizes these same libraries to automate iOS apps.
Let's take a look at the architecture, which is shown in the following diagram:
In the preceding diagram, when we execute the test scripts, it goes in the form of JSON through an HTTP request to the Appium server. The Appium server sends the command to the instruments, and the instruments look for the bootstrap.js
file, which is pushed by the Appium server to the iOS device. Then, these commands execute in the bootstrap.js file
within the iOS instruments' environment. After the execution of the command, the client sends back the message to the Appium server with the log details of the executed command.
A similar kind of architecture follows in the case of Android app automation. Let's discuss the Appium architecture for Android.
On an Android device, Appium uses the UIAutomator framework to automate the apps. UIAutomator is a framework that is developed by the Android developers to test the Android user interface.
Let's take a look at the architecture, which is shown in the following diagram:
In the preceding diagram, we have a UIAutomator/Selendroid in place of Apple instruments and bootstrap.jar
in place of the bootstrap.js
file.
Appium supports Android versions greater than or equal to 17; for earlier versions, it uses the Selendroid framework. When we execute the test scripts, Appium sends the command to the UIAutomator or Selendroid on the basis of the Android version.
Here, bootstrap.jar
plays the role of a TCP server, which we can use to send the test command in order to perform the action on the Android device using UIAutomator/Selendroid.
The JSON wire protocol (JSONWP) is a transport mechanism created by WebDriver developers. This wire protocol is a specific set of predefined, standardized endpoints exposed via a RESTful API. The purpose of WebDriver and JSONWP is the automated testing of websites via a browser such as Firefox driver, IE driver, and Chrome driver.
Appium implements the Mobile JSONWP, the extension to the Selenium JSONWP, and it controls the different mobile device behaviors, such as installing/uninstalling apps over the session.
Let's have a look at some of the endpoints from the API which are used to interact with mobile applications:
/session/:sessionId
/session/:sessionId/element
/session/:sessionId/elements
/session/:sessionId/element/:id/click
/session/:sessionId/source
/session/:sessionId/url
/session/:sessionId/timeouts/implicit_wait
Note
The complete list of endpoints is available at https://code.google.com/p/selenium/wiki/JsonWireProtocol and https://code.google.com/p/selenium/source/browse/spec-draft.md?repo=mobile.
Appium provides client libraries similar to WebDriver that act as an interface to the REST API. These libraries have functions similar to the following method:
AppiumDriver.getPageSource();
This method will issue an HTTP request to the JSONWP, and it gets the response from the applicable API endpoint. In this case, the API endpoint that handles the getPageSource
method is as follows:
/session/:sessionId/source
The driver will execute the test script that comes in the JSON format from the AppiumDriver server to get the source. It will return the page source in the string format. In case of non-HTML (native mobile apps) platforms, the Appium library will respond with an XML document representation of the UI hierarchy. The specific structure of the document may vary from platform to platform.
A session is a medium to send commands to the specific test application; a command is always performed in the context of a session. As we saw in the previous section, a client uses the session identifier as the sessionId
parameter before performing any command. The client library requests the server to create a session. The server will then respond with a sessionId
endpoint, which is used to send more commands to interact with the application(s) being tested.
Desired capabilities is a JSON object (a set of keys and values) sent by the client to the server. It describes the capabilities for the automation session in which we are interested.
Let's discuss the capabilities one by one; first, we will see the Appium server's capabilities:
We need to import "import org.openqa.Selenium.remote.DesiredCapabilities
" library for Java to work with the desired capabilities.
Now, let's discuss the Android capabilities, as shown in the following table:
Let's discuss the iOS capabilities, as shown in the following table:
We have seen all the desired capabilities that are used in Appium. Now, we will talk in brief about the Appium server and its client library.
The Appium server is used to interact with different platforms such as iOS and Android. It creates a session to interact with mobile apps, which are not supported on any platform. It is an HTTP server written in Node.js and uses the same concept as the Selenium Server, which identifies the HTTP requests from the client libraries and sends these requests to the appropriate platform. To start the Appium server, users need to download the source or install it directly from npm. Appium also provides the GUI version of the server. You can download it from the official Appium site, http://appium.io. In the next chapter, we will discuss the GUI version in more detail.
One of the biggest advantages of Appium is because it is simply a REST API at its core, the code you use to interact with it is written in a number of languages such as Java, C#, Ruby, Python, and others. Appium extends the WebDriver client libraries and adds the extra commands in it to work with mobile devices. It provides client libraries that support Appium extensions to the WebDriver protocol. Because of these extensions to the protocol, it is important to use Appium-specific client libraries to write automation tests or procedures, instead of generic WebDriver client libraries.
Appium added some interesting functionality for working closely with mobile devices, such as multitouch gestures and screen orientation. We will see the practical implementation of these functionalities later.
We should now have an understanding of the Appium architecture, JSON wire protocol, desired capabilities, and its uses. We also learned about the Appium server and its language-specific client library in this chapter.
Specifically, we dove into JSONWP and Appium session, which are used to send further commands in order to interact with the application. We also set up automation sessions using the desired capabilities. In the last section, we grasped some information about the Appium server and its language-specific client libraries.
In the next chapter, we will take a look at what we require to get started with Appium.