Learn Selenium

5 (2 reviews total)
By Unmesh Gundecha , Carl Cocchiaro
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Introducing WebDriver and WebElements

About this book

Selenium WebDriver 3.x is an open source API for testing both browser and mobile applications. With the help of this book, you can build a solid foundation and can easily perform end-to-end testing on web and mobile browsers.You'll begin by being introduced to the Selenium Page Object Model for software development. You'll architect your own framework with a scalable driver class, Java utility classes, and support for third-party tools and plugins. You'll design and build a Selenium grid from scratch to enable the framework to scale and support different browsers, mobile devices, and platforms.You'll strategize and handle a rich web UI using the advanced WebDriver API and learn techniques to handle real-time challenges in WebDriver. You'll perform different types of testing, such as cross-browser testing, load testing, and mobile testing. Finally, you will also be introduced to data-driven testing, using TestNG to create your own automation framework.By the end of this Learning Path, you'll be able to design your own automation testing framework and perform data-driven testing with Selenium WebDriver.

This Learning Path includes content from the following Packt products:

• Selenium WebDriver 3 Practical Guide - Second Edition by Unmesh Gundecha

• Selenium Framework Design in Data-Driven Testing by Carl Cocchiaro

Publication date:
July 2019
Publisher
Packt
Pages
536
ISBN
9781838983048

 

Introducing WebDriver and WebElements

In this chapter, we will look briefly into Selenium, its various components, such as Appium, and proceed to the basic components of a web page, including the various types of WebElements. We will learn different ways to locate WebElements on a web page and execute various user actions on them. We will cover the following topics in this chapter:

  • Various components of Selenium Testing Tools
  • Setting up a project in Eclipse with Maven and TestNG
  • Locating WebElements on a Web Page
  • Actions that can be taken on the WebElements

Selenium is a set of widely popular tools used to automate browsers. It is largely used to test applications, but its usages are not limited to testing. It can also be used to perform screen scraping and automate repetitive tasks in a browser window. Selenium supports automation on all the major browsers, including Google Chrome, Mozilla Firefox, Microsoft Internet Explorer and Edge, Apple Safari, and Opera. Selenium 3.0 is now a part of W3C standards and is supported by major browser vendors.

 

Selenium Testing Tools

Selenium 3.0 offers three important tools, Selenium WebDriver, Selenium Server, and Selenium IDE. Each of these tools provides features to create, debug, and run tests on supported browsers and operating systems. Let's explore each of them in detail.

Selenium WebDriver 

Selenium WebDriver is the successor of Selenium RC (Remote Control), which has been officially deprecated. Selenium WebDriver accepts commands using the JSON-Wire protocol (also called Client API) and sends them to a browser launched by the specific driver class (such as ChromeDriver, FirefoxDriver, or IEDriver). This is implemented through a browser-specific browser driver. It works with the following sequence:

  1. The driver listens to the commands from Selenium 
  2. It converts these commands into the browser's native API
  3. The driver takes the result of native commands and sends the result back to Selenium:

We can use Selenium WebDriver to do the following:

  • Create robust, browser-based regression automation
  • Scale and distribute scripts across many browsers and platforms
  • Create scripts in your favourite programming language

Selenium WebDriver offers a collection of language-specific bindings (client libraries) to drive a browser. WebDriver comes with a better set of APIs that meet the expectations of most developers by being similar to object-oriented programming in its implementation. WebDriver is being actively developed over a period of time, and you can see many advanced interactions with the web as well as mobile applications.

The Selenium Client API is a language-specific Selenium library that provides a consistent Selenium API in programming languages such as Java, C#, Python, Ruby, and JavaScript. These languages bindings let tests to launch a WebDriver session and communicate with the browser or Selenium Server.

Selenium Server

Selenium Server allows us to run tests on browser instances running on remote machines and in parallel, thus spreading a load of testing across several machines. We can create a Selenium Grid, where one server runs as the Hub, managing a pool of Nodes. We can configure our tests to connect to the Hub, which then obtains a node that is free and matches the browser we need to run the tests. The hub has a list of nodes that provide access to browser instances, and lets tests use these instances similarly to a load balancer. Selenium Grid enables us to execute tests in parallel on multiple machines by managing different types of browsers, their versions, and operating system configurations centrally.

Selenium IDE

Selenium IDE is a Firefox add-on that allows users to record, edit, debug, and play back tests captured in the Selenese format, which was introduced in the Selenium Core version. It also provides us with the ability to convert these tests into the Selenium RC or Selenium WebDriver format. We can use Selenium IDE to do the following:

  • Create quick and simple scripts using record and replay, or use them in exploratory testing
  • Create scripts to aid in automation-aided exploratory testing
  • Create macros to perform repetitive tasks on Web pages

The Selenium IDE for Firefox stopped working after the Firefox 55 moved to the WebExtension format from XPI format and it is currently no longer maintained.

 

Differences between Selenium 2 and Selenium 3 

Before we dive further into Selenium 3, let's understand the differences between Selenium 2 and Selenium.

Handling the browser  

As the Selenium WebDriver has been as the W3C Standard, Selenium 3 brings a number of changes to the browser implementations. All of the major browser vendors now support WebDriver specification and provide the necessary features along with the browser. For example, Microsoft came with EdgeDriver, and Apple supports the SafariDriver implementation. We will see some of these changes later in this book.

 Having better APIs

As W3C-standard WebDriver comes with a better set of APIs, which meet the expectations of most developers by being similar to the implementation of object-oriented programming.

Having developer support and advanced functionalities

WebDriver is being actively developed and is now supported by Browser vendors per W3C specification; you can see many advanced interactions with the web as well as mobile applications, such as File-Handling and Touch APIs. 

Testing Mobile Apps with Appium

One of the major differences introduced in Selenium 3 was the introduction of the Appium project. The mobile-testing features that were part of Selenium 2 are now moved into a separate project named Appium. 

Appium is an open source mobile-automation framework for testing native, hybrid, and web mobile apps on iOS and Android platforms using the JSON-Wire protocol with Selenium WebDriver. Appium replaces the iPhoneDriver and AndroidDriver APIs in Selenium 2 that were used to test mobile web applications.

Appium enables the use and extension of the existing Selenium WebDriver framework to build mobile tests. As it uses Selenium WebDriver to drive the tests, we can use any programming language to create tests for a Selenium client library.

 

Setting up a project in Eclipse with Maven and TestNG using Java

Selenium WebDriver is a library that helps you automate browsers. However, much more is needed when using it for testing and building a test framework or automating browsers for non-testing purposes. You will need an Integrated Development Environment (IDE) or a code editor to create a new Java project and add Selenium WebDriver and other dependencies in order to build a testing framework.

In the Java development community, Eclipse is a widely-used IDE, as well as IntelliJ IDEA and NetBeans. Eclipse provides a feature-rich environment for Selenium WebDriver test-development.

Along with Eclipse, Apache Maven provides support for managing the life cycle of a test project. Maven is used to define the project structure, dependencies, build, and test-management.

We can use Eclipse and Maven to build our Selenium WebDriver test framework from a single window. Another important benefit of using Maven is that we can get all the Selenium library files and their dependencies by configuring the pom.xml file. Maven automatically downloads the necessary files from the repository while building the project.

In this section, we will learn how to configure Eclipse and Maven for the Selenium WebDriver test development. Most of the code in this book has been developed in Eclipse and Maven.

You will need Eclipse and Maven to set up the test-development environment. Download and set up Maven from http://maven.apache.org/download.html. Follow the instructions on the Maven download page (see the Installation Instructions section of the page).

Download and set up Eclipse IDE for Java Developers from https://eclipse.org/downloads/

Along with Eclipse and Maven, we will also use TestNG as a testing framework for our project. The TestNG library will help us define test cases, test fixtures, and assertions. We need to install the TestNG plugin for Eclipse via Eclipse Marketplace.

Let's configure Eclipse with Maven to develop Selenium WebDriver tests using the following steps:

  1. Launch the Eclipse IDE.
  2. Create a new project by selecting File | New | Other from the Eclipse Main Menu.
  3. On the New dialog, select Maven | Maven Project, as shown in the following screenshot, and click Next:

  1. The New Maven Project dialog will be displayed. Select the Create a simple project (skip archetype selection) checkbox and click on the Next button, as shown in the following screenshot:
  1. On the New Maven Project dialog box, enter com.example in the Group Id: textbox and chapter1 in the Artifact Id: textbox. You can also add a name and description. Click on the Finish button, as shown in the following screenshot:
  1. Eclipse will create the chapter1 project with a structure (in Package Explorer) similar to the one shown in the following screenshot:
  1. Select pom.xml from Package Explorer. This will open the pom.xml file in the editor area with the Overview tab open. Select the pom.xml tab next to the Overview tab, as shown in the following screenshot:
  1. Add the Selenium WebDriver and TestNG dependencies highlighted in the following code snippet to pom.xml in the between project node:
<properties>
<java.version>1.8</java.version>
<selenium.version>3.13.0</selenium.version>
<testng.version>6.13.1</testng.version>
<maven.compiler.version>3.7.0</maven.compiler.version>
</properties>

<dependencies>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>${selenium.version}</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>${testng.version}</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven.compiler.version}</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
</plugins>
</build>
  1. Select src/test/java in Package Explorer and right-click on it to show the menu. Select New | Other, as shown in the following screenshot:
  1. Select the TestNG | TestNG class from the Select a wizard dialog, as shown in the following screenshot:
  1. On the New TestNG class dialog box, enter /chapter1/src/test/java in the Source folder: field. Enter com.example in the Package name: field. Enter NavigationTest in the Class name: field. Select the @BeforeMethod and @AfterMethod checkboxes and add src/test/resources/suites/testng.xml in the XML suite file: field. Click on the Finish button:
  1. This will create the NavigationTest.java class in the com.example package with TestNG annotations such as @Test, @BeforeMethod, and @AfterMethod, and the beforeMethod and afterMethod methods:
  1.  Modify the NavigationTest class with following code:
package com.example;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.*;

public class NavigationTest {

WebDriver driver;

@BeforeMethod
public void beforeMethod() {

// set path of Chromedriver executable
System.setProperty("webdriver.chrome.driver",
"./src/test/resources/drivers/chromedriver");

// initialize new WebDriver session
driver = new ChromeDriver();
}

@Test
public void navigateToAUrl() {
// navigate to the web site
driver.get("http://demo-store.seleniumacademy.com/");
// Validate page title
Assert.assertEquals(driver.getTitle(), "Madison Island");
}
@AfterMethod
public void afterMethod() {

// close and quit the browser
driver.quit();
}
}

In the preceding code, three methods are added as part of the NavigationTest class. We also declared a WebDriver driver; instance variable, which we will use later in the test to launch a browser and navigate to the site.

beforeMethod(), which is annotated with the @BeforeMethod TestNG annotation, will execute before the test method. It will set the path of the chromedriver executable required by Google Chrome. It will then instantiate the driver variable using the ChromeDriver() class. This will launch a new Google Chrome window on the screen.

The next method, navigateToAUrl(), annotated with the @Test annotation is the test method. We will call the get() method of the WebDriver interface passing the URL of the application. This will navigate to the site in the browser. We will check the title of the page by calling TestNG's Assert.assertEquals method and the getTitle() method of the WebDriver interface.

Lastly, afterMethod() is annotated with the @AfterMethod TestNG annotation will close the browser window. 

We need to download and copy the chromedriver executable from https://sites.google.com/a/chromium.org/chromedriver/downloads. Download the appropriate version based on the Google Chrome browser version installed on your computer as well as the operating system. Copy the executable file in the /src/test/resources/ drivers folder.

To run the tests, right-click in the code editor and select Run As | TestNG Test, as shown in the following screenshot:

This will launch a new Google Chrome browser window and navigate to the site. The test will validate the page title and the browser window will be closed at the end of the test. The TestNG Plugin will display results in Eclipse:

You can download the example code files for all the Packt books you have purchased from your account at http://www.packtpub.com. If you have purchased this book elsewhere, you can visit http://www.packtpub. com/support and register to have the files emailed directly to you. The example code is also hosted at https://github.com/PacktPublishing/Selenium-WebDriver-3-Practical-Guide-Second-Edition

 

WebElements

A web page is composed of many different types of HTML elements, such as links, textboxes, dropdown buttons, a body, labels, and forms. These are called WebElements in the context of WebDriver. Together, these elements on a web page will achieve the user functionality. For example, let's look at the HTML code of the login page of a website:    

<html>
<body>
<form id="loginForm">
<label>Enter Username: </label>
<input type="text" name="Username"/>
<label>Enter Password: </label>
<input type="password" name="Password"/>
<input type="submit"/>
</form>
<a href="forgotPassword.html">Forgot Password ?</a>
</body>
</html>

In the preceding HTML code, there are different types of WebElements, such as <html>, <body>, <form>, <label>, <input>, and <a>, which together make a web page provide the Login feature for the user. Let's analyze the following WebElement:

<label>Enter Username: </label>

Here, <label> is the start tag of the WebElement label. Enter Username: is the text present on the label element. Finally, </label> is the end tag, which indicates the end of a WebElement.

Similarly, take another WebElement:                                                                 

<input type="text" name="Username"/>

In the preceding code, type and name are the attributes of the WebElement input with the text and Username values, respectively.

UI-automation using Selenium is mostly about locating these WebElements on a web page and executing user actions on them. In the rest of the chapter, we will use various methods to locate WebElements and execute relevant user actions on them.

 

Locating WebElements using WebDriver 

Let's start this section by automating the Search feature from the Homepage of the demo application, http://demo-store.seleniumacademy.com/, which involves navigating to the homepage, typing the search text in the textbox, and executing the search. The code is as follows:

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;

import static org.assertj.core.api.AssertionsForClassTypes.assertThat;

public class SearchTest {

WebDriver driver;

@BeforeMethod
public void setup() {
System.setProperty("webdriver.chrome.driver",
"./src/test/resources/drivers/chromedriver");
driver = new ChromeDriver();
driver.get("http://demo-store.seleniumacademy.com/");
}

@Test
public void searchProduct() {
// find search box and enter search string
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys("Phones");
WebElement searchButton =
driver.findElement(By.className("search-button"));
searchButton.click();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Phones'");
}

@AfterMethod
public void tearDown() {
driver.quit();
}
}

As you can see, there are three new things that are highlighted, as follows:

WebElement searchBox = driver.findElement(By.name("q"));

They are the findElement() method, the By.name() method, and the WebElement interface. The findElement() and By() methods instruct WebDriver to locate a WebElement on a web page, and once found, the findElement() method returns the WebElement instance of that element. Actions, such as click and type, are performed on a returned WebElement using the methods declared in the WebElement interface, which will be discussed in detail in the next section.

The findElement method

In UI automation, locating an element is the first step before executing any user actions on it. WebDriver's findElement() method is a convenient way to locate an element on the web page. According to WebDriver's Javadoc (http://selenium.googlecode.com/git/docs/api/java/index.html), the method declaration is as follows:

WebElement findElement(By by)

So, the input parameter for the findElement() method is the By instance. The By instance is a WebElement-locating mechanism. There are eight different ways to locate a WebElement on a web page. We will see each of these eight methods later in the chapter.

The return type of the findElement() method is the WebElement instance that represents the actual HTML element or component of the web page. The method returns the first WebElement that the driver comes across that satisfies the locating-mechanism condition. This WebElement instance will act as a handle to that component from then on. Appropriate actions can be taken on that component by the test-script developer using this returned WebElement instance.

If WebDriver doesn't find the element, it throws a runtime exception named NoSuchElementException, which the invoking class or method should handle. 

The findElements method

For finding multiple elements matching the same locator criteria on a web page, the findElements() method can be used. It returns a list of WebElements found for a given locating mechanism. The method declaration of the findElements() method is as follows:

java.util.List findElements(By by)

The input parameter is the same as the findElement() method, which is an instance of the By class. The difference lies in the return type. Here, if no element is found, an empty list is returned and if there are multiple WebElements present that satisfy the locating mechanism, all of them are returned to the caller in a list.

Inspecting Elements with Developer Tools

Before we start exploring how to find elements on a page and what locator mechanism to use, we need to look at the HTML code of the page to understand the Document Object Model (DOM) tree, what properties or attributes are defined for the elements displayed on the page, and how JavaScript or AJAX calls are made from the application. browsers use the HTML code written for the page to render visual elements in the browser window. It uses other resources, including JavaScript, CSS, and images, to decide on the look, feel, and behavior of these elements.

Here is an example of a login page of the demo application and the HTML code written to render this page in a browser, as displayed in the following screenshot:

We need tools that can display the HTML code of the page in a structured and easy-to-understand format. Almost all browsers now offer Developer tools to inspect the structure of the page and associated resources.

Inspecting pages and elements with Mozilla Firefox

The newer versions of Mozilla Firefox provide built-in ways to inspect the page and elements. To inspect an element from the page, move the mouse over the desired element and right-click to open the pop-up menu. Select the Inspect Element option, as shown in the following screenshot:

This will display the Inspector tab with the HTML code in a tree format with the selected element highlighted, as shown in the following screenshot:

Using Inspector, we can also validate the XPath or CSS Selectors using the search box shown in the Inspector section. Just enter the XPath or CSS Selector and Inspector will highlight the elements that match the expression, as shown in the following screenshot:

The Developer tools provide various other debugging features. It also generates XPath and CSS selectors for elements. For this, select the desired element in the tree, right-click, and select the Copy > XPath or Copy > CSS Path option from the pop-up menu, as shown in the following screenshot:  

This will paste the suggested XPath or CSS selector value to the clipboard to be used later with the findElement() method.

Inspecting pages and elements in Google Chrome with Developer Tools

Similar to Mozilla Firefox, Google Chrome also provides a built-in feature to inspect pages and elements. We can move the mouse over a desired element on the page, right-click to open the pop-up menu, and then select the Inspect element option. This will open Developer tools in the browser, which displays information similar to that of Firefox, as shown in the following screenshot:

Similar to Firefox, we can also test XPath and CSS Selectors in Google Chrome Developer tools. Press Ctrl + F (on Mac, use Command + F) in the Elements tab. This will display a search box. Just enter XPath or CSS Selector, and matching elements will be highlighted in the tree, as shown in the following screenshot:

Chrome Developer Tools also provides a feature where you can get the XPath for an element by right-clicking on the desired element in the tree and selecting the Copy XPath option from the pop-up menu.

Similar to Mozilla Firefox and Google Chrome, you will find similar Developer tools in any major browser, including Microsoft Internet Explorer and Edge.

Browser developer tools come in really handy during the test-script development. These tools will help you to find the locator details for the elements with which you need to interact as part of the test. These tools parse the code for a page and display the information in a hierarchal tree. 

WebElements on a web page may not have all the attributes declared. It is up to the developer of the test script to select the attribute that uniquely identifies the WebElement on the web page for the automation.

Using the By locating mechanism

By is the locating mechanism passed to the findElement() method or the findElements() method to fetch the respective WebElement(s) on a web page. There are eight different locating mechanisms; that is, eight different ways to identify

an HTML element on a web page. They are located by ID, Name, ClassName, TagName, LinkText, PartialLinkText, XPath, and CSS Selector.

The By.id() method

On a web page, each element is uniquely identified by an ID attribute, which is optionally provided. An ID can be assigned manually by the developer of the web application or left to be dynamically generated by the application. Dynamically-generated IDs can be changed on every page refresh or over a period of time. Now, consider the HTML code of the Search box:

<input id="search" type="search" name="q" value="" class="input-text required-entry" maxlength="128" placeholder="Search entire store here..." autocomplete="off">

In the preceding code, the id attribute value of the search box is search.

Let's see how to use the ID attribute as a locating mechanism to find the Search box:

@Test
public void byIdLocatorExample() {
WebElement searchBox = driver.findElement(By.id("search"));
searchBox.sendKeys("Bags");
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Bags'");
}

In preceding code, we used the By.id() method and the search box's id attribute value to find the element.

Here, try to use the By.id identifier, and use the name value (that is, q) instead of the id value (that is, search). Modify line three as follows:

WebElement searchBox = driver.findElement(By.id("q")); 

The test script will fail to throw an exception, as follows:

Exception in thread "main" org.openqa.selenium.NoSuchElementException: Unable to locate element: {"method":"id","selector":"q"}

WebDriver couldn't find an element by id whose value is q. Thus, it throws an exception saying NoSuchElementException.

The By.name() method

As seen earlier, every element on a web page has many attributes. Name is one of them. For instance, the HTML code for the Search box is:

<input id="search" type="search" name="q" value="" class="input-text required-entry" maxlength="128" placeholder="Search entire store here..." autocomplete="off">

Here, name is one of the many attributes of the search box, and its value is q. If we want to identify this search box and set a value in it in your test script, the code will look as follows:

@Test
public void searchProduct() {
// find search box and enter search string
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys("Phones");
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Phones'");
}

If you observe line four, the locating mechanism used here is By.name and the name is q. So, where did we get this name from? As discussed in the previous section, it is the browser developer tools that helped us get the name of the button. Launch Developer tools and use the inspect elements widget to get the attributes of an element.

The By.className() method

Before we discuss the className() method, we have to talk a little about style and CSS. Every HTML element on a web page, generally, is styled by the web page developer or designer. It is not mandatory that each element should be styled, but they generally are to make the page appealing to the end user.

So, in order to apply styles to an element, they can be declared directly in the element tag, or placed in a separate file called the CSS file and can be referenced in the element using the class attribute. For instance, a style attribute for a button can be declared in a CSS file as follows:

.buttonStyle{
width: 50px;
height: 50px;
border-radius: 50%;
margin: 0% 2%;
}

Now, this style can be applied to the button element in a web page as follows:

<button name="sampleBtnName" id="sampleBtnId" class="buttonStyle">I'm Button</button>

So, buttonStyle is used as the value for the class attribute of the button element, and it inherits all the styles declared in the CSS file. Now, let's try this on our Homepage. We will try to make WebDriver identify the search button using its class name and click on it.

First, in order to get the class name of the search button, as we know, we will use Developers tools to fetch it. After getting it, change the location mechanism to By.className and specify the class attribute value in it. The code for that is as follows:

@Test
public void byClassNameLocatorExample() {
WebElement searchBox = driver.findElement(By.id("search"));
searchBox.sendKeys("Electronics");
WebElement searchButton =
driver.findElement(By.className("search-button"));
searchButton.click();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Electronics'");
}

In the preceding code, we have used the By.className locating mechanism by passing the class attribute value to it.

Sometimes, an element might have multiple values given for the class attribute. For example, the Search button has button and search-button values specified in the class attribute in the following HTML snippet:

<button type="submit" title="Search" class="button search-button"><span><span>Search</span></span></button>

We have to use one of the values of the class attribute with the By.className method. In this case, we can either use button or search-button, whichever uniquely identifies the element.

The By.linkText() method

As the name suggests, the By.linkText locating mechanism can only be used to identify the HTML links. Before we start discussing how WebDriver can be commanded to identify a link element using link text, let's see what an HTML link element looks like. The HTML link elements are represented on a web page using the <a> tag, an abbreviation for the anchor tag. A typical anchor tag looks like this:

<a href="http://demo-store.seleniumacademy.com/customer/account/" title="My Account">My Account</a>

Here, href is the link to a different page where your web browser will take you when you click on the link. So, the preceding HTML code when rendered by the browser looks like this:

This MY ACCOUNT is the link text. So the By.linkText locating mechanism uses this text on an anchor tag to identify the WebElement. The code would look like this:

@Test
public void byLinkTextLocatorExample() {
WebElement myAccountLink =
driver.findElement(By.linkText("MY ACCOUNT"));
myAccountLink.click();
assertThat(driver.getTitle())
.isEqualTo("Customer Login");
}

Here, the By.linkText locating mechanism is used to identify the MY ACCOUNT link.

The linkText and partialLinkText methods are case-sensitive. 

The By.partialLinkText() method

The By.partialLinkText locating mechanism is an extension of the By.linkText locator. If you are not sure of the entire link text or want to use only part of the link text, you can use this locator to identify the link element. So, let's modify the previous example to use only partial text on the link; in this case, we will use Privacy from the Privacy Policy link in the site footer:

The code would look like this:

@Test
public void byPartialLinkTextLocatorExample() {
WebElement orderAndReturns =
driver.findElement(By.partialLinkText("PRIVACY"));
orderAndReturns.click();
assertThat(driver.getTitle())
.isEqualTo("Privacy Policy");
}

What happens if there are multiple links whose text has Privacy in it? That is a question for the findElement() method rather than the locator. Remember when we discussed the findElement() method earlier, it will return only the first WebElement that it comes across. If you want all the WebElements that contain Privacy in its link text, use the findElements() method, which will return a list of all those elements.

Use WebDriver's findElements() method if you think you need all the WebElements that satisfy a locating-mechanism condition.

The By.tagName() method

Locating an element by tag name is slightly different from the locating mechanisms we saw earlier. For example, on a  Homepage, if you search for an element with the button tag name, it will result in multiple WebElements because there are nine buttons present on the Homepage. So, it is always advisable to use the findElements() method rather than the findElement() method when trying to locate elements using tag names.

Let's see how the code looks when a search for the number of links present on a  Homepage is made:

@Test
public void byTagNameLocatorExample() {

// get all links from the Home page
List<WebElement> links = driver.findElements(By.tagName("a"));

System.out.println("Found links:" + links.size());

// print links which have text using Java 8 Streams API
links.stream()
.filter(elem -> elem.getText().length() > 0)
.forEach(elem -> System.out.println(elem.getText()));
}

In the preceding code, we have used the By.tagName locating mechanism and the findElements() method, which return a list of all the links, that is, the a anchor tags defined on the page. On line five,  we printed the size of the list, and then printed text of only links where the text has been provided. We use the Java 8 Stream API to filter the element list and output the text value by calling the getText() method. This will generate the following output:

Found links:88
ACCOUNT
CART
WOMEN
...

The By.xpath() method

WebDriver uses XPath to identify a WebElement on the web page. Before we see how it does that, let's quickly look at the syntax for XPath. XPath is a short name for the XML path, the query language used for searching XML documents. The HTML for our web page is also one form of the XML document. So, in order to identify an element on an HTML page, we need to use a specific XPath syntax:

  • The root element is identified as //.
  • To identify all the div elements, the syntax will be //div.
  • To identify the link tags that are within the div element, the syntax will be //div/a.
  • To identify all the elements with a tag, we use *. The syntax will be //div/*.
  • To identify all the div elements that are at three levels down from the root, we can use //*/*/div.
  • To identify specific elements, we use attribute values of those elements, such as //*/div/a[@id='attrValue'], which will return the anchor element. This element is at the third level from the root within a div element and has an id value of attrValue.

So, we need to pass the XPath expression to the By.xpath locating mechanism to make it identify our target element. 

Now, let's see the code example and how WebDriver uses this XPath to identify the element:

@Test
public void byXPathLocatorExample() {
WebElement searchBox =
driver.findElement(By.xpath("//*[@id='search']"));
searchBox.sendKeys("Bags");
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Bags'");
}

In the preceding code, we are using the By.xpath locating mechanism and passing the XPath of the WebElement to it.

One disadvantage of using XPath is that it is costly in terms of time. For every element to be identified, WebDriver actually scans through the entire page, which is very time consuming, and too much usage of XPath in your test script will actually make it too slow to execute.

The By.cssSelector() method

The By.cssSelector() method is similar to the By.xpath() method in its usage, but the difference is that it is slightly faster than the By.xpath locating mechanism. The following are the commonly used syntaxes to identify elements:

  • To identify an element using the div element with the #flrs ID, we use the #flrs syntax
  • To identify the child anchor element, we use the #flrs > a syntax, which will return the link element
  • To identify the anchor element with its attribute, we use the #flrs > a[a[href="/intl/en/about.html"]] syntax

Let's try to modify the previous code, which uses the XPath locating mechanism to use the cssSelector mechanism:

@Test
public void byCssSelectorLocatorExample() {
WebElement searchBox =
driver.findElement(By.cssSelector("#search"));
searchBox.sendKeys("Bags");
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Bags'");
}

The preceding code uses the By.cssSelector locating mechanism, which uses the css selector ID of the Search box.

Let's look at a slightly complex example. We will try to identify the About Us on the Homepage:

@Test
public void byCssSelectorLocatorComplexExample() {

WebElement aboutUs =
driver.findElement(By
.cssSelector("a[href*='/about-magento-demo-store/']"));

aboutUs.click();

assertThat(driver.getTitle())
.isEqualTo("About Us");
}

The preceding code uses the cssSelector() method to find the anchor element identified by its href attribute.

 

Interacting with WebElements

In the previous section, we saw how to locate WebElements on a web page by using different locator methods. Here, we will see all the different user actions that can be performed on a WebElement. Different WebElements will have different actions that can be taken on them. For example, in a textbox element, we can type in some text or clear the text that is already typed in it. Similarly, for a button, we can click on it, get the dimensions of it, and so on, but we cannot type into a button, and for a link, we cannot type into it. So, though all the actions are listed in one WebElement interface, it is the test script developer's responsibility to use the actions that are supported by the target element. In case we try to execute the wrong action on a WebElement, we don't see any exception or error thrown and we don't see any action get executed; WebDriver ignores such actions silently.

Now, let's get into each of the actions individually by looking at their Javadocs and a code example. 

Getting element properties and attributes

In this section, we will learn the various methods to retrieve value and properties from the WebElement interface.

The getAttribute() method

The getAttribute method can be executed on all the WebElements. Remember, we have seen attributes of WebElement in the WebElements section. The HTML attributes are modifiers of HTML elements. They are generally key-value pairs that appear in the start tag of an element. For example:

  <label name="Username" id="uname">Enter Username: </label>

In the preceding code, name and id are the attributes or attribute keys and Username and uname are the attribute values.

The API syntax of the getAttribute() method is as follows:

java.lang.String getAttribute(java.lang.String name)

In the preceding code, the input parameter is String, which is the name of the attribute. The return type is again String, which is the value of the attribute.

Now let's see how we can get all the attributes of a WebElement using WebDriver. Here, we will make use of the Search box from the example application. This is what the element looks like:

<input id="search" type="search" name="q" value="" class="input-text required-entry" maxlength="128" placeholder="Search entire store here..." autocomplete="off">

We will list all the attributes of this WebElement using WebDriver. The code for that is as follows:

@Test
public void elementGetAttributesExample() {
WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Name of the box is: "
+ searchBox.getAttribute("name"));
System.out.println("Id of the box is: " + searchBox.getAttribute("id"));
System.out.println("Class of the box is: "
+ searchBox.getAttribute("class"));
System.out.println("Placeholder of the box is: "
+ searchBox.getAttribute("placeholder"));
}

In the preceding code, the last four lines of code use the getAttribute() method to fetch the attribute values of the name, id, class, and placeholder attributes of the WebElement search box. The output of the preceding code will be following:

 Name of the box is: q
Id of the box is: search
Class of the box is: input-text required-entry
Placeholder of the box is: Search entire store here...

Going back to the By.tagName() method of the previous section, if the search by a locating mechanism, By.tagName, results in more than one result, you can use the getAttribute() method to further filter the results and get to your exact intended element.

The getText() method

The getText method can be called from all the WebElements. It will return visible text if the element contains any text on it, otherwise it will return nothing.  The API syntax for the getText() method is as follows:

java.lang.String getText()

There is no input parameter for the preceding method, but it returns the visible innerText string of the WebElement if anything is available, otherwise it will return an empty string.

The following is the code to get the text present on the Site notice element present on the example application Homepage:

@Test
public void elementGetTextExample() {
WebElement siteNotice = driver.findElement(By
.className("global-site-notice"));

System.out.println("Complete text is: "
+ siteNotice.getText());
}

The preceding code uses the getText() method to fetch the text present on the Site notice element, which returns the following:

Complete text is: This is a demo store. Any orders placed through this store will not be honored or fulfilled.

The getCssValue() method

The getCssValue method can be called on all the WebElements. This method is used to fetch a CSS property value from a WebElement. CSS properties can be font-family, background-color, color, and so on. This is useful when you want to validate the CSS styles that are applied to your WebElements through your test scripts. The API syntax for the getCssValue() method is as follows:

java.lang.String getCssValue(java.lang.String propertyName)

In the preceding code, the input parameter is the String value of the CSS property name, and the return type is the value assigned to that property name.

The following is the code example to retrieve font-family of the text from the Search box:    

@Test
public void elementGetCssValueExample() {
WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Font of the box is: "
+ searchBox.getCssValue("font-family"));
}

The preceding code uses the getCssValue() method to find font-family of the text visible in the Search box. The output of the method is shown here:

Font of the box is: Raleway, "Helvetica Neue", Verdana, Arial, sans-serif

The getLocation() method

The getLocation method can be executed on all the WebElements. This is used to get the relative position of an element where it is rendered on the web page. This position is calculated relative to the top-left corner of the web page of which the (x, y) coordinates are assumed to be (0, 0). This method will be of use if your test script tries to validate the layout of your web page.

The API syntax of the getLocation() method is as follows:

Point getLocation()

The preceding method obviously doesn't take any input parameters, but the return type is a Point class that contains the (x, y) coordinates of the element.

The following is the code to retrieve the location of the Search box:

WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Location of the box is: "
+ searchBox.getLocation());

The output for the preceding code is the (x, y) location of the Search box, as shown in the following screenshot:

Location of the box is: (873, 136)

The getSize() method

The getSize method can also be called on all the visible components of HTML. It will return the width and height of the rendered WebElement. The API syntax of the getSize() method is as follows:

Dimension getSize()

The preceding method doesn't take any input parameters, and the return type is a class instance named Dimension. This class contains the width and height of the target WebElement. The following is the code to get the width and height of the Search box:

WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Size of the box is: "
+ searchBox.getSize());

The output for the preceding code is the width and height of the Search box, as shown in the following screenshot:

Size of the box is: (281, 40) 

The getTagName() method

The getTagName method can be called from all the WebElements. This will return the HTML tag name of the WebElement. For example, in the following HTML code, the button is the tag name of the HTML element:

<button id="gbqfba" class="gbqfba" name="btnK" aria-label="Google Search">

In the preceding code, the button is the tag name of the HTML element.

The API syntax for the getTagName() method is as follows:

java.lang.String getTagName()

The return type of the preceding method is String, and it returns the tag name of the target element.

The following is the code that returns the tag name of the Search button:

@Test
public void elementGetTagNameExample() {
WebElement searchButton = driver.findElement(By.className("search-button"));
System.out.println("Html tag of the button is: "
+ searchButton.getTagName());
}

The preceding code uses the getTagName() method to get the tag name of the Search button element. The output of the code is as expected:

Html tag of the button is: button

Performing actions on WebElements

In the previous section, we saw how to retrieve values or properties of WebElements. In this section, we will see how to perform actions on WebElements, which is the most crucial part of automation. Let's explore the various methods available in the WebElement interface.

The sendKeys() method

ThesendKeys  action is applicable for textbox or textarea HTML elements. This is used to type text into the textbox. This will simulate the user keyboard and types text into WebElements exactly as a user would. The API syntax for the sendKeys() method is as follows:

void sendKeys(java.lang.CharSequence...keysToSend)

The input parameter for the preceding method is CharSequence of text that has to be entered into the element. This method doesn't return anything. Now, let's see a code example of how to type a search text into the Search box using the sendKeys() method:

@Test
public void elementSendKeysExample() {
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys("Phones");
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'Phones'");
}

In the preceding code, the sendKeys() method is used to type the required text in the textbox element of the web page. This is how we deal with normal keys, but if you want to type in some special keys, such as Backspace, Enter, Tab, or Shift, we need to use a special enum class of WebDriver, named Keys. Using the Keys enumeration, you can simulate many special keys while typing into a WebElement.

Now let's see some code example, which uses the Shift key to type the text in uppercase in the Search Box:

@Test
public void elementSendKeysCompositeExample() {
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys(Keys.chord(Keys.SHIFT,"phones"));
searchBox.submit();
assertThat(driver.getTitle())
.isEqualTo("Search results for: 'PHONES'");
}

In the preceding code, the chord() method from the Keys enum is used to type the key, while the text specified is being given as an input to be the textbox. Try this in your environment to see all the text being typed in uppercase.

The clear() method

The clear action is similar to the sendKeys() method, which is applicable for the textbox and textarea elements. This is used to erase the text entered in a WebElement using the sendKeys() method. This can be achieved using the Keys.BACK_SPACE enum, but WebDriver has given us an explicit method to clear the text easily. The API syntax for the clear() method is as follows:

void clear()

This method doesn't take any input and doesn't return any output. It is simply executed on the target text-entry element.

Now, let's see how we can clear text that is entered in the Search box. The code example for it is as follows:

@Test
public void elementClearExample() {
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys(Keys.chord(Keys.SHIFT,"phones"));
searchBox.clear();
}

We have used the WebElement's clear() method to clear the text after typing phones into the Search box.

The submit() method

The submit() action can be taken on a Form or on an element, which is inside a Form element. This is used to submit a form of a web page to the server hosting the web application. The API syntax for the submit() method is as follows:

void submit()

The preceding method doesn't take any input parameters and doesn't return anything. But a NoSuchElementException is thrown when this method is executed on a WebElement that is not present within the form.

Now, let's see a code example to submit the form on a Search page:                                       

@Test
public void elementSubmitExample() {
WebElement searchBox = driver.findElement(By.name("q"));
searchBox.sendKeys(Keys.chord(Keys.SHIFT,"phones"));
searchBox.submit();
}

In the preceding code, toward the end is where the Search form is submitted to the application servers using the submit() method. Now, try to execute the submit() method on an element, let's say the About link, which is not a part of any form. We should see NoSuchElementException is thrown. So, when you use the submit() method on a WebElement, make sure it is part of the Form element.

Checking the WebElement state

In the previous sections, we saw how to retrieve values and perform actions on WebElements. Now, we will see how to check the state of a WebElement. We will explore methods to check whether the WebElement is displayed in the Browser window, whether it is editable, and if the WebElement is Radio Button of Checkbox, we can determine whether it's selected or unselected. Let's see how we can use the methods available in the WebElement interface.

 The isDisplayed() method

The isDisplayed action verifies whether an element is displayed on the web page and can be executed on all the WebElements. The API syntax for the isDisplayed() method is as follows:

boolean isDisplayed()

The preceding method returns a Boolean value specifying whether the target element is displayed on the web page. The following is the code to verify whether the Search box is displayed, which obviously should return true in this case:

@Test
public void elementStateExample() {
WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Search box is displayed: "
+ searchBox.isDisplayed());
}

The preceding code uses the isDisplayed() method to determine whether the element is displayed on a web page. The preceding code returns true for the Search box:  

Search box is displayed: true

The isEnabled() method

The isEnabled action verifies whether an element is enabled on the web page and can be executed on all the WebElements. The API syntax for the isEnabled() method is as follows:

boolean isEnabled()

The preceding method returns a Boolean value specifying whether the target element is enabled on the web page. The following is the code to verify whether the Search box is enabled, which obviously should return true in this case:    

@Test
public void elementStateExample() {
WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Search box is enabled: "
+ searchBox.isEnabled());
}

The preceding code uses the isEnabled() method to determine whether the element is enabled on a web page. The preceding code returns true for the Search box:  

Search box is enabled: true 

The isSelected() method

The isSelected method returns a boolean value if an element is selected on the web page and can be executed only on a radio button, options in select, and checkbox WebElements. When executed on other elements, it will return false. The API syntax for the isSelected() method is as follows:

boolean isSelected()

The preceding method returns a Boolean value specifying whether the target element is selected on the web page. The following is the code to verify whether the Search box is selected on a search page:

@Test
public void elementStateExample() {
WebElement searchBox = driver.findElement(By.name("q"));
System.out.println("Search box is selected: "
+ searchBox.isSelected());
}

The preceding code uses the isSelected() method. It returns false for the Search box, because this is not a radio button, options in select, or a checkbox. The preceding code returns false for the Search box:

Search box is selected: false

To select a Checkbox or Radio button, we need to call the WebElement.click() method, which toggles the state of the element. We can use the isSelected() method to see whether it's selected.

 

Summary

In this chapter, we covered a brief overview of the Selenium testing tools, and the architecture of WebDriver, WebElements. We learned how to set up a test-development environment using Eclipse, Maven, and TestNG. This will provide us with the foundation to build a testing framework using Selenium. Then, we saw how to locate elements, and the actions that can be taken on them. This is the most important aspect when automating Web Applications. In this chapter, we used ChromeDriver to run our tests. In the next chapter, we will learn and implement the Streams API of Java 8 since Selenium 3.0 includes a bunch of features of Java 8.

 

Questions

  1. True or false: Selenium is a browser automation library.
  2. What are the different types of locator mechanisms provided by Selenium?
  3. True or false: With the getAttribute() method, we can read CSS attributes as well?
  4. What actions can be performed on a WebElement?
  5. How can we determine whether the checkbox is checked or unchecked?
 

Further information

You can check out the following links for more information on the topics covered in this chapter:

  • Read the WebDriver Specification at https://www.w3.org/TR/webdriver/
  • Read more about using TestNG and Maven in Chapter 1, Creating a Faster Feedback Loop from Mastering Selenium WebDriver By Mark Collin, Packt Publishing
  • Read more about element interaction in Chapter 2, Finding Elements and Chapter 3, Working with Elements from Selenium Testing Tools Cookbook, 2nd Edition, by Unmesh Gundecha, Packt Publishing

About the Authors

  • Unmesh Gundecha

    Unmesh Gundecha has over 16 years, experience in Agile software development, test automation, and DevOps methodologies. He is an Agile, open source, and DevOps evangelist with extensive experience in a diverse set of tools and technologies. He has extensive hands-on experience in building sustainable and repeatable test automation solutions for web and mobile platforms, APIs, and CLI apps with continuous integration and delivery pipelines, using best-of-breed open source and commercial tools to do so. He is the author of Selenium Testing Tools Cookbook and Learning Selenium Testing Tools with Python, both by Packt Publishing.

    Browse publications by this author
  • Carl Cocchiaro

    Carl Cocchiaro has a bachelor's degree in business and over 30 years of experience in the software engineering field, designing and building test frameworks for desktop, browser, and mobile applications. He is an expert in the Selenium WebDriver/TestNG Java-based technologies. He is a certified SilkTest engineer and has architected UI and RESTful API automation frameworks for 25 major corporations. Carl is currently a software architect, quality engineering at RSA/Dell technologies, Boston, MA, USA.

    Browse publications by this author

Latest Reviews

(2 reviews total)
Helped me a lot to learn programming with Selenium technology
Interesting content base on the new version of Selenium. Easy to understand, but some links available in the books doesn't work (web page used in examples is not available). Apart of that I can recommend it as introduction form to journey with the Selenium 3.

Recommended For You

Book Title
Unlock this book and the full library for FREE
Start free trial