Selenium WebDriver Quick Start Guide

1 (1 reviews total)
By Pinakin Chaubal
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Introducing Selenium WebDriver and Environment Setup

About this book

Selenium WebDriver is a platform-independent API for automating the testing of both browser and mobile applications. It is also a core technology in many other browser automation tools, APIs, and frameworks. This book will guide you through the WebDriver APIs that are used in automation tests.

Chapter by chapter, we will construct the building blocks of a page object model framework as you learn about the required Java and Selenium methods and terminology.

The book starts with an introduction to the same-origin policy, cross-site scripting dangers, and the Document Object Model (DOM). Moving ahead, we'll learn about XPath, which allows us to select items on a page, and how to design a customized XPath. After that, we will be creating singleton patterns and drivers. Then you will learn about synchronization and handling pop-up windows. You will see how to create a factory for browsers and understand command design patterns applicable to this area.

At the end of the book, we tie all this together by creating a framework and implementing multi-browser testing with Selenium Grid.

Publication date:
October 2018


Introducing Selenium WebDriver and Environment Setup

Welcome to the exciting world of test automation using Java 8 and Selenium WebDriver 3.x. Throughout this book, we will get up to speed with Selenium and its surrounding technologies. Selenium is a browser automation tool that has progressed tenfold since its initial inception. Along with tools such as AutoIt, it can be used for automating desktop applications. Selenium is getting used extensively in mobile automation nowadays. The most important point is that it is open source, has a vast developer community, and is constantly evolving. With Selenium Grid, we can simulate different browsers on a single machine.

First, we will start with understanding the basics. This chapter is a gentle introduction to Selenium, and we will be covering the following topics:

  • The need for test automation and its advantages
  • Java 8 (briefly)
  • Selenium RC
  • Selenium WebDriver
  • Various drivers in Selenium
  • Preparing for the first script

Technical requirements


Why is test automation required?

Let's get started by understanding why test automation is needed. Today's agile world needs quick feedback on the code's quality. The developers check-in application code in a source code repository like GitHub. It is imperative that these changes be tested, and the best way to do so is through automation. A test-automation suite can eliminate the mundane work of manual regression testing and can be helpful in finding bugs earlier, thus reducing manual testing time. It can be configured to run at a particular time in the day.

A cut-off time should be provided to the developers, such as 6 P.M. in the evening, by which time they should check in code, get the application build done, and the application deployed to a server like Apache Tomcat. The automation suite may be scheduled to run at 7 P.M. daily. Jenkins is a tool that's used for continuous integration, and so can be used for this purpose.

Advantages of test automation

Advantages of test automation include reducing the burden on the testers doing the manual execution so that they can focus on the functional aspects of the application. Generally, a smoke, sanity, regression test suite is created for this purpose. The advantage of having automatic triggering through Jenkins is that it facilitates test execution in an unattended mode.


Some pointers on Selenium

We will be using version 3.13, which is the latest version of Selenium at the time of writing this book. It has developed a lot from its early ancestor, Selenium 1. Selenium RC was another tool that would let you write automated web application UI tests in programming languages such as Java, C#, Python, Ruby, and so on, against a HTTP website using any JavaScript enabled browser. For the coding part, we will be working with Java 8. Learning with Java can be fun and at the same time, fast.


What's new in Java 8

Up until Java 7, we only had object-oriented features in Java. Java 8 has added many new features. Some of these features are as follows:

  • Lambda expressions and functional interfaces
  • Default and static methods in interfaces
  • The forEach() method in iterable interfaces
  • The Java Stream API for bulk data operations on collections

Don't worry if you find this intimidating. We will slowly uncover Java 8 as we progress throughout this book.

Lambda expressions and functional interfaces

Lambda expressions are essential in functional programming. Lambda expressions are constructs that exist in a standalone fashion and not as a part of any class. One particular scenario where Lambda expressions can be used is while creating classes which consist of just a single method. Lambda expressions, in this case, help to be an alternative to anonymous classes (classes without names), which might not be feasible in certain situations. We will briefly look at two examples, side by side, of how we can convert a conventional Java snippet into a Lambda expression.

In the following code, we will assign a method to a variable called blockofCodeA. This is just what we are intending to solve with the means of Lambda expressions:

blockofCodeA = public void demo(){ System.out.println("Hello World");

The same piece of code can be written using Lambda expressions, as shown here:

blockofCodeA = () -> {                                 
System.out.println("Hello World");

Remove the name, return type, and the modifier, and simply add the arrow after the brackets. This becomes your Lambda expression.

Functional interfaces

Functional interfaces contain one—and only one—abstract method. An abstract method is one which should have a body in the implementation class if the implementation class is not abstract. It can have any number of regular methods (methods which have a body in the implementation classes), but the prerequisite of a functional interface is that the number of abstract methods must be only one. These interfaces are used hand-in-hand with Lambda expressions.

In the following code block, the demo method is inside an interface Greeting. Therefore, this interface should only have one abstract method, which is the demo method. In order to instruct other users that this is a functional interface, we annotate this interface with the @FunctionalInterface annotation.

The type of blockofCodeA will be of this functional interface type. This annotation is optional:

public interface Greeting {
public void demo();

Default and static methods in an interface

Up until Java 1.7, it was not possible to define a method inside an interface. Now, 1.8 introduces the default methods through which we can provide implementation for a method inside the interface. Let's see an example of this here:

interface Phone{
void dial();
default void text() {
System.out.println("Texting a message");

Static methods in Java are those methods that can be invoked without creating an object of a particular class, provided that the static method is in that particular class. In Java 8, static methods can be defined inside an interface, as shown here:

interface Phone {
inx x;
void changeRingtone();
static void text() {
public class PhoneDemo {
public static void main(String[] args) {

You can invoke the text() method directly using the name of the interface.

The forEach method for a collection

Starting with Java 8, we can invoke the forEach method on a collection and iterate through the contents of the collection. Let's compare the 7 and 8 versions of iterating over an array list of strings.

The following code, which is from Jave 7, fetches individual fruit names from the fruits list and prints it to the console:

List<String> fruits = Arrays.asList("Apples", "Oranges", "Bananas",
for (int i = 0; i < fruits.size(); i++) {

A second alternative that you can use is as follows:

for (String fruit : fruits ){

The example shown here does the same thing in Java 8 using lambda expressions:

fruits.forEach(i -> System.out.println(i));

Streams in Java 8

As per the Java documentation's definition:

Streams are a sequence of elements supporting sequential and parallel aggregate operations.

Imagine a factory in which workers are standing with tools in their hands, and machine parts keep moving around so that the individual worker can do their part. Streams can be compared somewhat to such a scenario:

List<String> fruits = Arrays.asList("Apples","Oranges","Bananas","Pears"); -> System.out.println(fruit));

Understanding Selenium RC

Selenium RC is a popular UI automation library for automating browsers. Selenium RC uses a generic form of JavaScript called Selenium Core to perform automation. However, this should comply with a security policy called the same-origin policy. The same-origin policy is a security measure that prevents website scripts from accessing the scripts of other websites. For example, JavaScript present on Google cannot access or communicate with JavaScript present on Yahoo. Three things are checked for the same-origin policy: the protocol, domain, and port. If these three things match, then only the request can be said as being one from the same domain.

Selenium Core was introduced by Jason Higgins; It was nothing but a JavaScript program. Prior to Selenium RC, IT people had to install both Selenium Core and the entire web application on their local machine to make the virtual appearance as though the requests were coming from the same domain. Selenium RC introduced the RC server, which acted as a HTTP proxy and handled the requests between the web application and Selenium Core.

What is cross-site scripting (XSS)?

Another concept related to same-origin policy is cross-site scripting. Cross-site scripting refers to the situation where a website can be prone to attacks from hackers. A typical hacker injects one or more JavaScript codes into web pages that are being browsed. These JavaScript codes can be malicious, and can pull cookie information from websites, pertaining to be banks, for example. This way, the malicious script bypasses the same-origin policy control.

Selenium RC consists of two parts:

  • Selenium server
  • Client libraries

The following diagram shows the functioning of Selenium RC, where the RC Server sits in-between the libraries like Java and Python and sends instructions to Selenium Core, thereafter operating on the individual browser:

Image modelled from

The role of the Remote Control Server is to inject the Selenium Core in the respective browser. The client libraries send instructions in the form of requests to the RC Server, and the RC Server communicates this to the browser. After receiving a response, this is communicated back to the user by the RC Server.


Introducing Selenium WebDriver

Selenium WebDriver is used for automating web browsers by using the browser's internal plugins or dll with the individual browser drivers, which are available for each individual browser.

The following diagram shows the high-level functionality of Selenium WebDriver. The JSON API parses the instructions from languages like Java, Python, and so on, and invokes and operates on the concerned browser:

Class structure of Selenium WebDriver

The following diagram is a snapshot of the class structure of Selenium WebDriver. The WebDriver interface is the parent of Remote WebDriver, which is a public class. Drivers for Internet Explorer, Firefox, Chrome, and so on inherit from the Remote WebDriver. In future chapters, we will be digging deep into these drivers:


Drivers in Selenium

We will now take a look at the various drivers that are available in Selenium and their usage.

Remote WebDriver

Remote WebDriver is the implementation class of the WebDriver interface. Apart from WebDriver, it also implements the interfaces of TakesScreenShot, findBy, JavaScriptExecutor, and so on.

Mobile drivers

All modern web apps have implementations for mobile devices. The two most popular operating systems in mobile devices are Android and iOS. Selenium has implementations for Android and iPhone, that is, AndroidDriver and IODDriver. Both of these are direct implementations of WebDriver.

Headless browsers

Headless browsers are those that do not have a graphical user interface (GUI). Everything runs in the background. When a test is executed with a headless browser, no screen is displayed to the user. Two popular headless browsers are HTMLUnit and Phantom JS. Chrome now supports the HTMLUnit browser.

Why do we need headless browsers?

Suppose that Selenium tests have to be executed on an OS which does not have a GUI like Linux or when multiple browser behaviors have to be simulated on just one machine. The advantage of a headless browser is that the resources utilized by the test are minimal. A scenario where you can use these browsers is for test data creation. In these situations, there is no special need to display the screen to the user.


Preparing for the very first script

Follow the steps shown here to get started with Selenium WebDriver.

Installing Java 8

Follow the instructions below to install Java 8:

  1. Go to and click on the appropriate version. I have selected the 64-bit Windows version since mine is a Windows machine.
  2. Once the file has downloaded, run the .exe file. Java will start installing it onto your machine. Next, we have to set two environment variables in order to use Java.
  3. Go to Control Panel and click Advanced System Settings.
  4. Click on Environment variables and add two system variables:
    • One is JAVA_HOME. Provide the path of the root folder where Java is installed. In this case, this will be C:\Program Files\Java\jdk1.8.0_152.
    • The second is the Path variable. Remember that this variable has to be appended after adding a ;. Here, the path of the bin folder has to be specified. In this case, this will be C:\Program Files\Java\jdk1.8.0_152\bin.
  5. The next step is to check our configuration. Open the Command Prompt and type java –version:

If you get an output similar to the one shown in the preceding screenshot, you are all set to start coding.

Now, let's get our hands dirty!

Setting up Eclipse

We will be using Eclipse as an IDE for developing Selenium Scripts in this book, but you are free to use whichever IDE suits you best.

Downloading Eclipse

Navigate to the Eclipse website ( and click on the Download link. Here, you can find very specific instructions regarding how to install you favourite IDE version (Kepler, Neon, and so on).

Creating a Maven project

Once the IDE is installed, perform the following steps:

  1. Double click on the .exe file for Eclipse and go to File | New | Other.
  2. Select Maven Project. Click Next.
  3. Click Next on the screen that appears.
  4. Select Create a simple project (skip archetype selection). Then, click Next.
  5. Input the Group ID. Ideally, this is the package name of the project. The Artifact ID corresponds to the name of the JAR file in case you want to create one. Keep the packaging as JAR. Notice that the version is 0.0.1-SNAPSHOT. The SNAPSHOT part indicates that the project is still under development and has not been released.
  1. Click Finish. The following is a snapshot of the Project Explorer:

When you create a Maven Project, the src/main/java, src/main/resources, src/test/java, and src/test/resources folders, are created for you. Apart from these, you will see a Maven Dependencies folder that is currently empty. Marked with a black box, there is a .xml file called pom.xml. This is the place where you will place all of the dependencies for your project. By dependencies, I mean dependent JARs. JAR stands for Java archive.

Understanding pom.xml

It's time to explore pom.xml. This is what pom.xml looks like:

<project xmlns="" xmlns:xsi="" xsi:schemaLocation="">

Group ID and Artifact ID that you added in the previous screens have appeared in the preceding file, inside the Project tag. In order to work with Selenium, we will need to add Selenium dependencies within the Project tag. Let's go ahead and add those from the Maven repository:

  1. Go to the Maven repository ( and grab the dependency shown here:
selenium-java -->
  1. Place this dependency inside a dependencies tag, as shown in the pom.xml file here:
      <project xmlns=""    
selenium-java -->

  1. Save the pom.xml. You will see a small activity in the bottom-right corner of Eclipse, stating that the project is being built.

The Maven Dependencies folder now gets populated with all of the downloaded JARs, as shown previously.

Manual configuration

With this, we are ready, and have the basic Eclipse setup for Selenium WebDriver. But we are not done yet. It might occur that, under a corporate firewall, you are unable to download the required JARS. In this situation, perform the following steps:

  1. Simply create a plain Java project.
  2. Right -click on the project in Project Explorer.
  3. Select Build Path | Configure Build Path.
  4. Click on Add External JARs and add the required JARs manually.
  5. Next, we will write a very simple script which just opens (this is shown in the following section). Right-click the Project and select new class.

Creating the first script

Type the following code. What the following script does is simply opens a new Chrome browser and navigates to the URL

public class FirstTest {
public static void main(String[] args) {
WebDriver driver = new ChromeDriver();

Right-click the file and click Run as Java Application and hurrah! A chrome browser opens and gets loaded.

You have successfully created your first Selenium Script.



This chapter gave you an idea of what Selenium RC and WebDriver are and also touched upon concepts like same-origin policy and cross-site scripting. We also did a basic Eclipse setup using Maven as well as without Maven, and finally we created a very simple program to open a URL in the browser.

In Chapter 2, Understanding the Document Object Model and Creating Customized XPaths, we will learn about the Document Object Model and its various traversal techniques.

About the Author

  • Pinakin Chaubal

    Pinakin is a BE(Computer Science) with 17+ years of IT experience which includes development and test automation experience. I am PMP and ISTQB certified. He is currently working at Intellect Design Arena as an automation architect. He handles the automation framework creation and maintenance effort in various projects. Previously, Pinakin has worked at Patni, Accenture, L&T Infotech etc. Following are some of the prestigious clients for which Pinakin has worked, General Electric, Bellsouth Telecommunications, Albertsons Retail,Travelers Insurance, Harleysville insurance, Barclays,Bank of Santander, Bank of Montreal, HSBC, CITI, Canadian Imperial Bank of Commerce,HDFC. Pinakin has used tools like QTP, Winrunner, UFT and now working with Selenium WebDriver.

    Browse publications by this author

Latest Reviews

(1 reviews total)
The wording structure and continuity within the book strikes me as mediocre. Instructions and code samples are scant and incomplete (the first offender being how to install and set up selenium). The author seems more concerned with teasing what will appear in later chapters than properly addressing the topic for the current chapter. I got this book at a discount and feel sincerely duped. That's how lacking this one book feels to me. Will probably explore further titles in the future. Hope to get better content when do.
Selenium WebDriver Quick Start Guide
Unlock this book and the full library for $5 a month*
Start now