JavaScript Execution with Selenium

Mark Collin

September 2015

 In this article, by Mark Collin, the author of the book, Mastering Selenium WebDriver, we will look at how we can directly execute JavaScript snippets in Selenium. We will explore the sort of things that you can do and how they can help you work around some of the limitations that you will come across while writing your scripts. We will also have a look at some examples of things that you should avoid doing.

(For more resources related to this topic, see here.)

Introducing the JavaScript executor

Selenium has a mature API that caters to the majority of automation tasks that you may want to throw at it. That being said, you will occasionally come across problems that the API doesn't really seem to support. This was very much on the development team's mind when Selenium was written. So, they provided a way for you to easily inject and execute arbitrary blocks of JavaScript. Let's have a look at a basic example of using a JavaScript executor in Selenium:

JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("console.log('I logged something to the
Javascript console');");

Note that the first thing we do is cast a WebDriver object into a JavascriptExecutor object. The JavascriptExecutor interface is implemented through the RemoteWebDriver class. So, it's not a part of the core set of API functions. Since we normally pass around a WebDriver object, the executeScript functions will not be available unless we perform this cast.

If you are directly using an instance of RemoteWebDriver or something that extends it (most driver implementations now do this), you will have direct access to the .executeScript() function. Here's an example:

FirefoxDriver driver = new FirefoxDriver(new FirefoxProfile());
driver.executeScript("console.log('I logged something to the
Javascript console');");

The second line (in both the preceding examples) is just telling Selenium to execute an arbitrary piece of JavaScript. In this case, we are just going to print something to the JavaScript console in the browser.

We can also get the .executeScript() function to return things to us. For example, if we tweak the script of JavaScript in the first example, we can get Selenium to tell us whether it managed to write to the JavaScript console or not, as follows:

JavascriptExecutor js = (JavascriptExecutor) driver;
Object response = js.executeScript("return console.log('I
logged something to the Javascript console');");

In the preceding example, we will get a result of true coming back from the JavaScript executor.

Why does our JavaScript start with return? Well, the JavaScript executed by Selenium is executed as a body of an anonymous function. This means that if we did not add a return statement to the start of our JavaScript snippet, we would actually be running this JavaScript function using Selenium:

var anonymous = function () {
   console.log('I logged something to the Javascript console');
};

This function does log to the console, but it does not return anything. So, we can't access the result of the JavaScript snippet. If we prefix it with a return, it will execute this anonymous function:

var anonymous = function () {
   return console.log('I logged something to the Javascript
   console');

};

This does return something for us to work with. In this case, it will be the result of our attempt to write some text to the console. If we succeeded in writing some text to the console, we will get back a true value. If we failed, we will get back a false value.

Note that in our example, we saved the response as an object—not a string or a Boolean. This is because the JavaScript executor can return lots of different types of objects. What we get as a response can be one of the following:

  • If the result is null or there is no return value, a null will be returned
  • If the result is an HTML element, a WebElement will be returned
  • If the result is a decimal, a double will be returned
  • If the result is a nondecimal number, a long will be returned
  • If the result is a Boolean, a Boolean will be returned
  • If the result is an array, a List object with each object that it contains, along with all of these rules, will be returned (nested lists are supported)
  • For all other cases, a string will be returned

It is an impressive list, and it makes you realize just how powerful this method is. There is more as well. You can also pass arguments into the .executeScript() function. The arguments that you pass in can be any one of the following:

  • Number
  • Boolean
  • String
  • WebElement
  • List

They are then put into a magic variable called arguments, which can be accessed by the JavaScript. Let's extend our example a little bit to pass in some arguments, as follows:

String animal = "Lion";
int seen = 5;
JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("console.log('I have seen a ' + arguments[0]
+ ' ' + arguments[1] + ' times(s)');", animal, seen);

This time, you will see that we managed to print the following text into the console:

I have seen a Lion 5 times(s)

As you can see, there is a huge amount of flexibility with the JavaScript executor. You can write some complex bits of JavaScript code and pass in lots of different types of arguments from your Java code.

Think of all the things that you could do!

Let's not get carried away

We now know the basics of how one can execute JavaScript snippets in Selenium. This is where some people can start to get a bit carried away.

If you go through the mailing list of the users of Selenium, you will see many instances of people asking why they can't click on an element. Most of the time, this is due to the element that they are trying to interact with not being visible, which is blocking a click action. The real solution to this problem is to perform an action (the same one that they would perform if they were manually using the website) to make the element visible so that they can interact with it.

However, there is a shortcut offered by many, which is a very bad practice. You can use a JavaScript executor to trigger a click event on this element. Doing this will probably make your test pass. So why is it a bad solution?

The Selenium development team has spent quite a lot of time writing code that works out if a user can interact with an element. It's pretty reliable. So, if Selenium says that you cannot currently interact with an element, it's highly unlikely that it's wrong. When figuring out whether you can interact with an element, lots of things are taken into account, including the z-index of an element. For example, you may have a transparent element that is covering the element that you want to click on and blocking the click action so that you can't reach it. Visually, it will be visible to you, but Selenium will correctly see it as not visible.

If you now invoke a JavaScript executor to trigger a click event on this element, your test will pass, but users will not be able to interact with it when they try to manually use your website.

However, what if Selenium got it wrong and I can interact with the element that I want to click manually? Well, that's great, but there are two things that you need to think about.

First of all, does it work in all browsers? If Selenium thinks that it is something that you cannot interact with, it's probably for a good reason. Is the markup, or the CSS, overly complicated? Can it be simplified?

Secondly, if you invoke a JavaScript executor, you will never know whether the element that you want to interact with really does get blocked at some point in the future. Your test may as well keep passing when your application is broken. Tests that can't fail when something goes wrong are worse than no test at all!

If you think of Selenium as a toolbox, a JavaScript executor is a very powerful tool that is present in it. However, it really should be seen as a last resort when all other avenues have failed you. Too many people use it as a solution to any slightly sticky problem that they come across.

If you are writing JavaScript code that attempts to mirror existing Selenium functions but are removing the restrictions, you are probably doing it wrong! Your code is unlikely to be better. The Selenium development team have been doing this for a long time with a lot of input from a lot of people, many of them being experts in their field.

If you are thinking of writing methods to find elements on a page, don't! Use the .findElement() method provided by Selenium.

Occasionally, you may find a bug in Selenium that prevents you from interacting with an element in the way you would expect to. Many people first respond by reaching for the JavascriptExecutor to code around the problem in Selenium.

Hang on for just one moment though. Have you upgraded to the latest version of Selenium to see if that fixes your problem? Alternatively, did you just upgrade to the latest version of Selenium when you didn't need to? Using a slightly older version of Selenium that works correctly is perfectly acceptable. Don't feel forced to upgrade for no reason, especially if it means that you have to write your own hacks around problems that didn't exist before.

The correct thing to do is to use a stable version of Selenium that works for you. You can always raise bugs for functionality that doesn't work, or even code a fix and submit a pull request. Don't give yourself the additional work of writing a workaround that's probably not the ideal solution, unless you need to.

So what should we do with it?

Let's have a look at some examples of the things that we can do with the JavaScript executor that aren't really possible using the base Selenium API.

First of all, we will start off by getting the element text.

Wait a minute, element text? But, that’s easy! You can use the existing Selenium API with the following code:

WebElement myElement = driver.findElement(By.id("foo"));
String elementText = myElement.getText();

So why would we want to use a JavaScript executor to find the text of an element?

Getting text is easy using the Selenium API, but only under certain conditions. The element that you are collecting the text from needs to be displayed. If Selenium thinks that the element from which you are collecting the text is not displayed, it will return an empty string. If you want to collect some text from a hidden element, you are out of luck. You will need to implement a way to do it with a JavaScript executor.

Why would you want to do this? Well, maybe you have a responsive website that shows different elements based on different resolutions. You may want to check whether these two different elements are displaying the same text to the user. To do this, you will need to get the text of the visible and invisible elements so that you can compare them. Let's create a method to collect some hidden text for us:

private String getHiddenText(WebElement element) {

   JavascriptExecutor js = (JavascriptExecutor)
((RemoteWebElement) element).getWrappedDriver();

   return (String) js.executeScript("return
   arguments[0].text", element);

}

There is some cleverness in this method. First of all, we took the element that we wanted to interact with and then extracted the driver object associated with it. We did this by casting the WebElement into a RemoteWebElement, which allowed us to use the getWrappedDriver() method. This removes the need to pass a driver object around the place all the time (this is something that happens a lot in some code bases).

We then took the driver object and cast it into a JavascriptExecutor so that we would have the ability to invoke the executeScript() method. Next, we executed the JavaScript snippet and passed in the original element as an argument. Finally, we took the response of the executeScript() call and cast it into a string that we can return as a result of the method.

Generally, getting text is a code smell. Your tests should not rely on specific text being displayed on a website because content always changes. Maintaining tests that check the content of a site is a lot of work, and it makes your functional tests brittle. The best thing to do is test the mechanism that injects the content into the website. If you use a CMS that injects text into a specific template key, you can test whether each element has the correct template key associated with it.

I want to see a more complex example!

So you want to see something more complicated. The Advanced User Interactions API cannot deal with HTML5 drag and drop. So, what happens if we come across an HTML5 drag-and-drop implementation that we want to automate? Well, we can use the JavascriptExecutor. Let's have a look at the markup for the HTML5 drag-and-drop page:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset=utf-8>
   <title>Drag and drop</title>
   <style type="text/css">
       li {
           list-style: none;
       }

       li a {
           text-decoration: none;
           color: #000;
           margin: 10px;
           width: 150px;
           border-width: 2px;
           border-color: black;
           border-style: groove;
           background: #eee;
           padding: 10px;
           display: block;
       }

       *[draggable=true] {
           cursor: move;
       }

       ul {
           margin-left: 200px;
           min-height: 300px;
       }

       #obliterate {
           background-color: green;
           height: 250px;
           width: 166px;
           float: left;
           border: 5px solid #000;
           position: relative;
           margin-top: 0;
       }

       #obliterate.over {
           background-color: red;
       }
   </style>
</head>
<body>
<header>
   <h1>Drag and drop</h1>
</header>

<article>
   <p>Drag items over to the green square to remove them</p>

   <div id="obliterate"></div>

   <ul>
       <li><a id="one" href="#" draggable="true">one</a></li>
       <li><a id="two" href="#" draggable="true">two</a></li>
       <li><a id="three" href="#" draggable="true">three</a></li>
       <li><a id="four" href="#" draggable="true">four</a></li>
       <li><a id="five" href="#" draggable="true">five</a></li>
   </ul>
</article>
</body>
<script>
   var draggableElements = document.querySelectorAll('li > a'),
           obliterator = document.getElementById('obliterate');

   for (var i = 0; i < draggableElements.length; i++) {
       element = draggableElements[i];
       element.addEventListener('dragstart', function (event) {
           event.dataTransfer.effectAllowed = 'copy';
           event.dataTransfer.setData('being-dragged', this.id);
       });
   }

   obliterator.addEventListener('dragover', function (event) {
       if (event.preventDefault) event.preventDefault();
       obliterator.className = 'over';
       event.dataTransfer.dropEffect = 'copy';
       return false;
   });

   obliterator.addEventListener('dragleave', function () {
       obliterator.className = '';
       return false;
   });

   obliterator.addEventListener('drop', function (event) {
       var elementToDelete = document.getElementById(
       event.dataTransfer.getData('being-dragged'));
       elementToDelete.parentNode.removeChild(elementToDelete);
       obliterator.className = '';
       return false;
   });
</script>
</html>

Note that you need a browser that supports HTML5/CSS3 for this page to work. The latest versions of Google Chrome, Opera Blink, Safari, and Firefox will work. You may have issues with Internet Explorer (depending on the version that you are using). For an up-to-date list of HTML5/CSS3 support, have a look at http://caniuse.com.

If you try to use the Advanced User Interactions API to automate this page, you will find that it just doesn't work. It looks like it's time to reach for JavascriptExecutor.

First of all, we need to write some JavaScript that can simulate the events that we need to trigger to perform the drag-and-drop action. To do this, we are going to create three JavaScript functions. The first function is going to create a JavaScript event:

function createEvent(typeOfEvent) {
   var event = document.createEvent("CustomEvent");
   event.initCustomEvent(typeOfEvent, true, true, null);
   event.dataTransfer = {
       data: {},
       setData: function (key, value) {
           this.data[key] = value;
       },
       getData: function (key) {
           return this.data[key];
       }
   };
   return event;
}

We then need to write a function that will fire events that we have created. This also allows you to pass in the dataTransfer value set on an element. We need this to keep track of the element that we are dragging:

function dispatchEvent(element, event, transferData) {
   if (transferData !== undefined) {
       event.dataTransfer = transferData;
   }
   if (element.dispatchEvent) {
       element.dispatchEvent(event);
   } else if (element.fireEvent) {
       element.fireEvent("on" + event.type, event);
   }
}

Finally, we need something that will use these two functions to simulate the drag-and-drop action:

function simulateHTML5DragAndDrop(element, target) {
   var dragStartEvent = createEvent('dragstart');
   dispatchEvent(element, dragStartEvent);
   var dropEvent = createEvent('drop');
   dispatchEvent(target, dropEvent, dragStartEvent.dataTransfer);
   var dragEndEvent = createEvent('dragend');
   dispatchEvent(element, dragEndEvent, dropEvent.dataTransfer);
}

Note that the simulateHTML5DragAndDrop function needs us to pass in two elements—the element that we want to drag, and the element that we want to drag it to.

It's always a good idea to try out your JavaScript in a browser first. You can copy the preceding functions into the JavaScript console in a modern browser and then try using them to make sure that they work as expected. If things go wrong in your Selenium test, you then know that it is most likely an error invoking it via the JavascriptExecutor rather than a bad piece of JavaScript.

We now need to take these scripts and put them into a JavascriptExecutor along with something that will call the simulateHTML5DragAndDrop function:

private void simulateDragAndDrop(WebElement elementToDrag,
WebElement target) throws Exception {
   WebDriver driver = getDriver();
   JavascriptExecutor js = (JavascriptExecutor) driver;
   js.executeScript("function createEvent(typeOfEvent) {\n" +
                   "var event =
                     document.createEvent(\"CustomEvent\");\n" +
                   "event.initCustomEvent(typeOfEvent,
                     true, true, null);\n" +
               "   event.dataTransfer = {\n" +
               "       data: {},\n" +
               "       setData: function (key, value) {\n" +
               "           this.data[key] = value;\n" +
               "       },\n" +
               "       getData: function (key) {\n" +
               "           return this.data[key];\n" +
               "       }\n" +
               "   };\n" +
               "   return event;\n" +
               "}\n" +
               "\n" +
               "function dispatchEvent(element, event,
               transferData) {\n" +
               "   if (transferData !== undefined) {\n" +
               "       event.dataTransfer = transferData;\n" +
               "   }\n" +
               "   if (element.dispatchEvent) {\n" +
                 "       element.dispatchEvent(event);\n" +
               "   } else if (element.fireEvent) {\n" +
               "       element.fireEvent(\"on\" + event.type,
                         event);\n" +
               "   }\n" +
               "}\n" +
               "\n" +
                "function simulateHTML5DragAndDrop(element,
                 target) {\n" +
               "   var dragStartEvent =
                     createEvent('dragstart');\n" +
               "   dispatchEvent(element, dragStartEvent);\n" +
                "   var dropEvent = createEvent('drop');\n" +
               "   dispatchEvent(target, dropEvent,
                     dragStartEvent.dataTransfer);\n" +
               "   var dragEndEvent = createEvent('dragend');
                     \n" +
               "   dispatchEvent(element, dragEndEvent,
                     dropEvent.dataTransfer);\n" +
               "}\n" +
               "\n" +
               "var elementToDrag = arguments[0];\n" +
               "var target = arguments[1];\n" +
               "simulateHTML5DragAndDrop(elementToDrag,
               target);",
           elementToDrag, target);
}

This method is really just a wrapper around the JavaScript code. We take a driver object and cast it into a JavascriptExecutor. We then pass the JavaScript code into the executor as a string. We have made a couple of additions to the JavaScript functions that we previously wrote. Firstly, we set a couple of variables (mainly for code clarity; they can quite easily be inlined) that take the WebElements that we have passed in as arguments. Finally, we invoke the simulateHTML5DragAndDrop function using these elements.

The final piece of the puzzle is to write a test that utilizes the simulateDragAndDrop method, as follows:

@Test
public void dragAndDropHTML5() throws Exception {
   WebDriver driver = getDriver();
   driver.get("http://ch6.masteringselenium.com/
   dragAndDrop.html");

   final By destroyableBoxes = By.cssSelector("ul > li > a");
   WebElement obliterator =
   driver.findElement(By.id("obliterate"));
   WebElement firstBox = driver.findElement(By.id("one"));
   WebElement secondBox = driver.findElement(By.id("two"));
assertThat(driver.findElements(destroyableBoxes).size(), is(equalTo(5))); simulateDragAndDrop(firstBox, obliterator); assertThat(driver.findElements(destroyableBoxes). size(), is(equalTo(4))); simulateDragAndDrop(secondBox, obliterator); assertThat(driver.findElements(destroyableBoxes). size(), is(equalTo(3))); }

This test finds a couple of boxes and destroys them one by one using the simulated drag and drop. As you can see, the JavascriptExcutor is extremely powerful.

Can I use JavaScript libraries?

The logical progression is, of course, to write your own JavaScript libraries that you can import instead of sending everything over as a string. Alternatively, maybe you would just like to import an existing library.

Let's write some code that allows you to import a JavaScript library of your choice. It's not a particularly complex JavaScript. All that we are going to do is create a new <script> element in a page and then load our library into it, as follows:

public void injectScript(String scriptURL) throws Exception {
   WebDriver driver = getDriver();
   JavascriptExecutor js = (JavascriptExecutor) driver;
   js.executeScript("function injectScript(url) {\n" +
           "   var script = document.createElement
                 ('script');\n" +
           "   script.src = url;\n" +
           "   var head = document.getElementsByTagName(
                 'head')[0];\n" +
           "   head.appendChild(script);\n" +
           "}\n" +
           "\n" +
           "var scriptURL = arguments[0];\n" +
           "injectScript(scriptURL);"
           , scriptURL);
}

We have again set arguments[0] to a variable before injecting it for clarity, but you can inline this part if you want to. All that remains now is to inject this into a page and check whether it works. Let's write a test!

We are going to use this function to inject jQuery into the Google website. The first thing that we need to do is write a method that can tell us whether jQuery has been loaded or not, as follows:

public Boolean isjQueryLoaded() throws Exception {
   WebDriver driver = getDriver();
   JavascriptExecutor js = (JavascriptExecutor) driver;
   return (Boolean) js.executeScript("return typeof jQuery
   != 'undefined';");
}

Now, we need to put all of this together in a test, as follows:

@Test
public void injectjQueryIntoGoogle() throws Exception {

   WebDriver driver = DriverFactory.getDriver();

   driver.get("http://www.google.com");

   assertThat(isjQueryLoaded(), is(equalTo(false)));

   injectScript("https://code.jquery.com/jquery-latest.min.js");

   assertThat(isjQueryLoaded(), is(equalTo(true)));
}

It's a very simple test. We loaded the Google website. Then, we checked whether jQuery existed. Once we were sure that it didn't exist, we injected jQuery into the page. Finally, we again checked whether jQuery existed.

We have used jQuery in our example, but you don't have to use jQuery. You can inject any script that you desire.

Should I inject JavaScript libraries?

It's very easy to inject JavaScript into a page, but stop and think before you do it. Adding lots of different JavaScript libraries may affect the existing functionality of the site. You may have functions in your JavaScript that overwrite existing functions that are already on the page and break the core functionality.

If you are testing a site, it may make all of your tests invalid. Failures may arise because there is a clash between the scripts that you inject and the existing scripts used on the site. The flip side is also true—injecting a script may make the functionality that is broken, work.

If you are going to inject scripts into an existing site, be sure that you know what the consequences are.

If you are going to regularly inject a script, it may be a good idea to add some assertions to ensure that the functions that you are injecting do not already exist before you inject the script. This way, your tests will fail if the developers add a JavaScript function with the same name at some point in the future without your knowledge.

What about asynchronous scripts?

Everything that we have looked at so far has been a synchronous piece of JavaScript. However, what if we wanted to perform some asynchronous JavaScript calls as a part of our test? Well, we can do this. The JavascriptExecutor also has a method called executeAsyncScript(). This will allow you to run some JavaScript that does not respond instantly. Let's have a look at some examples.

First of all, we are going to write a very simple bit of JavaScript that will wait for 25 seconds before triggering a callback, as follows:

@Test
private void javascriptExample() throws Exception {
   WebDriver driver = DriverFactory.getDriver();

   driver.manage().timeouts().setScriptTimeout(60,
   TimeUnit.SECONDS);
   JavascriptExecutor js = (JavascriptExecutor) driver;
   js.executeAsyncScript("var callback = arguments[
   arguments.length - 1]; window.setTimeout(callback, 25000);");

   driver.get("http://www.google.com");
}

Note that we defined a JavaScript variable named callback, which uses a script argument that we have not set. For asynchronous scripts, Selenium needs to have a callback defined, which is used to detect when the JavaScript that you are executing has finished. This callback object is automatically added to the end of your arguments array. This is what we have defined as the callback variable.

If we now run the script, it will load our browser and then sit there for 25 seconds as it waits for the JavaScript snippet to complete and call the callback. It will then load the Google website and finish.

We have also set a script timeout on the driver object that will wait for up to 60 seconds for our piece of JavaScript to execute.

Let's see what happens if our script takes longer to execute than the script timeout:

@Test
private void javascriptExample() throws Exception {
   WebDriver driver = DriverFactory.getDriver();

   driver.manage().timeouts().setScriptTimeout(5,
   TimeUnit.SECONDS);
   JavascriptExecutor js = (JavascriptExecutor) driver;
   js.executeAsyncScript("var callback = arguments[
   arguments.length - 1]; window.setTimeout(callback, 25000);");

   driver.get("http://www.google.com");
}

This time, when we run our test, it waits for 5 seconds and then throws a TimoutException. It is important to set a script timeout on the driver object when running asynchronous scripts, to give them enough time to execute.

What do you think will happen if we execute this as a normal script?

@Test
private void javascriptExample() throws Exception {
   WebDriver driver = DriverFactory.getDriver();
   driver.manage().timeouts().setScriptTimeout(
   5, TimeUnit.SECONDS);
   JavascriptExecutor js = (JavascriptExecutor) driver;
   js.executeScript("var callback = arguments[arguments.
   length - 1]; window.setTimeout(callback, 25000);");

   driver.get("http://www.google.com");
}

You may have been expecting an error, but that's not what you got. The script got executed as normal because Selenium was not waiting for a callback; it didn't wait for it to complete. Since Selenium did not wait for the script to complete, it didn't hit the script timeout. Hence, no error was thrown.

Wait a minute. What about the callback definition? There was no argument that was used to set the callback variable. Why didn't it blow up?

Well, JavaScript isn't as strict as Java. What it has done is try and work out what arguments[arguments.length - 1] would resolve and realized that it is not defined. Since it is not defined, it has set the callback variable to null. Our test then completed before setTimeout() had a chance to complete its call. So, you won't see any console errors.

As you can see, it's very easy to make a small error that stops things from working when working with asynchronous JavaScript. It's also very hard to find these errors because there can be very little user feedback. Always take extra care when using the JavascriptExecutor to execute asynchronous bits of JavaScript.

Summary

In this article, we:

  • Learned how to use a JavaScript executor to execute JavaScript snippets
    in the browser through Selenium
  • Learned about passing arguments into a JavaScript executor and the sort
    of arguments that are supported
  • Learned what the possible return types are for a JavaScript executor
  • Gained a good understanding of when we shouldn't use a JavaScript executor
  • Worked through a series of examples that showed ways in which we can
    use a JavaScript executor to enhance our tests

Resources for Article:


Further resources on this subject:


You've been reading an excerpt of:

Mastering Selenium WebDriver

Explore Title