Getting Started with PhantomJS

Chapter 1. Getting Started

PhantomJS is a new solution that provides headless testing of web applications. It is also a tool for dynamically capturing and rendering pages as images. It allows you to programmatically manipulate page content to control and change it to different forms. It can scrape websites and save important information to files. It will also provide you network-level information of your page and site resources. These are just a few of the functions that PhantomJS can do for us. It provides a fresh and a whole new way for web designers, testers, and developers to perform and create browser-based solutions.

PhantomJS uses QtWebKit as its core browser capability and uses the WebKit JavaScript engine for script interpretation and execution. Anything and everything that you can do in a WebKit-based browser (such as Chrome, Safari, and Opera browser) you can do with PhantomJS. It's more than just a browser because it supports web standards, such as CSS selector, DOM manipulation, JSON, HTML5 Canvas, and SVG; moreover, you can do some cool stuff such as performing file system I/O, accessing system environment variables, or even instantiating your own implementation of a web server daemon.

Building PhantomJS from source

You may also want to build your own binaries by compiling PhantomJS from source. Sources are hosted in the Github server at https://github.com/ariya/phantomjs.

Before you start downloading sources, you will need these tools installed on your workspace:

Required development tools

Windows

Visual Studio 2010 or 2008 (Express edition)

git

Mac OS X

Xcode

git

Ubuntu/RHEL/CentOS Linux

gcc

gcc-c++

make

git

openssl-devel

freetype-devel

fontconfig-devel

The PhantomJS team is always trying to find the optimal way to build the sources, and the build instructions are frequently modified. To build PhantomJS properly, you must follow the steps found here: http://phantomjs.org/build.html.

If you are not planning to hack into PhantomJS code and develop new features, then it is best to download the pre-packaged binaries.

Working with PhantomJS

Now, let's see how PhantomJS's magic works. It is a command-line-based application, so we need to execute it in an OS terminal or console. The PhantomJS package contains a series of files and comes with one main executable file, which is named phantomjs.

Open your terminal and then navigate to your PhantomJS bin folder. In the prompt, execute phantomjs without any arguments.

Tip

PhantomJS Windows build

In Windows build, PhantomJS executable can be found in the root folder with the filename phantomjs.exe.

Running PhantomJS without any arguments will give you an interactive prompt that is similar to the JavaScript debug console you could find in any modern browser. In this interactive prompt, we can execute JavaScript code line by line. This functionality is very useful for debugging or testing code before you actually build your script.

Say "Hello Ghost!" to PhantomJS using the interactive prompt. Using console.log will output any type of data to the output console of a JavaScript interpreter.

phantomjs> console.log("Hello Ghost!")
Hello Ghost!
undefined
phantomjs>

See? It is simple. Just like coding any JavaScript. But wait. What is that undefined message just after the Hello Ghost! message? That is not an error. It is just how the interactive mode behaves. Each call is expected to return data just like any ordinary function call and it also automatically outputs the data value to the output stream.

Since the console.log command does not return any value, the message is undefined. If we issue an assignment to a variable command, the following output will be displayed:

phantomjs> name = "Tara"
{}
phantomjs>

The assignment to a variable will take place and the result of the operation will be displayed. Because it is in the form of a string literal, the undefined message will not be displayed. The interactive mode is similar to a long-running script; any variable or function you define will be loaded into the memory buffer and can be accessed anytime during the session. So, based on our preceding example, the name variable can also be displayed by referencing it.

phantomjs> name = "Tara"
"Tara"
phantomjs> name
"Tara"
phantomjs> name + " and Cecil"
"Tara and Cecil"
phantomjs>

We can even use the variable with another operation as seen in the preceding lines of code. However, any operation's result that is not assigned to a variable will be available only during the execution of the line. The operation that concatenates the name variable with another string literal will be performed, and the resulting string will be displayed in the console but will not be kept in memory.

Objects can also be accessed within the interactive mode, and one of the most commonly used objects is phantom. Try typing phantom in the prompt and you will get the following output:

phantomjs> phantom
{
   "clearCookies": "[Function]",
   "deleteCookie": "[Function]",
   "addCookie": "[Function]",
   "injectJs": "[Function]",
   "debugExit": "[Function]",
   "exit": "[Function]",
   "cookies": [],
   "cookiesEnabled": true,
   "version": {
      "major": 1,
      "minor": 7,
      "patch": 0
   },
   "scriptName": "",
   "outputEncoding": "UTF-8",
   "libraryPath": "/Users/Aries/phantomjs/bin",
   "defaultPageSettings": {
      "XSSAuditingEnabled": false,
      "javascriptCanCloseWindows": true,
      "javascriptCanOpenWindows": true,
      "javascriptEnabled": true,
      "loadImages": true,
      "localToRemoteUrlAccessEnabled": false,
      "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X)AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.7.0Safari/534.34",
      "webSecurityEnabled": true
   },
   "args": []
}
phantomjs>

PhantomJS displays the content of the object when used in the interactive prompt, and even its own phantom object can be referenced. You may also observe that the object is displayed in the form of JSON and details every attribute of the object except for the function definition. Using this approach, we can also examine each and every object, and we will be able to know what the exposed attributes and available functions are.

Let's try using one of the most important functions available in the phantom object: the exit() function. This function will enable us to quit PhantomJS and return to the caller or to the underlying operating system.

phantomjs> phantom.exit()
$

This function signals the application to exit with a return code of zero or normal and without errors. Passing a numeric value as an argument of the exit() function denotes the error code to be passed back to the caller. This is helpful when trying to write scripts that need to verify if the execution was successful or if an error occurred and what type of error it was.

If we trap the error code in a shell script, it will look as follows:

#!/bin/bash
bin/phantomjs
OUT=$?
if [ $OUT -eq 0 ];then
   echo "Done."
else
   echo "Ooops! Failed.!"
fi

In the preceding lines of code, right after calling phantomjs, we capture the error code coming from the application using the $? function. We assign that to an OUT variable and then perform a test on it in the succeeding lines. If the error is equal to zero, then we display Done; otherwise, we say that the call failed.

$ ./trapme.sh 
phantomjs> phantom.exit(0)
undefined
Done.
$ ./trapme.sh 
phantomjs> phantom.exit(1)
undefined
Ooops! Failed.!
$

Use the interactive mode to experiment with the PhantomJS API.

Before we begin creating PhantomJS scripts, we first need to make a quick roundup of the PhantomJS JavaScript API.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

PhantomJS JavaScript API

PhantomJS runs JavaScript and comes with a JavaScript API to make your life easy. It extends the standard JavaScript API and adds richer layers of capabilities, such as allowing us access to the underlying file system, ease of access and manipulation of DOM objects, system and environment variable gathering. It even gives us the ability to inject custom scripts into the web page.

But, be warned. PhantomJS is a very active community, and every now and then, changes are being introduced. New APIs and objects are being added, but of course, there are few items that are being changed, and ultimately some of them become deprecated or are totally removed. The PhantomJS website has a full list of all the functions and syntax, and has proper tagging for deprecated functions. We should visit it regularly to check for upcoming updates so that we can adjust appropriately in our codes.

The Module API

While writing custom objects and API sets, you may want to create custom modules that will make your life easier. You can do that in PhantomJS using the Module API. It allows you to create your own modules and import it anywhere in your implementation. The built-in modules are webpage, system, fs (File System), and webserver.

The WebPage API

PhantomJS is a headless browser. Accessing and manipulating web documents are its core functionalities, and that's what the WebPage API is used for. The WebPage API allows us to access, control, and manipulate web documents. It provides a rich interface to easily reference and extract page details including document content. It enables capturing of events, such as page loading, when an error occurs within the page, or when navigating to another page is requested, and so on. It is also capable of capturing pages and saving them as images. And more importantly, it allows you to manipulate documents on the fly and traverse DOM as you do with any web page. This is enormously valuable for writing automated user interface tests—for example, you can force click events or post forms, and capture the results—as well as standard web scraping of public URLs.

The System API

The System module provides system-level functionalities ranging from OS information, environment variables, command-line arguments, and process-related properties. The System module is very useful as you engage more in developing applications with PhantomJS.

The FileSystem API

Accessing files, writing to text files, or just reading a custom configuration file—these are tasks that can be done with PhantomJS. FileSystem provides a standard API to perform file I/O. You can read, write, and delete files; you can even list folder files.

FileSystem has 31 functions to manage and manipulate files within PhantomJS. We have been using this set extensively, and it helps solve several problems without fail. Writing JSON data to a file is very basic, but you will find it much easier using the FileSystem API.

The WebServer API

Perhaps you have a grand idea, but it requires you to process a web request and execute a PhantomJS script on it. PhantomJS can do that; you can embed your own web server implementation using the WebServer API within your PhantomJS application. This feature is marked as experimental, but it does work, and with roughly five lines of code you can have your own web server running.

This module is based on the open source Mongoose web server library that supports multiple platforms, authorization, web sockets, URL rewrite, and even "resumeable" downloads. For more information about Mongoose, visit http://code.google.com/p/mongoose/.

SetITGood Feb 09, 2014

Being in IT for almost two decades now I am perhaps lucky to have witnessed the growth of this industry. More so, I am glad to see the "rebirth" of Java Scripting for front-end development in the recent years. But ask any front-end developer and one thing they dreaded most was testing these web pages and Java Scripts. PhantomJS - however is a game changer!PhantomJS provides "headless" testing of web applications. Wait, what did I mean with headless? When you are typing a URL from the browser, it essentially creates a request and the response is reflected on the page. PhantomJS actually does the same thing, EXCEPT we don't need to wait for it to be rendered before our eyes! Probably this is how it got its name :)PhantomJS also can dynamically capture/render pages as images, allow manipulation of page content/event, gain access to network-level information, and ability to save important infos into files for later processing.I find this book easy to read with only 121 pages, you can actually finish it in one sitting (but would be good to do it in front of your laptop trying out the sample codes yourself). However, it is advised you already have concrete knowledge of Java Scripts, HMTL, and CSS since the book heavily used later technologies such as DOM, JSON, and HTML5.An improvement I guess would be since the goal of the book was to jumpstart IT practitioners into PhantomJS, it would have been better to have laid out the different syntaxes of PhantomJS itself. In Chapter 2, I was surprised to see "===" as the conditional operator. Having used to "==" in Java, I had to check this if it was a typo.The book has more to give than what I expected. All throughout, the book used actual web pages (e.g. Pinterest, Instagram) as examples for us to run our tests, which challenges our creativity and encourages us to further try it out with other pages. Aside from PhantomJS, I liked how in a subtle way, the author introduced other technologies such as Yahoo LocalSearch and Google Directions API (which made me realized how easy it was to use them!). In the last chapter, the book introduced CasperJS - a spin-off extension and further simplified PhantomJS. I find this really exciting!The book is concise and straightforward, I highly encourage everyone doing front-end development to read this and for Technical Architects to consider PhantomJS at production work.On a side note, I noticed that the author used his family member's names all throughout the book, so I guess this was in some way a personal book. Anyway, I am excited for PhantomJS and is looking forward to further playing with it!

Amazon Verified review

Bill Mar 30, 2014

Great book. I had so much pleasure reading it and harness the power of phantomjs. Having a limited time for a project, i couldnt find much info about it even on their main site. This book provided an easy to follow guide. I recommend it!

Geomar Esmilla Feb 10, 2014

Getting Started with PhantomJS is a book about well, getting started. My work for the past 5 years requires me to review application architecture and data protection solutions, needless to say I have not coded in years. And this book just got me started writing codes again. You see, it really gets you started.There are very few books on web development tools these days that could capture my interest and maintain my interest. This book? I like this book. I like the way it's written. You should buy a copy and read it. Why? Because it starts and ends the way I like it. From introducing you to the command line interface (really for geek in all of us who loves that legacy look) and it ends with a shortlist of cool tools that will keep your momentum going.Aries Beltran writes that way that makes me reminisce those moments reading Borland books back in the 80s and 90s. Ahh.. Nostalgia. Simple, and effective in catching the reader's interest as one flips from page to page. This is a 140-page book. It's not long like those 5 inches thick book. But I'm impatient. I thought I'm going to speed read my way through it because I just don't have the patience, but I was wrong. It got me engaged enough to read, really read it from cover to cover. Every 11 chapter provides a newbie to PhantomJS like me a "Eureka" moment. I found myself immerse in Aries Beltran's world. I coded while reading. I coded while I ate (I especially loved Chapter 8: Cookies). And I coded in my sleep. And discovered that I like PhantomJS.The book is packed with cool recommendations:* Check out Confess.JS for metrics (a topic close to my heart)* CasperJS seems like a natural progression from PhantomJSThis book is basically good stuff. It's a book you will revisit again and again. It's a good addition to any developer's and web aficionado's library.Check them out. [...]

Robert Y. De Cruz Feb 10, 2014

PhantomJS makes website UI testing easy. It can be used as an automation tool for repeatedly going through each page, providing a means for testing that a code commit didn't break the website. Full-time testers can use it to automate their work by setting up the test scripts once, then letting PhantomJS do the tedious work of re-running the script every time the same functionality needs to be tested. This is the way UI testing should be done and I wish we had a similar tool/framework for non-Webkit browsers and desktop apps.This book is an excellent tutorial to PhantomJS. After going through the book, I'm able to do simple tasks such as opening web pages, triggering UI elements, etc. Then it shows me how to do the more complex stuff such as taking a screenshot of a web page when it was accessed, reading cookies, and doing unit testing. Software Testers will find this book valuable to their job as it goes through all the functionality you'll need to test most websites. This book will also be useful for Software Devs aiming to add some UI-level unit testing to make sure markup and JS changes don't break any of the existing UI functionality. This is a must-have book for any web development project.I also found the book to be a good reference manual as it's designed to be easy to look up how to do things. The example scripts are simple to follow and can be readily incorporated in your test scripts. It's a valuable tool in the tester's arsenal. Highly recommended for website testers and web developers looking for an easier way to test and automate their web scripts.[...]

Andrew Artajos Feb 08, 2014

At 140 pages and 11 chapters, this book is a quick read. To be honest it looks more like a booklet than a real book. Despite its appearance, it delivers the meat on the chosen subject--Getting started with PhantomJS.It starts you off on basic tasks like installing PhantomJS, working with the commandline, then moves toward the more complex tasks. The author shows you how to download a simple webpage then slowly builds upon this foundation in the succeeding chapters, until you get to learn how to use PhantomJS with different API's, JQuery, Jasmine--a unittesting framework, and CasperJS. There were some parts of the book I rather skip like Capturing Errors, Working with Files, and Cookies. These topics are boring but necessary nonetheless.The book is clear, concise, and uses simple English. This helps the reader go through the book quickly without having to put more effort in understanding the subject.The book in paperback format, although more expensive, is easier to read than the ebook format. When working with the examples, I think it would be more practical to buy the ebook instead. It will be easier to refer to when typing the actual code on the computer.However, this book makes you want more. I want to see in the future a more in depth take on PhantomJS. Most likely a book that discusses the best practices and some programming gems on the topic.Overall, this book is quite informative. It gets your feet wet on PhantomJS. I'm surprised that PhantomJS could open up to a lot of possible avenues to explore. PhantomJS when combined with modules, API's, and other open source software, offers a rich set of building tools for creating the next kick ass browser based solution.as seen on [...]

Getting Started with PhantomJS: Harness the strength and capabilities of PhantomJS to interact with the web and perform website testing with a headless browser based on WebKit

What do you get with Print?

Getting Started with PhantomJS

Chapter 1. Getting Started

Downloading PhantomJS

Tip

Tip

Building PhantomJS from source

Working with PhantomJS

Tip

Tip

PhantomJS JavaScript API

The Module API

The WebPage API

The System API

The FileSystem API

The WebServer API

The phantom object

The command-line arguments

The script argument

The debug option

The cookie-file option

Writing PhantomJS scripts

Summary

Page 1 of 9

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the author

FAQs

Getting Started with PhantomJS: Harness the strength and capabilities of PhantomJS to interact with the web and perform website testing with a headless browser based on WebKit

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access