Node.js Web Development - Third Edition

By David Herron
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    About Node.js
About this book

Node.js is a server-side JavaScript platform using an event driven, non-blocking I/O model allowing users to build fast and scalable data-intensive applications running in real time. Node.js Web Development shows JavaScript is not just for browser-side applications. It can be used for server-side web application development, real-time applications, microservices, and much more.

This book gives you an excellent starting point, bringing you straight to the heart of developing web applications with Node.js. You will progress from a rudimentary knowledge of JavaScript and server-side development to being able to create and maintain your own Node.js application. With this book you'll learn how to use the HTTP Server and Client objects, data storage with both SQL and MongoDB databases, real-time applications with Socket.IO, mobile-first theming with Bootstrap, microservice deployment with Docker, authenticating against third-party services using OAuth, and much more.

Publication date:
June 2016
Publisher
Packt
Pages
376
ISBN
9781785881503

 

Chapter 1. About Node.js

Node.js is an exciting new platform for developing web applications, application servers, any sort of network server or client, and general purpose programming. It is designed for extreme scalability in networked applications through an ingenious combination of server-side JavaScript, asynchronous I/O, asynchronous programming, built around JavaScript anonymous functions, and a single execution thread event-driven architecture.

While only a few years old, Node.js has quickly grown in prominence to where it's playing a significant role. Companies, small and large, are using it for large-scale and small-scale projects. PayPal, for example, has converted many services from Java to Node.js.

The Node.js model is very different from common application server platforms using threads. The claim is that with the single-thread event-driven architecture, memory footprint is low, throughput is high, the latency profile under load is better, and the programming model is simpler. The Node.js platform is in a phase of rapid growth, and many are seeing it as a compelling alternative to the traditional Java, PHP, Python, Ruby on Rails, and so on, approach to building web applications.

At its heart, it is a standalone JavaScript engine with extensions making it suitable for general purpose programming and with a clear focus on application server development. Even though we're comparing Node.js to application server platforms, it is not an application server. Instead, Node.js is a programming runtime akin to Python, Go, or Java SE. There are web application frameworks and application servers written in Node.js, however. In the few years that Node.js has been available, it's quickly gained a significant role, fulfilling the prediction that it could potentially supplant other web application stacks.

It is implemented around a non-blocking I/O event loop and a layer of file and network I/O libraries, all built on top of the V8 JavaScript engine (from the Chrome web browser). At the time of writing this, Microsoft had just proposed a patch to allow Node.js to utilize the ChakraCore JavaScript engine (from the Edge web browser). The theoretical possibility of hosting the Node.js API on top of a different JavaScript engine may come true, in the due course of time. Visit https://github.com/nodejs/node-chakracore to take a look at the project.

The I/O library is general enough to implement any sort of server implementing any TCP or UDP protocol, whether it's DNS, HTTP, IRC, or FTP. While it supports developing servers or clients for any network protocol, its biggest use case is in regular websites in place of technology such as an Apache/PHP or Rails stack or to complement existing websites. For example, adding real-time chat or monitoring existing websites can be easily done with the Socket.IO library for Node.js.

A particularly intriguing combination is deploying small services using Docker into cloud hosting infrastructure. A large application can be divided into what's now called microservices and easily deployed at scale using Docker. The result fits agile project management methods since each microservice can be easily managed by a small team which collaborates at the boundary of their individual API.

This book will give you an introduction to Node.js. We presume the following:

  • You already know how to write software

  • You are familiar with JavaScript

  • You know something about developing web applications in other languages

We will cover the following topics in this chapter:

  • An introduction to Node.js

  • Why you should use Node.js

  • The architecture of Node.js

  • Performance, utilization, and scalability with Node.js

  • Node.js, microservice architecture, and testing

  • Implementing the Twelve-Factor App model with Node.js

We will dive right into developing working applications and recognize that often the best way to learn is by rummaging around in working code.

 

The capabilities of Node.js


Node.js is a platform for writing JavaScript applications outside web browsers. This is not the JavaScript we are familiar with in web browsers! For example, there is no DOM built into Node.js, nor any other browser capability.

Beyond its native ability to execute JavaScript, the bundled modules provide capabilities of this sort:

  • Command-line tools (in shell script style)

  • An interactive-TTY style of program (REPL which stands for Read-Eval-Print Loop)

  • Excellent process control functions to oversee child processes

  • A buffer object to deal with binary data

  • TCP or UDP sockets with comprehensive event-driven callbacks

  • DNS lookup

  • An HTTP and HTTPS client/server layered on top of the TCP library filesystem access

  • Built-in rudimentary unit testing support through assertions

The network layer of Node.js is low level while being simple to use. For example, the HTTP modules allow you to write an HTTP server (or client) using a few lines of code. This is powerful, but it puts you, the programmer, very close to the protocol requests and makes you implement precisely those HTTP headers that you should return in request responses.

In other words, it's very easy to write an HTTP server in Node.js, but the typical web application developer doesn't need to work at that level of detail. For example, PHP coders assume that Apache is already there, and that they don't have to implement the HTTP server portion of the stack. The Node.js community has developed a wide range of web application frameworks such as Express, allowing developers to quickly configure an HTTP server that provides all of the basics we've come to expect—sessions, cookies, serving static files, logging, and so on—thus letting developers focus on their business logic.

Server-side JavaScript

Quit scratching your head already. Of course you're doing it, scratching your head and mumbling to yourself, "What's a browser language doing on the server?". In truth, JavaScript has a long and largely unknown history outside the browser. JavaScript is a programming language, just like any other language, and the better question to ask is "Why should JavaScript remain trapped inside browsers?".

Back in the dawn of the web age, the tools for writing web applications were at a fledgling stage. Some were experimenting with Perl or TCL to write CGI scripts, and the PHP and Java languages had just been developed. Even then, JavaScript saw use on the server side. One early web application server was Netscape's LiveWire server, which used JavaScript. Some versions of Microsoft's ASP used JScript, their version of JavaScript. A more recent server-side JavaScript project is the RingoJS application framework in the Java universe In other words, JavaScript outside the browser is not a new thing, even if it is uncommon.

 

Why should you use Node.js?


Among the many available web application development platforms, why should you chose Node.js? There are many stacks to choose from; What is it about Node.js that makes it rise above the others? We will see in the following sections.

Popularity

Node.js is quickly becoming a popular development platform with adoption from plenty of big and small players. One of those is PayPal, who are replacing their incumbent Java-based system with one written in Node.js. For PayPal's blog post about this, visit https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/. Other large Node.js adopters include Walmart's online e-commerce platform, LinkedIn, and eBay.

Since we shouldn't just follow the crowd, let's look at technical reasons to adopt Node.js.

JavaScript at all levels of the stack

Having the same programming language on the server and client has been a long time dream on the web. This dream dates back to the early days of Java, where Java applets were to be the frontend to server applications written in Java, and JavaScript was originally envisioned as a lightweight scripting language for those applets. Java never fulfilled the hype and we ended up with JavaScript as the principle in-browser client-side language, rather than Java. With Node.js we may finally be able to implement applications with the same programming language on the client and server by having JavaScript at both ends of the web, in the browser and server.

A common language for frontend and backend offers several potential wins:

  • The same programming staff can work on both ends of the wire

  • Code can be migrated between server and client more easily

  • Common data formats (JSON) exist between server and client

  • Common software tools exist for server and client

  • Common testing or quality reporting tools for server and client

  • When writing web applications, view templates can be used on both sides

The JavaScript language is very popular due to its ubiquity in web browsers. It compares favorably against other languages while having many modern advanced language concepts. Thanks to its popularity, there is a deep talent pool of experienced JavaScript programmers out there.

Leveraging Google's investment in V8

To make Chrome a popular and excellent web browser, Google invested in making V8 a super-fast JavaScript engine. The competition to make the best web browser leads Google to keep on improving V8. As a result, Node.js programmers automatically win as each V8 iteration ratchets up performance and capabilities.

The Node.js community may change things to utilize any JavaScript engine, in case another one ends up surpassing V8.

Leaner asynchronous event-driven model

We'll get into this later. The Node.js architecture, a single execution thread and a fast JavaScript engine, has less overhead than thread-based architectures.

Microservice architecture

A new hotness in software development is the microservice idea. Node.js is an excellent platform for implementing microservices. We'll get into this later.

The Node.js is stronger for having survived a major schism and hostile fork

During 2014 and 2015, the Node.js community faced a major split over policy, direction, and control. The io.js project was a hostile fork driven by a group who wanted to incorporate several features and change who's in control of making decisions. What resulted is a merge of the Node.js and io.js repositories, an independent Node.js foundation to run the show, and the community is working together to move forward in a common direction.

Threaded versus event-driven architecture Node.js's blistering performance is said to be because of its asynchronous event-driven architecture, and its use of the V8 JavaScript engine. That's a nice thing to say, but what's the rationale for the statement?

The normal application server model uses blocking I/O to retrieve data, and it uses threads for concurrency. Blocking I/O causes threads to wait, causing a churn between threads as they are forced to wait on I/O while the application server handles requests. Threads add complexity to the application server as well as server overhead.

Node.js has a single execution thread with no waiting on I/O or context switching. Instead, there is an event loop looking for events and dispatching them to handler functions. The paradigm is that any operation that would block or otherwise take time to complete must use the asynchronous model. These functions are to be given an anonymous function to act as a handler callback, or else (with the advent of ES2015 promises), the function would return a Promise. The handler function, or promise, is invoked when the operation is complete. In the meantime, control returns to the event loop, which continues dispatching events.

To help us wrap our heads around this, Ryan Dahl, the creator of Node.js, (in his Cinco de Node presentation) asked us what happens while executing a line of code like this:

result = query('SELECT * from db');
// operate on the result

Of course, the program pauses at that point while the database layer sends the query to the database, which determines the result and returns the data. Depending on the query, that pause can be quite long. Well, a few milliseconds, which is an eon in computer time. This pause is bad because while the entire thread is idling, another request might come in and need to be handled. This is where a thread-based server architecture would need to make a thread context switch. The more outstanding connections to the server, the greater the number of thread context switches. Context switching is not free because more threads requires more memory for per-thread state and more time for the CPU to spend on thread management overhead.

Simply using an asynchronous event-driven I/O, Node.js removes most of this overhead while introducing very little of its own.

Using threads to implement concurrency often comes with admonitions like these: expensive and error-prone, the error-prone synchronization primitives of Java, or designing concurrent software can be complex and error prone. The complexity comes from the access to shared variables and various strategies to avoid deadlock and competition between threads. The synchronization primitives of Java are an example of such a strategy, and obviously many programmers find them difficult to use. There's the tendency to create frameworks such as java.util.concurrent to tame the complexity of threaded concurrency, but some might argue that papering over complexity does not make things simpler.

Node.js asks us to think differently about concurrency. Callbacks fired asynchronously from an event loop are a much simpler concurrency model—simpler to understand, and simpler to implement.

Ryan Dahl points to the relative access time of objects to understand the need for asynchronous I/O. Objects in memory are more quickly accessed (on the order of nanoseconds) than objects on disk or objects retrieved over the network (milliseconds or seconds). The longer access time for external objects is measured in zillions of clock cycles, which can be an eternity when your customer is sitting at their web browser ready to move on if it takes longer than two seconds to load the page.

In Node.js, the query discussed previously will read as follows:

query('SELECT * from db', function (err, result) {
    if (err) throw err; // handle errors
    // operate on result
});

Or if written with an ES2015 Promise:

query('SELECT * from db')
.then(result => {
    // operate on result
})
.catch(err => {
    // handle errors
});

This code performs the same query written earlier. The difference is that the query result is not the result of the function call, but it is provided to a callback function that will be called later. The order of execution is not one line after another, but it is instead determined by the order of callback function execution.

Once the call to the query function finishes, control will return almost immediately to the event loop, which goes on to servicing other requests. One of those requests will be the response to the query, which invokes the callback function.

Commonly, web pages bring together data from dozens of sources. Each one has a query and response as discussed earlier. Using asynchronous queries, each one can happen in parallel, where the page construction function can fire off dozens of queries—no waiting, each with their own callback—and then go back to the event loop, invoking the callbacks as each is done. Because it's in parallel, the data can be collected much more quickly than if these queries were done synchronously one at a time. Now, the reader on the web browser is happier because the page loads more quickly.

Performance and utilization

Some of the excitement over Node.js is due to its throughput (the requests per second it can serve). Comparative benchmarks of similar applications, for example, Apache show that Node.js has tremendous performance gains.

One benchmark going around is this simple HTTP server (borrowed from https://nodejs.org/en/), which simply returns a "Hello World" message directly from memory:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(8124, "127.0.0.1");
console.log('Server running at http://127.0.0.1:8124/');

This is one of the simpler web servers one can build with Node.js. The http object encapsulates the HTTP protocol, and its http.createServer method creates a whole web server, listening on the port specified in the listen method. Every request (whether a GET or POST on any URL) on that web server calls the provided function. It is very simple and lightweight. In this case, regardless of the URL, it returns a simple text/plain Hello World response.

Ryan Dahl (Node.js's original author) showed a simple benchmark (http://nodejs.org/cinco_de_node.pdf) that returned a 1-megabyte binary buffer; Node.js gave 822 req/sec while Nginx gave 708 req/sec, for a 15% improvement over Nginx. He also noted that Nginx peaked at 4 megabytes memory, while Node.js peaked at 64 megabytes.

Yahoo! search engineer Fabian Frank published a performance case study of a real-world search query suggestion widget implemented with Apache/PHP and two variants of Node.js stacks (http://www.slideshare.net/FabianFrankDe/nodejs-performance-case-study). The application is a pop-up panel showing search suggestions as the user types in phrases, using a JSON-based HTTP query. The Node.js version could handle eight times the number of requests per second with the same request latency. Fabian Frank said both Node.js stacks scaled linearly until CPU usage hit 100%. In another presentation (http://www.slideshare.net/FabianFrankDe/yahoo-scale-nodejs), he discussed how Yahoo!Axis is running on Manhattan + Mojito and the value of being able to use the same language (JavaScript) and framework (YUI/YQL) on both frontend and backend.

LinkedIn did a massive overhaul of their mobile app using Node.js for the server-side to replace an old Ruby on Rails app. The switch let them move from 30 servers down to three, and allowed them to merge the frontend and backend team because everything was written in JavaScript. Before choosing Node.js, they'd evaluated Rails with Event Machine, Python with Twisted, and Node.js, choosing Node.js for the reasons that we just discussed. For a look at what LinkedIn did, see http://arstechnica.com/information-technology/2012/10/a-behind-the-scenes-look-at-linkedins-mobile-engineering/.

Mikito Takada blogged about benchmarking and performance improvements in a 48 hour hackathon application (http://blog.mixu.net/2011/01/17/performance-benchmarking-the-node-js-backend-of-our-48h-product-wehearvoices-net/) he built comparing Node.js with what he claims is a similar application written with Django (a web application framework for Python). The unoptimized Node.js version is quite a bit slower (in response time) than the Django version but a few optimizations (MySQL connection pooling, caching, and so on) made drastic performance improvements handily beating out Django.

Is Node.js a cancerous scalability disaster?

In October 2011, software developer and blogger Ted Dziuba wrote an infamous blog post (since pulled from his blog) claiming that Node.js is a cancer, calling it a "scalability disaster". The example he showed for proof is a CPU-bound implementation of the Fibonacci sequence algorithm. While his argument was flawed, he raised a valid point that Node.js application developers have to consider—where do you put the heavy computational tasks?

A key to maintaining high throughput of Node.js applications is ensuring that events are handled quickly. Because it uses a single execution thread, if that thread is bogged down with a big calculation, it cannot handle events, and the system performance will suffer.

The Fibonacci sequence, serving as a stand-in for heavy computational tasks, quickly becomes computationally expensive to calculate, especially for a naïve implementation like this:

var fibonacci = exports.fibonacci = function(n) {
    if (n === 1 || n === 2)
        return 1;
    else
        return fibonacci(n-1) + fibonacci(n-2);
}

Yes, there are many ways to calculate Fibonacci numbers more quickly. We are showing this as a general example of what happens to Node.js when event handlers are slow, and not to debate the best ways to calculate mathematics functions:

var http = require('http');
var url  = require('url');

var fibonacci = // as above

http.createServer(function (req, res) {
  var urlP = url.parse(req.url, true);
  var fibo;
  res.writeHead(200, {'Content-Type': 'text/plain'});
  if (urlP.query['n']) {
    fibo = fibonacci(urlP.query['n']);
    res.end('Fibonacci '+ urlP.query['n'] +'='+ fibo);
  } else {
    res.end('USAGE: http://127.0.0.1:8124?n=## where ## is the Fibonacci number desired');
  }
}).listen(8124, '127.0.0.1');
console.log('Server running at http://127.0.0.1:8124');

If you call this from the request handler in a Node.js HTTP server, for sufficiently large values of n (for example, 40), the server becomes completely unresponsive because the event loop is not running, as this function is grinding through the calculation.

Does this mean that Node.js is a flawed platform? No, it just means that the programmer must take care to identify code with long-running computations and develop a solution. The possible solutions include rewriting the algorithm to work with the event loop or to foist computationally expensive calculations to a backend server.

A simple rewrite dispatches the computations through the event loop, letting the server continue handling requests on the event loop. Using callbacks and closures (anonymous functions), we're able to maintain asynchronous I/O and concurrency promises:

var fibonacciAsync = exports.fibonacciAsync = function(n, done) {
    if (n === 1 || n === 2) done(1);
    else {
        process.nextTick(function() {
            fibonacciAsync(n-1, function(val1) {
                process.nextTick(function() {
                    fibonacciAsync(n-2, function(val2) {
                        done(val1+val2);
                    });
                });
            });
        });
    }
}

Dziuba's valid point wasn't expressed well in his blog post, and it was somewhat lost in the flames following that post. Namely, that while Node.js is a great platform for I/O-bound applications, it isn't a good platform for computationally intensive ones.

Server utilization, the bottom line, and green web hosting

The striving for optimal efficiency (handling more requests per second) is not just about the geeky satisfaction that comes from optimization. There are real business and environmental benefits. Handling more requests per second, as Node.js servers can do, means the difference between buying lots of servers and buying only a few servers. Node.js can let your organization do more with less.

Roughly speaking, the more servers you buy, the greater the cost, and the greater the environmental impact. There's a whole field of expertise around reducing cost and the environmental impact of running web server facilities, to which that rough guideline doesn't do justice. The goal is fairly obvious—fewer servers, lower costs, and lower environmental impact.

Intel's paper, Increasing Data Center Efficiency with Server Power Measurements (http://download.intelintel.com/it/pdf/Server_Power_Measurement_final.pdf), gives an objective framework for understanding efficiency and data center costs. There are many factors such as buildings, cooling systems, and computer system designs. Efficient building design, efficient cooling systems, and efficient computer systems (datacenter efficiency, datacenter density, and storage density) can decrease costs and environmental impact. But you can destroy those gains by deploying an inefficient software stack compelling you to buy more servers than you would if you had an efficient software stack. Alternatively, you can amplify gains from datacenter efficiency with an efficient software stack.

This talk about efficient software stacks isn't just for altruistic environmental purposes. This is one of those cases where being green can help your business bottom line.

Node.js, the microservice architecture, and easily testable systems

New capabilities such as cloud deployment systems and Docker make it possible to implement a new kind of service architecture. Docker makes it possible to define server process configuration in a repeatable container that's easy to deploy by the millions into a cloud hosting system. It lends itself best to small single-purpose service instances that can be connected together to make a complete system. Docker isn't the only tool to help simplify cloud deployments; however, its features are well attuned to modern application deployment needs.

Some have popularized the microservice concept as a way to describe this kind of system. According to the microservices.io website, a microservice consists of a set of narrowly focused, independently deployable services. They contrast this with the monolithic application deployment pattern where every aspect of the system is integrated into one bundle (such as a single WAR file for a Java EE appserver). The microservice model gives developers much needed flexibility.

Some advantages of microservices are as follows:

  • Each microservice can be managed by a small team

  • Each team can work on its own schedule, so long as the service API compatibility is maintained

  • Microservices can be deployed independently, such as for easier testing

  • It's easier to switch technology stack choices

Where does Node.js fit with this? Its design fits the microservice model like a glove:

  • Node.js encourages small, tightly focused, single purpose modules

  • These modules are composed into an application by the excellent npm package management system

  • Publishing modules is incredibly simple, whether via the NPM repository or a Git URL

Node.js and the Twelve-Factor app model

Throughout this book, we'll call out aspects of the Twelve-Factor application model, and ways to implement those ideas in Node.js. This model is published on http://12factor.net, and it is a set of guidelines for application deployment in the modern cloud computing era.

The guidelines are straightforward, and once you read them, they seem like pure common sense. As a best practice, the Twelve-Factor model is a compelling strategy for delivering the kind of fluid self-contained cloud deployed applications called for by our current computing environment.

 

Summary


You learned a lot in this chapter. Specifically, you saw that JavaScript has a life outside web browsers and you learned about the difference between asynchronous and blocking I/O. We then covered the attributes of Node.js and where it fits in the overall web application platform market and threaded versus asynchronous software. Lastly, we saw the advantages of fast event-driven asynchronous I/O, coupled with a language with great support for anonymous closures.

Our focus in this book is real-world considerations of developing and deploying Node.js applications. We'll cover as many aspects as we can of developing, refining, testing, and deploying Node.js applications.

Now that we've had this introduction to Node.js, we're ready to dive in and start using it. In Chapter 2, Setting up Node.js, we'll go over setting up a Node.js environment, so let's get started.

About the Author
  • David Herron

    David Herron is a software engineer living in Silicon Valley who has worked on projects ranging from an X.400 email server to being part of the team that launched the OpenJDK project, to Yahoo's Node.js application-hosting platform, and a solar array performance monitoring service. That took David through several companies until he grew tired of communicating primarily with machines, and developed a longing for human communication. Today, David is an independent writer of books and blog posts covering topics related to technology, programming, electric vehicles, and clean energy technologies.

    Browse publications by this author
Latest Reviews (10 reviews total)
Хорошая книга, но немного не тот материал который мне сейчас нужен. Качество изложения не дотягивает в том числе до 5 звезд.
I got the $5 purchase and was a good deal for me. It helped
Making everything available during special sales.
Node.js Web Development - Third Edition
Unlock this book and the full library FREE for 7 days
Start now