Home Web Development Mastering Node.js - Second Edition

Mastering Node.js - Second Edition

By Sandro Pasquali , Kevin Faaborg
books-svg-icon Book
eBook $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Understanding the Node Environment
About this book
Node.js, a modern development environment that enables developers to write server- and client-side code with JavaScript, thus becoming a popular choice among developers. This book covers the features of Node that are especially helpful to developers creating highly concurrent real-time applications. It takes you on a tour of Node's innovative event non-blocking design, showing you how to build professional applications. This edition has been updated to cover the latest features of Node 9 and ES6. All code examples and demo applications have been completely rewritten using the latest techniques, introducing Promises, functional programming, async/await, and other cutting-edge patterns for writing JavaScript code. Learn how to use microservices to simplify the design and composition of distributed systems. From building serverless cloud functions to native C++ plugins, from chatbots to massively scalable SMS-driven applications, you'll be prepared for building the next generation of distributed software. By the end of this book, you'll be building better Node applications more quickly, with less code and more power, and know how to run them at scale in production environments.
Publication date:
December 2017
Publisher
Packt
Pages
498
ISBN
9781785888960

 

Understanding the Node Environment

 

Introduction – JavaScript as a systems language

When John Bardeen, Walter Brattain, and William Shockley invented the transistor in 1947, they changed the world in ways we are still discovering today. From their revolutionary building block, engineers could design and manufacture digital circuits far more complex than those possible earlier. Each decade that followed has seen a new generation of these devices: smaller, faster, and cheaper, often by orders of magnitude.

By the 1970s, corporations and universities could afford mainframe computers small enough to fit in a single room, and powerful enough that they could serve multiple users simultaneously. The minicomputer, a new and different kind of device, needed new and different kinds of technologies to help users get the most out of the machine. Ken Thompson and Dennis Ritchie at Bell Labs developed the operating system Unix, and the programming language C to write it. They built constructs into their system, like processes, threads, streams, and the hierarchical filesystem. Today, these constructs are so familiar, that it's hard to imagine a computer working any other way. However, they're just constructs, made up by these pioneers, with the goal of helping people like us understand the otherwise inscrutable patterns of data in memory and storage inside the machine.

C is a systems language, and it is a safe and powerful shorthand alternative for developers familiar with keying in assembly instructions. Given its familiar setting of a microprocessor, C makes low-level system tasks easy. For instance, you can search a block of memory for a byte of a specific value:

// find-byte.c 
int find_byte(const char *buffer, int size, const char b) {
for (int i = 0; i < size; i++) {
if (buffer[i] == b) {
return i;
}
}
return -1;
}

By the 1990s, what we could build with transistors had evolved again. A personal computer (PC) was light and cheap enough to be found on workplace and dormitory desktops. Increased speed and capacity allowed users to boot from a character-only teletype to graphical environments, with pretty fonts and color images. And with an Ethernet card and cable, your computer got a static IP address on the internet, where network programs could connect to send and receive data with any other computer on the planet.

It was within this landscape of technology that Sir Tim Berners-Lee invented the World Wide Web, and Brendan Eich created JavaScript. Designed for coders familiar with HTML tags, JavaScript was a way to move beyond static pages of text with animation and interactivity. Given its familiar setting of a webpage, JavaScript makes high-level tasks easy. Web pages are filled with text and tags, so combining two strings is easy:

// combine-text.js
const s1 = "first string";
const s2 = "second string";
let s3 = s1 + s2;

Now, let's port each program to the other language and platform. First, from the preceding combine-text.js, let's write combine-text.c:

// combine-text.c 
const char *s1 = "first string";
const char *s2 = "second string";
int size = strlen(s1) + strlen(s2);
char *buffer = (char *)malloc(size + 1); // One more for the 0x00 byte that terminates strings
strcpy(buffer, s1);
strcat(buffer, s2);
free(buffer); // Never forget to free memory!

The two string literals are easy to define, but after that, it gets a lot harder. Without automatic memory management, it's your responsibility as a developer to determine how much memory you need, allocate it from the system, write to it without overwriting the buffer, and then free it afterwards.

Secondly, let's attempt the reverse: from the find-byte.c code prior, let's write find-byte.js. Before Node, it was not possible to use JavaScript to search a block of memory for a specific byte. In the browser, JavaScript can't allocate a buffer, and doesn't even have a type for byte. But with Node, it's both possible and easy:

// find-byte.js
function find_byte(buffer, b) {
let i;
for (i = 0; i < buffer.length; i++) {
if (buffer[i] == b) {
return i;
}
}
return -1; // Not found
}
let buffer = Buffer.from("ascii A is byte value sixty-five", "utf8");
let r = find_byte(buffer, 65); // Find the first byte with value 65
console.log(r); // 6 bytes into the buffer

Emerging from generations of computing decades apart, when both computers and what people were doing with them were wildly different, there's no real reason the design, purpose, or use that drives these two languages, C and JavaScript, should necessarily come together. But they did, because in 2008 Google released Chrome, and in 2009, Ryan Dahl wrote Node.js.

Applying design principles previously only considered for operating systems. Chrome uses multiple processes to render different tabs, ensuring their isolation. Chrome was released open source and built on WebKit, but one part inside was completely new. Coding from scratch in his farmhouse in Denmark, Lars Bak's V8 used hidden class transitions, incremental garbage collection, and dynamic code generation to execute (not interpret) JavaScript faster than ever before.

With V8 under the hood, how fast can Node run JavaScript? Let's write a little program to show execution speed:

// speed-loop.js
function main() {
const cycles = 1000000000;
let start = Date.now();
for (let i = 0; i < cycles; i++) {
/* Empty loop */
}
let end = Date.now();
let duration = (end - start) / 1000;
console.log("JavaScript looped %d times in %d seconds", cycles, duration);
}
main();

The following is the output for speed-loop.js:

$ node --version
v9.3.0
$ node speed-loop.js
JavaScript looped 1000000000 times in 0.635 seconds

There's no code in the body of the for loop, but your processor is busy incrementing i, comparing it to cycles, and repeating the process. It's late 2017 as I write this, typing on a MacBook Pro with a 2.8 GHz Intel Core i7 processor. Node v9.3.0 is current, and takes less than a second to loop a billion times.

How fast is pure C? Let's see:

/* speed-loop.c */
#include <stdio.h>
#include <time.h>
int main() {
int cycles = 1000000000;
clock_t start, end;
double duration;
start = clock();
for (int i = 0; i < cycles; i++) {
/* Empty loop */
}
end = clock();
duration = ((double)(end - start)) / CLOCKS_PER_SEC;
printf("C looped %d times in %lf seconds\n", cycles,duration);
return 0;
}

The following is the output for speed-loop.c:

$ gcc --version
Apple LLVM version 8.1.0 (clang-802.0.42)
$ gcc speed-loop.c -o speed-loop
$ ./speed-loop
C looped 1000000000 times in 2.398294 seconds

For additional comparison, let's try an interpreted language, like Python:

# speed-loop.py

import time

def main():

cycles = 1000000000
start = time.perf_counter()

for i in range(0, cycles):
pass # Empty loop

end = time.perf_counter()
duration = end - start
print("Python looped %d times in %.3f seconds" % (cycles, duration))

main()

The following is the output for speed-loop.py:

$ python3 --version
Python 3.6.1
$ python3 speed-loop.py
Python looped 1000000000 times in 31.096 seconds

Node runs code fast enough so that you don't have to worry that your application might be slowed down by the execution speed. You'll still have to think about performance, of course, but constrained by factors beyond language and platform choice, such as algorithms, I/O, and external processes, services, and APIs. As V8 compiles JavaScript rather than interpreting it, Node lets you enjoy high-level language features like automatic memory management and dynamic types, without having to give up the performance of a natively-compiled binary. Earlier, you had to choose one or the other; but now, you can have both. It's great.

Computing in the 1970s was about the microprocessor, and computing in the 1990s was about the web page. Today, in 2017, another new generation of physical computing technology has once again changed our machines. The smartphone in your pocket communicates wirelessly with scalable, pay-as-you-go software services in the cloud. Those services run on virtualized instances of Unix, which in turn run on physical hardware in data centers, some of which are so large they were strategically placed to draw current from a neighboring hydroelectric dam. With such new and different machines as these, we shouldn't be surprised that what's possible for users and what's necessary for developers is also new and different, once again.

Node.js imagines JavaScript as a systems language, like C. On the page, JavaScript can manipulate headers and styles. As a systems language, JavaScript can manipulate memory buffers, processes and streams, and files and sockets. This anachronism, made possible by the performance V8 gives the language, sends it back two decades, transplanting it from the web page to the microprocessor die.

"Node's goal is to provide an easy way to build scalable network programs."
– Ryan Dahl, creator of Node.js

In this book, we will study the techniques professional Node developers use to tackle the software challenges of today. By mastering Node, you are learning how to build the next generation of software. In this chapter, we will explore how a Node application is designed, the shape and texture of its footprint on a server, and the powerful base set of tools and features Node provides for developers. Throughout, we will examine progressively more intricate examples demonstrating how Node's simple, comprehensive, and consistent architecture solves many difficult problems well.

The Unix design philosophy

As a network application scales, the volume of information it must recognize, organize, and maintain increases. This volume, in terms of I/O streams, memory usage, and processor load, expands as more clients connect. This expansion of information volume also burdens the software developer. Scaling issues appear, usually demonstrating a failure to accurately predict the behavior of a large system from the behavior of its smaller predecessors:

  • Can a data layer designed for storing a few thousand records accommodate a few million?
  • Are the algorithms used to search a handful of records efficient enough to search many more?
  • Can this server handle 10,000 simultaneous client connections?

The edge of innovation is sharp and cuts quickly, presenting less time for deliberation precisely when the cost of error is magnified. The shape of objects comprising the whole of an application becomes amorphous and difficult to understand, particularly as ad hoc modifications are made, reactively, in response to dynamic tension in the system. What is described in a specification as a small subsystem may have been patched into so many other systems, that its actual boundaries are misunderstood. When this happens, it becomes impossible to accurately trace the outline of the composite parts of the whole.

Eventually, an application becomes unpredictable. It is dangerous when one cannot predict all future states of an application, or the side effects of change. Any number of servers, programming languages, hardware architectures, management styles, and so on, have attempted to subdue the intractable problem of risk following growth, of failure menacing success. Oftentimes, systems of even greater complexity are sold as the cure. The hold that any one person has on information is tenuous. Complexity follows scale; confusion follows complexity. As resolution blurs, errors happen.

Node chose clarity and simplicity instead, echoing a philosophy from decades earlier:

"Write programs that do one thing and do it well.
Write programs to work together.
Write programs to handle text streams, because that is a universal interface."
-Peter H. Salus, A Quarter-Century of Unix, 1994

From their experiences creating and maintaining Unix, Ken Thompson and Dennis Ritchie came up with a philosophy for how people should best build software. Using this philosophy as his guide, Ryan Dahl made a number of decisions in the design of Node:

  • Node's design favors simplicity over complexity
  • Node uses familiar POSIX APIs, rather than attempting an improvement
  • Node does everything with events, and doesn't need threads
  • Node leverages the existing C libraries, rather than trying to reimplement their functionality
  • Node favors text over binary formats

Text streams are the language of Unix programs. JavaScript got good at manipulating text from its beginning as a web scripting language. It's a natural fit.

POSIX

POSIX, the Portable Operating System Interface, defines the standard APIs for Unix. It's adopted in Unix-based operating systems and beyond. The IEEE created and maintains the POSIX standard to enable systems from different manufacturers to be compatible. Write your C program using POSIX APIs on your laptop running macOS, and you'll have an easier time later building it on a Raspberry Pi.

As a common denominator, POSIX is old, simple, and most importantly, well-known to developers of all stripes. To make a new directory in a C program, use this API:

int mkdir(const char *path, mode_t mode);

And here it is in Node:

fs.mkdir(path[, mode], callback)

The Node documentation for the filesystem module starts out by telling the developer, there's nothing new here:

File I/O is provided by simple wrappers around standard POSIX functions.
https://nodejs.org/api/fs.html

For Node, Ryan Dahl implemented proven POSIX APIs, rather than trying to come up with something on his own. While such an attempt might be better in some ways, or some situations, it would lose the instant familiarity that POSIX gives to new Node developers trained in other systems.

In choosing POSIX for the API, Node is in no way limited to the standards from the 1970s. It's easy for anyone to write their own module that calls down to Node's API, while presenting a different one upwards. These fancier alternatives can then compete in a Darwinian quest to prove themselves better than POSIX.

Events for everything

If a program asks the operating system to open a file on the disk, that task might complete right away. Or, it might take a moment for the disk to spin up, or for other file system activity the operating system is working on to finish before it can perform this new request. Tasks that go beyond manipulating the memory of our application's process space to more distant hardware in the computer, network, and internet are not fast or reliable enough to program in the same way. Software designers needed a way to code these tasks, which can be slow and unreliable, without making their applications slow and unreliable as a whole. For systems programmers using languages like C and Java, the standard and accepted tool to use to solve this problem is the thread.

pthread_t my_thread;
int x = 0;
/* Make a thread and have it run my_function(&x) */
pthread_create(&my_thread, NULL, my_function, &x);

If a program asks the user a question, the user might respond right away. Or, the user may take a moment to think before clicking Yes or No. For web developers using HTML and JavaScript, the way to do this is the event as follows:

<button onclick="myFunction()">Click me</button>

At first glance, these two scenarios may seem completely distinct:

  • In the first, a low-level system is shuttling blocks of memory from program to program, with delays milliseconds can be too big to measure
  • In the second, the very top surface of a huge stack of software is asking the user a question

Conceptually, however, they're the same. Node's design realizes this, and uses events for both. In Node, there is one thread, bound to an event loop. Deferred tasks are encapsulated, entering and exiting the execution context via callbacks. I/O operations generate evented data streams, and these are piped through a single stack. Concurrency is managed by the system, abstracting thread pools, and simplifying shared access to memory.

Node showed us that JavaScript doesn't need threads to be useful as a systems language. Additionally, by not having threads, JavaScript and Node avoid concurrency issues that create performance and reliability challenges that developers expert in a code base can still have difficulty reasoning about. In Chapter 2, Understanding Asynchronous Event-Driven Programming, we'll go deeper into events, and the event loop.

 

Standard libraries

Node is built on standard open source C libraries. For example, the TLS and SSL protocols are implemented by OpenSSL. More than just adopting an API, the C source code of OpenSSL is included and complied into Node. When your JavaScript program hashes a cryptographic key, it's not JavaScript that's actually doing the work. Your JavaScript, run by Node, has called down to the C code of OpenSSL. Essentially, you are scripting the native library.

This design choice of using the existing and proven open source libraries helped Node in a number of ways:

  • It meant that Node could arrive on the scene very rapidly, with the core set of functionality systems programmers needed and expected already there
  • It ensures performance, reliability, and security continues to match the libraries
  • It also didn't break cross-platform use, as all of these C libraries have been written and maintained to compile for different architectures for years

Previous platforms and languages have made a different choice in trying to achieve software portability. The 100% Pure Java™ Standard, for instance, was a Sun Microsystems initiative to promote the development of portable applications. Rather than leveraging the existing code in a hybrid stack, it encouraged developers to rewrite everything in Java. Developers had to keep features, performance, and security up to the standard by writing and testing new code. Node, on the other hand, picked a design that gets this all for free.

 

Extending JavaScript

When he designed Node, JavaScript was not Ryan Dahl's original language choice. Yet, upon exploration, he found a good modern language without opinions on streams, the filesystem, handling binary objects, processes, networking, and other capabilities one would expect to exist in a systems language. JavaScript, strictly limited to the browser, had no use for, and had not implemented, these features.

Guided by the Unix philosophy, Dahl was guided by a few rigid principles:

  • A Node program/process runs on a single thread, ordering execution through an event loop
  • Web applications are I/O intensive, so the focus should be on making I/O fast
  • Program flow is always directed through asynchronous callbacks
  • Expensive CPU operations should be split off into separate parallel processes, emitting events as results arrive
  • Complex programs should be assembled from simpler programs

The general principle is, operations must never block. Node's desire for speed (high concurrency) and efficiency (minimal resource usage) demands the reduction of waste. A waiting process is a wasteful process, especially when waiting for I/O.

JavaScript's asynchronous, event-driven design fits neatly into this model. Applications express interest in some future event, and are notified when that event occurs. This common JavaScript pattern should be familiar to you:

Window.onload = function() {
// When all requested document resources are loaded,
// do something with the resulting environment
}
element.onclick = function() {
// Do something when the user clicks on this element
}

The time it will take for an I/O action to complete is unknown, so the pattern is to ask for notification when an I/O event is emitted, whenever that may be, allowing other operations to be completed in the meantime.

Node adds an enormous amount of new functionality to JavaScript. Primarily, the additions provide evented I/O libraries offering the developer system access not available to browser-based JavaScript, such as writing to the filesystem or opening another system process. Additionally, the environment is designed to be modular, allowing complex programs to be assembled out of smaller and simpler components.

Let's look at how Node imported JavaScript's event model, extended it, and used it in the creation of interfaces to powerful system commands.

Events

Many functions in the Node API emit events. These events are instances of events.EventEmitter. Any object can extend EventEmitter, providing Node developers with a simple and uniform way to build tight, asynchronous interfaces to object methods.

The following code sets Node's EventEmitter object as the prototype of a function constructor we define. Each constructed instance has the EventEmitter object exposed to its prototype chain, providing a natural reference to the event API. The counter instance methods emit events, and code after that listens for them. After making a Counter, we listen for the incremented event, specifying a callback Node will call when the event happens. Then, we call the increment twice. Each time, our Counter increments the internal count it holds, and then emits the incremented event. This calls our callback, giving it the current count, which our callback logs:

// File counter.js
// Load Node's 'events' module, and point directly to EventEmitter there
const EventEmitter = require('events').EventEmitter;
// Define our Counter function
const Counter = function(i) { // Takes a starting number
this.increment = function() { // The counter's increment method
i++; // Increment the count we hold
this.emit('incremented', i); // Emit an event named incremented
}
}
// Base our Counter on Node's EventEmitter
Counter.prototype = new EventEmitter(); // We did this afterwards, not before!
// Now that we've defined our objects, let's see them in action
// Make a new Counter starting at 10
const counter = new Counter(10);
// Define a callback function which logs the number n you give it
const callback = function(n) {
console.log(n);
}
// Counter is an EventEmitter, so it comes with addListener
counter.addListener('incremented', callback);
counter.increment(); // 11
counter.increment(); // 12

The following is the output for counter.js:

$ node counter.js
11
12

To remove the event listeners bound to counter, use this:

counter.removeListener('incremented', callback).

For consistency with browser-based JavaScript, counter.on and counter.addListener are interchangeable.

Node brought EventEmitter to JavaScript and made it an object your objects can extend. This greatly increases the possibilities available to developers. With EventEmitter, Node can handle I/O data streams in an event-oriented manner, performing long-running tasks while keeping true to Node's principles of asynchronous, non-blocking programming:

// File stream.js
// Use Node's stream module, and get Readable inside
let Readable = require('stream').Readable;
// Make our own readable stream, named r
let r = new Readable;
// Start the count at 0
let count = 0;
// Downstream code will call r's _read function when it wants some data from r
r._read = function() {
count++;
if (count > 10) { // After our count has grown beyond 10
return r.push(null); // Push null downstream to signal we've got no more data
}
setTimeout(() => r.push(count + '\n'), 500); // A half second from now, push our count on a line
};
// Have our readable send the data it produces to standard out
r.pipe(process.stdout);

The following is the output for stream.js:

$ node stream.js
1
2
3
4
5
6
7
8
9
10

This example creates a readable stream r, and pipes its output to the standard out. Every 500 milliseconds, code increments a counter and pushes a line of text with the current count downstream. Try running the program yourself, and you'll see the series of numbers appear on your terminal.

On what would be the 11th count, r pushes null downstream, indicating that it has no more data to send. This closes the stream, and with nothing more to do, Node exits the process.

Subsequent chapters will explain streams in more detail. Here, just note how pushing data onto a stream causes an event to fire, how you can assign a custom callback to handle this event, and how the data flows downstream.

Node consistently implements I/O operations as asynchronous, evented data streams. This design choice enables Node's excellent performance. Instead of creating a thread (or spinning up an entire process) for a long-running task like a file upload that a stream may represent, Node only needs to commit the resources to handle callbacks. Additionally, in the long stretches of time in between the short moments when the stream is pushing data, Node's event loop is free to process other instructions.

As an exercise, re-implement stream.js to send the data r produces to a file instead of the terminal. You'll need to make a new writable stream w, using Node's fs.createWriteStream:

// File stream2file.js
// Bring in Node's file system module
const fs = require('fs');
// Make the file counter.txt we can fill by writing data to writeable stream w
const w = fs.createWriteStream('./counter.txt', { flags: 'w', mode: 0666 });
...
// Put w beneath r instead
r.pipe(w);

Modularity

In his book, The Art of Unix Programming, Eric Raymond proposed the Rule of Modularity:

"Developers should build a program out of simple parts connected by well-defined interfaces, so problems are local, and parts of the program can be replaced in the future versions to support new features. This rule aims to save time on debugging complex code that is complex, long, and unreadable."

Large systems are hard to reason about, especially when the boundaries of internal components are fuzzy, and the interactions between them are complex. This principle of building large systems out of small, simple, and loosely-coupled pieces is a good idea for software and beyond. Physical manufacturing, management theory, education, and government, all have benefited from this design philosophy.

When developers began employing JavaScript for larger and more complex software challenges, they encountered this challenge. There was not yet a good way (and later, no common standard way) to assemble a JavaScript program from smaller ones. For example, you've probably seen HTML pages with tags like these at the top:

<head>
<script src="fileA.js"></script>
<script src="fileB.js"></script>
<script src="fileC.js"></script>
<script src="fileD.js"></script>
...
</head>

This works, but leads to a number of problems:

  • The page must declare all potential dependencies before any are needed or used. If, while running, your program encounters a situation where it needs an additional dependency, dynamically loading another module is possible, but a separate hack.
  • The scripts are not encapsulated. Code in every file writes to the same global object. Adding a new dependency may break an earlier one because of a name collision.
  • fileA cannot address fileB as a collection. An addressable context like fileB.function1 isn't available.

The <script> tag would be a nice place for useful module services such as dependency awareness and version control, but it doesn't have these features.

These difficulties and dangers made creating and using JavaScript modules feel more treacherous than effortless. A good module system with features like encapsulation and versioning can reverse this, encouraging code organization and sharing, and leading to a robust ecosystem of high-quality open source software components.

JavaScript needed a standard way to load and share discreet program modules, and found one in 2009 with the CommonJS Modules specification. Node follows this specification, making it easy to define and share bits of reusable code called modules or packages.

Choosing a delightfully simple design, a package is just a directory of JavaScript files. Metadata about the package, such as its name, version, and software license, lives in an additional file named package.json. The JSON contents of this file are easily both human and machine-readable. Let's take a look:

{
"name": "mypackage1",
"version": "0.1.2",
"dependencies": {
"jquery": "^3.1.0",
"bluebird": "^3.4.1",
},
"license": "MIT"
}

This package.json defines a package named mypackage1, which depends on two other packages: jQuery and Bluebird. Alongside the package names is a version number. Version numbers follow the Semantic Versioning (SemVer) rules, with a pattern like Major.Minor.Patch. Looking at the incremented version numbers of a package your code has been using, here's what that means:

  • Major: There's a change in the purpose or outcome of the API. If your code calls an updated function, it may break or produce an unintended result. Figure out what's changed, and determine if it affects your code.
  • Minor: The package has added functionality, but remains compatible. Run all your tests, and you're good to go. Check out the documentation if you're curious, as there might be new, more advanced parts of the API alongside the functions and objects you're familiar with.
  • Patch: The package fixed a bug, improved performance, or refactored a little. Run all your tests, and you're good to go.

Packages enable the construction of large systems from many small, interdependent systems. Perhaps even more importantly, packages encourage sharing. More detailed information about SemVer is available in Appendix A, Organizing Your Work Into Modules, where npm and packages are discussed in more depth.

"What I'm describing here is not a technical problem. It's a matter of people getting together and making a decision to step forward and start building up something bigger and cooler together."
– Kevin Dangoor, creator of CommonJS
Not just about modules, CommonJS is actually a whole collection of standards founded with the goal of removing everything that was holding JavaScript back from world domination, open source developer Kris Kowal explained in a 2009 post evangelizing the initiative. He names the first of these impediments as the absence of a good module system. The second? The absence of a standard library, including such systems-level fundamentals as access to the filesystem, manipulation of I/O streams, and types for bytes and blocks of binary data. Today, CommonJS is known for giving JavaScript a module system, while Node is what gave JavaScript systems-level access:

https://arstechnica.com/information-technology/2009/12/commonjs-effort-sets-javascript-on-path-for-world-domination/

CommonJS gave JavaScript packages. With packages, the next thing JavaScript needed was a package manager. Node provided one with npm.

A registry of packages, npm is accessible in two ways. First, at the website www.npmjs.com, you can link to and search for packages, essentially shopping for the right one. Stats that count how many times a package has been downloaded in the last day, week, and month show popularity and usage. Most packages link to a developer profile page and open source code on GitHub, so you can see the code, visualize recent development activity, and judge the reputations of the authors and contributors.

The second way to access npm is through the command-line tool npm, which is installed with Node. Using npm as a traditional package manager for your workstation, you can install packages globally, creating new command-line tools on your shell's path. npm also knows how to create, read, and edit package.json files, and can start you out with a new, empty Node package, add the dependencies it needs, download all the code, and keep everything up to date.

Along with Git and GitHub, npm is now achieving a dream of software development identified in the 1970s: that code could be reused more often, and software projects would be written entirely from scratch less frequently.
Earlier attempts at reaching this goal through version control systems like CVS and Subversion, and open source code sharing websites like SourceForge.net, focused on bigger units of both code and people, and didn't achieve as much.

GitHub and npm took a different approach in two important ways:
  • Favoring individual developers working alone over communities meeting and discussing, developers could focus more on code and less on conversation
  • Favoring small, atomic software components over complete applications, encapsulated composition started happening not just at a micro-level of subroutines and objects, but at the more important macroscale of application design
Even documentation is better with the new approach: in a monolithic software application, documentation was too often the afterthought that may or may not have happened after the product shipped.
With components, great documentation is necessary to sell your package to the world, getting it a larger public daily download count, and the social media accounts you keep as a developer of more followers.

In no small part, Node's success is due to the number and quality of packages available to you as a Node developer.

More extensive information on creating and managing Node packages can be found in Appendix A, Organizing Your Work into Modules.

The key design philosophy to follow is this: build programs out of packages where possible, and share those packages when possible. The shape of your applications will be clearer and easier to maintain. Importantly, the efforts of thousands of other developers can be linked into applications via npm, directly by inclusion, and indirectly as shared packages are tested, improved, refactored, and repurposed by members of the Node community.

Contrary to popular belief, npm is not an abbreviation for Node Package Manager, and should never be used or explained as an acronym:
https://docs.npmjs.com/policies/trademark

The network

I/O in the browser is mercilessly hobbled, for very good reasons - if the JavaScript on any given website could access your filesystem, for instance, users could only click links to new sites they trusted, rather than ones they simply wanted to try out. Keeping pages in a limited sandbox, the design of the web made navigating from thing1.com to thing2.com not have the consequences of double-clicking thing1.exe and thing2.exe.

Node, of course, recasts JavaScript in the role of a systems language, giving it direct and unfettered access to operating system kernel objects such as files, sockets, and processes. This lets Node create scalable systems with high I/O requirements. It's likely the first thing you coded in Node was a HTTP server.

Node supports standard network protocols in addition to HTTP, such as TLS/SSL, and UDP. With these tools we can easily build scalable network programs, moving far beyond the comparatively limited AJAX solutions JavaScript developers know from the browser.

Let's write a simple program that sends a UDP packet to another node:

const dgram = require('dgram');
let client = dgram.createSocket("udp4");
let server = dgram.createSocket("udp4");
let message = process.argv[2] || "message";
message = Buffer.from(message);
server
.on('message', msg => {
process.stdout.write(`Got message: ${msg}\n`);
process.exit();
})
.bind(41234);
client.send(message, 0, message.length, 41234, "localhost");

Go ahead and open two terminal windows and navigate each to your code bundle for Chapter 8, Scaling Your Application, under the /udp folder. We're now going to run a UDP server in one window, and a UDP client in another.

In the right window, run receive.js with a command like the following:

$ node receive.js

On the left, run send.js with a command, as follows:

$ node send.js

Executing that command will cause the message to appear on the right:

$ node receive.js
Message received!

A UDP server is an instance of EventEmitter, emitting a message event when messages are received on the bound port. With Node, you can use JavaScript to write your application at the I/O level, moving packets and streams of binary data with ease.

Let's continue to explore I/O, the process object, and events. First, let's dig into the machine powering Node's core, V8.

 

V8, JavaScript, and optimizations

V8 is Google's JavaScript engine, written in C++. It compiles and executes JavaScript code inside of a VM (Virtual Machine). When a webpage loaded into Google Chrome demonstrates some sort of dynamic effect, like automatically updating a list or news feed, you are seeing JavaScript, compiled by V8, at work.

V8 manages Node's main process thread. When executing JavaScript, V8 does so in its own process, and its internal behavior is not controlled by Node. In this section, we will investigate the performance benefits that can be had by playing with these options, learning how to write optimizable JavaScript, and the cutting-edge JavaScript features available to users of the latest Node versions (such as 9.x, the version we use in this book).

Flags

There are a number of settings available to you for manipulating the Node runtime. Try this command:

$ node -h

In addition to standards such as --version, you can also flag Node to --abort-on-uncaught-exception.

You can also list the options available for v8:

$ node --v8-options

Some of these settings can save the day. For example, if you are running Node in a restrained environment like a Raspberry Pi, you might want to limit the amount of memory a Node process can consume, to avoid memory spikes. In that case, you might want to set the --max_old_space_size (by default ~1.5GB) to a few hundred MB.

You can use the -e argument to execute a Node program as a string; in this case, logging out of the version of V8 your copy of Node contains:

$ node –e "console.log(process.versions.v8)"

It's worth your time to experiment with Node/V8 settings, both for their utility and the path, to give you a slightly stronger understanding of what is happening (or might happen) under the hood.

Optimizing your code

The simple optimizations of smart code design can really help you. Traditionally, JavaScript developers working in browsers did not need to concern themselves with memory usage optimizations, having quite a lot to use for what were typically uncomplicated programs. On a server, this is no longer the case. Programs are generally more complicated, and running out of memory takes down your server.

The convenience of a dynamic language is in avoiding the strictness that compiled languages impose. For example, you need not explicitly define object property types, and can actually change those property types at will. This dynamism makes traditional compilation impossible, but opens up some interesting new opportunities for exploratory languages such as JavaScript. Nevertheless, dynamism introduces a significant penalty in terms of execution speeds when compared to statically compiled languages. The limited speed of JavaScript has regularly been identified as one of its major weaknesses.

V8 attempts to achieve the sorts of speeds one observes for compiled languages for JavaScript. V8 compiles JavaScript into native machine code, rather than interpreting bytecode, or using other just-in-time techniques. Because the precise runtime topology of a JavaScript program cannot be known ahead of time (the language is dynamic), compilation consists of a two-stage, speculative approach:

  1. Initially, a first-pass compiler (the full compiler) converts your code into a runnable state as quickly as possible. During this step, type analysis and other detailed analysis of the code is deferred, prioritizing fast compilation – your JavaScript can begin executing as close to instantly as possible. Further optimizations are accomplished during the second step.
  2. Once the program is up and running, an optimizing compiler then begins its job of watching how your program runs, and attempting to determine its current and future runtime characteristics, optimizing and re-optimizing as necessary. For example, if a certain function is being called many thousands of times with similar arguments of a consistent type, V8 will re-compile that function with code optimized on the optimistic assumption that future types will be like the past types. While the first compile step was conservative with as-yet unknown and un-typed functional signature, this hot function's predictable texture impels V8 to assume a certain optimal profile and re-compile based on that assumption.

Assumptions help us make decisions more quickly, but can lead to mistakes. What if the hot function V8's compiler just optimized against a certain type signature is now called with arguments violating that optimized profile? V8 has no choice, in that case: it must de-optimize the function. V8 must admit its mistake and roll back the work it has done. It will re-optimize in the future if a new pattern is seen. However, if V8 must again de-optimize at a later time, and if this optimize/de-optimize binary switching continues, V8 will simply give up, and leave your code in a de-optimized state.

Let's look at some ways to approach the design and declaration of arrays, objects, and functions, so that you are helping, rather than hindering the compiler.

Numbers and tracing optimization/de-optimization

The ECMA-262 specification defines the Number value as a "primitive value corresponding to a double-precision 64-bit binary format IEEE 754 value". The point is that there is no Integer type in JavaScript; there is a Number type defined as a double-precision floating-point number.

V8 uses 32-bit numbers for all values internally, for performance reasons that are too technical to discuss here. It can be said that one bit is used to point to another 32-bit number, should greater width be needed. Regardless, it is clear that there are two types of values tagged as numbers by V8, and switching between these types will cost you something. Try to restrict your needs to 31-bit signed Integers where possible.
Because of the type ambiguity of JavaScript, switching the types of numbers assigned to a slot is allowed. For example, the following code does not throw an error:

let a = 7;
a = 7.77;

However, a speculative compiler like V8 will be unable to optimize this variable assignment, given that its guess that a will always be an Integer turned out to be wrong, forcing de-optimization.

We can demonstrate the optimization/de-optimization process by setting some powerful V8 options, executing V8 native commands in your Node program, and tracing how v8 optimizes/de-optimizes your code.

Consider the following Node program:

// program.js
let
someFunc = function foo(){}
console.log(%FunctionGetName(someFunc));

If you try to run this normally, you will receive an Unexpected Token error – the modulo (%) symbol cannot be used within an identifier name in JavaScript. What is this strange method with a % prefix? It is a V8 native command, and we can turn on execution of these types of functions by using the --allow-natives-syntax flag:

node --allow-natives-syntax program.js
// 'someFunc', the function name, is printed to the console.

Now, consider the following code, which uses native functions to assert information about the optimization status of the square function, using the %OptimizeFunctionOnNextCall native method:

let operand = 3;
function square() {
return operand * operand;
}
// Make first pass to gather type information
square();
// Ask that the next call of #square trigger an optimization attempt;
// Call
%OptimizeFunctionOnNextCall(square);
square();

Create a file using the previous code, and execute it using the following command: node --allow-natives-syntax --trace_opt --trace_deopt myfile.js. You will see something like the following returned:

 [deoptimize context: c39daf14679]
[optimizing: square / c39dafca921 - took 1.900, 0.851, 0.000 ms]

We can see that V8 has no problem optimizing the square function, as operand is declared once and never changed. Now, append the following lines to your file and run it again:

%OptimizeFunctionOnNextCall(square);
operand = 3.01;
square();

On this execution, following the optimization report given earlier, you should now receive something like the following:

**** DEOPT: square at bailout #2, address 0x0, frame size 8
[deoptimizing: begin 0x2493d0fca8d9 square @2]
...
[deoptimizing: end 0x2493d0fca8d9 square => node=3, pc=0x29edb8164b46, state=NO_REGISTERS, alignment=no padding, took 0.033 ms]
[removing optimized code for: square]

This very expressive optimization report tells the story very clearly: the once-optimized square function was de-optimized following the change we made in one number's type. You are encouraged to spend some time writing code and testing it using these methods.

Objects and arrays

As we learned when investigating numbers, V8 works best when your code is predictable. The same holds true with arrays and objects. Nearly all of the following bad practices are bad for the simple reason that they create unpredictability.

Remember that in JavaScript, an object and an array are very similar under the hood (resulting in strange rules that provide no end of material for those poking fun at the language!). We won't be discussing those differences, only the important similarities, specifically in terms of how both these data constructs benefit from similar optimization techniques.

Avoid mixing types in arrays. It is always better to have a consistent data type, such as all integers or all strings. As well, avoid changing types in arrays, or in property assignments after initialization if possible. V8 creates blueprints of objects by creating hidden classes to track types, and when those types change the optimization, blueprints will be destroyed and rebuiltif you're lucky. Visit https://github.com/v8/v8/wiki/Design%20Elements for more information.

Don't create arrays with gaps, such as the following:

let a = [];
a[2] = 'foo';
a[23] = 'bar';

Sparse arrays are bad for this reason: V8 can either use a very efficient linear storage strategy to store (and access) your array data, or it can use a hash table (which is much slower). If your array is sparse, V8 must choose the least efficient of the two. For the same reason, always start your arrays at the zero index. As well, do not ever use delete to remove elements from an array. You are simply inserting an undefined value at that position, which is just another way of creating a sparse array. Similarly, be careful about populating an array with empty valuesensure that the external data you are pushing into an array is not incomplete.

Try not to preallocate large arraysgrow as you go. Similarly, do not preallocate an array and then exceed that size. You always want to avoid spooking V8 into turning your array into a hash table. V8 creates a new hidden class whenever a new property is added to an object constructor. Try to avoid adding properties after an object is instantiated. Initialize all members in constructor functions in the same order. Same properties + same order = same object.

Remember that JavaScript is a dynamic language that allows object (and object prototype) modifications after instantiation. Since the shape and volume of an object can, therefore, be altered after the fact, how does V8 allocate memory for objects? It makes some reasonable assumptions. After a set number of objects are instantiated from a given constructor (I believe 8 is the trigger amount), the largest of these is assumed to be of the maximum size, and all further instances are allocated that amount of memory (and the initial objects are similarly resized). A total of 32 fast property slots, inclusive, are then allocated to each instance based on this assumed maximum size. Any extra properties are slotted into a (slower) overflow property array, which can be resized to accommodate any further new properties.

With objects, as with arrays, try to define as much as possible the shape of your data structures in a futureproof manner, with a set number of properties, of types, and so on.

Functions

Functions are typically called often, and should be one of your prime optimization focuses. Functions containing try-catch constructs are not optimizable, nor are functions containing other unpredictable constructs, like with or eval. If, for some reason, your function is not optimizable, keep its use to a minimum.

A very common optimization error involves the use of polymorphic functions. Functions that accept variable function arguments will be de-optimized. Avoid polymorphic functions.

An excellent explanation of how V8 performs speculative optimization can be found here: https://ponyfoo.com/articles/an-introduction-to-speculative-optimization-in-v8

Optimized JavaScript

The JavaScript language is in constant flux, and some major changes and improvements have begun to find their way into native compilers. The V8 engine used in the latest Node builds supports nearly all of the latest features. Surveying all of these is beyond the scope of this chapter. In this section, we'll mention a few of the most useful updates and how they might be used to simplify your code, helping to make it easier to understand and reason about, to maintain, and perhaps even become more performant.

We will be using the latest JavaScript features throughout this book. You can use Promises, Generators, and async/await constructs as of Node 8.x, and we will be using those throughout the book. These concurrency operators will be discussed at depth in Chapter 2, Understanding Asynchronous Event-Driven Programming, but a good takeaway for now is that the callback pattern is losing its dominance, and the Promise pattern in particular is coming to dominate module interfaces.

In fact, a new method util.promisify was recently added to Node's core, which converts a callback-based function to a Promise-based one:

const {promisify} = require('util');
const fs = require('fs');

// Promisification happens here
let readFileAsync = promisify(fs.readFile);

let [executable, absPath, target, ...message] = process.argv;

console.log(message.length ? message.join(' ') : `Running file ${absPath} using binary ${executable}`);

readFileAsync(target, {encoding: 'utf8'})
.then(console.log)
.catch(err => {
let message = err.message;
console.log(`
An error occurred!
Read error: ${message}
`);
});

Being able to easily promisify fs.readFile is very useful.

Did you notice any other new JavaScript constructs possibly unfamiliar to you?

Help with variables

You'll be seeing let and const throughout this book. These are new variable declaration types. Unlike var, let is block scoped; it does not apply outside of its containing block:

let foo = 'bar';

if(foo == 'bar') {
let foo = 'baz';
console.log(foo); // 1st
}
console.log(foo); // 2nd

// baz
// bar
// If we had used var instead of let:
// baz
// baz

For variables that will never change, use const, for constant. This is helpful for the compiler as well, as it can optimize more easily if a variable is guaranteed never to change. Note that const only works on assignment, where the following is illegal:

const foo = 1;
foo = 2; // Error: assignment to a constant variable

However, if the value is an object, const doesn't protect members:

const foo = { bar: 1 }
console.log(foo.bar) // 1
foo.bar = 2;
console.log(foo.bar) // 2

Another powerful new feature is destructuring, which allows us to easily assign the values of arrays to new variables:

let [executable, absPath, target, ...message] = process.argv;

Destructuring allows you to rapidly map arrays to variable names. Since process.argv is an array, which always contains the path to the Node executable and the path to the executing file as the first two arguments, we can pass a file target to the previous script by executing node script.js /some/file/path, where the third argument is assigned to the target variable.

Maybe we also want to pass a message with something like this:

node script.js /some/file/path This is a really great file!

The problem here is that This is a really great file! is space-separated, so it will be split into the array on each word, which is not what we want:

[... , /some/file/path, This, is, a, really, great, file!]

The rest pattern comes to the rescue here: the final argument ...message collapses all remaining destructured arguments into a single array, which we can simply join(' ') into a single string. This also works for objects:

let obj = {
foo: 'foo!',
bar: 'bar!',
baz: 'baz!'
};

// assign keys to local variables with same names
let {foo, baz} = obj;

// Note that we "skipped" #bar
console.log(foo, baz); // foo! baz!

This pattern is especially useful for processing function arguments. Prior to rest parameters, you might have been grabbing function arguments in this way:

function (a, b) {
// Grab any arguments after a & b and convert to proper Array
let args = Array.prototype.slice.call(arguments, f.length);
}

This was necessary previously, as the arguments object was not a true Array. In addition to being rather clumsy, this method also triggers de-optimization in compilers like V8.

Now, you can do this instead:

function (a, b, ...args) {
// #args is already an Array!
}

The spread pattern is the rest pattern in reverse—you expand a single variable into many:

const week = ['mon','tue','wed','thur','fri'];
const weekend = ['sat','sun'];

console.log([...week, ...weekend]); // ['mon','tue','wed','thur','fri','sat','sun']

week.push(...weekend);
console.log(week); // ['mon','tue','wed','thur','fri','sat','sun']

Arrow functions

Arrow functions allow you to shorten function declarations, from function() {} to simply () => {}. Indeed, you can replace a line like this:

SomeEmitter.on('message', function(message) { console.log(message) });

To:

SomeEmitter.on('message', message => console.log(message));

Here, we lose both the brackets and curly braces, and the tighter code works as expected.

Another important feature of arrow functions is they are not assigned their own this—arrow functions inherit this from the call site. For example, the following code does not work:

function Counter() {
this.count = 0;

setInterval(function() {
console.log(this.count++);
}, 1000);
}

new Counter();

The function within setInterval is being called in the context of setInterval, rather than the Counter object, so this does not have any reference to count. That is, at the function call site, this is a Timeout object, which you can check yourself by adding console.log(this) to the prior code.

With arrow functions, this is assigned at the point of definition. Fixing the code is easy:

setInterval(() => { // arrow function to the rescue!
console.log(this);
console.log(this.count++);
}, 1000);
// Counter { count: 0 }
// 0
// Counter { count: 1 }
// 1
// ...

String manipulation

Finally, you will see a lot of backticks in the code. This is the new template literal syntax, and along with other things, it (finally!) makes working with strings in JavaScript much less error-prone and tedious. You saw in the example how it is now easy to express multiline strings (avoiding 'First line\n' + 'Next line\n' types of constructs). String interpolation is similarly improved:

let name = 'Sandro';
console.log('My name is ' + name);
console.log(`My name is ${name}`);
// My name is Sandro
// My name is Sandro

This sort of substitution is especially effective when concatenating many variables, and since the contents of each ${expression} can be any JavaScript code:

console.log(`2 + 2 = ${2+2}`)  // 2 + 2 = 4

You can also use repeat to generate strings: 'ha'.repeat(3) // hahaha.

Strings are now iterable. Using the new for...of construct, you can pluck apart a string character by character:

for(let c of 'Mastering Node.js') {
console.log(c);
// M
// a
// s
// ...
}

Alternatively, use the spread operator:

console.log([...'Mastering Node.js']);
// ['M', 'a', 's',...]

Searching is also easier. New methods allow common substring seeks without much ceremony:

let targ = 'The rain in Spain lies mostly on the plain';
console.log(targ.startsWith('The', 0)); // true
console.log(targ.startsWith('The', 1)); // false
console.log(targ.endsWith('plain')); // true
console.log(targ.includes('rain', 5)); // false

The second argument to these methods indicates a search offset, defaulting to 0. The is found at position 0, so beginning the search at position 1 fails in the second case.

Great, writing JavaScript programs just got a little easier. The next question is what's going on when that program is executed within a V8 process?

 

The process object

Node's process object provides information on and control over the current running process. It is an instance of EventEmitter is accessible from any scope, and exposes very useful low-level pointers. Consider the following program:

const size = process.argv[2];
const n = process.argv[3] || 100;
const buffers = [];
let i;
for (i = 0; i < n; i++) {
buffers.push(Buffer.alloc(size));
process.stdout.write(process.memoryUsage().heapTotal + "\n");
}

Have Node run process.js with a command like this:

$ node process.js 1000000 100

The program gets the command-line arguments from process.argv, loops to allocate memory, and reports memory usage back to standard out. Instead of logging back to the terminal, you could stream output to another process, or a file:

$ node process.js 1000000 100 > output.txt

A Node process begins by constructing a single execution stack, with the global context forming the base of the stack. Functions on this stack execute within their own local context (sometimes referred to as scope), which remains enclosed within the global context. This way of keeping the execution of a function together with the environment the function runs in is called closure. Because Node is evented, any given execution context can commit the running thread to handling an eventual execution context. This is the purpose of callback functions.

Consider the following schematic of a simple interface for accessing the filesystem:

If we were to instantiate Filesystem and call readDir, a nested execution context structure would be created:

(global (fileSystem (readDir (anonymous function) ) ) )

Inside Node, a C library named libuv creates and manages the event loop. It connects to low-level operating system kernel mode objects that can produce events, such as timers that go off, sockets that receive data, files that open for reading, and child processes that complete. It loops while there are still events to process, and calls callbacks associated with events. It does this at a very low level, and with a very performant architecture. Written for Node, libuv is now a building block of a number of software platforms and languages.

The concomitant execution stack is introduced to Node's single-process thread. This stack remains in memory until libuv reports that fs.readdir has completed, at which point the registered anonymous callback fires, resolving the sole pending execution context. As no further events are pending, and the maintenance of closures no longer necessary, the entire structure can be safely torn down (in reverse, beginning with anonymous), and the process can exit, freeing any allocated memory. This method of building up and tearing down a single stack is what Node's event loop is ultimately doing.

 

The REPL

Node's REPL (Read-Eval-Print-Loop) represents the Node shell. To enter the shell prompt, enter Node via your terminal without passing a filename:

$ node

You now have access to a running Node process, and may pass JavaScript commands to this process. Additionally, if you enter an expression, the REPL will echo back the value of the expression. As a simple example of this, you can use the REPL as a pocket calculator:

$ node
> 2+2
4

Enter the 2+2 expression, and Node will echo back the value of the expression, 4. Going beyond simple number literals, you can use this behavior to query, set, and again, query the values of variables:

> a
ReferenceError: a is not defined
at repl:1:1
at sigintHandlersWrap (vm.js:22:35)
at sigintHandlersWrap (vm.js:96:12)
at ContextifyScript.Script.runInThisContext (vm.js:21:12)
at REPLServer.defaultEval (repl.js:346:29)
at bound (domain.js:280:14)
at REPLServer.runBound [as eval] (domain.js:293:12)
at REPLServer.<anonymous> (repl.js:545:10)
at emitOne (events.js:101:20)
at REPLServer.emit (events.js:188:7)
> a = 7
7
> a
7

Node's REPL is an excellent place to try out, debug, test, or otherwise play with JavaScript code.

As the REPL is a native object, programs can also use instances as a context in which to run JavaScript interactively. For example, here we create our own custom function sayHello, add it to the context of a REPL instance, and start the REPL, emulating a Node shell prompt:

require('repl').start("> ").context.sayHello = function() {
return "Hello";
};

Enter sayHello() at the prompt, and the function will send Hello to standard out.

Let's take everything we've learned in this chapter and create an interactive REPL that allows us to execute JavaScript on a remote server:

  1. Create two files, client.js and server.js, and type in the following code.
  2. Run each in its own terminal window, keeping both windows side by side on your screen:
// File client.js
let net = require("net");
let sock = net.connect(8080);
process.stdin.pipe(sock);
sock.pipe(process.stdout);

// File server.js
let repl = require("repl")
let net = require("net")
net.createServer((socket) => {
repl
.start({
prompt: "> ",
input: socket,
output: socket,
terminal: true
}).on('exit', () => {
socket.end();
})
}).listen(8080);

The client.js program creates a new socket connection to port 8080 through net.connect, and pipes any data coming from standard in (your terminal) through to that socket. Similarly, any data arriving from the socket is piped to standard out (back to your terminal). With this code, we've created a way to take terminal input and send it via a socket to port 8080, listening for any data that the socket may send back to us.

The other program, server.js, closes the loop. This program uses net.createServer and .listen to create and start a new TCP server. The callback the code passes to net.createServer receives a reference to the bound socket. Within the enclosure of that callback, we instantiate a new REPL instance, giving it a nice prompt (> here, but could be any string), indicating that it should both listen for input from, and broadcast output to, the passed socket reference, indicating that the socket data should be treated as terminal data (which has special encoding).

We can now type something like console.log("hello") into the client terminal, and see hello displayed.

To confirm that the execution of our JavaScript commands is occurring in the server instance, type console.log(process.argv) into the client, and the server will display an object containing the current process path, which will be server.js.

With just a few lines of code, we've created a way to remotely control Node processes. It's the first step towards multi-node analytics tools, remote memory management, automatic server administration, and more.

 

 Summary

Experienced developers have all struggled with the problems that Node aims to solve:

  • How to serve many thousands of simultaneous clients efficiently
  • Scaling networked applications beyond a single server
  • Preventing I/O operations from becoming bottlenecks
  • Eliminating single points of failure, thereby ensuring reliability
  • Achieving parallelism safely and predictably

As each year passes, we see collaborative applications and software responsible for managing levels of concurrency that would have been considered rare just a few years ago. Managing concurrency, both in terms of connection handling and application design, is the key to building scalable architectures.

In this chapter, we've outlined the key problems Node's designers sought to solve, and how their solution has made the creation of easily scalable, high-concurrency networked systems easier for an open community of developers. We've seen how JavaScript has been given very useful new powers, how its evented model has been extended, and how V8 can be configured to further customize the JavaScript runtime. Through examples, we've learned how I/O is handled by Node, how to program the REPL, as well as how to manage inputs and outputs to the process object.

Node turns JavaScript into a systems language, creating a useful anachronism of scripting sockets as well as buttons, and cutting across decades of learning from the evolution of computing.

Node's design restores the virtues of simplicity the original Unix developers discovered in the 1970s. Interestingly, computer science rebelled against that philosophy in the intervening time period. C++ and Java favored object-oriented design patterns, serialized binary data formats, subclassing rather than rewriting, and other policies that caused codebases to often grow to one million lines or more before finally collapsing under the weight of their own complexity.

But then came the web. The browser's View, Source feature is a gentle on-ramp that brought millions of web users into the ranks of a new generation of software developers. Brendan Eich designed JavaScript with this novice prospective developer in mind. It's easy to start by editing tags and changing styles, and soon be writing code. Talk to the young employees of newly growing start-ups, now professional developers, engineers, and computer scientists, and many will recount View, Source as how they got their start.
Riding Node's time warp back, JavaScript found a similar design and philosophy in the founding principles of Unix. Perhaps connecting computers to the internet gave smart people new, more interesting computing problems to solve. Perhaps another new generation of students and junior employees arrived and rebelled against their mentors once again. For whatever reason, small, modular, and simple make up the prevailing philosophy today, as they did much earlier before.

In the decades ahead, how many more times will computing technology change enough to prompt the designers of the day to write new software and languages quite different from the practices taught and accepted as correct, finished, and permanent just a few years earlier? As Arthur C. Clarke noted, trying to predict the future is a discouraging and hazardous occupation. Perhaps we'll see several more revolutions in computers and code. Alternatively, it's possible that computing technology will soon plateau for a stretch of years, and within that stability, computer scientists will find and settle on the best paradigms to teach and use. Nobody knows the best way to code right now, but perhaps soon, we will. If that's the case, then this time now, when creating and exploring to find these answers is anyone's game, is a wonderfully compelling time to be working and playing with computers.

Our goal of demonstrating how Node allows applications to be intelligently constructed out of well-formed pieces in a principled way has begun. In the next chapter, we will delve deeper into asynchronous programming, learn how to manage more complex event chains, and develop more powerful programs using Node's model.

About the Authors
  • Sandro Pasquali

    Sandro Pasquali formed a technology company named Simple in 1997, that sold the world's first JavaScript-based application development framework and was awarded several patents for deployment and advertising technologies that anticipated the future of Internet-based software. Node represents, for him, the natural next step in the inexorable march towards the day when JavaScript powers nearly every level of software development. Sandro has led the design of enterprise-grade applications for some of the largest companies in the world, including Nintendo, Major League Baseball, Bang and Olufsen, LimeWire, AppNexus, Conde Nast, and others. He has displayed interactive media exhibits during the Venice Biennial, won design awards, built knowledge management tools for research institutes and schools, and started and run several start-ups. Always seeking new ways to blend design excellence and technical innovation, he has made significant contributions across all levels of software architecture, from data management and storage tools to innovative user interfaces and frameworks. He is the author of Deploying Node.js, also by Packt Publishing, which aims to help developers get their work in front of others. Sandro runs a software development company in New York and trains corporate development teams interested in using Node and JavaScript to improve their products. He spends the rest of his time entertaining his beautiful daughter, and his wife.

    Browse publications by this author
  • Kevin Faaborg

    Kevin Faaborg is a professional software developer and avid software hobbyist. At Harvard, he learned C programming from visiting professor Brian Kernighan. He witnessed and contributed to how digital technology has shaped music distribution, working first at MTV Networks, then Lime Wire LLC, and now Spotify AB, where he designed and started the patent program. Kevin travels frequently, spending time each year in San Francisco, Colorado, NYC, and Stockholm. Follow him at github/zootella

    Browse publications by this author
Latest Reviews (19 reviews total)
This Book is not for Beginners. If you have basic knowledge about Node.js this book is for you to explore further. Just one Suggestion-Pictures would have added value to the content. Not Many pictures are there in the book ex: event loops etc.
They way things are explained is awful. I bought another book because i was not enjoying this book.
I love the book and the purchasing process was fine. I actually have access to it via o reilly, but bought it from packt directly to get the pdf.
Mastering  Node.js - Second Edition
Unlock this book and the full library FREE for 7 days
Start now