Understanding and Developing Node Modules


Node Web Development

Node Web Development

A practical introduction to Node, the exciting new server-side JavaScript web development stack

        Read more about this book      

(For more resources on Web Development, see here.)


What's a module?

Modules are the basic building block of constructing Node applications. We have already seen modules in action; every JavaScript file we use in Node is itself a module. It's time to see what they are and how they work.

The following code to pull in the fs module, gives us access to its functions:

var fs = require('fs');

The require function searches for modules, and loads the module definition into the Node runtime, making its functions available. The fs object (in this case) contains the code (and data) exported by the fs module.

Let's look at a brief example of this before we start diving into the details. Ponder over this module, simple.js:

var count = 0;
exports.next = function() { return count++; }

This defines an exported function and a local variable. Now let's use it:

Node Web Development

The object returned from require('./simple') is the same object, exports, we assigned a function to inside simple.js. Each call to s.next calls the function next in simple.js, which returns (and increments) the value of the count variable, explaining why s.next returns progressively bigger numbers.

The rule is that, anything (functions, objects) assigned as a field of exports is exported from the module, and objects inside the module but not assigned to exports are not visible to any code outside the module. This is an example of encapsulation.

Now that we've got a taste of modules, let's take a deeper look.


Node modules

Node's module implementation is strongly inspired by, but not identical to, the CommonJS module specification. The differences between them might only be important if you need to share code between Node and other CommonJS systems. A quick scan of the Modules/1.1.1 spec indicates that the differences are minor, and for our purposes it's enough to just get on with the task of learning to use Node without dwelling too long on the differences.


How does Node resolve require('module')?

In Node, modules are stored in files, one module per file. There are several ways to specify module names, and several ways to organize the deployment of modules in the file system. It's quite flexible, especially when used with npm, the de-facto standard package manager for Node.

Module identifiers and path names

Generally speaking the module name is a path name, but with the file extension removed. That is, when we write require('./simple'), Node knows to add .js to the file name and load in simple.js.

Modules whose file names end in .js are of course expected to be written in JavaScript. Node also supports binary code native libraries as Node modules. In this case the file name extension to use is .node. It's outside our scope to discuss implementation of a native code Node module, but this gives you enough knowledge to recognize them when you come across them.

Some Node modules are not files in the file system, but are baked into the Node executable. These are the Core modules, the ones documented on nodejs.org. Their original existence is as files in the Node source tree but the build process compiles them into the binary Node executable.

There are three types of module identifiers: relative, absolute, and top-level.

Relative module identifiers begin with "./" or "../" and absolute identifiers begin with "/". These are identical with POSIX file system semantics with path names being relative to the file being executed.

Absolute module identifiers obviously are relative to the root of the file system. Top-level module identifiers do not begin with "." , "..", or "/" and instead are simply the module name. These modules are stored in one of several directories, such as a node_modules directory, or those directories listed in the array require.paths, designated by Node to hold these modules.

Local modules within your application

The universe of all possible modules is split neatly into two kinds, those modules that are part of a specific application, and those modules that aren't. Hopefully the modules that aren't part of a specific application were written to serve a generalized purpose. Let's begin with implementation of modules used within your application.

Typically your application will have a directory structure of module files sitting next to each other in the source control system, and then deployed to servers. These modules will know the relative path to their sibling modules within the application, and should use that knowledge to refer to each other using relative module identifiers.

For example, to help us understand this, let's look at the structure of an existing Node package, the Express web application framework. It includes several modules structured in a hierarchy that the Express developers found to be useful. You can imagine creating a similar hierarchy for applications reaching a certain level of complexity, subdividing the application into chunks larger than a module but smaller than an application. Unfortunately there isn't a word to describe this, in Node, so we're left with a clumsy phrase like "subdivide into chunks larger than a module". Each subdivided chunk would be implemented as a directory with a few modules in it.

Node Web Development

In this example, the most likely relative module reference is to utils.js. Depending on the source file which wants to use utils.js it would use one of the following require statements:

var utils = require('./lib/utils');
var utils = require('./utils');
var utils = require('../utils');



        Read more about this book      

(For more resources on Web Development, see here.)


Bundling external dependencies with your application

Modules placed in a node_modules directory are required using a top-level module identifier such as:

var express = require('express');

Node searches the node_modules directories to find modules. There is not just one node_modules directory, but several that are searched for by Node. Node starts at the directory of the current module, appends node_modules, and searches there for the module being requested. If not found in that node_modules directory it moves to the parent directory and tries again, repeatedly moving to parent directories until reaching the root of the file system.

In the previous example, you'll notice a node_modules directory within which is a directory named qs. By being situated in that location, the qs module is available to any module inside Express with this code utterance:

var qs = require('qs');

What if you want to use the Express framework in your application? That's simple, make a node_modules directory inside the directory structure of your application, and install the Express framework there:

Node Web Development

We can see this in a hypothetical application shown here, drawapp. With the node_modules directory situated where it is any module within drawapp can access express with the code:

var express = require('express');

But those same modules cannot access the qs module stashed inside the node_modules directory within the express module. The search for node_modules directories containing the desired module goes upward in the filesystem hierarchy, and not into child directories.

Likewise a module could be installed in lib/node_modules and be accessible from draw.js or svg.js and not accessible from index.js. The search for node_modules directories goes upward, and not into child directories.

Node searches upward for node_modules directories, stopping at the first place it finds the module it's searching for. A module reference from draw.js or svg.js would search this list of directories:

  • /home/david/projects/drawapp/lib/node_modules
  • /home/david/projects/drawapp/node_modules
  • /home/david/projects/node_modules
  • /home/david/node_modules
  • /home/node_modules
  • /node_modules

The node_modules directory plays a key role in keeping the Node package management from disappearing into a maze of conflicting package versions. Rather than having one place to put modules, and descend into confusion as dependencies on conflicting module versions slowly drive you nuts, multiple node_modules directories let you have specific versions of modules in specific places, if needed. Different versions of the same module can live in different node_modules directories, and they won't conflict with each other, so long as the node_modules directories are situated correctly.

For example, if you've written an application using the forms module (https://github.com/caolan/forms) to help build the forms in your application and after writing hundreds of different forms, the authors of the forms module make incompatible changes. With hundreds of forms to convert and test on their new API you might not want to do it all at once, but spread out the effort. To do so would require two directories within your application, each with its own node_modules directory, with a different version of the forms module in each. Then as you convert a form to the new forms module, move its code into the directory where the new forms module lives.

System-wide modules in the require.paths directories

The algorithm Node uses to find the node_modules directories extends beyond your application source tree. It goes to the root of the file system, and you could have a /node_modules directory with a global module repository to satisfy any search for modules.

Node provides an additional mechanism with the require.paths variable. This is an array of directory names where we can search for modules.

An example is:

$ node
> require.paths;

The NODE_PATH environment variable can add directories to the require.paths array:

$ export NODE_PATH=/usr/lib/node
$ node
> require.paths;

It used to be a common idiom for Node programs to add directories into require.paths variable as follows: require.paths.push(__dirname). However, this is no longer recommended because in practice it was found to be a troublesome source of confusion. While you can still do this, and while there are still modules in existence using this idiom, it's sternly frowned upon. The results are unpredictable when multiple modules push directories into require.paths.

The recommended practice is, in most cases, to install modules in node_modules directories.

Complex modules—modules as directories

A complex module might include several internal modules, data files, template files, documentation, tests, or more. These can be stashed inside a carefully constructed directory structure,which Node will treat as a module satisfying a require('moduleName') request. To do so, you place one of the two files in a directory, either a module file named index.js, or a file named package.json. The package.json file will contain data describing the module, in a format nearly identical to the package.json format defined by the npm package manager. The two are compatible with Node using a very small subset of the tags that npm recognizes.

Specifically, Node recognizes these fields in package.json:

{ name: "myAwesomeLibrary",
   main: "./lib/awesome.js" }

With that package.json, the code require('myAwesomeLibrary') would find this directory, and load the file:


If there were no package.json file then Node will instead look for index.js, which would load the file:


Under either scenario (index.js or package.json), the complex module with internal modules and other assets is easy to implement. Some of the modules will use relative module identifiers to reference other modules inside the package, and you can use a node_modules directory to integrate modules developed elsewhere.


CommonJS modules

Node's module system is based on the the CommonJS module system (http://www.commonjs.org/). While JavaScript is a powerful language with several advanced modern features (such as objects and closures), it lacks a standard object library to facilitate building applications. CommonJS aims to fill that gap with both a convention for implementing modules in JavaScript, and a set of standard modules.

The require function takes a module identifier and returns the API exported by the module. If a module requires other modules they are loaded as well. Modules are contained in one JavaScript file, and CommonJS doesn't specify how the module identifier is mapped into a filename.

Modules provide a simple mechanism for encapsulation to hide implementation details while exposing an API. Module content is JavaScript code which is treated as if it were written as follows:

(function() { ... contents of module file ... })();

This encapsulates (hides) every top-level object in the module within a private namespace that other code cannot access. This is how the Global Object problem is resolved.

The exported module API is the object returned by the require function. Inside the module it's implemented with a top-level object named exports whose fields contain the exported API. To expose a function or object from the module, simply assign it into the exports object.

Demonstrating module encapsulation

That was a lot of words, so let's do a quick example. Create a file named module1.js containing this:

var A = "value A";
var B = "value B";
exports.values = function() {
   return { A: A, B: B };

Then create a file named module2.js containing the following:

var util = require('util');
var A = "a different value A";
var B = "a different value B";
var m1 = require('./module1');
util.log('A='+A+' B='+B+' values='+util.inspect(m1.values()));

Then run it as follows (you must have already installed Node):

$ node module2.js
19 May 21:36:30 - A=a different value A B=a different value B
values={ A: 'value A', B: 'value B' }

This artificial example demonstrates encapsulation of the values in module1.js from those in module2.js. The A and B values in module1.js don't overwrite A and B in module2.js, because they're encapsulated within module1.js. Values encapsulated within a module can be exported, such as the .values function in module1.js.

The Global Object problem has to do with those variables which are outside the functions, putting them in the global context. In web browsers there is a single global context, and it causes a lot of problems if one JavaScript script steps on the global variables used in another script. With CommonJS modules each module has its own private global context, making it safe to share variables between functions within a module without danger of interfering with global variables in other modules.


We learned a lot about modules and packages for Node. In this article we covered implementing modules and packages for Node. We also saw how Node locates modules.

Further resources on this subject:

You've been reading an excerpt of:

Node Web Development

Explore Title