Nginx 1 Web Server Implementation Cookbook

4 (1 reviews total)
By Dipankar Sarkar
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. The Core HTTP Module

About this book

Nginx is an open source high-performance web server, which has gained quite some popularity recently. Due to its modular architecture and small footprint, it has been the default choice for a lot of smaller Web 2.0 companies for use as a load-balancing proxy server. It supports most of the existing back-end web protocols like FCGI, WSGI, and SCGI. This book is for you if you want to have in-depth knowledge of the Nginx server.

Nginx 1 Web Server Implementation Cookbook covers the whole range of techniques that would prove useful for you in setting up a very effective web application with the Nginx web server. It has recipes for lesser-known applications of Nginx like a mail proxy server, streaming of video files, image resizing on the fly, and much more.

The first chapter of the book covers the basics that would be useful for anyone who is starting with Nginx. Each recipe is designed to be independent of the others.

The book has recipes based on broad areas such as core, logging, rewrites, security, and others. We look at ways to optimize your Nginx setup, setting up your WordPress blog, blocking bots that post spam on your site, setting up monitoring using munin, and much more.

Nginx 1 Web Server Implementation Cookbook makes your entry into the Nginx world easy with step-by-step recipes for nearly all the tasks necessary to run your own web application.

Publication date:
May 2011
Publisher
Packt
Pages
236
ISBN
9781849514965

 

Chapter 1. The Core HTTP Module

In this chapter, we will cover:

  • Installing new modules and compiling Nginx

  • Running Nginx in debug mode

  • Easy reloading of Nginx using the CLI

  • Splitting configuration files for better management

  • Setting up multiple virtual hosts

  • Setting up a default catch-all virtual host

  • Using wildcards in virtual hosts

  • Setting up the number of worker processes correctly

  • Increasing the size of uploaded files

  • Using dynamic SSI for simple sites

  • Adding content before and after a particular page

  • Enabling auto indexing of a directory

  • Serving any random web page from a directory

  • Serving cookies for identifying and logging users

  • Re-encoding the response to another encoding

  • Enabling Gzip compression on some content types

  • Setting up 404 and other error pages

 

Introduction


This chapter deals with the basics of Nginx configuration and implementation. By the end of it you should be able to compile Nginx on your machine, create virtual hosts, set up user tracking, and get PHP to work.

 

Installing new modules and compiling Nginx


Today, most softwares are designed to be modular and extensible. Nginx, with its great community, has an amazing set of modules out there that lets it do some pretty interesting things. Although most operating system distributions have Nginx binaries in their repositories, it is a necessary skill to be able to compile new, bleeding edge modules, and try them out. Now we will outline how one can go about compiling and installing Nginx with its numerous third-party modules.

How to do it...

  1. The first step is to get the latest Nginx distribution, so that you are in sync with the security and performance patches (http://sysoev.ru/nginx/nginx-0.7.67.tar.gz). Do note that you will require sudo or root access to do some of the installation steps going ahead.

  2. Un-tar the Nginx source code. This is simple, you will need to enter the following command:

    tar -xvzf nginx-0.7.67.tar.gz
    
  3. Go into the directory and configure it. This is essential, as here you can enable and disable the core modules that already come with Nginx. Following is a sample configure command:

    ./configure -–with-debug \
    --with-http_ssl_module \
    --with-http_realip_module \ 
    --with-http_ssl_module \ 
    --with-http_perl_module \ 
    --with-http_stub_status_module
    

    You can figure out more about what other modules and configuration flags use:

    ./configure -–help
    
  4. If you get an error, then you will need to install the build dependencies, depending on your system. For example, if you are running a Debian based system, you can enter the following command:

    apt-get build-dep nginx
    

    This will install all the required build dependencies, like PCRE and TLS libraries.

  5. After this, you can simply go ahead and build it:

    sudo make install
    
  6. This was the plain vanilla installation! If you want to install some new modules, we take the example of the HTTP subscribe-publish module:

  7. Download your module (http://pushmodule.slact.net/downloads/nginx_http_push_module-0.692.tar.gz).

  8. Un-tar it at a certain location:/path/to/module.

  9. Reconfigure Nginx installation:

    ./configure ..... --add-module=/path/to/module
    

    The important part is to point the –add-module flag to the right module path. The rest is handled by the Nginx configuration script.

  10. You can continue to build and install Nginx as shown in step 5.

    sudo make install
    

If you have followed steps 1 to 10, it will be really easy for you to install any Nginx module.

There's more...

If you want to check that the module is installed correctly, you can enter the following command:

nginx -V 

A sample output is something as shown in the following screenshot:

This basically gives you the compilation flags that were used to install this particular binary of Nginx, indirectly listing the various modules that were compiled into it.

 

Running Nginx in debug mode


Nginx is a fairly stable piece of software which has been running in production for over a decade and has built a very strong developer community around it. But, like all software there are issues and bugs which crop up under the most critical of situations. When that happens, it's usually best to reload Nginx with higher levels of error logging and if possible, in the debug mode.

How to do it...

If you want the debug mode, then you will need to compile Nginx with the debug flag (--with-debug ). In most cases, most of the distributions have packages where Nginx is pre-compiled with debug flag. Here are the various levels of debugging that you can utilize:

error_log LOGFILE [debug | info | notice | warn | error | crit | debug_core | debug_alloc | debug_mutex | debug_event | debug_http | debug_imap];

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

If you do not set the error log location, it will log to a compiled-in default log location. This logging is in addition to the normal error logging that you can do per site. Here is what the various specific debug flags do:

Flags

Application

debug_core

Lets you dump the core when Nginx crashes

debug_alloc

Logs all memory allocation warnings and errors

debug_mutex

Logs potential mutex issues

debug_event

Logs events module issues

debug_http

This is the default HTTP logging

debug_imap

This is the default IMAP logging

There's more...

Nginx allows us to log errors for specific IP addresses. Here is a sample configuration that will log errors from 192.168.1.1 and the IP range of 192.168.10.0/24:

error_log logs/error.log;
events { 
    debug_connection   192.168.1.1; 
    debug_connection   192.168.10.0/24; 
}

This is extremely useful when you want to debug in the production environment, as logging for all cases has unnecessary performance overheads. This feature allows you to not set a global debug on the error_log , while being able to see the debug output for specific matched IP blocks based on the user's IP address.

 

Easy reloading of Nginx using the CLI


Depending on the system that you have, it will offer one clean way of reloading your Nginx setup

  • Debian based: /etc/init.d/Nginx reload

  • Fedora based: service Nginx reload

  • FreeBSD/BSD: service Nginx reload

  • Windows: Nginx -s reload

All the preceding commands reload Nginx; they send a HUP signal to the main Nginx process. You can send quite a few control signals to the Nginx master process, as outlined in the following table. These let you manage some of the basic administrative tasks:

Signal

Activity

TERM ,INT

Quick shutdown

QUIT

Graceful shutdown

HUP

Reload configuration, gracefully shutdown the worker processes and restart them

USR1

Reopen the log files

USR2

Upgrade the executable on the fly, when you have already installed it

WINCH

Gracefully shutdown the worker process

How to do it...

Let me run you through the simple steps of how you can reload Nginx from the command line.

  1. Open a terminal on your system. Most UNIX-based systems already have fairly powerful terminals, while you can use PuTTY on Windows systems.

  2. Type in ps auxww | grep nginx. This will output something as shown in the following screenshot:

    If nothing comes, then it means that Nginx is not running on your system.

  3. If you get the preceding output, then you can see the master process and the two worker processes (it may be more, depending on your worker_processes configuration). The important number is 3322, which is basically the PID of the master process.

  4. To reload Nginx, you can issue the command kill -HUP <PID of the nginx master process>. In this case, the PID of the master process is 3322. This will basically read the configurations again, gracefully close your current connections, and start new worker processes. You can issue another ps auxww | grep nginx to see new PIDs for the worker processes (4582,4583):

  5. If the worker PIDs do not change it means that you may have a problem while reloading the configuration files. Go ahead and check the Nginx error log.

This is very useful while writing scripts, which control Nginx configuration. A good example is when you are deploying code on production; you will temporarily point the site to a static landing page.

 

Splitting configuration files for better management


By default, when you are installing Nginx you get this one monolithic configuration file which contains a whole lot of sample configurations. Due to its extremely modular and robust designing, Nginx allows you to maintain your configuration file as a set of multiple linked files.

How to do it...

Let's take a sample configuration file nginx.conf and see how can it be broken into logical, maintainable pieces:

user       www www; #This directive determines the user and group of the processes started
worker_processes  2;
error_log  logs/error.log;
pid        logs/nginx.pid;
events {
    worker_connections  1024;
}
http {
    include       mime.types;
    default_type  application/octet-stream;
    gzip on;
    gzip_min_length 5000;
    gzip_buffers    4 8k;
    gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;
    gzip_proxied  any;
    gzip_comp_level 2;
    ignore_invalid_headers  on;
    server {
        listen       80;
        server_name  www.example1.com;
        location / {
            root   /var/www/www.example1.com;
            index  index.php index.html index.htm;
        }
        location ~ \.php$ {
            include conf/fcgi.conf;
            fastcgi_pass   127.0.0.1:9000;
        }
    }
}

The preceding configuration is basically serving a simple PHP site at http://www.example1.com using FastCGI. Now we can go ahead and split this file into the following structure:

  • nginx.conf: The central configuration file remains

  • fcgi.conf: This will contain all the FastCGI configurations

  • sites-enabled/: This directory will contain all the sites that are enabled (much like Apache2's sites-enabled directory)

  • sites-available/: This directory will contain all the sites that are not active, but available (again, much like Apache2's sites-available)

  • sites-enabled/site1.conf: This is the sample virtual host configuration of the sample PHP site

The following code is for the new nginx.conf

user       www www; 
worker_processes  2;
error_log  logs/error.log;
pid        logs/nginx.pid;
events {
        worker_connections  1024;
}
http {
        include       mime.types;
        default_type  application/octet-stream;
        gzip on;
        gzip_min_length 5000;
        gzip_buffers    4 8k;
        gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;
        gzip_proxied  any;
        gzip_comp_level 2;
        ignore_invalid_headers  on;
       includes sites-available/*;
}

If you notice, you will see how includes has allowed the inclusion of external configuration files. It should be noted that if we have any errors in any of the files, the Nginx server will fail to reload.

Here is the FastCGI configuration which is used by this setup; generally most Nginx installations provide a default one.

The following is the code for fcgi.conf:

fastcgi_param  QUERY_STRING       $query_string;
fastcgi_param  REQUEST_METHOD     $request_method;
fastcgi_param  CONTENT_TYPE       $content_type;
fastcgi_param  CONTENT_LENGTH     $content_length;
fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
fastcgi_param  REQUEST_URI        $request_uri;
fastcgi_param  DOCUMENT_URI       $document_uri;
fastcgi_param  DOCUMENT_ROOT      $document_root;
fastcgi_param  SERVER_PROTOCOL    $server_protocol;
fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
fastcgi_param  SERVER_SOFTWARE    nginx/$nginx_version;
fastcgi_param  REMOTE_ADDR        $remote_addr;
fastcgi_param  REMOTE_PORT        $remote_port;
fastcgi_param  SERVER_ADDR        $server_addr;
fastcgi_param  SERVER_PORT        $server_port;
fastcgi_param  SERVER_NAME        $server_name;
fastcgi_index  index.php ;
fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name ;

# PHP only, required if PHP was built with --enable-force-cgi-redirect
fastcgi_param  REDIRECT_STATUS    200;
fastcgi_connect_timeout 60;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;

The following is the code for sites-enabled/site1.conf:

server {
    listen       80;
    server_name  www.example1.com;
    location / {
        root   /var/www/www.example1.com;
        index  index.php index.html index.htm;
    }
    location ~ \.php$ {
        include conf/fcgi.conf;
        fastcgi_pass   127.0.0.1:9000;
    }
}

This sort of a file arrangement allows clean separation of the main configuration and the auxiliary ones. It also promotes structured thinking, which is useful when you have to quickly switch or deploy sites.

We will go over the various configurations that you see in these files in other chapters. For example, fcgi.conf is covered in the recipe to get PHP working with Nginx using FastCGI.

 

Setting up multiple virtual hosts


Usually any web server hosts one or more domains, and Nginx, like any good web server, allows you to easily configure as many virtual hosts as you want.

How to do it...

Let's take a simple example. You want to set up a simple set of webpages on www.example1.com. Here is the sample configuration which needs to go into the sites-enabled/site1.conf:

server {
	listen 80;
	server_name www.example1.com example1.com;
	access_log /var/log/Nginx/example1.com/access.log;
	error_log /var/log/Nginx/example1.com/error.log;
	location / {
		root /var/www/www.example1.com;
		index index.html index.htm;
	}
}

How it works...

So let's see how this works. The listen defines the port on which the web server is listening (in this case, its 80)! The server_name lets you easily define the domain that maps to this virtual host configuration. Inside, you can start defining how the virtual host works. In this case it serves set of HTML pages from the /var/www/www.example1.com directory.

So when you reload your Nginx configuration assuming that your DNS records point correctly at your server, you should see your HTML pages load when you access the web address (in this case, http://www.example1.com).

There's more...

Here is a quick checklist to get you started:

  1. Create a simple directory with the HTML files.

  2. Create a simple configuration file containing the virtual host configuration for www.example1.com.

  3. Reload Nginx.

  4. Point your DNS server to the correct server running Nginx.

  5. Load www.example1.com.

 

Setting up a default catch-all virtual host


Once you are comfortable setting up the virtual hosts, you will end up in a situation where you have a lot of domains pointing at the IP. In addition to the domains, you would also have the web server responding to the IP addresses it hosts, and many other unused subdomains of the domains pointing at it. We can take a look at this with a simple example, so you have http://www.example1.com pointing at the IP address, you have configured a virtual host to handle the domains www.example1.com and example1.com. In such a scenario, when the user types in abc.example1.com or an IP address the web server will not be able to serve the relevant content (be it 404 or some other promotional page).

How to do it...

For situations like the one above, one can utilize the default catchall virtual host that Nginx provides; here is a simple example where this default catchall virtual host serves a simple set of web pages.

The following is the code for sites-enabled/default.conf:

server {
	listen   80 default; 
 	server_name  _;
	location / {
		root /var/www/default;
		index index.html index.htm;
	}
}

How it works...

The key thing to note is the fact that you are listening on the default port and that the server_name is "_" which is the catchall mechanism. So whenever the user enters a domain for which you have no defined virtual host, pages will get server from the /var/www/default directory.

 

Using wildcards in virtual hosts


Imagine a situation where you need to create an application that needs to serve dynamic pages on subdomains! In that case, you will need to set up a virtual host in Nginx that can utilize wildcards. Nginx has been made ground up to handle such a scenario. So let's take our favorite example of http://www.example1.com. Let's say you are building an application that needs to handle the various subdomains such as a.example1.com, b.example1.com, and so on. The following configuration would let the application behind handle all these various subdomains.

How to do it...

You will need to set a wildcard on the DNS entry. Without the DNS entries, the domain (and subdomains) will never resolve to your server IP. A sample DNS entry is given below which points the domain http://example1.com to the IP 69.9.64.11:

example1.com. IN A 69.9.64.11

Once you know how your DNS works, you can add this to your nginx.conf inside the http section:

server {
	listen 80;
	server_name example1.com *.example1.com;
	location / {
		........
	}	
}

How it works...

The important part to note is that in this case, you are serving all the subdomains using the same code base. We have also set the virtual host to serve the non-www domain as well (example1.com which is different from www.example1.com).

So when you type a.example1.com, your web application will receive a.example1.com as the domain that was requested from the web server and it can process the HTTP response accordingly.

 

Setting up the number of worker processes correctly


Nginx like any other UNIX-based server software, works by spawning multiple processes and allows the configuration of various parameters around them as well. One of the basic configurations is the number of worker processes spawned! It is by far one of the first things that one has to configure in Nginx.

How to do it...

This particular configuration can be found at the top of the sample configuration file nginx.conf:

user       www www; 
worker_processes  5; 
error_log  logs/error.log; 
pid        logs/nginx.pid; 
worker_rlimit_nofile 8192; 
events { 
  worker_connections  4096; 
}

In the preceding configuration, we can see how the various process configurations work. You first set the UNIX user under which the process runs, then you can set the number of worker processes that Nginx needs to spawn, after that we have some file locations where the errors are logged and the PIDs (process IDs) are saved.

How it works...

By default, worker_processes is set at 2. It is a crucial setting in a high performance environment as Nginx uses it for the following reasons:

  • It uses SMP, which allows you to efficiently use multi-cores or multi-processors systems very efficiently and have a definite performance gain.

  • It increases the number of processes decreases latency as workers get blocked on disk I/O.

  • It limits the number of connections per process when any of the various supported event types are used. A worker process cannot have more connections than specified by the worker_connections directive.

There's more...

It is recommended that you set worker_processes as the number of cores available on your server. If you know the values of worker_processes and worker_connections, one can easily calculate the maximum number of connections that Nginx can handle in the current setup.

Maximum clients = worker_processes * worker_connections

 

Increasing the size of uploaded files


Usually when you are running a site where the user uploads a lot of files, you will see that when they upload a file which is more than 1MB in size you get an Nginx error stating, "Request entity too Large" (413), as shown in the following screenshot. We will look at how Nginx can be configured to handle larger uploads.

How to do it...

This is controlled by one simple part of the Nginx configuration. You can simply paste this in the server part of the Nginx configuration:

client_max_body_size 100M; # M stands for megabytes

This preceding configuration will allow you to upload a 100 megabyte file. Anything more than that, and you will receive a 413. You can set this to any value which is less than the available disk space to Nginx, which is primarily because Nginx downloads the file to a temporary location before forwarding it to the backend application.

There's more...

Nginx also lets us control other factors related to people uploading files on the web application, like timeouts in case the client has a slow connection. A slow client can keep one of your application threads busy and thus potentially slow down your application. This is a problem that is experienced on all the heavy multimedia user-driven sites, where the consumer uploads all kinds of rich data such as images, documents, videos, and so on. So it is sensible to set low timeouts.

client_body_timeout 60; # parameter in seconds
client_body_buffer_size 8k;
client_header_timeout 60; # parameter in seconds
client_header_buffer_size 1k;

So, here the first two settings help you control the timeout when the body is not received at one read-step (basically, if the server is queried and no response comes back). Similarly, you can set the timeout for the HTTP header as well. The following table lists out the various directives and limits you can set around client uploading.

Directive

Use

client_body_in_file_only

This directive forces Nginx to always store a client request body in temporary disk files, even if the file size is 0.

The file will not be removed at request completion.

client_body_in_single_buffer

This directive specifies whether to keep the whole body in a single client request buffer.

client_body_buffer_size

This directive specifies the client request body buffer size.

If the request body is more than the buffer, then the entire request body or some part is written in a temporary file.

client_body_temp_path

This directive assigns the directory for storing the temporary files in it with the body of the request.

client_body_timeout

This directive sets the read timeout for the request body from client.

client_header_buffer_size

This directive sets the header buffer size for the request header from client.

client_header_timeout

This directive assigns timeout with reading of the title of the request of client.

client_max_body_size

This directive assigns the maximum accepted body size of client request, indicated by the line Content-Length in the header of request.

 

Using dynamic SSI for simple sites


With the advent of modern feature-full web servers, most of them have Server-Side Includes (SSI) built in. Nginx provides easy SSI support which can let you do pretty much all basic web stuff.

How to do it...

Let's take a simple example and start understanding what one can achieve with it.

  1. Add the following code to the nginx.conf file:

    server {
    	…..
    	location / {
    		ssi on;
    		root /var/www/www.example1.com;
    	}
    }
  2. Add the following code to the index.html file:

    <html>
    <body>
      <!--# block name="header_default" --> 
      the header testing
     <!--# endblock -->
     <!--# include file="header.html" stub="header_default" →
    <!--# echo var="name" default="no" -->
     <!--# include file="footer.html"-->
    </body>
    </html>
  3. Add the following code to the header.html file:

    <h2>Simple header</h2>
  4. Add the following code to the footer.html file:

    <h2>Simple footer</h2>

How it works...

This is a simple example where we can see that you can simply include some partials in the larger page, and in addition to that you can create block as well within the page. So the <block> directive allows you to create silent blocks that can be included later, while the <include> directive can be used to include HTML partials from other files, or even URL end points. The <echo> directive is used to output certain variables from within the Nginx context.

There's more...

You can utilize this feature for all kinds of interesting setups where:

  • You are serving different blocks of HTML for different browsers types

  • You want to optimize and speed up certain common blocks of the sites

  • You want to build a simple site with template inheritance without installing any other scripting language

 

Adding content before and after a particular page


Today, in most of the sites that we visit, the webpage structure is formally divided into a set of boxes. Usually, all sites have a static header and a footer block. Here, in this following page you can see the YUI builder generating the basic framework of such a page.

In such a scenario, Nginx has a really useful way of adding content before and after it serves a certain page. This will potentially allow you to separate the various blocks and optimize their performance individually, as well.

Let's have a look at an example page:

So here we want to insert the header block before the content, and then append the footer block:

How to do it…

The sample configuration for this particular page would look like this:

server {
	listen 80;
	server_name www.example1.com;
	location / { 
 		add_before_body   /red_block 
 		add_after_body    /blue_block; 
		...
	}
	location /red_block/ {
		…
	}
	location /blue_block/ {
		….
	}
}

This can act as a performance enhancer by allowing you to load CSS based upon the browser only. There can be cases where you want to introduce something into the header or the footer on short notice, without modifying your backend application. This provides an easy fix for those situations.

Note

This module is not installed by default and it is necessary to enable it when building Nginx.

./configure –with-http_addition_module
 

Enabling auto indexing of a directory


Nginx has an inbuilt auto-indexing module. Any request where the index file is not found will route to this module. This is similar to the directory listing that Apache displays.

How to do it...

Here is the example of one such Nginx directory listing . It is pretty useful when you want to share some files over your local network. To start auto index on any directory all you need to do is to carry out the following example and place it in the server section of the Nginx configuration file:

server {
	location 80;
	server_name www.example1.com;
	location / {
		root /var/www/test;
		autoindex on;
	}
}

How it works...

This will simply enable auto indexing when the user types in http://www.example1.com. You can also control some other things in the listings in this way:

autoindex_exact_size off;

This will turn off the exact file size listing and will only show the estimated sizes. This can be useful when you are worried about file privacy issues.

autoindex_localtime on;

This will represent the timestamps on the files as your local server time (it is GMT by default):

This image displays a sample index auto-generated by Nginx using the preceding configuration. You can see the filenames, timestamp, and the file sizes as the three data columns.

 

Serving any random web page from a directory


There has been a recent trend for a lot of sites to test out their new pages based upon the A/B methodology. You can explore more about its history and the various companies that have adopted this successfully as a part of their development process at http://en.wikipedia.org/wiki/A/B_testing. In this practice, you have a set of pages and some metric (such as number of registrations, or the number of clicks on a particular element). Then you go about getting people to randomly visit these pages and get data about their behavior on them. This lets you iteratively improve the page and the elements on them.

Nginx has something that will let you to run your own A-B test without writing any code at all. It allows you to randomly select any web page from a directory and display it.

How to do it...

Let's have a look at a sample configuration which needs to be placed within the HTTP section of the Nginx configuration:

server {
	listen 80;
	server_name www.example1.com;
	location  /  { 
		root /var/www/www.example1.com/test_index;
 		random_index  on; 
	}
}

How it works...

Let's assume that you have some files in the /var/www/www.example1.com/test_index directory. When you turn on the random index it will scan the directory and then send a randomly picked file instead of the default index.html. The only exceptions are plain files. Whole filenames which start with a dot will not be part of the site of files to be picked from.

So here are two sample test pages, with slightly differing headers. Notice that the URLs are the same. So it will let you determine if the end user is clicking through more with the red link or the blue link using pure statistical methods:

The preceding screenshot displays A.html on opening the site. There is equal probability of opening both the pages, much like the tossing of a coin and getting heads or tails.

So, using the A-B testing as an example, you can set an A.html and a B.html, which would be served to the user randomly. It would allow you to easily measure a lot of interesting client behavior by simply analyzing the Nginx access logs.

 

Serving cookies for identifying and logging users


Nginx has a useful functionality of serving cookies for identifying users. This is very useful in tracking anonymous user behavior in case a website does not want to employ external analytics software. This module is compatible with the mod_uid module in Apache2, which provides a similar functionality.

How to do it…

Here is a sample configuration for this module. This goes in the server section of the configuration:

userid          on;
userid_name     uid;
userid_domain   example1.com;
userid_path     /;
userid_expires  365d;
userid_p3p      'policyref="/w3c/p3p.xml", CP="CUR ADM OUR NOR STA NID"';

How it works...

Now let's see and understand what the various directives are about. The first userid directive enables this module; the second assigns a name to the cookie which is going to be written on the client side. The next three directives are the standard cookie information that is needed (the primary domain, the path, and the time of expiry). The last directive enables the browser to understand the privacy practices that the website follows. This is done by using the P3P protocol which allows websites to declare their intended usage that they collect about the user. It is basically an XML file that allows you to programmatically display your privacy policy. The following code is a simple example configuration of how you can define a policy where the data is removed after 4 months:

<META xmlns="http://www.w3.org/2002/01/P3Pv1">
  <POLICY-REFERENCES>
    <EXPIRY max-age="10000000"/><!-- about four months -->
  </POLICY-REFERENCES>
</META>

This XML put on the server will objectively define the privacy policies of the site to the incoming bots or users.

There's more...

On enabling this module, some variables are available in the Nginx configuration which allow you do fairly interesting things. You have access to some variables in the configuration contest, like $uid_got,$uid_set.

These can be used for writing interesting rewrite rules. A simple application using these variables is to log the users coming on your site and then determining the user bounce rates on your website by parsing the logs.

 

Re-encoding the response to another encoding


File encoding is a major issue on most websites, a lot of time the database (MySQL in most cases) is configured to run using the Latin-1 encoding instead of the UTF-8 encoding that is the prevalent standard. Nginx provides an easy solution for changing your web page encoding on-the-fly, so that your users do not end up with garbled characters on your website

How to do it...

All you need to do is to place this in the server section of your Nginx configuration:

charset         windows-1251; 
source_charset  koi8-r;

How it works...

This basically defines the fact that the source character set is koi8-r. If the encoding is different from the charset character set, then re-encoding is carried out. In case your original response already has a "Content-Type" header present then you will need to use the following to override and do the re-encoding:

overrride_charset on;

There's more...

You can also decide how the re-encoding happens by defining a character mapping. A simple example is the following:

charset_map  koi8-r  windows-1251 { 
  C0  FE ; # small yu 
  C1  E0 ; # small a 
  C2  E1 ; # small b 
  C3  F6 ; # small ts 
  # ... 
}

Nginx lets you do these neat little things that can make your site more accessible and usable for the end-user.

 

Enabling Gzip compression on some content types


As the Web has evolved, we have had improvements in web server and browser technologies. In recent times, with the booming consumer Internet market, the web application has had to become faster.

Compression techniques, which were already present, have come of age and now most sites enable a fairly high degree of compression on the pages they serve. Nginx being state of the art, has Gzip compression and allows a whole lot of options on how to go about it.

How to do it...

You will need to modify your Nginx configuration file and add the following directives:

http {
	gzip             on; 
	gzip_min_length  1000; 
	gzip_comp_level 6;
	gzip_proxied     expired no-cache no-store private auth; 
	gzip_types       text/plain application/xml; 
	gzip_disable     "MSIE [1-6]\.";
	server {
		….
	}
}

How it works...

This sample configuration allows you to turn on the Gzip compression of the outgoing page for all pages which are over 1000 bytes. This limit is set because compression technology performance degrades as the page size becomes smaller. You can then set the various MIME types for which the compression should occur; this particular example will compress only plain text files and XML files.

Older browsers are not the best when it comes to utilizing this, and you can disable Gzip depending on the browser type. One of the most interesting settings is the level of compression where you need to make a choice between the amount of CPU that you want to spend on compressing and serving the pages (the higher this number, more of your CPU time will go towards compressing and sending pages). It is recommended to follow a middle path on this particular setting; the client also spends more CPU time decompressing the page if you set this. A sensible setting of this value would be six.

There's more...

For proxy requests, gzip_proxied actually allows or disallows the compression of the response of the proxy request based on the request and the response. You can use the following parameters:

parameter

Function

off

Disables compression for all proxy requests

expired

Enables compression, if the Expires header prevents caching

no-cache

Enables compression if Cache-Control header contains no-cache

no-store

Enables compression if Cache-Control header contains no-store

private

Enables compression if Cache-Control header contains private

no_last_modified

Enables compression if Last-Modified isn't set

no_etag

Enables compression if there is no ETag header

auth

Enables compression if there is an Authorization header

any

Enables compression for all requests

So in the preceding example (expired no-cache no-store private auth) it is clear that the compression is enabled when the Expires header prevents caching, when the Cache-Control contains no-cache, no-store, or private, and when there is an Authorization header present. This allows tremendous control on how the compression is delivered to the client's browser.

 

Setting up 404 and other error pages


All web applications have errors and missing pages, and Nginx has easy methods of ensuring that the end user has a good experience when the application does not respond correctly. It successfully handles all the HTTP errors with default pages, which can gracefully notify the users that something has gone wrong.

How to do it...

Nginx allows you to do pretty interesting things with error pages. Following are some example configurations which can be placed within the HTTP or server section.

We are also going to define a named location using the "@" prefix after location. These locations are not used during the normal processing of any request and are intended to only process internally redirected requests.

location @fallback (
    proxy_pass http://backend;
)
error_page   404          /404.html; 
error_page   502 503 504  /50x.html; 
error_page   403          http://example1.com/forbidden.html; 
error_page   404          = @fallback; 
error_page 404 =200 /.empty.gif; 

How it works...

The first example allows you to map a simple 404 page to a simple HTML. The next example allows the mapping of various application error codes to another generic application error HTML page. You can also map the error page to some other external site all together (http://example1.com/forbidden.html). The fourth example allows you to map the page to another location, defined as @fallback. The last example is interesting as it actually allows you to change the response code to a 200 (HTTP OK). This is useful in situations where you have excessive 404 pages on the site, and would prefer not sending a 404 back as reply, but a 200 with a very small GIF file in return.

You can utilize this very effectively to give the end user a better experience when they inadvertently reach dead ends and application errors on your site.

If you do not set these error pages correctly, you will get the default Nginx error pages which may not be useful to the user and may turn them away.

About the Author

  • Dipankar Sarkar

    Dipankar Sarkar is a web and mobile entrepreneur. He has a Bachelor’s degree in Computer Science and Engineering from Indian Institute of Technology, Delhi. He is a firm believer in the Open source movement and has participated in the Google Summer of Code, 2005-06 and 2006-07. He has conducted technical workshops for Windows mobile and Python at various technical meetups. He recently took part in the Startup Leadership Program, Delhi Chapter. He has worked with Slideshare LLC, one of the world’s largest online presentation hosting and sharing services as an early engineering employee. He has since then worked with Mpower Mobile LLC, a mobile payment startup and Clickable LLC, a leading search engine marketing startup. He was a co-founder at Kwippy, which was one of the top micro-blogging sites. He is currently working in the social TV space and has co-founded Jaja. He has previously authored “Nginx web server implementation cookbook” and this is his second book on Nginx. This book “Mastering Nginx” is a more structured approach to how one can learn and master Nginx, with practical examples and strategies.

    Browse publications by this author

Latest Reviews

(1 reviews total)
Good