Migration from Apache to Lighttpd

Exclusive offer: get 50% off this eBook here
Lighttpd

Lighttpd — Save 50%

Installing, compiling, configuring, optimizing, and securing this lightning-fast web server

$23.99    $12.00
by Andre Bogus | November 2008 | Linux Servers Open Source

In this article by Andre Bogus, we will be focusing on migrating from Apache to Lighttpd web server. Lighttpd is the perfect solution for every server that is suffering load problems, as it has a small memory footprint compared to other web-servers, effective management of the cpu-load, and advanced feature set, such as FastCGI, SCGI, Auth, Output-Compression, URL-Rewriting, and many more. Apache is still the most common web server used today, so while we wait for Lighttpd's world domination, the migration from this server warrants its own article. As this article is on Lighttpd and not on Apache, it assumes some knowledge of the Apache configuration. If anything is unclear, the Apache documentation at http://apache.org/docs/ will be of help.

Now starting from a working Apache installation, what can Lighttpd offer us?

  • Improved performance for most cases (as in more hits per second)
  • Reduced CPU time and memory usage
  • Improved security

Of course, the move to Lighttpd is not a small one, especially if our Apache configuration makes use of its many features. Systems tied into Apache as a module may make the move hard or even impossible without porting the module to a Lighttpd module or moving the functionality into CGI programs, if possible.

We can ease the pain by moving in small steps. The following descriptions assume that we have one Apache instance running on one hardware instance. But we can scale the method by repeating it for every hardware instance.

When not to migrate
Before we start this journey, we need to know that our hardware and operating systems support Lighttpd, that we have root access (or access to someone who has), and that the system has enough space for another Lighttpd installation (yes, I know, Lighttpd should reduce space concerns, but I have seen Apache installations munching away entire RAID arrays). Probably, this only makes sense if we plan on moving a big percentage of traffic to Lighttpd. We also might make extensive use of Apache module, which means a complete migration would involve finding or writing suitable substitutes for Lighttpd.

Adding Lighttpd to the Mix

Install Lighttpd on the system that Apache runs on. Find an unused port (refer to a port scanner if needed) to set server.port to. For example, if port 4080 is unused on our system, we would look for server.port in our Lighttpd configuration and change it to:

server.port = 4080

If we want to use SSL, we should change all occurrences of the port 443 to another free port, say 4443. We assume our Apache is answering requests on HTTP port 80.

Now let's use this Lighttpd instance as a proxy for our Apache by adding the following configuration:

server.modules = (
#...
"mod_proxy",
#...
)
#...
proxy.server = (
"" => ( # proxy everything
host => "127.0.0.1" # localhost
port => "80"
)
)

This tells our Lighttpd to proxy all requests to the server that answers on localhost, port 80, which happens to be our Apache server. Now, when we start our Lighttpd and point our browser to http://localhost:4080/, we should be able to see the same thing that our Apache is returning.

What is a proxy?
A Proxy stands in front of another object, simulating the object by relaying all requests to it. A proxy can change requests on the fly, filter requests, and so on. In our case, Lighttpd is the web server to the outside, whilst Apache will still get all requests as usual.

Excursion: mod_proxy

mod_proxy is the module that allows Lighttpd to relay requests to another web server. It is not to be confused with mod_proxy_core (of Lighttpd 1.5.0), which provides a basis for other interfaces such as CGI. Usually, we want to proxy only a specific subset of requests, for example, we might want to proxy requests for Java server pages to a Tomcat server. This could be done with the following proxy directive:

proxy.server = (
".jsp" => ( host => "127.0.0.1", port => "8080" )
# given our tomcat is on port 8080
)

Thus the tomcat server only serves JSPs, which is what it was built to do, whilst our Lighttpd does the rest.

Or we might have another server which we want to include in our Web presence at some given directory:

proxy.server = (
"/somepath" => ( host => "127.0.0.1", port => "8080" )
)

Assuming the server is on port 8080, this will do the trick. Now http://localhost/somepath/index.html will be the same as http://localhost:8080/index.html.

Reducing Apache Load

Note that as most Lighttpd directives, proxy.server can be moved into a selector, thereby reducing its reach. This way, we can reduce the set of files Apache will have to touch in a phased manner. For example, YouTube™ uses Lighttpd to serve the videos. Usually, we want to make Lighttpd serve static files such as images, CSS, and JavaScript, leaving Apache to serve the dynamically generated pages.

Now, we have two options: we can either filter the extensions we want Apache to handle, or we can filter the addresses we want Lighttpd to serve without asking Apache.

Actually, the first can be done in two ways. Assuming we want to give all addresses ending with .cgi and .php to Apache, we could either use the matching of proxy.server:

proxy.server = (
".cgi" => ( host = "127.0.0.1", port = "8080" ),
".php" => ( host = "127.0.0.1", port = "8080" )
)

or match by selector:

$HTTP['url'] =~ "(.cgi|.php)$" {
proxy.server = ( "" => ( host = "127.0.0.1", port = "8080" ) )
}

The second way also allows negative filtering and filtering by regexp — just use !~ instead of =~.

mod_perl, mod_php, and mod_python

There are no Lighttpd modules to embed scripting languages into Lighttpd (with the exception of mod_magnet, which embeds Lua) because this is simply not the Lighttpd way of doing things. Instead, we have the CGI, SCGI, and FastCGI interfaces to outsource this work to the respective interpreters.

Most mod_perl scripts are easily converted to FastCGI using CGI::Fast. Usually, our mod_perl script will look a lot like the following script:

use CGI;
my $q = CGI->new;
initialize(); # this might need to be done only once
process_query($q); # this should be done per request
print response($q); # this, too

Using the easiest way to convert to FastCGI:

use CGI:Fast # instead of CGI
while (my $q = CGI:Fast->new) { # get requests in a while-loop
initialize();
process_query($q);
print response($q);
}

If this runs, we may try to put the initialize() call outside of the loop to make our script run even faster than under mod_perl. However, this is just the basic case. There are mod_perl scripts that manipulate the Apache core or use special hooks, so these scripts can get a little more complicated to migrate.

Migrating from mod_php to php-fcgi is easier — we do not need to change the scripts, just the configuration. This means that we do not get the benefits of an obvious request loop, but we can work around that by setting some global variables only if they are not already set. The security benefit is obvious. Even for Apache, there are some alternatives to mod_php, which try to provide more security, often with bad performance implications.

mod_python can be a little more complicated, because Apache calls out to the python functions directly, converting form fields to function arguments on the fly. If we are lucky, our python scripts could implement the WSGI (Web Server Gateway Interface). In this case, we can just use a WSGI-FastCGI wrapper. Looking on the Web, I already found two: one standalone (http://svn.saddi.com/py-lib/trunk/fcgi.py), and one, a part of the PEAK project (http://peak.telecommunity.com/DevCenter/FrontPage). Otherwise, python usually has excellent support for SCGI.

As with mod_perl, there are some internals that have to be moved into the configuration (for example dynamic 404 pages, the directive for this is server.error-handler-405, which can also point to a CGI script). However, for basic scripts, we can use SCGI (either from http://www.mems-exchange.org/software/scgi/ or as a python-only version from http://www.cherokee-project.com/download/pyscgi/). We also need to change import cgi to import scgi and change CGIHandler and CGIServer to SCGIHandler and SCGIServer, respectively.

Lighttpd Installing, compiling, configuring, optimizing, and securing this lightning-fast web server
Published: October 2008
eBook Price: $23.99
Book Price: $39.99
See more
Select your format and quantity:

.htaccess

A lot of Lighttpd users converting from Apache ask if Lighttpd has any substitutes for .htaccess files, which were made popular by Apache and are now a de-facto Standard used by many web servers. Instead, Lighttpd has its own configuration syntax, so all the old .htaccess files won't work with Lighttpd.

There is no perfect solution to this problem, but as the most used feature of .htaccess files is authentication, we can at least solve that. Let's have a look at the authentication directive format in Apache and Lighttpd:

  • Apache just assumes that the path required for authentication is the path where the .htaccess file resides, while Lighttpd needs to add this explicitly.
  • The httpd.conf adds some more stuff, which is given as default from httpd.conf. In the lighttpd.conf example, we do not assume such defaults.

Note that the Lighttpd configuration gets a little more complicated if we have multiple backends or user files. In this case, we need to use a selector to limit the reach of our directives. For example, assume that we want digest authentication for internal.mydomain.com, but htpasswd authentication for some directories in mydomain.com, with a different htpasswd file for the messages directory:

server.modules = (..., "mod_auth", ...)
auth.backend = "htpasswd"
auth.backend.htpasswd.userfile = "/web/general/.htpasswd"
$HTTP["host"] == "internal.mydomain.com" {
auth.backend = "htdigest"
auth.backend.htdigest.userfile = "/web/internal/.htdigest"
auth.require = (
"/" => (
"method" => "digest",
"realm" => "internal",
"require" => "valid-user"
)
)
}
else
$HTTP["url"] =~ "^/messages" {
auth.backend.htpasswd.userfile = "/web/messages/.htpasswd"
auth.require = (
"/" => (
"method" => "basic",
"realm" => "messages",
"require" => "valid-user"
)
)
}
auth.require = ( # This table assigns authentication requirements
# to directories or file types.
"/admin/" => ( # everything below the /admin path
"method" => "basic",
"realm" => "admin",
"require" => "user=andre|user=bob" # allow only bob and me
),
"/download" => (
"method" => "basic",
"realm" => "download",
"require" => "valid-user"
),
".private" => ( # all files ending with .private
"method" => "basic",
"realm" => "private",
"require" => "user=andre"
)
# ... we could add more directories here.
)

The first selector marks out a region internal.mydomain.com, where we then use digest authentication. The next selector catches the message directory everywhere else and includes the use of the /web/messages/.htpasswd user file. Finally, we add all the requirements for the other directories.

Note that the following two are identical:

$HTTP["url"] =~ "^/messages" {

auth.require = ( "/" => ( ... ) )

}

auth.require = ( "/messages" => ( ... ) )

But the left version is more flexible as it allows defining different user files and backends for each path that matches a certain pattern. Armed with this knowledge, we can write a small script that runs through our web root, finds all .htaccess files and emits corresponding Lighttpd configuration (at least for the access directives). In fact we do not even need to do this, because I already did the coding:

#!/bin/env python
import os
def toUserList(users):
return "|".join(["user="+user for user in users.split(" ")])
def groups(groupFileName, gps):
groupFile = open(groupFileName)
groupDict = {}
for groupLine in groupFile:group, users = groupLine.split(":")
groupDict[group.strip()] = users.strip()
return "|".join([toUserList(groupDict[g])
for g in gps.split(" ")])
for (root, dirs, files) in os.walk(path):
if ".htaccess" not in files: continue
filepath = os.path.join(root, ".htaccess")
f = open(filepath)
try:
realm = root.rsplit(os.path.sep, 1)[1]
except:
realm = root
try:
# try some sensible defaults
r = {"authtype":"Basic", "url":root,
"required":"nothing","realm":realm,
"authuserfile":os.path.join(root, ".htpasswd",
"error":None}
for line in f:
try:
tempdirective, arguments = line.split(" ", 1)
directive = tempdirective.lower()
r[directive] = arguments.strip('"')
except:
pass
if r["required"].startswith("user"):
r["required"] = toUserList(r["required"][5:])
elif r["required"].startswith("group"):
r["required"] = groups(r["authgroupfile"], r["required"][6:])
if r["required"] != "nothing" and r["error"] is None:
r["backend"] = {"Basic":"htpasswd",
"Digest":"htdigest"}[r["authtype"]]
r["authtype"] = r["authtype"].lower()
print """$HTTP["url"] =~ "%{url}s" {
auth.backend = "${backend}s"
auth.backend.${backend}s.userfile = "${authuserfile}s"
auth.require = ( "/" => (
"method" => "${authtype}s",
"realm" => "${realm}s",
"require" => "${required}s"
) )
}""" % r;
finally:
f.close()

The htaccess2lighttpd.py script is available at http://www.packtpub.com/files/code/2103_Code.zip.

Note the script does have one limitation: Lighttpd does not handle groups. However, it allows specification of a list of users directly, as in user=andre|user=bob that we required for admin access. The other way is to have a separate password file for each group. The script, however, takes the first way. This means that we need to re-run the script each time a group membership changes. So this solution would only be temporary — the move to per-group access files can then be made without being hectic.

.htaccess and PHP

Apart from that, some users might put PHP options into the .htaccess files. Pier Alan Joye maintains a htscanner program to ease the transition. It is available at http://pecl.php.net/package/htscanner. This program basically moves PHP options from .htaccess files into the php.ini file.

Rewriting Rules

On the Lighttpd forums, most former Apache administrators ask how they can adapt their rewrite rules to work with Lighttpd. There is no program (yet) to do this, but here are some typical constructs and advice on how to do that in Lighttpd lingua:

Apache

Lighttpd

LoadModule "rewrite_module"

RewriteEngine on

server.modules = (..., "mod_rewrite",
                                    "mod_redirect", ...)

# A simple rewrite

RewriteRule ^from_here(.*)/to_there$1

url.rewrite = ("^/from_here" => "to_there")

RewriteCond %{HTTP_HOST} me..*

RewriteRule ^/(.*) /me/$1

$HTTP["host"] =~ "me..*" {
    url.rewrite = ( "^/" => "/me/"
}

# Redirecting a single file

RewriteRule move.html target.html [R]

url.redirect = ( "move.html" => "target.html")

# Solving the trailing slash problem

RewriteCond %{REQUEST_FILENAME} -d

RewriteRule (.*) $1/

# nothing to do here. Lighttpd does not have this problem.

# Redirecting failed web pages to xyz.com

RewriteCond %{REQUEST_FILENAME} !-f

RewriteRule ^(.+) http://xyz.com/$1

# use an CGI error page that redirects

server-error-handler-404 = "redirect.cgi"

 

# Time-based multiplexing

RewriteCond %{TIME_HOUR} > 07

RewriteCond %{TIME_HOUR} < 19

RewriteRule ^foo.html foo.day.html

RewriteRule ^foo.html foo.night.html

# either use mod_magnet or solve this outside of Lighttpd, for example by using a cron job to set symbolic links.

# Rewrite for google bot

RewriteCond %{HTTP_USER_AGENT} Google

RewriteRule ^(.+) /bots/$1

# match for useragent

$HTTP["useragent"] =~ "Google" {
    url.rewrite = "^/" => "/bots"
}

# Rewrite by cookie (missing session)

RewriteCond %{HTTP_COOKIE} sess [N]

RewriteRule ^(.+) index.php

# use a negative regexp match

$HTTP["cookie"] !~ "sess" {
    url.rewrite = ("(.*)" => "index.pho")
}

# set environment variable based on query

RewriteCond %{QUERY_STRING}

id=([^&]*)

RewriteRule ^(.*)$ /$1 [E=ID:%1]

server.modules += ("mod_setenv")
$HTTP["url"] =~ "[?&]id=([^&]*)" {
    setenv.add_request_header = "ID: %1"
}

# block images by referer

RewriteCond %{REFERER} !^$

RewriteCond %{REFERER} !my.net [NC]

RewriteRule ^images/*.png - [F]

# deny for non-empty outside referers

$HTTP["referer"] !~ "^($|.*my.net) {
    url.access-deny = (".png")
}

 

 

Naturally this table cannot cover all aspects of Apache rewrite rules, but remember that all complex systems have emerged from simple systems.

WebDAV

Apache does WebDAV out of the box, while Lighttpd needs the mod_webdav module to support WebDAV, and it still has some rough edges. For example, Window users will find that their mod_auth login does not work when they activate WebDAV; this can be compensated by a cookie-based login. Oh, and we need to have webdav support configured at compile time, if we built our Lighttpd from source. The configuration, as always, is pretty straightforward:

server.modules += ( "mod_webdav" )
# activate WebDAV for the server "dav.my.net"
$HTTP["host"] == "dav.my.net" {
webdav.activate = "enable"
# enable writing for members only (identify by sess cookie)
$HTTP["cookie"] !~ "sess" {
$HTTP["url"] =~ "^/members/" {
webdav.is-readonly = "enable"
}
}
}

The important directives here are webdav.activate and webdav.is-readonly. The former activates WebDAV, if we set it to enable. Otherwise, WebDAV is deactivated by default. The latter forbids operations that modify files on the server (PUT and DELETE), and is disabled by default. So unless we enable this option, PUT and DELETES are not served.

Summary

There are some obstacles on the way from Apache to Lighttpd. But a planned and careful approach will allow us to keep our server working while we change it. The .htaccess scanner script can be a stop gap measure to smoothen the transition for .htaccess authentication users. Finally, a heavy use of rewrite rules can make it tricky to move. However, we can translate them one by one into something that will work with Lighttpd, especially when we add Lua to the mix.

Lighttpd Installing, compiling, configuring, optimizing, and securing this lightning-fast web server
Published: October 2008
eBook Price: $23.99
Book Price: $39.99
See more
Select your format and quantity:

About the Author :


Andre Bogus

Andre Bogus is a musician turned programmer. He has worked in different jobs from voice acting to programming to teaching to managing software projects. At the moment he works as a consultant and implementer for KOGIT GmbH, an Identity Management company based in Germany.

He found Lighttpd while searching for the ideal software for his personal web server and quickly learned the tricks to make it do what he wanted. He enjoys learning new things and telling others about them. When his full schedule allows it, he can be found on the #lighttpd IRC channel.

Books From Packt

Learning jQuery
Learning jQuery : Better Interaction Design and Web Development with Simple JavaScript Techniques

CherryPy Essentials: Rapid Python Web Application Development
CherryPy Essentials: Rapid Python Web Application Development

Professional Plone Development
Professional Plone Development

Learning Website Development with Django
Learning Website Development with Django

OpenCms 7 Development
OpenCms 7 Development

Zenoss Core Network and System Monitoring
Zenoss Core Network and System Monitoring

Building Websites with Joomla! 1.5
Building Websites with Joomla! 1.5

Building Powerful and Robust Websites with Drupal 6
Building Powerful and Robust Websites with Drupal 6


Your rating: None Average: 5 (1 vote)

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
V
c
t
f
m
s
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software