Your message has been sent.
This article has been saved to your account.
Go to my account
This article has been emailed to your Kindle.
Send this article
Traditionally, web hosts have had a difficult time offering efficient, highly secure web space for a multitude of customers. Generally, a host will provide cheap accounts on a shared server and offer virtual machines as a more expensive option for the more security-conscious site owners. In this article, Joshua Kramer will explain how to provide highly secure hosting for Python-based web applications in an efficient manner. With the popularity of applications such as Trac, Django, and TurboGears, Python-based web applications will become more prevalent in the future, and the concepts presented in this article will become more valuable.
When contemplating the security of a web application, there are several attack vectors that you must consider. An outsider may attack the operating system by planting a remote exploit, exercising insecure operating system settings, or brandishing some other method of privilege escalation. Or, the outsider may attack other sites contained in the same server without escalating privileges. (Note that this particular discussion does not touch upon the conditions under which an attack steals data from a single site. Instead, I'm focusing on the ability to attack different applications on the same server.) With hosts providing space for large numbers of PHP-based sites, security can be difficult as the httpd daemon traditionally runs under the same Unix user for all sites.
In order to prevent these kinds of attacks from occurring, you need to concentrate on two areas:
- Preventing the site from reading or modifying the data of another site, and
- Preventing the site from escalating privileges to tamper with the operating system and bypass user-based restrictions.
There are two toolboxes you use to accomplish this. In the first case, you need to find a way to run all of your sites under different Linux users. This allows the traditional Linux filesystem security model to provide protection against a hacked site attacking other sites on the same server. In the second case, you need to find a way to prevent a privilege escalation to begin with and barring that, prevent damage to the operating system should an escalation occur. Let's first take a look at a method to run different sites under different users.
The Python web framework provides several versatile methods by which applications can run. There are three common methods: first, using Python's built-in http server; second, running the script as a CGI application; and third, using mod_python under Apache (similar to what mod_perl and mod_php do). These methods have various disadvantages: respectively, a lack of scalability, performance issues due to CGI application loading, and the aforementioned “all sites under one user” problem.
To provide a scalable, secure, high-performance framework, you can turn to a relatively new delivery method: mod_wsgi. This Apache module, created by Graham Dumpleton, provides several methods by which you can run Python applications. In this case, we'll be focusing on the “daemon” mode of mod_wsgi. Much like mod_python, the daemon mode of mod_wsgi embeds a Python interpreter (and the requisite script) into a httpd instance. Much like with mod_python, you can configure sites based on mod_wsgi to appear at various locations in the virtual directory tree and under different virtual servers. You can also configure the number and behavior of child daemons on a per-site basis. However, there is one important difference: with mod_wsgi, you can configure each httpd instance to run as a different Linux user. During operation, the main httpd instance dispatches requests to the already-running mod_wsgi children, producing performance results that rival mod_python. But most importantly, since each httpd instance is running under a different Linux user, you can apply Linux security mechanisms to different sites running on one server.
Once you have your sites running on a per-user basis, you should next turn your attention to preventing privilege escalation and protecting the operating system. By default, the Targeted mode of SELinux provided by RedHat Enterprise Linux 5 (and its free cousins such as CentOS) provides strong protection against intrusions from httpd-based applications. Because of this, you will need to configure SELinux to allow access to resources such as databases and files that reside outside of the normal httpd directories.
To illustrate these concepts, I'll guide you as you install a Trac instance under mod_wsgi. The platform is CentOS 5. As a side note, it's highly recommended that you perform the installation and SELinux debugging in a XEN instance so that your environment only contains the software that is needed. The sidebar explains how to easily install the environment that was originally used to perform this exercise, and I will assume that is your primary environment. There are a few steps that require the use of a C compiler – namely, the installation of Trac – and I'll guide you through migrating these packages to your XEN-based test environment.
In this example, you'll use a standard installation of Trac. Following the instructions provided in the URL in the Resource section, begin by installing Trac 0.10.4 with ClearSilver 0.10.5 and SilverCity 0.9.7. (Note that with many Python web applications such as Trac and Django, “installing” the application means that you're actually installing the libraries necessary for Python to run the application. You'll need to run a script to create the actual site.)
Next, create a PostgreSQL user and database on a different machine. If you are using XEN for your development machine, you can use a PostgreSQL database running in your main DOM0 instance; all we are concerned with is that the PostgreSQL instance is accessed on a different machine over the network. (Note that MySQL will also work in this example, but SQLite will not. In this case, we need a database engine that is accessed over the network, not as a disk file.)
After that's done, you'll need to create an actual Trac site. Create a directory under /opt, such as /opt/trac. Next, run the trac_admin command and enter the information prompted.
trac-admin /opt/trac initenv
You can find mod_wsgi at the source listed in the Resources. After you make sure the httpd_devel package is installed, installing mod_wsgi is as simple as extracting the tarball and issuing the normal ./configure and 'make install' commands.
Running Trac under mod_wsgi
If you look under /opt/trac, you'll notice two directories: one labeled apache, and one with the label of the project that you assigned when you installed this instance of Trac. You'll start by creating an application script in the apache directory. The application script is listed in Listing 1.
Listing 1: /opt/trac/apache/trac.wsgi
sys.stdout = sys.stderr
os.environ['TRAC_ENV'] = '/opt/trac/test_proj'
application = trac.web.main.dispatch_request
(Note the 'sys.stdout = sys.stderr' line. This is necessary due to the way WSGI handles communications between the Python script and the httpd instance. If there is any code in the script that prints to STDOUT (such as debug messages), then the httpd instance can crash.)
After creating the application script, you'll modify httpd.conf to load the wsgi module and set up the Trac application. After the LoadModule lines, insert a line for mod_wsgi:
LoadModule wsgi_module modules/mod_wsgi.so
Next, go to the bottom of httpd.conf and insert the text in Listing 2. This text configures the wsgi module for one particular site; it can be used under the default httpd configuration as well as under VirtualHost directives.
Listing 2: Excerpt from httpd.conf:
WSGIDaemonProcess trac user=trac_user group=trac_user threads=25
WSGIScriptAlias /trac /opt/trac/apache/trac.wsgi
Allow from all
Note the WSGIScriptAlias identifier. The /trac keyword (first parameter) specifies where in the directory tree the application will exist. With this configuration, If you go to your server's root address, you'll see the default CenOS splash page. If you add /trac after the address, you'll hit your Trac instance.
Save the httpd.conf file. Finally, add a Linux user called trac_user. It is important that this user should not have login privileges. When the root httpd instance runs and encounters the WSGIDaemonProcess directive noted above, it will fork itself as the user specified in the directive; the fork will then load Python and the indicated script.
Securing Your Site
In this section, I'll focus on the two areas noted in the introduction: User based security and SELinux. I will touch briefly on the theory of SELinux and explain the nuts and bolts of this particular implementation in more depth. I highly recommend that you read the RedHat Enterprise Linux Deployment Guide for the particulars about how RedHat implements SELinux. As with all activities involving some risk, if you plan to implement these methods, you should retain the services of a qualified security consultant to advise you about your particular situation.
Setting up the user-based security is not difficult. Because the HTTPD instance containing Python and the Trac instance will run under the Trac user, you can safely set everything under /opt/trac/test_project for read and execute (for directories) for user and none for group/all. By doing this, you will isolate this site from other sites and users on the system.
Now, let's configure SELinux. First, you should verify that your system is running the proper Policy and Mode. On your development system, you'll be using the Targeted policy in its Permissive mode. If you choose to move your Python applications to a production machine, you would run under the Targeted policy, in the Enforcing mode. The Targeted policy is limited to protecting the most popular network services without making the system so complex as to prevent user-level work from being done. It is the only mode that ships with RedHat 5, and by extension, CentOS 5. In Permissive mode, SELinux policy violations are trapped and sent to the audit log, but the behavior is allowed. In enforcing mode, the violation is trapped and the behavior is not allowed. To verify the Mode, run the Security Level Configuration tool from the Administration menu. The SELinux tab, shown in Figure 1, allows you to adjust the Mode.
After you have verified that SELinux is running in Permissive mode, you need to do two things. First, you need to change the Type of the files under /opt/trac. Second, you need to allow Trac to connect to the Postgres database that you configured when you installed Trac.
First, you need to tweak the SELinux file types attached to the files in your Trac instance. These file types dictate what processes are allowed to access them. For example, /etc/shadow has a very restrictive 'shadow' type that only allows a few applications to read and write it. By default, SELinux expects web-based applications – indeed, anything using Apache – to reside under /var/www. Files created under this directory have the SELinux Type httpd_sys_content_t. When you created the Trac instance under /opt/trac, the files were created as type usr_t. Figure 2 shows the difference between these labels
To properly label the files under /opt, issue the following commands as root:
chcon -R -t httpd_user_content_t trac/
After the file types are configured, there is one final step to do: allow Trac to connect to PostgreSQL. In its default state, SELinux disallows outbound network connections for the httpd type. To allow database connections, issue the following command:
setsebool -P httpd_can_network_connect_db=1
In this case, we are using the -P option to make this setting persistent. If you omit this option, then the setting will be reset to its default state upon the next reboot.
After the setsebool command has been run, start HTTPD by issuing the following command:
/sbin/service httpd start
If you visit the url http://127.0.0.1/trac, you should see the Trac screen such as that in Figure 3.
eBook Price: $23.99
Book Price: $39.99
Once you have Trac up and running, it's time to see what happens when there is an access violation. Enter and compile the program in listing 2. We're compiling a C program because we want to simulate a web application gaining root access. The easiest way to do this is to use a C program, because Linux does not observe the SUID bit on shell scripts.
Listing 2: evil_app.c
system("/sbin/ifconfig > /opt/trac/test_proj/box_info.txt");
system("cp /etc/shadow /opt/trac/test_proj/shadow");
Now, copy the program to your web server under the directory /opt/trac/apache. Set the owner to root, and set the “Set User bit” (via “chmod +s evil_app”). Finally, alter the Trac libraries to call the program. As I noted above, the Trac installation is stored under the Python directory tree. Therefore, you need to modify the file /usr/lib/python2.4/site-packages/trac/loader.py. Add the following line near line 47, before the 'for' statement related to plugins_dirs:
Next, restart the web server and reload the page in your browser. If the server is running the monitor daemon and a graphical user interface, the currently logged-in user will see a violation notification appear on-screen. When you click on the notification, you will see additional details with instructions indicating how to configure SELinux to allow the trapped behavior, if that is what's desired. This Audit Log Analysis application is seen in Figure 4 displaying a warning about the line we noted above.
If you scroll down the list, you will also notice messages about the system denying permission for ifconfig to access the files under /proc. If this had been run on a properly-configured production server, the warnings noted in the troubleshooter would have resulted in a denial of access to the protected services.
That's it! This article described the basic steps to create a highly-secure Python-based web application. As with any unfamiliar technology, you should consult an expert or read the materials described in the resources if you plan to implement these methods on a production system.
When configuring applications to run with SELinux, it is highly recommended that you use a XEN instance of the operating system instead of your main development machine. When you use a XEN instance, you can install only the packages that will either be present on the production machine or are necessary for installation to the production machine. Also, as I have found, when SELinux becomes misconfigured, it is much easier to restore the pristine state of the machine by copying a disk file instead of attempting to re-install the operating system. There is an excellent tutorial describing how to install a Xen guest referenced in the resources section.
In the case of the webdev1 VM used in this article - my “staging” machine - I installed the minimum necessary Server and Graphical Environment packages. After I compiled and installed the requisite mod_wsgi, ClearSilver, SilverCity, and Trac packages, and the “evil_app.c” program on my development machine, I migrated them manually to webdev1. The files are as follows:
-the SilverCity directory
-the Trac directory
eBook Price: $23.99
Book Price: $39.99
About the Author :
Joshua Kramer is taking his final class towards a Philosophy degree at Capital University in Bexley, Ohio. He blogs about all manner of topics, from car repair, to economic theory, to computer subjects. When he's not using Linux (as he has since Kernel version 1.2.8 in 1995) to do productive things, he is doing productive things at his employer, Belron US. Joshua likes to make complex ideas easy to understand so that those ideas have the highest practical use possible. His blog can be found at http://www.globalherald.net/jb01.
Books From Packt