What is Folding @ Home?
Folding @ Home is a project started by Stanford University in late 2000. The idea is that it allows for distributed medical research and leverages the computing power of the general public, instead of requiring the university to buy more and more computers. Instead of a single computer taking a million days to calculate a problem, distributing the research load can allow for the same problem to be solved in ten days by a fraction of the machines.
Contributing to Folding @ Home on Ubuntu is very simple, and can be done in just a few short steps. There are packages in the main Ubuntu repositories that support installation, management and removal of the Folding @ Home clients. Before I get into these, I'd like to outline what Folding @ Home actually does.
To understand how the system works we first need to understand a few key underlying principles. Perhaps the most important is to answer the question: What are proteins?
The Folding @ Home website defines proteins as:
Proteins are necklaces of amino acids --- long chain molecules. Proteins are the basis of how biology gets things done. As enzymes, they are the driving force behind all of the biochemical reactions which make biology work. As structural elements, they are the main constituent of our bones, muscles, hair, skin and blood vessels. As antibodies, they recognize invading elements and allow the immune system to get rid of the unwanted invaders. For these reasons, scientists have sequenced the human genome -- the blueprint for all of the proteins in biology -- but how can we understand what these proteins do and how they work?
In a nutshell, proteins are a critical part of everything around us. From our bones, muscles and hair to our immune system and everyday health. In an attempt to improve our understanding of exactly how proteins work, the Folding @ Home project attempts to simulate the work of proteins in our bodies.
This brings us to the next question. Why is it called folding?
However, only knowing this sequence tells us little about what the protein does and how it does it. In order to carry out their function (eg as enzymes or antibodies), they must take on a particular shape, also known as a "fold." Thus, proteins are truly amazing machines: before they do their work, they assemble themselves! This self-assembly is called "folding." One of our project goals is to simulate protein folding in order to understand how proteins fold so quickly and reliably, and to learn about what happens when this process goes awry (when proteins misfold).
Basically, contributing to this project can help simulate the folding of a protein and, hopefully, determine where and when proteins misfold. If this process of misfolding can be defined, perhaps it can be avoided. It is thought that diseases such as Alzheimer's, cystic fibrosis, Mad Cow and even many cancers are caused by misfolding proteins. Contributing to this project can help researches make progress towards remedying these diseases.
I started contributing to the Folding @ Home project many years ago. Knowing that Alzheimer's disease is common in my family, it seemed like a very beneficial and relateble project to be working on. The best thing about it is that I can spend a few minutes setting up a client, and then I never have to think about it afterwards. The process of contributing is a "no-worry" process, one that allows me to contribute but doesn't get in my way.
Origami : Folding @ Home Made Easy
A few years after I started contributing I wrote a script that would improve and automate the process of installing the Folding @ Home client on my machines. This little script has since turned into a full blown program that has now been included into the Ubuntu (and other) repositories. This program is called origami.
The goal of Origami is to make installation, automation and management of Folding @ home clients as simple as possible. After installing the origami package it takes just one command to configure and install the research client. For those interested in contributing, here are a few basic steps.
To install the package from the Ubuntu repositories, simple use:
sudo aptitude install origami
For those on other distributions, Origami is also available in tar format from: http://zelut.org/projects/origami/ as well as on Launchpad. The latest bazaar trunk can be checked out using: bzr branch lp:origami.
If you've installed the package you're ready to go. If you downloaded the tar or checked out a branch from revision control you'll need to simply copy the executable into your path.
sudo cp origami-*/origami /usr/bin/
Installation is then as simple as using the following command:
sudo origami install
Origami also supports a wide range of configuration options. The Folding @ Home system is setup to allow for user and team based contributions, so you can track your progress. You can also form or join teams and compete against other groups. If you'd like to contribute using a specific username or for a team, use the following installation syntax:
sudo origami install -u USERNAME -t TEAMNUMBER
For example, my username is 'Zelut' and I fold for TeamUbuntu, so I use:
sudo origami install -u Zelut -t 45104
Origami also allows you to specify pre-defined hours that you'd like the client to run. Perhaps you only want it to run after business hours, you can add the cron option to the installer which will stop the client at 8:00 am and restart it again at 5:00 pm.
sudo origami install -u USERNAME -t TEAMNUMBER -c 1
You can also define the type of processor you have. The default is to use the 32bit Folding @ Home client, but if you'd prefer to leverage your 64bit processor you can use the proc option to activate that:
sudo origami install -u USERNAME -t TEAMNUMBER -p amd64
It is also possible to define proxy information as required. You would use the following:
sudo origami install -u USERNAME -t TEAMNUMBER -P port -H hostname
Replace 'port' with the port number and 'hostname' with your proxy server hostname or IP address.
You can also define the size or work unit you would like to be assigned:
sudo origami install -u USERNAME -t TEAMNUMBER -b (small|normal|big)
Select either small, normal or big and you will only receive that size work units. Small units are finished faster but receive fewer points toward yourself or your team. These units are generally suited for slower processors. Normal is the default, and is a sane value for most installations. Big units take much longer to finish but have big point rewards when completed. If you have a fast, new processor you might consider doing big work units.
Lastly, you can define a unique PASSKEY to your Folding @ Home work which will allow you to verify that all of the work contributed under your username is unique to your clients. This is purely optional for local installations, but required for network-based deployments which we'll discuss soon.
To generate a PASSKEY for your username visit:
Just to be clear, it is perfectly reasonable to combine a number of these options into one line. Let's say I wanted to install to my local machine, using my username 'Zelut' and TeamUbuntu '45104', but I also needed tho 64bit processor, big units and cron job. The command would be:
sudo origami install -u Zelut -t 45104 -c 1 -p amd64 -b big
The beautiful thing about Origami is that it is set-and-forget. Unless I remove the Folding @ Home client from my machine I never need to consider all of the installation options again. I simply know that the client is running in the background, and will automatically run on each boot.
Network Installation & Management
Origami also supports network-based installation and management. If you have ssh access to a network of machines you can use the deploy option, along with any (or all) of the options above, to install remote machines with the Folding @ Home client. Note: the PASSKEY option is required for all network-based deployments.
To deploy Folding @ Home clients to machines on a network you'll need to define them first. Generate a ~/.origamirc file and populate it with a list of hostnames or IPs (or a mix) that you'd like to install to, one per line. When the deploy option is run it will parse the file and attempt to install on each machine listed. It should be noted that, currently, network deployment requires root access via ssh to each client installed.
An example ~/.origamirc file might look like:
[cedwards@daphne ~]$ cat .origamirc
As you can see, you can use a simple hostname, fully qualified domain or even an IP address. Any method that you might use to normally connect to that host is valid. These hosts don't even technically have to be within your local area network. As long as they are directly accessible via ssh they are considered valid by origami.
Once you have your Folding @ Home client installed it should automatically start along with the rest of your system services. To check its status you can use:
This will show you the status of your Client, along with defined username and team information and completion percentage on any job(s) that you have running. These jobs are referred to as Work Units, or WU. The current status output on my machine looks like:
current status of origami on daphne
Status of FAH client(s): OK
Completed WU on CPU #1: 2
Completed WU on CPU #2: 2
Your Team: 45104
Your Username: Zelut
User ranking URL:
Current Work Unit
Download time: December 4 08:42:21
Due time: January 25 08:42:21
Progress: 94% [|||||||||_]
Current Work Unit
Download time: December 4 08:43:11
Due time: January 25 08:43:11
Progress: 94% [|||||||||_]
If you'd like to check the progress of remote hosts installed via the deploy option, you can use:
Origami also allows you to save and restore your work. For example, perhaps you plan on reinstalling your machine but don't want to lose the currently Work Unit progress you've made. You can archive your current progress and restore it when you're done with your reinstallation.
sudo origami archive
This function will archive your current work and settings into ~/.origami. Within this folder you should find a tar archive named after the hostname of the machine, along with the date stamp of the file. When you are ready to restore this archive simply place it back into the ~/.origami on a new or reinstalled machine and run:
sudo origami restore
These functions also work for networked hosts:
These will store all archives on the central server and restore all archives from the central ~/.origami directory.
If you find that you no longer want to contribute to the Folding @ Home project you can safely remove all Folding data and Work Units using the erase function:
sudo origami erase
This command will remove all Folding @ Home data, and only leave the origami package contents installed. You'll need to use your package manager to remove that (ie; sudo aptitude remove origami).
To remove origami from network-managed machines you can use the armageddon command:
sudo origami armageddon
This will connect to each host listed in ~/.origamirc and run the erase tool. This will also run erase on the local machine. BE CAREFUL WITH THIS OPTION AS IT WILL REMOVE ALL FOLDING DATA, WORK UNIT PROGRESS AND SETTINGS!
Origami supports quite a few more options than can be outlined in this article. See the 'origami help' output for more suggestions.
If you'd like to help support origami, or if you have questions about usage, or other Folding @ Home clients, consider joining us in the #ubuntu-folding channel on irc.freenode.net.
Folding @ Home is a worthwhile project that is making great progress toward medical research advancement. It is one simple thing that you can do to contribute positively to a wider community, and the contributions you make are greatly appreciated. I invite you to read more about Folding @ Home as well as Stanford's results and other papers at: http://folding.stanford.edu
If you have read this article you may be interested to view :
- Compiling and Running Handbrake in Ubuntu
- Control of File Types in Ubuntu
- Five Years of Ubuntu
- Ubuntu 9.10: How To Upgrade
- Ubuntu User Interface Tweaks
- What's New In Ubuntu 9.10 "Karmic Koala"
- Create a Local Ubuntu Repository using Apt-Mirror and Apt-Cacher