Now since we have set up Hadoop, written, and run our first application in it, we can look at the concept of Monte Carlo simulators and how we can calculate Pi (П)using Hadoop and MPI. This brings together and compares the two technologies we have explored in Chapters 2 through 6.
You're reading from Raspberry Pi Super Cluster
A Monte Carlo simulator, also known as Monte Carlo methods, is a type of computational method found in a variety of fields ranging from physics to finance.
Monte Carlo simulators use randomized sampling repeatedly in order to obtain a result for a particular mathematical question.
The name is derived from the city of Monte Carlo in Monaco. The origin of the name comes from Manhattan project participants Stanislaw Ulam and John Von Neumann in reference to a relative of Ulam who had a taste for gambling.
Calculating П is an example of a problem especially suited to this type of algorithm and an early example of this is Buffon's needle. You can read more about the history of this experiment at Wolfram MathWorld:
http://mathworld.wolfram.com/BuffonsNeedleProblem.html
In order to calculate П we can also use another method that involves a diagram displaying a circle located in a square divided into four quarters.
In this diagram we are interested in the top-right quarter of...
Hadoop comes packaged with a number of example applications. We are of course interested in calculating П program in particular.
The source code for this application can be downloaded from Apache's website at the following URL:
The JAR file containing the compiled class can be found on your machine at:
/home/pi/hadoop/hadoop-1.2.1/hadoop-examples-1.2.1.jar
Let's navigate to this directory:
cd ~/hadoop/hadoop-1.2.1
We are now going to run the example. The program takes two inputs: the number of maps and the number of samples. Try running the following demonstration:
hadoop jar hadoop-examples-1.2.1.jar pi 2 4
You should now see something similar to:
Number of Maps = 2 Samples per Map = 4 Wrote input for Map #0 Wrote input for Map #1 Starting Job … Job Finished in 273.167 seconds Estimated value...
We have seen that we can calculate П with Hadoop. We can now try a similar application in C. The program we will now write will generate results similar to what we saw with the example program included with MPICH and will also use a MapReduce-style approach.
Create a new file at the following location to store your code in:
~/mpich3/code/monte_carlo_pi.c
Open this file and add the following code:
#include "mpi.h" #include <stdio.h> #include <stdlib.h> #include <time.h> double insidecircle(int throws); #define GAMES 20 #define THROWS 100
The previous block of code includes the necessary header files and defines a function and two constants. The function
insidercircle()
will be responsible for calculating П.
The first constant is the number of
GAMES
, that is, attempts at calculating П. The second defines the number of THROWS
in each game. Now add the following code to the end of the file:
int main (int argc, char *argv[]) { double jobaverage, calcpi...
In this chapter, we brought together the technologies we have studied so far and compared them by looking at how they both solve the problem through parallel computing.
This problem was calculating П using a Monte Carlo style simulator.
In case of the MPI, we wrote a small application in C, which gave us some more exposure to programming parallel applications.
Now that you have a taste for how these two technologies can be used, you have the context to explore both Hadoop and MPI in more detail including editing the C program and writing your own Java application.
In the next chapter, we shall be looking at further tasks which we can perform with our Raspberry Pi cluster.