Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Raspberry Pi Super Cluster

You're reading from  Raspberry Pi Super Cluster

Product type Book
Published in Nov 2013
Publisher Packt
ISBN-13 9781783286195
Pages 126 pages
Edition 1st Edition
Languages
Author (1):
Andrew K. Dennis Andrew K. Dennis
Profile icon Andrew K. Dennis

Chapter 4. Hadoop – Distributed Applications on the Raspberry Pi

In Chapter 1, Clusters, Parallel Computing, and Raspberry Pi, we touched upon the technology known as Apache Hadoop.

In this chapter, we will explore the subject in more detail. This will include setting up Hadoop in order to be able to write distributed applications on the Raspberry Pi in Java via the paradigm of MapReduce.

We will start with a brief introduction to Hadoop and then walk you through the installation and the configuration process for a test cluster.

A brief introduction to Apache Hadoop


The technology known as Apache Hadoop is an open-source framework for developing distributed applications hosted by the Apache Software Foundation. The framework contains a number of subprojects. The one we are interested in is the Hadoop Core, also known as Hadoop Common.

The Hadoop Common project is located within the overall Hadoop framework. It allows the development of cloud computing environments via off-the-shelf hardware such as the Raspberry Pi. The developer interacts with it by using its Java based API.

Within Hadoop Common there are several significant areas that help us achieve our goal of developing parallel computing applications. Two of the most important areas are as follows:

  • Hadoop MapReduce environment

  • Hadoop Distributed File System (HDFS)

In this chapter both subjects will be touched upon during the installation and setup process and Chapter 5, MapReduce Applications with Hadoop and Java,provides an in-depth look at the HDFS and MapReduce...

Installing Java


Java, as you may know, is an OOP language that has its syntactical roots in the C and C++ programming languages. It is also the language we will be using to interact with the Hadoop framework.

We will start by installing Java onto our Master Raspberry Pi. Make sure you have logged into this machine.

The latest version of the Java Development Kit (JDK) can be found on the Oracle website at the following URL:

http://www.oracle.com/technetwork/java/javase/downloads/index.html

You should note that some versions of Java (including hard-float versus soft-float) might not be compatible with your Raspberry Pi.

You can always refer to the eLinux.org RPi Java JDK installation guide for updates available at the following link:

http://elinux.org/RPi_Java_JDK_Installation

Tip

Remember you can update your Raspberry Pi's package list by running apt-get update.

To start the Java installation, run the following command:

sudo apt-get install openjdk-7-jdk

You will now see the process running in...

Installing Apache Hadoop


In order to install Hadoop, we will need to locate the tar.gz file that contains the most recent stable release from the Apache website.

Before downloading this file you should create a directory on your Raspberry Pi to place the file in and to store your Hadoop projects.

Under your /home/pi directory, create the hadoop folder using the following command:

mkdir hadoop

Next navigate into this directory using the cd command:

cd hadoop

Now that we have a place to store our code, we can grab the latest version of Hadoop at the following link:

http://www.apache.org/dyn/closer.cgi/hadoop/common

We will download the tar.gz file you selected from the download website using wget. The following command illustrates this process:

wget http://apache.osuosl.org/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz

Remember to replace the URL with the mirror you selected from the download page and the version number (in our example 1.2.1) with the one you have chosen.

Once the file has finished...

Summary


In this chapter we introduced you to Apache Hadoop. You then installed Java and set up your first Raspberry Pi with the Hadoop software. Once this was done, you completed the setup of your second Raspberry Pi by installing Java and Hadoop.

With our environment ready to go, we can now explore MapReduce Java-based applications. The coming chapter will introduce you to this very topic and fill in some more detail on Hadoop Common.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Raspberry Pi Super Cluster
Published in: Nov 2013 Publisher: Packt ISBN-13: 9781783286195
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}