Reader small image

You're reading from  HBase Administration Cookbook

Product typeBook
Published inAug 2012
PublisherPackt
ISBN-139781849517140
Edition1st Edition
Right arrow
Author (1)
Yifeng Jiang
Yifeng Jiang
author image
Yifeng Jiang

Yifeng Jiang is a Hadoop and HBase Administrator and Developer at Rakutenthe largest e-commerce company in Japan. After graduating from the University of Science and Technology of China with a B.S. in Information Management Systems, he started his career as a professional software engineer, focusing on Java development. In 2008, he started looking over the Hadoop project. In 2009, he led the development of his previous company's display advertisement data infrastructure using Hadoop and Hive. In 2010, he joined his current employer, where he designed and implemented the Hadoop- and HBase-based, large-scale item ranking system. He is also one of the members of the Hadoop team in the company, which operates several Hadoop/HBase clusters
Read more about Yifeng Jiang

Right arrow

Quick start


HBase has two run modes—standalone mode and distributed mode. Standalone mode is the default mode of HBase. In standalone mode, HBase uses a local filesystem instead of HDFS, and runs all HBase daemons and an HBase-managed ZooKeeper instance, all in the same JVM.

This recipe describes the setup of a standalone HBase. It leads you through installing HBase, starting it in standalone mode, creating a table via HBase Shell, inserting rows, and then cleaning up and shutting down the standalone HBase instance.

Getting ready

You are going to need a Linux machine to run the stack. Running HBase on top of Windows is not recommended. We will use Debian 6.0.1 (Debian Squeeze) in this book, because we have several Hadoop/HBase clusters running on top of Debian in production at my company, Rakuten Inc., and 6.0.1 is the latest Amazon Machine Image (AMI) we have, at http://wiki.debian.org/Cloud/AmazonEC2Image.

As HBase is written in Java, you will need to have Java installed first. HBase runs on Oracle's JDK only, so do not use OpenJDK for the setup. Although Java 7 is available, we don't recommend you to use Java 7 now because it needs more time to be tested. You can download the latest Java SE 6 from the following link: http://www.oracle.com/technetwork/java/javase/downloads/index.html.

Execute the downloaded bin file to install Java SE 6. We will use /usr/local/jdk1.6 as JAVA_HOME in this book:

root# ln -s /your/java/install/directory /usr/local/jdk1.6

We will add a user with the name hadoop, as the owner of all HBase/Hadoop daemons and files. We will have all HBase files and data stored under /usr/local/hbase:

root# useradd hadoop
root# mkdir /usr/local/hbase
root# chown hadoop:hadoop /usr/local/hbase

How to do it...

Get the latest stable HBase release from HBase's official site, http://www.apache.org/dyn/closer.cgi/hbase/. At the time of writing this book, the current stable release was 0.92.1.

You can set up a standalone HBase instance by following these instructions:

  1. 1. Download the tarball and decompress it to our root directory for HBase. We will set an HBASE_HOME environment variable to make the setup easier, by using the following commands:

    root# su - hadoop
    hadoop$ cd /usr/local/hbase
    hadoop$ tar xfvz hbase-0.92.1.tar.gz
    hadoop$ ln -s hbase-0.92.1 current
    hadoop$ export HBASE_HOME=/usr/local/hbase/current
    
  2. 2. Set JAVA_HOME in HBase's environment setting file, by using the following command:

    hadoop$ vi $HBASE_HOME/conf/hbase-env.sh
    # The java implementation to use. Java 1.6 required.
    export JAVA_HOME=/usr/local/jdk1.6
    
  3. 3. Create a directory for HBase to store its data and set the path in the HBase configuration file (hbase-site.xml), between the<configuration> tag, by using the following commands:

    hadoop$ mkdir -p /usr/local/hbase/var/hbase
    hadoop$ vi /usr/local/hbase/current/conf/hbase-site.xml
    <property>
    <name>hbase.rootdir</name>
    <value>file:///usr/local/hbase/var/hbase</value>
    </property>
    
  4. 4. Start HBase in standalone mode by using the following command:

    hadoop$ $HBASE_HOME/bin/start-hbase.sh
    starting master, logging to /usr/local/hbase/current/logs/hbase-hadoop-master-master1.out
    
  5. 5. Connect to the running HBase via HBase Shell, using the following command:

    hadoop$ $HBASE_HOME/bin/hbase shell
    HBase Shell; enter 'help<RETURN>' for list of supported commands.
    Type "exit<RETURN>" to leave the HBase Shell
    Version 0.92.1, r1298924, Fri Mar 9 16:58:34 UTC 2012
    
  6. 6. Verify HBase's installation by creating a table and then inserting some values. Create a table named test, with a single column family named cf1, as shown here:

    hbase(main):001:0> create 'test', 'cf1'
    0 row(s) in 0.7600 seconds
    

    i. In order to list the newly created table, use the following command:

    hbase(main):002:0> list
    TABLE
    test
    1 row(s) in 0.0440 seconds
    

    ii. In order to insert some values into the newly created table, use the following commands:

    hbase(main):003:0> put 'test', 'row1', 'cf1:a', 'value1'
    0 row(s) in 0.0840 seconds
    hbase(main):004:0> put 'test', 'row1', 'cf1:b', 'value2'
    0 row(s) in 0.0320 seconds
    
  7. 7. Verify the data we inserted into HBase by using the scan command:

    hbase(main):003:0> scan 'test'
    ROW COLUMN+CELL row1 column=cf1:a, timestamp=1320947312117, value=value1 row1 column=cf1:b, timestamp=1320947363375, value=value2
    1 row(s) in 0.2530 seconds
    
  8. 8. Now clean up all that was done, by using the disable and drop commands:

    i. In order to disable the table test, use the following command:

    hbase(main):006:0> disable 'test'
    0 row(s) in 7.0770 seconds
    

    ii. In order to drop the the table test, use the following command:

    hbase(main):007:0> drop 'test'
    0 row(s) in 11.1290 seconds
    
  9. 9. Exit from HBase Shell using the following command:

    hbase(main):010:0> exit
    
  10. 10. Stop the HBase instance by executing the stop script:

hadoop$ /usr/local/hbase/current/bin/stop-hbase.sh
stopping hbase.......

How it works...

We installed HBase 0.92.1 on a single server. We have used a symbolic link named current for it, so that version upgrading in the future is easy to do.

In order to inform HBase where Java is installed, we will set JAVA_HOME in hbase-env.sh, which is the environment setting file of HBase. You will see some Java heap and HBase daemon settings in it too. We will discuss these settings in the last two chapters of this book.

In step 1, we created a directory on the local filesystem, for HBase to store its data. For a fully distributed installation, HBase needs to be configured to use HDFS, instead of a local filesystem. The HBase master daemon (HMaster) is started on the server where start-hbase.sh is executed. As we did not configure the region server here, HBase will start a single slave daemon (HRegionServer) on the same JVM too.

As we mentioned in the Introduction section, HBase depends on ZooKeeper as its coordination service. You may have noticed that we didn't start ZooKeeper in the previous steps. This is because HBase will start and manage its own ZooKeeper ensemble, by default.

Then we connected to HBase via HBase Shell. Using HBase Shell, you can manage your cluster, access data in HBase, and do many other jobs. Here, we just created a table called test, we inserted data into HBase, scanned the test table, and then disabled and dropped it, and exited the shell.

HBase can be stopped using its stop-hbase.sh script. This script stops both HMaster and HRegionServer daemons.

Previous PageNext Page
You have been reading a chapter from
HBase Administration Cookbook
Published in: Aug 2012Publisher: PacktISBN-13: 9781849517140
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yifeng Jiang

Yifeng Jiang is a Hadoop and HBase Administrator and Developer at Rakutenthe largest e-commerce company in Japan. After graduating from the University of Science and Technology of China with a B.S. in Information Management Systems, he started his career as a professional software engineer, focusing on Java development. In 2008, he started looking over the Hadoop project. In 2009, he led the development of his previous company's display advertisement data infrastructure using Hadoop and Hive. In 2010, he joined his current employer, where he designed and implemented the Hadoop- and HBase-based, large-scale item ranking system. He is also one of the members of the Hadoop team in the company, which operates several Hadoop/HBase clusters
Read more about Yifeng Jiang