Packt+ | Advance your knowledge in tech

You're reading from HBase Administration Cookbook

Product typeBook

Published inAug 2012

PublisherPackt

ISBN-139781849517140

Edition1st Edition

Tools

Hadoop HBase

Concepts

Database Administration

Author (1)

Yifeng Jiang

Quick start

HBase has two run modes—standalone mode and distributed mode. Standalone mode is the default mode of HBase. In standalone mode, HBase uses a local filesystem instead of HDFS, and runs all HBase daemons and an HBase-managed ZooKeeper instance, all in the same JVM.

This recipe describes the setup of a standalone HBase. It leads you through installing HBase, starting it in standalone mode, creating a table via HBase Shell, inserting rows, and then cleaning up and shutting down the standalone HBase instance.

Getting ready

You are going to need a Linux machine to run the stack. Running HBase on top of Windows is not recommended. We will use Debian 6.0.1 (Debian Squeeze) in this book, because we have several Hadoop/HBase clusters running on top of Debian in production at my company, Rakuten Inc., and 6.0.1 is the latest Amazon Machine Image (AMI) we have, at http://wiki.debian.org/Cloud/AmazonEC2Image.

As HBase is written in Java, you will need to have Java installed first. HBase runs on Oracle's JDK only, so do not use OpenJDK for the setup. Although Java 7 is available, we don't recommend you to use Java 7 now because it needs more time to be tested. You can download the latest Java SE 6 from the following link: http://www.oracle.com/technetwork/java/javase/downloads/index.html.

Execute the downloaded bin file to install Java SE 6. We will use /usr/local/jdk1.6 as JAVA_HOME in this book:

root# ln -s /your/java/install/directory /usr/local/jdk1.6

We will add a user with the name hadoop, as the owner of all HBase/Hadoop daemons and files. We will have all HBase files and data stored under /usr/local/hbase:

root# useradd hadoop
root# mkdir /usr/local/hbase
root# chown hadoop:hadoop /usr/local/hbase

How to do it...

Get the latest stable HBase release from HBase's official site, http://www.apache.org/dyn/closer.cgi/hbase/. At the time of writing this book, the current stable release was 0.92.1.

You can set up a standalone HBase instance by following these instructions:

1. Download the tarball and decompress it to our root directory for HBase. We will set an HBASE_HOME environment variable to make the setup easier, by using the following commands:
```
root# su - hadoop
hadoop$ cd /usr/local/hbase
hadoop$ tar xfvz hbase-0.92.1.tar.gz
hadoop$ ln -s hbase-0.92.1 current
hadoop$ export HBASE_HOME=/usr/local/hbase/current
```

2. Set JAVA_HOME in HBase's environment setting file, by using the following command:

hadoop$ vi $HBASE_HOME/conf/hbase-env.sh
# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/local/jdk1.6

3. Create a directory for HBase to store its data and set the path in the HBase configuration file (hbase-site.xml), between the<configuration> tag, by using the following commands:

hadoop$ mkdir -p /usr/local/hbase/var/hbase
hadoop$ vi /usr/local/hbase/current/conf/hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>file:///usr/local/hbase/var/hbase</value>
</property>

4. Start HBase in standalone mode by using the following command:

hadoop$ $HBASE_HOME/bin/start-hbase.sh
starting master, logging to /usr/local/hbase/current/logs/hbase-hadoop-master-master1.out

5. Connect to the running HBase via HBase Shell, using the following command:

hadoop$ $HBASE_HOME/bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.92.1, r1298924, Fri Mar 9 16:58:34 UTC 2012

6. Verify HBase's installation by creating a table and then inserting some values. Create a table named test, with a single column family named cf1, as shown here:
```
hbase(main):001:0> create 'test', 'cf1'
0 row(s) in 0.7600 seconds
```
i. In order to list the newly created table, use the following command:
```
hbase(main):002:0> list
TABLE
test
1 row(s) in 0.0440 seconds
```
ii. In order to insert some values into the newly created table, use the following commands:
```
hbase(main):003:0> put 'test', 'row1', 'cf1:a', 'value1'
0 row(s) in 0.0840 seconds
hbase(main):004:0> put 'test', 'row1', 'cf1:b', 'value2'
0 row(s) in 0.0320 seconds
```

7. Verify the data we inserted into HBase by using the scan command:

hbase(main):003:0> scan 'test'
ROW COLUMN+CELL row1 column=cf1:a, timestamp=1320947312117, value=value1 row1 column=cf1:b, timestamp=1320947363375, value=value2
1 row(s) in 0.2530 seconds

8. Now clean up all that was done, by using the disable and drop commands:
i. In order to disable the table test, use the following command:
```
hbase(main):006:0> disable 'test'
0 row(s) in 7.0770 seconds
```
ii. In order to drop the the table test, use the following command:
```
hbase(main):007:0> drop 'test'
0 row(s) in 11.1290 seconds
```
9. Exit from HBase Shell using the following command:
```
hbase(main):010:0> exit
```
10. Stop the HBase instance by executing the stop script:

hadoop$ /usr/local/hbase/current/bin/stop-hbase.sh
stopping hbase.......

How it works...

We installed HBase 0.92.1 on a single server. We have used a symbolic link named current for it, so that version upgrading in the future is easy to do.

In order to inform HBase where Java is installed, we will set JAVA_HOME in hbase-env.sh, which is the environment setting file of HBase. You will see some Java heap and HBase daemon settings in it too. We will discuss these settings in the last two chapters of this book.

In step 1, we created a directory on the local filesystem, for HBase to store its data. For a fully distributed installation, HBase needs to be configured to use HDFS, instead of a local filesystem. The HBase master daemon (HMaster) is started on the server where start-hbase.sh is executed. As we did not configure the region server here, HBase will start a single slave daemon (HRegionServer) on the same JVM too.

As we mentioned in the Introduction section, HBase depends on ZooKeeper as its coordination service. You may have noticed that we didn't start ZooKeeper in the previous steps. This is because HBase will start and manage its own ZooKeeper ensemble, by default.

Then we connected to HBase via HBase Shell. Using HBase Shell, you can manage your cluster, access data in HBase, and do many other jobs. Here, we just created a table called test, we inserted data into HBase, scanned the test table, and then disabled and dropped it, and exited the shell.

HBase can be stopped using its stop-hbase.sh script. This script stops both HMaster and HRegionServer daemons.

You have been reading a chapter from

HBase Administration Cookbook

Published in: Aug 2012Publisher: PacktISBN-13: 9781849517140

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Yifeng Jiang

Yifeng Jiang is a Hadoop and HBase Administrator and Developer at Rakutenthe largest e-commerce company in Japan. After graduating from the University of Science and Technology of China with a B.S. in Information Management Systems, he started his career as a professional software engineer, focusing on Java development. In 2008, he started looking over the Hadoop project. In 2009, he led the development of his previous company's display advertisement data infrastructure using Hadoop and Hive. In 2010, he joined his current employer, where he designed and implemented the Hadoop- and HBase-based, large-scale item ranking system. He is also one of the members of the Hadoop team in the company, which operates several Hadoop/HBase clusters
Read more about Yifeng Jiang

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages