Reader small image

You're reading from  Mastering Apache Storm

Product typeBook
Published inAug 2017
Reading LevelExpert
Publisher
ISBN-139781787125636
Edition1st Edition
Languages
Right arrow
Author (1)
Ankit Jain
Ankit Jain
author image
Ankit Jain

Ankit Jain holds a bachelor's degree in computer science and engineering. He has 6 years, experience in designing and architecting solutions for the big data domain and has been involved with several complex engagements. His technical strengths include Hadoop, Storm, S4, HBase, Hive, Sqoop, Flume, Elasticsearch, machine learning, Kafka, Spring, Java, and J2EE. He also shares his thoughts on his personal blog. You can follow him on Twitter at @mynameisanky. He spends most of his time reading books and playing with different technologies. When not at work, he spends time with his family and friends watching movies and playing games.
Read more about Ankit Jain

Right arrow

Introduction to Hadoop


Apache Hadoop is an open source platform for developing and deploying big data applications. It was initially developed at Yahoo! based on the MapReduce and Google File System papers published by Google. Over the past few years, Hadoop has become the flagship big data platform.

In this section, we will discuss the key components of a Hadoop cluster.

Hadoop Common

This is the base library on which other Hadoop modules are based. It provides an abstraction for OS and filesystem operations so that Hadoop can be deployed on a variety of platforms.

Hadoop Distributed File System

Commonly known as HDFS, the Hadoop Distributed File System is a scalable, distributed, fault-tolerant filesystem. HDFS acts as the storage layer of the Hadoop ecosystem. It allows the sharing and storage of data and application code among the various nodes in a Hadoop cluster.

The following are the key assumptions taken while designing HDFS:

  • It should be deployable on a cluster of commodity hardware.
  • Hardware...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Apache Storm
Published in: Aug 2017Publisher: ISBN-13: 9781787125636

Author (1)

author image
Ankit Jain

Ankit Jain holds a bachelor's degree in computer science and engineering. He has 6 years, experience in designing and architecting solutions for the big data domain and has been involved with several complex engagements. His technical strengths include Hadoop, Storm, S4, HBase, Hive, Sqoop, Flume, Elasticsearch, machine learning, Kafka, Spring, Java, and J2EE. He also shares his thoughts on his personal blog. You can follow him on Twitter at @mynameisanky. He spends most of his time reading books and playing with different technologies. When not at work, he spends time with his family and friends watching movies and playing games.
Read more about Ankit Jain