Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Hadoop 2

You're reading from  Learning Hadoop 2

Product type Book
Published in Feb 2015
Publisher Packt
ISBN-13 9781783285518
Pages 382 pages
Edition 1st Edition
Languages

Table of Contents (18) Chapters

Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
1. Introduction 2. Storage 3. Processing – MapReduce and Beyond 4. Real-time Computation with Samza 5. Iterative Computation with Spark 6. Data Analysis with Apache Pig 7. Hadoop and SQL 8. Data Lifecycle Management 9. Making Development Easier 10. Running a Hadoop Cluster 11. Where to Go Next Index

Summary


This chapter has given a whistle-stop tour through storage on a Hadoop cluster. In particular, we covered:

  • The high-level architecture of HDFS, the main filesystem used in Hadoop

  • How HDFS works under the covers and, in particular, its approach to reliability

  • How Hadoop 2 has added significantly to HDFS, particularly in the form of NameNode HA and filesystem snapshots

  • What ZooKeeper is and how it is used by Hadoop to enable features such as NameNode automatic failover

  • An overview of the command-line tools used to access HDFS

  • The API for filesystems in Hadoop and how at a code level HDFS is just one implementation of a more flexible filesystem abstraction

  • How data can be serialized onto a Hadoop filesystem and some of the support provided in the core classes

  • The various file formats available in which data is most frequently stored in Hadoop and some of their particular use cases

In the next chapter, we will look in detail at how Hadoop provides processing frameworks that can be used to process...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}