More Information
  • Familiarize yourself with HDFS and daemons
  • Determine backup areas, disaster recover principles, and backup needs
  • Understand the necessity for Hive metadata backup
  • Discover HBase to explore different backup styles, such as snapshot, replication, copy table, the HTable API, and manual backup
  • Learn the key considerations of a recovery strategy and restore data in the event of accidental deletion
  • Tune the performance of a Hadoop cluster and recover from scenarios such as failover, corruption, working drives, and NameNodes
  • Monitor node health, and explore various techniques for checks, including HDFS checks and MapReduce checks
  • Identify common hardware failure points and discover mitigation techniques

Hadoop offers distributed processing of large datasets across clusters and is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. It enables computing solutions that are scalable, cost-effective, flexible, and fault tolerant to back up very large data sets from hardware failures.

Starting off with the basics of Hadoop administration, this book becomes increasingly exciting with the best strategies of backing up distributed storage databases.

You will gradually learn about the backup and recovery principles, discover the common failure points in Hadoop, and facts about backing up Hive metadata. A deep dive into the interesting world of Apache HBase will show you different ways of backing up data and will compare them. Going forward, you'll learn the methods of defining recovery strategies for various causes of failures, failover recoveries, corruption, working drives, and metadata. Also covered are the concepts of Hadoop matrix and MapReduce. Finally, you'll explore troubleshooting strategies and techniques to resolve failures.

  • Learn the fundamentals of Hadoop’s backup needs, recovery strategy, and troubleshooting
  • Determine common failure points, intimate HBase, and explore different backup techniques to resolve failures
  • Explore common issues and their solutions using in-depth knowledge of Hadoop
Page Count 206
Course Length 6 hours 10 minutes
ISBN 9781783289042
Date Of Publication 27 Jul 2015


Gaurav Barot

Gaurav Barot is an experienced software architect and PMP-certified project manager with more than 12 years of experience. He has a unique combination of experience in enterprise resource planning, sales, education, and technology. He has served as an enterprise architect and project leader in projects in various domains, including healthcare, risk, insurance, media, and so on for customers in the UK, USA, Singapore, and India.

Gaurav holds a bachelor's degree in IT engineering from Sardar Patel University, and has completed his post graduation in IT from Deakin University Melbourne.

Chintan Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.

Amij Patel

Amij Patel is a cofounder of KNOWARTH Technologies ( and leads mobile, UI/UX, and e-commerce vertical. He is an out-of-the-box thinker with a proven track record of designing and delivering the best design solutions for enterprise applications and products.

He has a lot of experience in the Web, portals, e-commerce, rich Internet applications, user interfaces, big data, and open source technologies. His passion is to make applications and products interactive and user friendly using the latest technologies. Amij has a unique ability—he can deliver or execute on any layer and technology from the stack.

Throughout his career, he has been honored with awards for making valuable contributions to businesses and delivering excellence through different roles, such as a practice leader, architect, and team leader. He is a cofounder of various community groups, such as Ahmedabad JS and the Liferay UI developers' group. These are focused on sharing knowledge of UI technologies and upcoming trends with the broader community. Amij is respected as motivational, the one who leads by example, a change agent, and a proponent of empowerment and accountability.