More Information
Learn
  • Understand the Sqoop import arguments and the provided examples to master moving data from RDBMS to Hadoop
  • Get to know the Sqoop incremental import feature
  • Understand the HBase table structure, HBase basic commands, and learn how to move data from RDBMS to HBase
  • Learn about the Hive table structure, Hive basic commands, and understand the provided examples to discover how to move data from RDBMS to Hive
  • Explore the Sqoop export arguments and learn how to move process data from Hadoop to RDBMS
  • Learn how to move data from Hive to RDBMS
  • Discover Sqoop third-party connectors
About

In today’s world, data size is growing at a very fast rate, and people want to perform analytics by combining different sources of data (RDBMS, Text, and so on). Using Hadoop for analytics requires you to load data from RDBMS to Hadoop and perform analytics on that data, before then loading that process data back to RDBMS to generate business reports.

Instant Apache Sqoop is a practical, hands-on guide that provides you with a number of clear, step-by-step exercises that will help you to take advantage of the real power of Apache Sqoop and give you a good grounding in the knowledge required to transfer data between RDBMS and the Hadoop ecosystem.

Instant Apache Sqoop looks at the import/export process required in data transfer and discusses examples of each process. It will also give you an overview of HBase and Hive table structures and how you can populate HBase and Hive tables. The book will finish by taking you through a number of third-party Sqoop connectors.

You will also learn about various import and export arguments and how you can use these arguments to move data between RDBMS and the Hadoop ecosystem. This book also explains the architecture of import and export processes. The book will also take a look at some Sqoop connectors and will discuss examples of each connector. If you want to move data between RDBMS and the Hadoop ecosystem, then this is the book for you.

You will learn everything that you need to know to transfer data between RDBMS and the Hadoop ecosystem as well as how you can add new connectors into Sqoop.

Features
  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results
  • Learn how to transfer data between RDBMS and Hadoop using Sqoop
  • Add a third-party connector into Sqoop
  • Export data from Hadoop and Hive to RDBMS
  • Describe third-party Sqoop connectors
Page Count 58
Course Length 1 hours 44 minutes
ISBN 9781782165774
Date Of Publication 25 Aug 2013

Authors

Ankit Jain

Ankit Jain currently works as a senior research scientist at Uber AI Labs, the machine learning research arm of Uber. His work primarily involves the application of deep learning methods to a variety of Uber's problems, ranging from forecasting and food delivery to self-driving cars. Previously, he has worked in a variety of data science roles at the Bank of America, Facebook, and other start-ups. He has been a featured speaker at many of the top AI conferences and universities, including UC Berkeley, O'Reilly AI conference, and others. He has a keen interest in teaching and has mentored over 500 students in AI through various start-ups and bootcamps. He completed his MS at UC Berkeley and his BS at IIT Bombay (India).