Hadoop Blueprints

Use Hadoop to solve business problems by learning from a rich set of real-life case studies

Hadoop Blueprints

Blueprints
Anurag Shrivastava, Tanmay Deshpande

Use Hadoop to solve business problems by learning from a rich set of real-life case studies
$35.99
$44.99
RRP $35.99
RRP $44.99
eBook
Print + eBook

Instantly access this course right now and get the skills you need in 2017

With unlimited access to a constantly growing library of over 4,000 eBooks and Videos, a subscription to Mapt gives you everything you need to get that next promotion or to land that dream job. Cancel anytime.

Free Sample

Book Details

ISBN 139781783980307
Paperback316 pages

Book Description

If you have a basic understanding of Hadoop and want to put your knowledge to use to build fantastic Big Data solutions for business, then this book is for you. Build six real-life, end-to-end solutions using the tools in the Hadoop ecosystem, and take your knowledge of Hadoop to the next level.

Start off by understanding various business problems which can be solved using Hadoop. You will also get acquainted with the common architectural patterns which are used to build Hadoop-based solutions. Build a 360-degree view of the customer by working with different types of data, and build an efficient fraud detection system for a financial institution. You will also develop a system in Hadoop to improve the effectiveness of marketing campaigns. Build a churn detection system for a telecom company, develop an Internet of Things (IoT) system to monitor the environment in a factory, and build a data lake – all making use of the concepts and techniques mentioned in this book.

The book covers other technologies and frameworks like Apache Spark, Hive, Sqoop, and more, and how they can be used in conjunction with Hadoop. You will be able to try out the solutions explained in the book and use the knowledge gained to extend them further in your own problem space.

Table of Contents

Chapter 1: Hadoop and Big Data
The beginning of the big data problem
Building open source Hadoop
Enterprise Hadoop
The design of the Hadoop system
MapReduce
Building a MapReduce Version 2 program
Hadoop platform tools
Big data use cases
The architecture of Hadoop-based systems
Summary
Chapter 2: A 360-Degree View of the Customer
Capturing business information
Setting up the technology stack
Test driving Hive and Sqoop
Engineering the solution
Presenting the view
Summary
Chapter 3: Building a Fraud Detection System
Understanding the business problem
Selecting and cleansing the dataset
Machine learning for fraud detection
Designing the high-level architecture
Creating our fraud detection model
Putting the fraud detection model to use
Chapter 4: Marketing Campaign Planning
Creating the solution outline
Supervised learning
Tree-structure models for classification
Finding the right dataset
Setting the up the solution architecture
Building the machine learning model
Running the Model on Hadoop
Creating the target List
Post campaign activities
Summary
Chapter 5: Churn Detection
A business case for churn detection
Creating the solution outline
Building a churn predictor using Hadoop
Summary
Chapter 6: Analyze Sensor Data Using Hadoop
A business case for sensor data analytics
Creating the solution outline
Technology stack
Batch data analytics
Stream data analytics
Summary
Chapter 7: Building a Data Lake
Data lake building blocks
Hadoop security
Apache Ranger
Apache Flume
Apache Zeppelin
Technology stack for Data Lake
Data Lake business requirements
Summary
Chapter 8: Future Directions
Hadoop solutions team
Hadoop on Cloud
NoSQL databases
Summary

What You Will Learn

  • Learn about the evolution of Hadoop as the big data platform
  • Understand the basics of Hadoop architecture
  • Build a 360 degree view of your customer using Sqoop and Hive
  • Build and run classification models on Hadoop using BigML
  • Use Spark and Hadoop to build a fraud detection system
  • Develop a churn detection system using Java and MapReduce
  • Build an IoT-based data collection and visualization system
  • Get to grips with building a Hadoop-based Data Lake for large enterprises
  • Learn about the coexistence of NoSQL and In-Memory databases in the Hadoop ecosystem

Authors

Table of Contents

Chapter 1: Hadoop and Big Data
The beginning of the big data problem
Building open source Hadoop
Enterprise Hadoop
The design of the Hadoop system
MapReduce
Building a MapReduce Version 2 program
Hadoop platform tools
Big data use cases
The architecture of Hadoop-based systems
Summary
Chapter 2: A 360-Degree View of the Customer
Capturing business information
Setting up the technology stack
Test driving Hive and Sqoop
Engineering the solution
Presenting the view
Summary
Chapter 3: Building a Fraud Detection System
Understanding the business problem
Selecting and cleansing the dataset
Machine learning for fraud detection
Designing the high-level architecture
Creating our fraud detection model
Putting the fraud detection model to use
Chapter 4: Marketing Campaign Planning
Creating the solution outline
Supervised learning
Tree-structure models for classification
Finding the right dataset
Setting the up the solution architecture
Building the machine learning model
Running the Model on Hadoop
Creating the target List
Post campaign activities
Summary
Chapter 5: Churn Detection
A business case for churn detection
Creating the solution outline
Building a churn predictor using Hadoop
Summary
Chapter 6: Analyze Sensor Data Using Hadoop
A business case for sensor data analytics
Creating the solution outline
Technology stack
Batch data analytics
Stream data analytics
Summary
Chapter 7: Building a Data Lake
Data lake building blocks
Hadoop security
Apache Ranger
Apache Flume
Apache Zeppelin
Technology stack for Data Lake
Data Lake business requirements
Summary
Chapter 8: Future Directions
Hadoop solutions team
Hadoop on Cloud
NoSQL databases
Summary

Book Details

ISBN 139781783980307
Paperback316 pages
Read More

Read More Reviews