Switch to the store?

Talend for Big Data

More Information
Learn
  • Discover the structure of the Talend Unified Platform
  • Work with Talend HDFS components
  • Implement ELT processing jobs using Talend Hive components
  • Load, filter, aggregate, and store data using Talend Pig components
  • Integrate HDFS with RDBMS using Sqoop components
  • Use the streaming pattern for big data
  • Learn to reuse the partitioning pattern for Big Data
About

Talend, a successful Open Source Data Integration Solution, accelerates the adoption of new big data technologies and efficiently integrates them into your existing IT infrastructure. It is able to do this because of its intuitive graphical language, its multiple connectors to the Hadoop ecosystem, and its array of tools for data integration, quality, management, and governance.

This is a concise, pragmatic book that will guide you through design and implement big data transfer easily and perform big data analytics jobs using Hadoop technologies like HDFS, HBase, Hive, Pig, and Sqoop. You will see and learn how to write complex processing job codes and how to leverage the power of Hadoop projects through the design of graphical Talend jobs using business modeler, meta-data repository, and a palette of configurable components.

Starting with understanding how to process a large amount of data using Talend big data components, you will then learn how to write job procedures in HDFS. You will then look at how to use Hadoop projects to process data and how to export the data to your favourite relational database system.

You will learn how to implement Hive ELT jobs, Pig aggregation and filtering jobs, and simple Sqoop jobs using the Talend big data component palette. You will also learn the basics of Twitter sentiment analysis the instructions to format data with Apache Hive.

Talend for Big Data will enable you to start working on big data projects immediately, from simple processing projects to complex projects using common big data patterns.

Features
  • Write complex processing job codes easily with the help of clear and step-by-step instructions
  • Compare, filter, evaluate, and group vast quantities of data using Hadoop Pig
  • Explore and perform HDFS and RDBMS integration with the Sqoop component
Page Count 96
Course Length 2 hours 52 minutes
ISBN 9781782169499
Date Of Publication 21 Feb 2014

Authors

Bahaaldine Azarmi

Bahaaldine Azarmi, or Baha for short, is a solutions architect at Elastic. Prior to this position, Baha co-founded ReachFive, a marketing data platform focused on user behavior and social analytics. Baha also worked for different software vendors such as Talend and Oracle, where he held solutions architect and architect positions. Before Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana 5.0, Scalable Big Data Architecture, and Talend for Big Data. Baha is based in Paris and has an MSc in computer science from Polytech'Paris.