Hands-On Big Data Processing with Hadoop 3 [Video]

More Information
  • The introduction to practical of Hadoop ecosystem and how to understand each component
  • Understanding of the Data storage and Data processing in Hadoop by UNix commands
  • Manage the HDFS storage and move the data
  • Import the data and deal with Structured data and query it through Hive
  • import the data from non RDBMs source and store in HDFS
  • Deal with semi structured data and Unstructured data through PIG

Hadoop which is one of the best open-source software frameworks for distributed computing. It provides you with means to ramp up your career and skills. You will start out by learning the basics of Hadoop, including its file system HDFS, and its cluster management resource YARN and its many libraries and programming tools. This course will get you started with the Hadoop major components which Industry demands. You will be able to see how the structure, unstructured and semi structured data can be processed with Hadoop.

This course will majorly focus on the problem faced in Big Data and the solution offered by respective Hadoop component. You will learn to use different components and tools such as Mapreduce to process raw data and will learn how tools such as Hive and Pig aids in this process. You will then move on to Data Analysis techniques with Hadoop using tools such as Hive and will learn to apply them in a real world Big Data Application. This course will teach you to perform real-time data analytics, stream and batch processing on your application. Finally, this course will also teach you how to extend your analytics solutions to the cloud.

The codes of this course are placed on Github: https://github.com/PacktPublishing/Hands-on-Big-Data-Processing-with-Hadoop-3

Style and Approach

This hands-on course covers all the important aspects of Big Data Processing with Hadoop 3. With a great balance between theoretical and practical aspects of the course, you will get a complete understanding of the subject

  • Get a clear understanding of the storage paradigm of Hadoop.
  • Understanding of data Processing with various schemas like structured unstructured and semi structured data.
  • Learn data movement from various sources like RDBMS, Web log server, Syslog server, social media and other sources.
Course Length 4 hours 36 minutes
ISBN 9781788997553
Date Of Publication 31 Oct 2018


Sudhanshu Saxena

Sudhanshu Saxena is a renowned name in Big Data analytics, works as a Big Data Scientist & Speaker, Machine Learning Expert and Big-data Analytics trainer. After Completing Bachelor of technology, he holds an experience of 12+ years in corporate as Expert facilitator and corporate behavioral trainer, skilled in designing programs, content development, facilitating organizational development workshops.

The expert lead a team of Data Scientists to solve the Business problems. Connected with more than 55 corporates and training bodies for Data science and training in Artificial intelligence for pan India. He has successfully mentor more than 5000 Hours online classes/Webinars for Big Data and Hadoop and various programs. Been a trainer for More than 33 corporate trainings, 350 classroom sessions in associations with different International training organizations. Has been a speaker for 36+ corporate session for Machine learning and Big Data Analytics and visualization. Being a part of highly revolutionary IT industry as realized the gap between industry trends, native technologies and understanding, he started to share his experience and knowledge towards native technology and analysis through practical experience. Presently he also provides consulting on Big data analytics, Hadoop, ML project for various MNCs.