Building a Big Data Analytics Stack [Video]
Building a Big Data ecosystem is hard. There are a variety of technologies available and every one of them has its pros and cons. When building a big data pipeline for software engineers, we need to use more low-level tools and APIs such as HBase and Apache Spark.
In this course, we’ll check out HBase, a database built by optimizing on the HDFS. Moving on, we’ll have a bit of fun with Spark MLlib. Finally, you’ll get an understanding of ETL and deploy a Hadoop project to the cloud. Building Big Data Ecosystem is hard. There are a variety of technologies available and every one of them has own pros and cons. Software Engineers we need to use more low-level tools and APIs like HBase and Apache Spark while building big data pipeline.
By the end of the course, you’ll be able to use more high-level tools that have more user-friendly, declarative APIs such as Pig and Hive.
Style and Approach
This course will give you both a knowledge-based understanding and practical hands-on experience of Hadoop 2.7. It also looks at Spark, Pig, Hive, HBase, and YARN, so you can understand how to implement these components while using Hadoop clusters.
- Publication date:
- November 2017
- Publisher
- Packt
- Duration
- 1 hour 31 minutes
- ISBN
- 9781787125018