Reader small image

You're reading from  Simplify Big Data Analytics with Amazon EMR

Product typeBook
Published inMar 2022
PublisherPackt
ISBN-139781801071079
Edition1st Edition
Tools
Concepts
Right arrow
Author (1)
Sakti Mishra
Sakti Mishra
author image
Sakti Mishra

Sakti Mishra is an engineer, architect, author, and technology leader with over 16 years of experience in the IT industry. He is currently working as a senior data lab architect at Amazon Web Services (AWS). He is passionate about technologies and has expertise in big data, analytics, machine learning, artificial intelligence, graph networks, web/mobile applications, and cloud technologies such as AWS and Google Cloud Platform. Sakti has a bachelor’s degree in engineering and a master’s degree in business administration. He holds several certifications in Hadoop, Spark, AWS, and Google Cloud. He is also an author of multiple technology blogs, workshops, white papers and is a public speaker who represents AWS in various domains and events.
Read more about Sakti Mishra

Right arrow

EMR release history

As Amazon EMR is built on top of the open source Hadoop ecosystem, it tries to stay up to date with the open source stable releases, which includes new features and bug fixes.

Each EMR release comprises different Hadoop ecosystem applications or services that fit together with specific versions. EMR uses Apache Bigtop, which is an open source project within the Apache community to package the Hadoop ecosystem applications or components for an EMR release.

When you launch a cluster, you need to select the EMR cluster version and with advanced options, you can identify which version of each Hadoop application is integrated into that EMR release. If you are using AWS SDK or AWS CLI commands to create a cluster, you can specify the version using the release label. Release labels follow a naming convention of emr-x.x.x, for example, emr-6.3.0.

The EMR documentation clearly lists each release version and the Hadoop components integrated into it.

The following is a diagram of the EMR 6.3.0 release, which lists a few components of Hadoop services that are integrated into it and how it compares to previous releases of EMR 6.x:

Figure 1.7 – Diagram of EMR release version comparison

Figure 1.7 – Diagram of EMR release version comparison

If you were using open source Hadoop or any third-party Hadoop clusters and then migrating to EMR, it is best to go through the release documentation, understand different versions of Hadoop applications integrated into it, find the different configurations involved related to security, network access, authentication, authorization, and so on, and then evaluate it against your current Hadoop cluster to plan for migration.

With this, you have got a good overview of Amazon EMR, its benefits, its release history, and more. Now, let's compare it with a few other AWS services that are also based on Spark workloads and understand how they compare with Amazon EMR.

Previous PageNext Page
You have been reading a chapter from
Simplify Big Data Analytics with Amazon EMR
Published in: Mar 2022Publisher: PacktISBN-13: 9781801071079
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Sakti Mishra

Sakti Mishra is an engineer, architect, author, and technology leader with over 16 years of experience in the IT industry. He is currently working as a senior data lab architect at Amazon Web Services (AWS). He is passionate about technologies and has expertise in big data, analytics, machine learning, artificial intelligence, graph networks, web/mobile applications, and cloud technologies such as AWS and Google Cloud Platform. Sakti has a bachelor’s degree in engineering and a master’s degree in business administration. He holds several certifications in Hadoop, Spark, AWS, and Google Cloud. He is also an author of multiple technology blogs, workshops, white papers and is a public speaker who represents AWS in various domains and events.
Read more about Sakti Mishra