Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hadoop Essentials

You're reading from  Hadoop Essentials

Product type Book
Published in Apr 2015
Publisher Packt
ISBN-13 9781784396688
Pages 194 pages
Edition 1st Edition
Languages
Author (1):
Shiva Achari Shiva Achari
Profile icon Shiva Achari

Table of Contents (15) Chapters

Hadoop Essentials
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Introduction to Big Data and Hadoop Hadoop Ecosystem Pillars of Hadoop – HDFS, MapReduce, and YARN Data Access Components – Hive and Pig Storage Component – HBase Data Ingestion in Hadoop – Sqoop and Flume Streaming and Real-time Analysis – Storm and Spark Index

An introduction to Spark


Spark is a cluster computing framework, which was developed in AMPLab at UC Berkley and contributed as an open source project to Apache. Spark is an in-memory based data processing framework, which makes it much faster in processing than MapReduce. In MapReduce, intermediate data is stored in the disk and data access and transfer makes it slower, whereas in Spark it is stored in-memory. Spark can be thought of as an alternative to MapReduce due to the limitations and overheads of the latter, but not as a replacement. Spark is widely used for streaming data analytics, graph analytics, fast interactive queries, and machine learning. It has attracted the attention of many contributors due to its in-memory nature and actually was one of the top-level Apache projects in 2014 with over 200 contributors and 50+ organizations. Spark utilizes multiple threads instead of multiple processes to achieve parallelism on a single node.

Spark's main motive was to develop a processing...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}