Reader small image

You're reading from  Simplifying Data Engineering and Analytics with Delta

Product typeBook
Published inJul 2022
PublisherPackt
ISBN-139781801814867
Edition1st Edition
Concepts
Right arrow
Author (1)
Anindita Mahapatra
Anindita Mahapatra
author image
Anindita Mahapatra

Anindita Mahapatra is a Solutions Architect at Databricks in the data and AI space helping clients across all industry verticals reap value from their data infrastructure investments. She teaches a data engineering and analytics course at Harvard University as part of their extension school program. She has extensive big data and Hadoop consulting experience from Thinkbig/Teradata prior to which she was managing development of algorithmic app discovery and promotion for both Nokia and Microsoft AppStores. She holds a Masters degree in Liberal Arts and Management from Harvard Extension School, a Masters in Computer Science from Boston University and a Bachelors in Computer Science from BITS Pilani, India.
Read more about Anindita Mahapatra

Right arrow

Motivation for Delta

Data lakes have been in existence for a while now, so their need is no longer questioned. What is more relevant is the specifics of the solution's implementation. Consolidating all the siloed data by itself does not constitute a data lake. However, it is a starting point. Layering in governance makes the data consumable and is a step toward a curated data lake. Big data systems provide scale out of the box but force us to make some accommodations for data quality. Age-old aspects of transactional integrity were compromised on a distributed system because it was very hard to maintain ACID compliance. Due to this, BASE properties were favored. All of this was moving the needle in the wrong direction and from pristine data lakes we were moving toward data swamps, where the data could not be trusted and hence insights that were generated on the data could not be trusted either. So, what is the point of building a data lake?

Let's consider a few common...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Simplifying Data Engineering and Analytics with Delta
Published in: Jul 2022Publisher: PacktISBN-13: 9781801814867

Author (1)

author image
Anindita Mahapatra

Anindita Mahapatra is a Solutions Architect at Databricks in the data and AI space helping clients across all industry verticals reap value from their data infrastructure investments. She teaches a data engineering and analytics course at Harvard University as part of their extension school program. She has extensive big data and Hadoop consulting experience from Thinkbig/Teradata prior to which she was managing development of algorithmic app discovery and promotion for both Nokia and Microsoft AppStores. She holds a Masters degree in Liberal Arts and Management from Harvard Extension School, a Masters in Computer Science from Boston University and a Bachelors in Computer Science from BITS Pilani, India.
Read more about Anindita Mahapatra