Reader small image

You're reading from  Simplifying Data Engineering and Analytics with Delta

Product typeBook
Published inJul 2022
PublisherPackt
ISBN-139781801814867
Edition1st Edition
Concepts
Right arrow
Author (1)
Anindita Mahapatra
Anindita Mahapatra
author image
Anindita Mahapatra

Anindita Mahapatra is a Solutions Architect at Databricks in the data and AI space helping clients across all industry verticals reap value from their data infrastructure investments. She teaches a data engineering and analytics course at Harvard University as part of their extension school program. She has extensive big data and Hadoop consulting experience from Thinkbig/Teradata prior to which she was managing development of algorithmic app discovery and promotion for both Nokia and Microsoft AppStores. She holds a Masters degree in Liberal Arts and Management from Harvard Extension School, a Masters in Computer Science from Boston University and a Bachelors in Computer Science from BITS Pilani, India.
Read more about Anindita Mahapatra

Right arrow

Monitoring data drift

The famous Greek philosopher, Heraclitus, said "Change is the only constant in life." 

Drift refers to the process of moving away from the expected norm. In the world of data, drift is applicable in different contexts. This includes drift in data, in the model, in performance, and in business metrics. Most of the model drift is on account of drift in data. We detect drift in a model by monitoring its accuracy using the F1 score, precision, recall, and other metrics. If the values fall below a certain threshold, then this signals that the business logic needs to be re-evaluated. Drift is usually detected in the context of model drift but that is too late in the pipeline. Profiling the data continuously helps detect drift sooner.

Drift can be classified into two categories, as follows:

  • Data drift:
    • New fields get added, older fields get dropped or changed, or the statistical quality of the data changes because the product was introduced...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Simplifying Data Engineering and Analytics with Delta
Published in: Jul 2022Publisher: PacktISBN-13: 9781801814867

Author (1)

author image
Anindita Mahapatra

Anindita Mahapatra is a Solutions Architect at Databricks in the data and AI space helping clients across all industry verticals reap value from their data infrastructure investments. She teaches a data engineering and analytics course at Harvard University as part of their extension school program. She has extensive big data and Hadoop consulting experience from Thinkbig/Teradata prior to which she was managing development of algorithmic app discovery and promotion for both Nokia and Microsoft AppStores. She holds a Masters degree in Liberal Arts and Management from Harvard Extension School, a Masters in Computer Science from Boston University and a Bachelors in Computer Science from BITS Pilani, India.
Read more about Anindita Mahapatra