Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Observability for Data Engineering

You're reading from  Data Observability for Data Engineering

Product type Book
Published in Dec 2023
Publisher Packt
ISBN-13 9781804616024
Pages 228 pages
Edition 1st Edition
Languages
Authors (2):
Michele Pinto Michele Pinto
Profile icon Michele Pinto
Sammy El Khammal Sammy El Khammal
Profile icon Sammy El Khammal
View More author details

Table of Contents (17) Chapters

Preface Part 1: Introduction to Data Observability
Chapter 1: Fundamentals of Data Quality Monitoring Chapter 2: Fundamentals of Data Observability Part 2: Implementing Data Observability
Chapter 3: Data Observability Techniques Chapter 4: Data Observability Elements Chapter 5: Defining Rules on Indicators Part 3: How to adopt Data Observability in your organization
Chapter 6: Root Cause Analysis Chapter 7: Optimizing Data Pipelines Chapter 8: Organizing Data Teams and Measuring the Success of Data Observability Part 4: Appendix
Chapter 9: Data Observability Checklist Chapter 10: Pathway to Data Observability Index Other Books You May Enjoy

Mastering lineage

Lineage or process lineage is the action of a data application on the data sources’ schemas. Lineage is a link between inputs and outputs, often one or several input schemas and an output schema.

It expresses what happens with the data inside a specific application. By extension, the lineage of the data source is the set of all the transformations that ended in creating the data source and all the computations or manipulations that are based on the data source.

As we stated previously, lineage is a link between schemas. These schemas can come from the same data source. For instance, creating a new column inside a SQL table creates a new schema inside the table that is fed by data coming from another schema of the data source.

Lineage is a unique combination of data flows – a data flow being a one-to-one relationship between an input schema and output schema that occurs inside the application. Without the application, there cannot be any lineage...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}