Reader small image

You're reading from  Driving Data Quality with Data Contracts

Product typeBook
Published inJun 2023
PublisherPackt
ISBN-139781837635009
Edition1st Edition
Right arrow
Author (1)
Andrew Jones
Andrew Jones
author image
Andrew Jones

Andrew Jones is a principal engineer at GoCardless, one of Europe's leading Fintech's. He has over 15 years experience in the industry, with the first half primarily as a software engineer, before he moved into the data infrastructure and data engineering space. Joining GoCardless as its first data engineer, he led his team to build their data platform from scratch. After initially following a typical data architecture and getting frustrated with facing the same old challenges he'd faced for years, he started thinking there must be a better way, which led to him coining and defining the ideas around data contracts. Andrew is a regular speaker and writer, and he is passionate about helping organizations get maximum value from data.
Read more about Andrew Jones

Right arrow

Creating a data contract

We’ll start by defining a specification for data generators to create a data contract. We’ll discuss why we have chosen to define it in this way, and how it acts as the foundation of our sample implementation.

We’ll be using this data contract to drive the contract-driven architecture we’ll be building out in this chapter. It will be the foundation that drives the following resources and services:

  • A BigQuery table, acting as the interface to the data.
  • Code libraries for the data generators to use, by converting our data contract to JSON Schema and using existing open source libraries.
  • A schema registry, so the schemas are available to others. Again, we used our JSON Schema representation of the data contract to interact with that.
  • An anonymization service, which uses the data contract directly to anonymize some data.

The following diagram shows how each of these resources is driven by the data contract...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Driving Data Quality with Data Contracts
Published in: Jun 2023Publisher: PacktISBN-13: 9781837635009

Author (1)

author image
Andrew Jones

Andrew Jones is a principal engineer at GoCardless, one of Europe's leading Fintech's. He has over 15 years experience in the industry, with the first half primarily as a software engineer, before he moved into the data infrastructure and data engineering space. Joining GoCardless as its first data engineer, he led his team to build their data platform from scratch. After initially following a typical data architecture and getting frustrated with facing the same old challenges he'd faced for years, he started thinking there must be a better way, which led to him coining and defining the ideas around data contracts. Andrew is a regular speaker and writer, and he is passionate about helping organizations get maximum value from data.
Read more about Andrew Jones