Reader small image

You're reading from  Data Engineering with dbt

Product typeBook
Published inJun 2023
PublisherPackt
ISBN-139781803246284
Edition1st Edition
Right arrow
Author (1)
Roberto Zagni
Roberto Zagni
author image
Roberto Zagni

Roberto Zagni is a senior leader with extensive hands-on experience in data architecture, software development and agile methodologies. Roberto is an Electronic Engineer by training with a special interest in bringing software engineering best practices to cloud data platforms and growing great teams that enjoy what they do. He has been helping companies to better use their data, and now to transition to cloud based Data Automation with an agile mindset and proper SW engineering tools and processes, aka DataOps. Roberto also coaches data teams hands-on about practical data architecture and the use of patterns, testing, version control and agile collaboration. Since 2019 his go to tools are dbt, dbt Cloud and Snowflake or BigQuery.
Read more about Roberto Zagni

Right arrow

Building the STG model for the first dimension

We have instructed dbt to load our CSV data in a table and we can use that as we would any other model through the ref(…) function.

That would work, but when we consider the CSV under the seed folder could be a temporary solution only, like in this case, we prefer to take the data loaded as a seed in use through sources, as we do with any external data.

Defining the external data source for seeds

We will define an external data source to read our seeds in a metadata-driven way so that we can easily adapt if the data would stop coming from a seed and start to come from a different place, such as a table from a master data system or a file from a data lake.

Let’s define a YAML file to define the sources for the seeds, and then add the config for the seed that we have just created:

  1. Create a new file named source_seed.yml in the models folder.
  2. Add the configuration for the seed external source, as in the...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Data Engineering with dbt
Published in: Jun 2023Publisher: PacktISBN-13: 9781803246284

Author (1)

author image
Roberto Zagni

Roberto Zagni is a senior leader with extensive hands-on experience in data architecture, software development and agile methodologies. Roberto is an Electronic Engineer by training with a special interest in bringing software engineering best practices to cloud data platforms and growing great teams that enjoy what they do. He has been helping companies to better use their data, and now to transition to cloud based Data Automation with an agile mindset and proper SW engineering tools and processes, aka DataOps. Roberto also coaches data teams hands-on about practical data architecture and the use of patterns, testing, version control and agile collaboration. Since 2019 his go to tools are dbt, dbt Cloud and Snowflake or BigQuery.
Read more about Roberto Zagni